Wednesday, June 21, 2023

Lots of Effort, Terrible Result

 On June 13, 2023 a seminar was held at CERN to report some of the results and methods used by one of the major collaborations involved in the LHC. One of the motivations for this work is the fact that there is matter in the universe but very little antimatter. It is expected that whatever created the initial, very hot, universe would have produced as much matter as antimatter. These components would then have interacted and resulted in neither matter nor antimatter, just a lot of photons. We are an indication that this didn't happen.

Beginning in the 1960s and continuing into the 2000s experimental evidence showed that certain particle interactions violated matter/antimatter symmetry. This was incorporated into the Standard Model of Particle Physics back in the '60s. However the asymmetry involved wasn't enough to explain the amount of matter we see.

When the LHC was designed, many decades ago, it was produced with 4 interaction areas. One of these, LHCb, was specifically designed to determine how well the asymmetry predicted by the Standard Model matches experimental results. That shows how important this is to physics. The recent seminar presented the current status of that data analysis and it shows that the asymmetry is consistent with the predictions of the Standard Model.

So why am I writing this? I saw a post that linked to an article that was clearly related to this seminar. It also contained a significant amount of material clearly intended to provide background on the subject for readers that aren't familiar with the field. Why don't I say it was about the results from the seminar? Because what it reported implied the opposite of what actually happened.

Let's look at both articles. One says, correctly, "The weak force of the Standard Model of particle physics is known to induce a behavioural difference between matter and antimatter". The other says, incorrectly, the opposite, "The Standard Model of physics tells us that if we substitute a particle for its antiparticle, it should still operate within the laws of physics in the same way". There are numerous other examples of the second article getting things completely wrong. Most importantly in the way that the seminar's results are portrayed.

The first, the correct one, says "...  the new LHCb results, which are more precise than any equivalent result from a single experiment, are in line with the values predicted by the Standard Model". The other says, the results do "... not fully answer why there is more matter than antimatter in the universe, [the experimental results] will help constrain models that do attempt to explain this strange asymmetry". Although it isn't explicit it inplies that the results show something new, the exact opposite of what is true.

How does this happen? The "journalist" could simply have copied, or slightly reworded, the article linked above. Instead they clearly expended lots of effort. Unfortunately, they had essentially no understanding of any of the physics involved.

I was made aware of this when a link to the incorrect article was posted by a friend. Reading it made it clear that the veracity of the information wasn't trustworthy. A quick search found the article at the top of this rant. I sent the link as a comment to my friend's post. The fact that people are far more likely to come across popular articles on things like this is not a surprise. The problem is that articles like this almost always get some, or in this case essentially everything, wrong.

Wednesday, May 3, 2023

A Superposition of Errors

For a really long time, I've been trying to construct a simple explanation of Quantum Mechanics (QM) that could be understood without a background in the math used in the formal studies of the subject. I hope to eventually do that but there are several subtopics. I'm not sure which to tackle first and I keep getting stuck.

Recently I came across a science blogger's article about quantum computing that said:

Qubits exploit the quantum phenomenon of superposition, the ability for a particle to be in more than one state at once. A qubit can therefore be in any state between 0 and 1 inclusive, and in fact can be in every state from 0 to 1 at the same time.

Yes, Qubits exploit superposition. Essentially every other part of this is wrong. Superposition is not a quantum phenomenon. A qubit is not in more than one state at a time, it is a state formed by a superposition of other states. There are no states between 0 and 1, so a qubit cannot be in a state between zero and one. The assertion that it is "in every state from 0 to 1 at the same time" isn't even wrong.

So, thanks to this "incentive" I'll address superposition.

Superposition is a general property of an enormous category of mathematical relationships. One set of those is used in QM but it is not a "quantum phenomenon".

To explain the rest of these errors I need to explain a bit about superposition.

I'll use light waves as an example. As I hope everyone reading this knows, light is an electromagnetic wave. One particularly simple and useful way to view a light wave is that it has its electric field value changing like a sine wave in a single direction. Light like this is said to be polarized in that direction. 

A particular electromagnetic wave can be polarized in the vertical direction. Another possible wave is one that is polarized horizontally. Electromagnetic waves are in the category of mathematics that support superposition. This means that the sum of any solutions to the relevant equations are also solutions. So we can add the horizontal and vertical waves and the results will also be a valid electromagnetic wave. I chose these options because adding combinations, also known as forming a superposition, of vertical and horizontal waves results in an electromagnetic wave at any possible angle. (For those that aren't spooked by trig functions to set the polarization angle to θ from the vertical component is cos(θ) and the horizontal is sin(θ).) Any linearly polarized light wave can be constructed from a superposition of horizontally and vertically polarized waves. Light that is polarized at some angle to the direction in which the polarization is being measured is not in more than one polarization state at a time. It is a distinct polarization state that is formed by the superposition of other states. 

If we consider light as a series of photons we start to see some of the effects of QM. A photon of light is an all-or-nothing sort of thing. If the photon encounters a polarizing filter it either goes through it or it doesn't. The proportion of photons that will go through is related to the amount of horizontal and vertical polarization in the superposition. Each photon either passes through the filter or it doesn't. There are only two states possible, the photon has an intrinsic "two-valueness".

(The above treatment of polarized light ignores several very important aspects of the topic, like circular polarization and polarization measurements at arbitrary angles. These don't have any relevance to the topic of this entry, and they don't relate to the general topic of superposition in quantum computers.)

This superposition is, mathematically, the same as superposition in the context of quantum computers.

Many quantum mechanical systems can only be in discrete states. Let's consider the (so-called) spin of an electron. The spin, or angular momentum, of a regular object is a vector whose direction is the axis of rotation and whose magnitude depends on the distribution of mass and the rate of rotation. For an electron, the spin behaves in a way that has no classical analog. No matter how the electrons are aligned and no matter what axis the spin is measured along the result is always the same magnitude, either with or opposite that measurement axis. Explaining what this means and why it's so weird is far beyond the scope is this rant. This "two-valueness" is true for all qubits, not just the ones that can be reduced to electrons.

So, how should that article have described qubits? Here's a possibility.

Qubit is a portmanteau of quantum and bit. Qubits take advantage of superposition, a fundamental property of quantum systems that can make any combination of states a possible state. Qubits, like regular bits, have two states, usually called 0 and 1. Superposition allows qubits to be in other states, ones that aren't possible with regular bits, where the value is both 0 and 1 at the same time. When combined with another quantum behavior, entanglement, this greatly increases the amount of information that can be encoded in a set of qubits.

Monday, July 11, 2022

Infrared is NOT heat

There has been a lot of coverage of infrared (IR) astronomy recently and, with the release of the first JWST image, this should continue. One of the most common things I've come across in this area is the assertion that IR radiation is heat.

It isn't.

To explain why very knowledgeable people say this and why it is wrong will take a bit of explanation.

First, let's talk about light. For the average person that's a really familiar topic. Light is the stuff we see. But we can do better than that. It has been known since 1865, when James Clerk Maxwell derived the speed of electromagnetic (EM) waves, that the light our eyes can see is just one kind of electromagnetic wave. Let's call that visible light. The study of electricity and magnetism (E&M) is very mature and, without going into any detail here, one of the very well understood aspects of E&M is that energy can be sent through space as a wave of intertwined electric and magnetic fields. These all travel at the same speed and the speed derived from Maxwell's Equations is the measured speed of visible light.

It was noticed by William Herschel in 1800 that a prism produces more than just the colors that we can see. If you take light from the Sun and pass it through a prism, as was done by Newton, it is split into a fan of color in the familiar pattern of a rainbow, Red-Orange-Yellow-Green-Blue-Indigo-Violet (or not so familiar since indigo isn't a color we encounter often). Herschel was interested in the amount of energy in different colors of light. Thermometers with blackened bulbs were placed so that different parts of the spread-out sunlight would heat up the thermometers. He also noticed that a thermometer placed past the red light coming out of the prism would also heat up. In fact, the thermometer beyond the red got even warmer than the one in red light. This showed that there is something coming from the Sun that transferred energy, was bent by a prism, and was not visible. We have since learned that there are a wide range of EM waves that physicists call "light". Everything from radio waves to gamma rays. Only a tiny fraction can be seen with our eyes and are called "visible light". This is even true for sunlight. Less than half of the energy emitted from the Sun is in visible light. A bit more, still less than half, is IR. Most of the rest is ultraviolet (UV).

Next, let's talk about heat. All matter is made of molecules. These molecules are moving around, even in a solid. That kinetic energy, the energy of motion, is what heat actually is. When an object absorbs more energy it gets warmer. This is true no matter how hot the object is.

What kind of energy is emitted by an object at a given temperature? You probably remember that heat is transferred in three different ways: Convection, conduction, and radiation. Since we're going to be talking about objects that aren't touching anything, only radiation matters. What determines the properties of the radiation given off by an object? This is another fascinating topic but the most important things to know are:

1) The radiation given off is EM waves

2) (almost) All objects made of (almost) any material give off (almost) the same radiation when at the same temperature. (The "almosts" will be ignored from now on.)

3) All objects (at a nonzero temperature) give off the most radiation at a wavelength that gets shorter as the temperature gets higher.

4) If object A is at a higher temperature than object B, object A will give off more radiation than object B at ALL wavelengths. 

For objects that are a few thousand degrees that peak is in visible light. For temperatures in the millions of degrees it is in X-rays. For the coldest objects warmed only by the cosmic background radiation it is microwaves. All of these types of radiation tell you (something) about the temperature of the object giving off that radiation. 

What happens when various forms of EM waves are absorbed by an object? When the energy in that wave is absorbed it heats the object up. This is true of all forms of EM waves. This highlights the first misconception caused by saying that "Infrared is heat". Most people think that IR is particularly good at heating things. It is, but not because IR is heat. It is because most of the materials we deal with our day-to-day lives absorb IR quite strongly. This is not true in general.

The association of IR with heat is mostly an accident of the way it was discovered and the temperatures we normally experience.

IR is useful in astronomy NOT because of this association. Here are a few of the reasons that an IR telescope is useful

We think of space as empty, and compared to what we usually experience it is. But there are large volumes that contain lots of dust. That dust absorbs visible light. It absorbs IR far less. This allows us to see inside these clouds or even through them. This has allowed us to track individual stars as they orbit the supermassive black hole at the center of our galaxy. This work won a recent Nobel Prize in Physics.

As the universe expands light that is traveling expands with it. The earliest stars that we think exist have most of their visible light shifted into IR. That means that we need to be able to detect IR light to see them.

Different molecules absorb light in characteristic patterns in many different parts of the EM spectrum. Many of the most interesting constituents of planetary atmospheres have their most distinctive absorption features in IR light.

Many of the asteroids in our solar system reflect very little of the light that illuminates them. This means that they are very faint. But absorbing that light means that they warm up and emit light, much of it is IR light.

Being able to explore the universe in IR with JWST is sure to teach us a LOT. But it isn't because IR is heat.

Friday, December 17, 2021

The Dangers of “Good” Science Communication

One of the central lessons I have learned from my time in the Skeptic community is that the veracity of information isn’t the primary factor in determining what people accept as true. Some groups have known this for a very long time. The practices of the advertising and marketing sectors are largely determined, whether they realize it or not, by the psychology of belief. In our highly connected, ad driven world, the importance of how to use the largely unconscious factors that affect attention and acceptance is central to success. This has worked its way into almost every aspect of our lives. One of my favorite science YouTube channels recently did a video about some of the things that affect the popularity of a video and interactions with the algorithm that determines how often that video is presented and to whom. The subject has also come up in print as various science communicators talk about the importance of headlines and the way they present information. As all of these people get better and better at presenting information in a way that appeals to our psychology they are able to make content that is more convincing and more likely to “go viral”. I see this as a significant, and essentially ignored, danger. People can be wrong. The more expertise you have in a field the less likely you are to be wrong. This is a particular danger in science communication. The material is often quite subtle. Without sufficient expertise in the subject material, it is likely that the message will misinform as much, or even more, than inform. This is fairly well understood and recognized, at least in the abstract. As Neil deGrasse Tyson put it recently in the ad for his MasterClass: “One of the great challenges in life is knowing enough to think you're right but not enough to know you're wrong”. As science communicators get better and better at presenting their material in a convincing manner the material is more likely to stick in people’s memories. When they present incorrect information, in this more convincing manner. their audience accepts it and remembers it even better. Let’s consider a specific, quite narrow, topic: waste from Thorium reactors. The amount of misinformation on this subject is enormous. I’ve seen trusted science communicators assert that the waste from thorium reactors is far less radioactive and has a shorter half-life than that of current reactors. This is not only wrong, it reinforces a fundamental misunderstanding of the subject. If something is less radioactive it has, by definition, a longer half-life. That’s simply what the words mean. It is impossible for something to be both less radioactive and have a shorter half-life. The half-life is the length of time needed for half of the sample to decompose due to radioactive decay. A substance with a very long half-life is very difficult to distinguish from one that is not radioactive. This mistake is often made in the opposite direction. You will see pronouncements about the danger of “highly radioactive materials with a long half-life”. Such materials, by definition, can not exist. One science communicator, popular among skeptics, explained that material in a thorium reactor is “completely burned” so it “has had almost all its radioactivity already spent”. As if radioactivity is a substance that is released in nuclear reactors. Not only is this not the way it works, it encourages people to think of reactors in ways that are simply wrong. Such misinformation can only make things worse. When that misinformation is skillfully communicated it does so to a greater extent. I have written about this type of problem before. That post was about the use of “whiz-bang” visuals, one of the many ways a video is made more appealing. So what can be done? A simple, fairly effective solution is both obvious and not practical. Restrict science communication to people that are true experts in the field being communicated. Another, slightly more practical option, is to get science communicators to confirm what they say with subject matter experts. Yet another option is for topics, like the characteristics of spent fuel from thorium reactors, which need lots of background information to be comprehensible, to be out of bounds for science communicators


Wednesday, November 10, 2021

Nuclear Waste as Nuclear Fuel? Not really.

I keep running across the assertion that nuclear waste can be used as nuclear fuel. In several cases it is asserted that it is only regulations that prevent this use. One went so far as to say "[W]hat’s the difference between nuclear fuel and nuclear waste. The scientific answer is really nothing, there is no difference. The only difference is a technological one. Nuclear fuel is radioactive material we know how to burn as fuel and radioactive waste is nuclear material we don’t know how to burn as fuel but there is no other real, physical difference between fuel and waste. It’s a technological difference only."

The quote in the previous paragraph is simply false. As will be explained below, nuclear waste, no matter what reasonable definition you use, contains a significant amount of material that cannot be used in a nuclear reactor. For the rest of this I'll address the more reasonable versions of the assertion.

To honestly make the claim that nuclear waste can be used as nuclear fuel several additional, and extremely important, details need to be added. Here's an overview. Details follow.

It is helpful to think of two types of fuel, active fuel and potential fuel. Active fuel undergoes a reaction that releases large amounts of energy, allowing the reactor to generate power. Potential fuel undergoes a reaction (or reactions) that convert it to an active fuel. In our day to day world only active fuel is thought of as fuel. Most current nuclear reactors use U-235 as active fuel and in normal operation almost all of it is used up. At the same time a small amount of the U-238, acting as potential fuel, is converted to Pu-239 that then acts as active fuel. All of the uranium originally put into the reactor is fuel but the vast majority is potential fuel and only a small amount of that either starts as or can be converted to active fuel in current reactors. It is sometimes said that the only difference between nuclear fuel and nuclear waste is the type of reactor it is used in. If by "fuel" we mean active fuel, which is the only kind we are familiar with in normal life, this is simply wrong. It is possible to build reactors where virtually all of the potential fuel is converted to active fuel but those reactors are complex and expensive and they require a large amount of complex processing to get anywhere near this point. It should also be noted that these future reactors can NOT be operated using only what we now call waste (more detail below). New, active fuel is needed.

The first problem is that the term "nuclear waste" means entirely different things to different people. For this discussion we'll be considering "spent nuclear fuel". This is a small fraction of the radioactive material produced in a reactor and is the only really problematic material short of a nuclear accident.

For this discussion we can consider most nuclear fuel as just a mixture of two isotopes of uranium. The majority is U-238, about 95% of it. It is potential fuel so it cannot take part in the power generating nuclear reactions. The rest, about 5%, is U-235. This is the active fuel that produces the desired energy in the reactor.

When the spent nuclear fuel is removed most of the U-235 has been used up, it is now <1% of the material. It is converted to what are called fission byproducts. This consists of a wide range of material, some quite radiologically hazardous, but for considerations of use as fuel it is worthless. Some of it is worse than useless. It will absorb neutrons, suppressing the desired nuclear reactions. It is truly waste. A small amount of the U-238 has been converted to plutonium, mostly Pu-239 with a smaller amount of Pu-240 and other isotopes. In total about 1% is plutonium. Some of the uranium is also converted to other uranium isotopes.

The claim that spent fuel is reusable is based on reprocessing. The idea is that by a variety of processes the spent fuel can be separated into components and then placed back into the reactors that it came from. Often the claim is made with numbers attached. Most commonly I've seen it as something like this paraphrased example: "Only around 5% of the energy available in nuclear fuel is extracted in our current reactors. 90% of rest could be extracted if the fuel were reprocessed". 

The 5% number should sound familiar, it is the original amount of U-235. The other 95% is based on the idea that the U-238 can also be used as a nuclear fuel. 

This is almost correct if we are careful when we talk about what kind of fuel we mean and, more importantly realize that it requires a type of reactor that is different than essentially all of the ones currently in use. The fact these issues are almost never mentioned when the claim is made is what makes the claim so misleading. 

To understand this better a bit more background is required. All of the reactors in this discussion are fission reactors. Nuclear fission occurs when a neutron is absorbed by a nucleus and that nucleus splits, roughly in half, and often emits a number of neutrons. There are many isotopes that will undergo fission but the important subset for this discussion are those that emit enough neutrons when they fission that a chain reaction is possible. These are called fissile materials. There are only a few such isotopes and three of them are most important for nuclear power: U-233, U-235 and, Pu-239. Of these only U-235 exists in any noticeable amount in nature. But there isn't much, it is just 0.7% of natural uranium. The remaining 99.3%, is essentially all U-238.

Most power reactors use uranium that has been enriched in U-235 to about 5%. This is done by separating out almost pure U-238, increasing the proportion of U-235. The separated material is known as depleted uranium. Enrichment is done so that the neutrons released by the fission are likely enough to reach another U-235 nucleus and cause a sustained reaction. This results in the 95% U-238 mentioned above. As mentioned above, the operation of the reactor results in some Pu-239 being created, some of it used as active fuel but about 1% of the the spent fuel is plutonium. 

The transformation from the potential fuel, U-238, to Pu-239, is a multi-step process.

U-238 + neutron → U-239
U-239 decays to Np-239 (half-life of about 25 minutes)
Np-239 decays to Pu-239 (half-life of about 2.5 days)

Near the end of the useful life of the nuclear fuel a significant fraction of the energy comes from the fission of Pu-239. It is possible to design a reactor to optimize the production of Pu-239. These are called "breeder" reactors and they produce more active fuel than the active fuel they consume. Breeder reactors sound too good to be true. Producing more fuel than is consumed doesn't seem possible, but it is. That's one of the problems with this entire topic. So much of what happens is so far outside of the intuition developed in our day to day world that we are often lead to conclude things that aren't true.

As an interesting aside, the oft mentioned Thorium reactor is also a breeder. Here Th-232 is a potential fuel that goes through a similar (but slower) multi-step reaction to produce the active fuel U-233, the third fissile isotope.

Current reprocessing of spent fuel produces, among other things, MOX fuel. These are Mixed OXides of uranium and plutonium, current reactors can use this material as a minority of its fuel. Almost all of this is produced by combining plutonium from reprocessing with depleted uranium. Depleted uranium is used because the uranium obtained from reprocessing (RepU) contains various impurities that impede the nuclear reactions and other negative consequences that are beyond the scope here. This is in direct contradiction to the common assertion that this MOX fuel is composed mostly of the reprocessed material and results in more of the potential fuel being used.

In conclusion, there is a lot of misinformation about nuclear power and so called "nuclear waste" much of that misinformation greatly exaggerates the dangers and problem. But a significant amount is being spread by nuclear proponents. Whereas it is possible to process spent nuclear fuel and extract virtually all of the nuclear energy available in the uranium fuel (both active and potential) this is far more complex and expensive than is implied by proponents, it would require reactors that are significantly different than those in use today, and it would not use all of material as fuel. 

Monday, July 19, 2021

Vaccination Fraction and the Base Rate Fallacy

I keep seeing references to the fraction of people with COVID related conditions that have been vaccinated. A recent story implied that the fact that about half of the infections in Israel were in vaccinated people was evidence that the vaccines aren't working. There is a nice post that looks at the stories about Israel. Half sounds like a lot only is you don't take into account the base rate of vaccination. Since 85% of the people have been vaccinated the fact that a far smaller fraction of the infected were vaccinated shows that the vaccines do work. Not taking the base rate into account, or deceptively not including it in the information provided, is the Base Rate Fallacy.

What is the relationship between vaccine efficacy, vaccination rate, and the fraction of those effected that are vaccinated? It turns out that it isn't as simple as you might expect.

Say the fraction of people that are vaccinated is v, and the fraction of illnesses the vaccine prevent is e. Then fraction of infected people that are vaccinated is:

Let's look at some examples. If we set both e and v at 0.85, meaning that 85% of the population has been vaccinated and that the vaccine is 85% effective in preventing disease. In that case the fraction of the cases that are vaccinated is expected to be just under half. For the current situation in the US let's take e=98% and v=30%. This gives us 0.008, which is a pretty good match for the oft repeated "99% of those hospitalized for COVID in the US have not been vaccinated".

I see a slight danger here. If we consider e=0.98 and v=0.5 we get 0.019. The vaccinated fraction more than doubles. The reason it goes up as the vaccination rate goes up is obvious, once it is pointed out. Take the extreme case, where everyone is vaccinated, v=1. Then the fraction of effected people that were vaccinated is also 100% since that's the only kind of people there are. This is the extreme case of the Base Rate Fallacy. I both hope and fear that when the US gets to v=0.5 we hear people using this doubling in the vaccinated fraction as evidence that vaccines aren't working. I hope for this because it will mean that v has gotten to 0.5 and I fear it will reduce future vaccinations. If we ever get to v=0.95 the fraction will be greater than 1/4, which will doubtless be trumpeted as a failure.

How do we get this equation?

Let's define some factors. Let's say that if G (for group) unvaccinated people are exposed we get Gc infections. That defines the degree of contagion c. If G vaccinated people are exposed there will be Gc(1-e) infections where e is the efficacy of the vaccine so we have a modified contagion factor of c(1-e).

If we have a group of N people then the number of cases expected in the vaccinated group, Cv, is Nv (the number of vaccinated people) times the modified contagion factor c(1-e).

The number cases in the unvaccinated group, Cu is N(1-v) (the number of unvaccinated people) times the contagion factor c.

So the fraction of cases that are vaccinated is the number of cases in vaccinated people divided by the total number of cases Cv/(Cu+Cv) or

There is a common factor of Nc so that cancels. So the number of people and the contagion factor don't enter into the final result. So we get: 


We could also divide through by (1-e) and get the expression at the top.

Friday, July 2, 2021

The Analemma

Most of us have seen an analemma though many are not familiar with the term. If you've looked at a globe or world map you've probably seen a curve that looks like a figure-8 somewhere in the Pacific Ocean. Like this (source: Wikimedia Commons):


The analemma is the path in the sky the sun follows at the same local time every day over the course of a year. This is the subject of some beautiful images. This one is tilted because the image shows the location of the Sun in the morning, not noon as will be used in the rest of this post.



If you haven't thought about it, it is hard to connect that statement with that shape. In fact, even if you have thought about it, it isn't easy. So what's going on here?

There are LOTS of explanations of this on the web that range from simplistic to opaque. They treat the subject with varying degrees of completeness and comprehensibility. I looked a lot of them and didn't find a single one that explained the features in a way that's comprehensive, correct, and accessible. So I decided to write this one.

Some things are easy to see. The Earth has a tilted axis that remains fairly fixed in orientation as it goes around the Sun. Except for areas near the equator, that means the Sun is higher in the sky in local summer than in the winter. For a lot of this discussion we'll be talking about the subsolar point, the spot on the Earth where the sun is directly overhead. This point is at its most northward at the June solstice and most southward at the December solstice. Each of those latitudes is removed from the equator by the tilt of the axis. The full range of this is just twice the tilt of the axis.

That explains most of what's going on but you might think that the sun would just move up and down in a straight line. There are three reasons that this isn't the case. None are obvious without a fair amount of thought and only one is explained well, or even at all, in most treatments of the analemma.

First we need to talk about one of my favorite bits of pedantry. When asked what the length of a day is on Earth most people have no problem coming up with 24 hours. And that's right, or almost right. The problem is that there at least three kinds of days. For most planets, including Earth, the day is determined mostly by the rate at which it rotates. In fact, when you look up the length of the day for any planet other than Earth what you often find is the rotation period measured with respect to the distant stars. This is called the sidereal day and for Earth this is 23 hours 56 minutes 4.09 seconds (approximately). So why do we say that a day is 24 hours? Are we just rounding up? No, that's not what's happening. There is a kind of day that is 24 hours, or pretty darn close. But we aren't there yet.

When people think of the word "day" they think about the time the Sun is above the horizon. At first blush you might think that the length of time it takes for the sun to go all around the sky is one sidereal day. But it isn't. That's because as the Earth turns it also goes around the Sun. Since both of these rotations are in the same direction it takes a bit longer for the subsolar point to return to (near) the same spot than one sidereal day. To make this clear, look at this. (source: Wikimedia Commons):


Marked "1" we see our Blue Marble with a marker pointed to the Sun. Next, at "2", Earth has rotated exactly once so it is pointing in the same direction as before but it has moved a bit around the Sun so it isn't pointed at the Sun. At position "3" it has rotated a bit more and it now pointed at the Sun. This take about four minutes. The time from noon to noon is called a Solar Day. Over the course of an entire year the number of sidereal days is one more than the number of solar days. (This figure is not an accurate representation of the motion of the Earth. The effect and relative sizes are distorted for clarity). So for a planet with a perfectly circular orbit and no axial tilt the Sun would trace the same path in the sky and be at the same point at noon every day.

But the Earth doesn't go around the Sun at a constant rate. It moves faster when closer to the Sun so it takes more time than usual to point back toward the Sun. This effect adds about 10 seconds to the length of a Solar Day in January and shortens it by about the same in June. This means that on some days it takes longer for the Sun to return to the same point in the sky, on others it takes less time. This produces side to side motion as seen in the analemma.

To see this explicitly, let's consider the point where the prime meridian and the equator cross. This is shown by the red pole pointing straight at the camera in the image below. The axis is the thick green pole and there are rings for the equator and the prime meridian.


If we look at the "Earth" (in quotes because there is no axial tilt) from the Sun once every average solar day we see two effects. The "Earth" changes size because the distance changes and it appears to rotate slightly. The spot where the Sun is overhead (subsolar) is indicated by a white dot.


It is easy to see some of the effect of a tilted axis. Let's consider the simplest case: A perfectly circular orbit with an axial tilt of 23.4°. Here are four images showing the situation at the equinoxes and the solstices.


We see that the Sun is overhead in the southern hemisphere in the first image, then at the equator, the northern hemisphere, and the equator again. So, we have the location of the Sun in the sky at successive mean Solar days moving north to south because of the Earth's tilt and east to west because of the elliptical orbit. This is where most explanations of the analemma stop. But it turns out that there are other effects. Even if our orbit were perfectly circular, the Sun's location at solar noon would still move east to west. This shift is also due to the Earth's tilt.

Let's look at a simple, but very exaggerated, situation. A planet with no axial tilt, a perfectly circular orbit, and a sidereal day 1/8 of its year. To start, as we did before, let's look from the direction of the Sun at local noon at the point with the red pole. 


Now let's move ahead one sidereal day. The axis is pointing in the same direction it was in the first image but the angle has changed.



If we advance forward to one solar day from the starting point it looks exactly like the first image. This gives the expected behavior. As we go through the year after each solar day that Sun returns to the same point in the sky.

But things are different when there is an axial tilt. Here's a set of images that show the differences. The ones on the left are the same those above. The ones on the right show what happens with a tilt of 45°. Starting at the time of the southern summer solstice so the planet has the southern hemisphere tilted the maximum amount toward the sun.

     Zero tilt                                           45 Degree tilt

The first four images are just we expect. For the top two the subsolar point is on the meridian and at a latitude that equals the axial tilt. The next two show the result of having revolved 1/3 of the way around the sun just as the planet has completed one solar rotation. But the last images are a surprise. At first we might expect the subsolar point to be along the marked line of longitude in both but isn't. The key to understanding this is to look at the axis. The North Pole has moved to the right and the subsolar point is displaced to the right of the meridian. If we look at the animation below showing an entire year of this, we see that the pole appears to shift both left and right as well as north and south.


If we look from this planet, along the direction of the red pole in the animation above we would see the Sun trace out the pattern seen below:


This computation was done with a circular orbit, all of the motion is caused by the large, 45°, tilt. If we try using the values for Earth, a tilt of 23.4° and an eccentricity (a measure of the non-circularity of the orbit) of 0.0167 we get this:


This is not the shape of the analemma as seen from Earth. It's pretty close, but not quite right. So, what went wrong? There is one more factor that affects the shape. We have taken into account the tilt of the axis producing motions in both the vertical and horizontal directions and the eccentricity of the orbit producing its own side to side motion. What we are missing is the relationship between the date of perihelion and the date of the solstice. For Earth, the southern solstice happens about 2 weeks before perihelion. In the computation above they happened on the same date. If we make this adjustment we get a good match:


This can produce some very different shapes. If we leave the date difference at 2 weeks, and increase the eccentricity to 0.3 and the tilt to 45° we get:


We can notice both the strange shape and the effects of the large eccentricity making the Sun change size by quite a bit.

I'll end this with one more simulation. Here is what the analemma looks like on Mars: