Thursday, September 1, 2016

Live From Low-Earth Orbit!

It looks like I disappeared again. Or maybe I was just too faint to detect above the noise of the internet. Sorry about that. To make up for my absence, this post will have a whole bunch of pictures. After all, there is a favorable exchange rate between pictures and words.

What's brought me out of hiding today is a very cool new account on Twitter. Because the Hubble Space Telescope kind of belongs to the American public, it has started live tweeting where it's looking, what tools it's using to do that looking, and who told it to look there. So you get stuff like this:
The picture is not what Hubble was looking at right then but an image pulled from the Sloan Digital Sky Survey. Hubble can't usefully beam images directly to us, because everything Hubble (and all other telescopes) looks at has to be processed. This notion makes people grumble, because they want to see the raw, unmanipulated data in its purest form rather than rely on whatever artistic license NASA has exercised.

But raw images in astronomy (and raw data more generally in science) simply aren't useful. In fact, they don't even exist, because any contact with an instrument inevitably distorts the data. The purpose of processing images, then, is to remove the imprint of the instrument on the image and hopefully recover what's actually there.

The coolest part about Hubble_Live is that it tweets out this process, too. There are many ways astronomers attempt to extract the true signal from the data collected, but I want to talk about three of the big ones I've learned about and which Hubble employs. These are:

Hubble performs these calibrations in order to figure out how it's interfering with the pictures it's taking. To see what these calibrations do, I want to show you some data my classmates and I took with a much smaller, terrestrial telescope last fall. We were looking at the Ring Nebula, which Hubble has an obnoxiously gorgeous picture of here for reference:

NASA, ESA, and the Hubble Heritage (STScI / AURA)- ESA / Hubble Collaboration
The Ring Nebula is faint, so to image it we tracked it for two minutes, letting the charge-coupled device (CCD) at the bottom of the telescope count up the photons streaming in from space. But a CCD is not really a camera. It's more accurate to think of a CCD as an electron counter.

At each pixel, there's the electric equivalent of a little bucket that collects electrons and converts them into a voltage that can be measured and manipulated digitally by a computer. Ideally, the way the CCD counts electrons is by having them knocked into the pixel bucket by incoming photons. But there are other sources of electrons, too. If you don't take them into account, you end up with an image that doesn't correspond to what you were looking at. So here's the raw data of the Ring Nebula taken by our telescope:
Ignore the numbers.
As you can see, well, err. Now I said this is the raw data, not an image, because what I'm really showing you is a two dimensional matrix where the intensity at each pixel is proportional to the number of electrons that were counted at that pixel. There's no sense in which this is representative of what a human would see if they had eyes as big as a telescope and could store light for two minutes. It's just a graphical representation of the electrons counted. All pictures you see--whether from Hubble or your smartphone--are just that. The difference is sometimes we want to modify that matrix so it looks something like what people see.

I'm being a little disingenuous here, though. The Ring Nebula is in this data, but because it is very faint compared to some of the pixels in the image, it's not apparent. I can turn up the contrast by bounding the brightness levels you're allowed to see, and then the nebula does appear.
Color photography is so overrated.
But I haven't done anything scientific here. I haven't calibrated the data at all, just chosen what data to show you. This isn't a more accurate or useful representation of the data. To get a scientifically meaningful image, we have to account for all the extra electrons our CCD has picked up.

One electron source is the instrument itself, which because it is not at a temperature of absolute zero consists of vibrating molecules that can occasionally knock an electron into the pixel bucket. This is called the "dark current," because it shows up even when the telescope isn't looking at anything. The warmer your telescope, the larger the dark current will be, which means that weak signals can be lost in the noise of the telescope's heat. You can minimize that heat and detect faint signals by keeping your telescope cold (like, say, by putting it in space).

To determine the dark current, Hubble does a dark calibration, which essentially amounts to taking a picture of the same exposure length as your actual picture but with the lens cap on. That way the only electrons detected will be those coming from the heat of the instrument. Once you know what this average amount of heat is, you can subtract it from the electron counts of your image. Here is an image of the dark frame from our observations:
Think TV static.
The intensity of our dark frame is about 60% of the intensity of our image, which means that by subtracting it from the image, we're losing a lot of information on faint sources. But if we don't subtract the dark current, we're overestimating the true brightness of the Ring Nebula by a factor of roughly 2.5, which would lead to some pretty bad science on our part.

Another source of electrons is the electronic components of the CCD. To operate properly, a CCD requires a certain voltage to be coursing through it constantly. For Hubble, this is the BIAS calibration, because you can think of the CCD voltage as being a bias introduced into the electronics in order to produce usable data. Telescopes acquire a bias frame by taking a zero-second exposure that doesn't let in dark current electrons or photoelectrons. Hubble does this separately from taking its dark calibration, but in certain situations you can also simply assume that your dark current includes the bias electrons. In that case, subtracting out the dark frame gets rid of the bias electrons, too. That was the case for the data we collected. If you look at what's left over after this subtraction, this is the image you get:
The Thumbprint Nebula (I've zoomed in a bit here.)
While this looks worse than the artificially contrasted version up above, the Ring Nebula does pop right out when the telescope's heat and bias is removed without manually adjusting the contrast. By fiddling with contrast, you can create spurious images that don't represent anything actually out there. No artificial images happened to be produced in this case, but we can't be so sure that the structure we see in the Ring Nebula is the true structure except by removing the dark current and bias electrons.

Finally (for our scenraio), the individual pixels in the CCD might have varying levels of light sensitivity. Since we want each photon to count equally, we have to adjust for these effects. Balancing the variable sensitivity is known as flat fielding, and you produce a flat field by shining a light of uniform intensity across the CCD. When you do this, the CCD should record the same number of electrons (more or less) at each pixel. If some regions of the CCD are too bright or too dim, you know this corresponds to unequal sensitivity. To remove the effects of this sensitivity, you divide your image by the (normalized) flat field, so that the brightness at each pixel is adjusted by a factor proportional to its sensitivity.

In space, unfortunately, it's difficult to shine a uniformly bright light on Hubble. You might think the Sun would work, but the Sun is way too bright and even a very short exposure would saturate Hubble’s sensors. Saturation causes electrons to bleed over into neighboring pixels and gives you electron counts that are not proportional to the number of photons detected. Instead, Hubble takes flat fields by looking at the Earth, which (with a lot of processing, aided by the fact that the Earth moves beneath Hubble very quickly, blurring any image it takes) can reproduce a flat field. So the DARK-EARTH calibration is Hubble's way of adjusting for the varying sensitivity of its equipment.

On Earth, flat fields are usually produced by shining a light on the dome of your observatory and having the telescope look at that, or looking at a small region of the dark sky before any stars become visible. Here's the flat field we produced:
I think the telescope has floaters.
I suspect we actually did a very poor job of shining light uniformly as I think you can see our light source positioned on the right side there. The smudges, however, probably are true variations in the pixel sensitivity, so producing the flat field removed those. (The ring-like smudge in the middle is an eerie coincidence.) After dividing our image by the flat field, we get this picture:

Possibly the Eye of Sauron (More zooming done.)
The main visual advantage seems to be increased clarity of the inner region of the nebula.

None of these, of course, look like the beautiful pictures we see from Hubble or APOD. There are two reasons for this. One, our telescope simply doesn't have the resolution (or other exquisite features) that Hubble does, so there's a limit to how nice a picture it can take. The second reason, however, is that pretty pictures are created to be pretty, not for doing science. As I said above, this is just a representation of the data, but there are other representations.

In fact, one purpose of this lab was to determine the three dimensional structure of the nebula. That is, is it a donut or a shell? A picture alone can be deceiving. But other methods of interpreting the data might be more useful. So here's another representation, plotting the brightness of the nebula along a particular axis in different wavelengths of light:
Graphs are the best, you guys.
Doing some math on graphs like these, we were able to show that the Ring Nebula is probably more like a thin shell of material than a donut, despite its visual appearance. The ring is a bit of an illusion. But a graph like this is only accurate because of the processing done to the remove observational artifacts, even though that processing does not produce an aesthetically pleasing picture.

Nevertheless, what's astronomy without cool pictures? In addition to looking at the nebula with a clear filter, we also used filters that passed only red light from glowing hydrogen and blue/green light from doubly-ionized oxygen. When you clean up that data, assign a color to each filter, and plot them on top of each other, you get this:
Insert riff on Beyoncé lyrics here.
That's not really what the Ring Nebula looks like, but it is one way of seeing it.

Thursday, March 17, 2016

On Guessing

This is a follow-up to my Lagrange point post. At the end, I briefly mentioned the L4/L5 Lagrange points, which are stable and form equilateral triangles with the masses of a three-body system. I'd like to delve into the physics of these points a bit to illustrate something about how physicists solve problems.

That is, physicists (in general) do not like doing calculations. They don't want to sit around all day crunching numbers to arrive at an answer. When you solve a physics problem, the goal is to build as simple a model as possible that captures the essential features of what you're studying. (This is where the spherical cow jokes come in.) That way, if you're lucky, you can avoid having to do a lot of math. Instead you can arrive at the answer you want by symmetry, or dimensional analysis, or guessing.

Guessing is an important part of the physicist's toolkit and some of what makes doing these problems fun (for me, at least). It's easy to stare at a problem for hours and feel overwhelmed by the complexity of it. I liken this to how it feels when you've just begun to write something. You have a blank screen and a blinking cursor in front of you and there's nothing more terrifying or paralyzing.

In writing, sometimes the solution is to just start writing and see where the story takes you. And so it follows with physics. If you have a complex problem, at times the best strategy is to just guess at the answer and see where the physics takes you. In this way, doing physics can be a lot like playing a game or solving a puzzle. It's fun, and I seriously wouldn't still be in school if I thought otherwise.

So let's return to the L4/L5 Lagrange points. In class, when discussing the three-body problem, our professor performed enough derivation to get us to believe that stable orbits can exist. He went through the same argument I used about rotating frames and centrifugal force. So a test mass is in a stable orbit when gravity and centrifugal force cancel out. He then gave us the punch line, telling us where the Lagrange points are, but didn't go through the math of actually finding them. Why not? Because if you do the derivation, the equations of motion you end up having to solve are:

I should probably credit Massimo Ricotti for this.
I'm not going to attempt to explain what all that means. It's ugly, and you wouldn't want to solve that unless you had no other choice. But there is another way. Our professor mentioned that when thinking about the 5 Lagrange points, you can guess where 2 of them (L4/L5) must be.

This intrigued me, which is why we're here today. What makes it possible to guess these locations? As we saw with the L2 point, its exact location is related to the square root of the ratio between the two big masses. This is (probably) not something you could just pull out of thin air. But that's not the case for L4 and L5. The location of one of these points is at the vertex of an equilateral triangle that has the two large masses at the other vertices. Flip this triangle over and you get the other one. How massive the objects are isn't relevant at all; distance is the only important variable (and two masses can basically orbit each other at any distance they like). So you could conceivably guess the answer just by looking at the problem.

There are a lot more MS Paint illustrations coming. You've been warned.
But what makes equilateral triangles, as aesthetically pleasing as they are, physically appealing? Let's consider a special case and then move on to a more general scenario.

Forget the Earth-Moon system and consider two stars of equal mass in circular orbits about each other. In that case, the stars are actually orbiting their center of mass, which is halfway between the two for equal mass stars. A third body that's motionless in the rotating frame also orbits the center of mass, which means centrifugal force pushes away from that center. To make the problem even simpler, let's put the third body equidistant from the two stars.

I'm a big fan of purple.
Then the forces of gravity to the left and right cancel out, leaving only gravity pulling down and centrifugal force pushing up. To get our Lagrange point, we just need those forces to balance. This means we have to guess how far up from the center the Lagrange point is.

First, let's consider gravity. The total strength of gravity depends on the inverse square of the distance to the stars, d. But we don't want the total force, only the vertical component. That part is a fraction of the total, and that fraction is equal to h/d. This means gravity now depends on the distance to the center of mass and the inverse cube of the distance to the stars.

On the other hand, centrifugal force depends on the distance to the center of mass, h, and the inverse cube of the distance between the stars, a. Our gravity and centrifugal terms are nearly the same, except one uses a and the other d. But we're trying to find d, so let's just guess that d=a. Then all the lengths of our triangle are equal and we've found a point where all the forces cancel out--a Lagrange point. (This guess works because the constants in each equation are the same. Otherwise, d might just be proportional to a.)

So there we have it. Using a few reasonable assumptions, a simple model, and nothing more than geometry, we've found the Lagrange points. Where do we go from here? How about back to the Sun-Earth system, where one of the two masses is much, much bigger than the other. If that's the case, then the center of mass moves to the sun, and centrifugal force points directly away from it.

It's a trap!
If we maintain our equilateral triangle guess, where does that leave us? With a problem. The problem is that if you rotate the above picture so that the sun's gravity vector and the centrifugal vector are horizontal, you're left with the Earth's gravity vector at an angle of 60° away from horizontal. This is bad because the "vertical" component of the Earth's gravity isn't balanced by anything else, which means that no matter what values you insert into your equation, there is no equilibrium point. Uh, oh.

But our graph has fooled us here. You see, by moving the center of mass directly on top of the sun, we are implicitly saying that the Earth has no mass whatsoever. And if that's the case, then it has no gravitational force, which means it doesn't need to be counteracted at all. In the limit where the Earth has no mass, the three-body problem reduces to the one-body problem. So there is a point of stability at the equilateral triangle, but also at any point along the same circular orbit.

This wasn't a totally useless exercise, however. It shows us that it's reasonable to expect L4/L5 to be stable from one extreme of equal masses to the other extreme of just one big mass. But we haven't yet proven that the L4/L5 points exist where they do for any arbitrary masses. How do we do that? First, let's make a generic diagram describing the situation.

You made it.
Let's say that Star A has a mass of m and Star B has a mass of km, where k is some fraction between 0 and 1. This means we can vary between the two extremes of equal mass (k=1) and one dominant mass (k=0). The smaller k is, the farther to the left the center of mass moves, the smaller Star B's gravity vector is, and the more horizontal the centrifugal vector gets. This should mean that the forces pointing to the right stay balanced. Additionally, as k gets smaller, there is less overall gravity pointing down, but because the centrifugal force is getting more horizontal, that gravity has less it needs to counteract. So our equilateral triangle still looks good.

To prove the general validity of our guess, let's see what happens if the interior angles are some arbitrary angle, rather than the 60° they must be. We have to compare the combined vertical force of gravity to the vertical centrifugal force. Using trig, we can find the distance from the test mass to a star in terms of a and θ. Because of the inverse square law of gravity, a is going to be squared. Trig also gets us the vertical component of that force in terms of θ.

On the other hand, centrifugal force depends on the distance to the center of mass, l. But because we only want the vertical component, the actual location of the center of mass is irrelevant and all we need is h, which again can be found in terms of a and θ. As before, centrifugal force also depends on the inverse cube of a, so some canceling of exponents means it's the inverse square of a that shows up.

Because both expressions depend on the square of a, we can get rid of it. Both forces are also equally dependent on the sum of the masses of the two stars, so we can cancel the mass terms, too. This means our equation is now defined entirely in terms of θ. After a little algebra, we can arrive at the following equality:

sin(θ) = 1/2

Everything else in our equation is gone. All that matters is the angle between h and d. Now, I just happen to know that the sine of 30° is 1/2. This means the full interior angle is 60°. With our guess that the test mass is halfway between the two stars, the only possibility is an equilateral triangle with interior angles of 60° and lengths of a. (A similar argument can be made for the horizontal components of the forces.)

I should note that this doesn't prove that there aren't other Lagrange points forming different triangles when the test mass is not half way between. To see that there can't be other points of stability (except on the line joining the two stars), you need to solve for the effective potential of the force fields at work in this system. That can't be done by guessing, but it can be done by drawing! Unfortunately, drawing equipotential surfaces would strain my artistic talents past their breaking point. Here's some computer art instead.

Credit: NASA / WMAP Science Team

Wednesday, March 9, 2016

Lagrange Point 2: Newton's Redemption

This past November, I had the opportunity to tour Goddard Space Flight Center. Although we saw many cool operations (including a gigantic cryogenic chamber!), the most interesting was the under construction James Webb Space Telescope. I had intended to write about the visit at the time, but I spent much of my fall semester trying not to hyperventilate instead. However, we just covered some relevant material in my theoretical astrophysics course, so let's take a look now.

A full-scale model. Credit: NASA
JWST gets called the successor to Hubble, but calling it the sequel would probably be more appropriate. It promises to explore material untouched by the first one, it's going to have even more spectacular visuals, and it's way over budget and behind schedule. The two features that most distinguish it from Hubble are its size (bigger) and its wavelengths of interest (longer).

Longer means infrared. Being an infrared telescope, JWST will see through dust, directly image planets, and peer further back in time at objects redshifted out of the visible range. But infrared telescopes come with some complications. On Earth, we don't do a lot of infrared astronomy, partly because the atmosphere absorbs too much of it, but also because stuff too cold to emit visible light (basically everything on Earth) is usually spilling out lots of infrared instead. We can't do IR astronomy on Earth for the same reason we can't do visible astronomy during the day: it's too bright.

That's why JWST will be in space. But even in space, the Earth and sun loom large. Keep the telescope too near the Earth, and the Earth warms it up, generating noise in the cameras. JWST must be kept cold, much colder than the objects it wants to look at. The only way to accomplish that is to put it far away from the Earth and hold up a shield to block the Earth and sun. The trick is that you want to be able to block both bodies at the same time, which wouldn't work if you just flung the satellite into any old orbit. The farther you get from the sun, the longer your year (Kepler's third law says the cube of your semi-major axis is proportional to the square of your year), so the sun and Earth will change relative positions in the sky.

You need to find an orbit that's far away, stable, and lets you block two objects at once--tricky. Arranging three objects in space is known as the three-body problem in celestial mechanics , and it has a long history. When Newton first formulated his laws of motion and gravity, he was able to solve the one- and two-body problems. That is, he could tell you how a tiny, insignificant planet would orbit a gigantic star (the one-body problem) or how two comparable objects would orbit each other (the two-body problem), but he was not able to count any higher than 2. Newton reasoned that miniscule interactions from nearby planets would build up over time and slowly destabilize orbits, and he assumed the only solution was divine intervention.

Astronomers, physicists, and mathematicians spent a long time looking for more precise answers. It turns out there is no generic solution to the three-body problem, no simple orbit that works for any configuration of three or more masses. Using perturbation theory, you can account for the infinitesimal, cumulative influences of many bodies over time, but in the long run (millions of years), orbits become chaotic. Chaotic doesn't necessarily mean that a planet will be flung from the solar system, but that we eventually can't say with any precision where in an orbit a planet will be at any given time.

A couple mathematicians were able to work out very specific periodic solutions to what gets called the restricted three-body problem, or the 2+1 body problem: two large gravitating masses, one tiny mass that is virtually insignificant. In just the right location relative to the big ones, the small one can be stable. Nowadays these are known as the Lagrange points, in honor of one of the mathematicians who worked them out (Euler already had enough named after him).

This seems perfect for JWST. If there's a line between the sun and the Earth, we want JWST to be on that line out past Earth. Can we find a Lagrange point there?

In space, lines are purple.
Well first let's backtrack just a second. There isn't really a line connecting the sun and the Earth, because the Earth is constantly in motion about the sun at ~30 km/s. The only way to draw such a line is if we imagine ourselves moving along at the same angular speed as the Earth so that it appears stationary.

Notice I said angular speed, which is how long it takes to move a given angle rather than a given distance. If you think about a spinning tire, the outer bits are moving faster than the inner bits, because the bigger the radius, the larger the circumference covered in the same amount of time. But they are both covering the same fraction of a circle in the same time, and thus both have the same angular speed. If different bits moved at different angular speeds, they wouldn’t keep the same relative positions and the tire would spin apart.

We want our frame and JWST to be moving at the same angular speed as the Earth. But in establishing this frame of reference, we have invalidated Newton's laws of motion. We are no longer in an inertial frame, which is one moving at a constant velocity. Circular motion is not constant, because velocity includes direction.

What does it mean for Newton's laws to be invalidated? It means that an object not experiencing any net force will seem to accelerate away. For our rotating frame, maintaining circular motion requires constant force toward the center of the circle. Tie a ball to the end of a string and spin the ball in a circle. The tension along the string is the radial force that maintains circular motion. If the ball comes loose, it will fly off in a straight line. But from the frame of the spinning string, which can continue spinning as long as you supply a force, the ball will appear to curve away. This tendency to accelerate away from a spinning frame can be accounted for if we invent a fictitious force--centrifugal force--that acts in opposition to whatever force maintains circular motion--centripetal force.

So if we look at the Earth from a rotating frame, JWST will seem to experience a centrifugal force pushing it away from the Earth. In order to have the telescope remain stationary in our rotating frame, the force from gravity must balance the centrifugal force.

Doing physics really involves making diagrams like this.
So here's our three-body problem. JWST is pulled inward by the gravity of the sun at a distance of a+d and by the gravity of the Earth at a distance of just d. That sum is:

Fg = Gmsunmjwst/(a+d)2 + Gmearthmjwst/d2

And it's pulled outward by the centrifugal force which results from the angular motion of the system. How do we characterize the centrifugal force? It's the square of the angular speed times the distance from the center of mass (the sun, in this case) times the mass of the accelerating object. The angular speed is inversely proportional to the period, the Earth’s year. So centrifugal force involves the square of the period. Using Kepler’s relation between period and semi-major axis, we can substitute in that quantity (a in our diagram). Doing some algebra, that gives us a centrifugal force of:

Fc = G(msun+mearth)mjwst(a+d)/a3

And we want Fg to equal Fc. If we cancel some stuff out, we arrive at the following expression, which is defined purely in terms of the masses and the distances between them:

msun/(a+d)2 + mearth/d2 = (msun+mearth)(a+d)/a3

We're trying to solve for d, the point at which all these forces cancel out. But there's a problem. If we were to multiply all these terms out (FOIL!), we'd find this was a quintic function, which means there'd be a d5. And there is no equivalent of the quadratic formula for quintic equations. So we have to make some approximations. We have to assume that the sun is so much bigger than the Earth (true, in this case) that the Earth can be ignored whenever the two terms are added together. And we also assume that d is much smaller than a, which lets us do some mathematical tricks. If you make those approximations, and then do some more algebra, you eventually find that:

d = a(mearth/3msun)1/3

That is the location of the second Lagrange point (and the first one, but on the other side). Plugging in the relevant numbers, d = 1.5 million km, which is curiously 1/100 Earth’s distance from the sun. The sun is a little more than a hundred times wider than the Earth, which means that from L2, the Earth and sun appear just about the same size--more or less the moon. And that means JWST can easily block both of them with the same shield. (The similarity in angular size really is a happy coincidence that has to do with an accidental congruence of densities, radii, and that factor of 3 up there. Try it with any other planet and it doesn't work.)

So there you have it. When the combined gravitational pull of the Earth and sun cancel out the centrifugal force pushing JWST away, the telescope remains stationary with respect to Earth’s motion about the sun. It sits 1.5 million km behind the Earth and completes an orbit in a year despite being farther away from the sun.

But that's not quite the end of the story. It turns out that L1, L2, and L3 (on the other side of the sun from the Earth) are only metastable, which means a slight push sends an object flying off into a new orbit. So we can put satellites there, but they require station keeping to prevent them from falling away. L4 and L5, which form equilateral triangles with the two big masses of the 2+1 problem, are stable. Consequently, we actually find families of asteroids called the Trojans at the Sun-Jupiter L4 and L5 points. Also, I’ve totally neglected the Coriolis effect here, which is another fictitious force that pops up when… oh dear, look at that word count.

Wednesday, February 24, 2016

When the Moon Hits Your Eye, There's Some Math Using Pi

It was warm and clear this past weekend, so I did some late night observation with my new binoculars. Weather is the bane of astronomers, unless you only work with space telescopes or you do neutrino observation or now even gravitational wave detection. Actually, I'm not so sure about that last one. I imagine a light drizzle could easily be mistaken for colliding black holes.

You're welcome, Celestron.
Anyway, it struck me while I was observing that the moon is very bright. Whenever I found it in my binoculars, I flinched momentarily before I adjusted to the stark change between black sky and white moon. And indeed, at night, the moon is by far the brightest thing in the sky (except for inconveniently placed streetlamps).

But it turns out the moon is pretty dim, too, when considered from another perspective (no, not the dark side). So why does the moon shine in the first place? While it does have a temperature, the vast majority of its thermal radiation is not in the visible range. Instead, of course, the moon borrows its light from the sun, reflecting it back toward us.

Naively, then, you might expect the moon would be roughly the same brightness as the sun. And when you look at a full moon hovering imperiously in the night, washing out all the stars in the sky, it does seem darn bright. However, our eyes (and the rest of our senses) are pretty terrible at discerning objective levels of radiant power. The moon is bright only relative to the sky and the stars. In astronomical terms, the sun is much, much more luminous than the moon.

Measured with fancy equipment, the apparent magnitude of the sun is about -27, while the apparent magnitude of the moon is roughly -13. If you remember from my nerdrage over Star Wars, larger magnitudes are dimmer, the visible stars are around magnitudes 1-6, and the scale is not linear. From this we can tell that the sun is way brighter than the moon, the moon is way brighter than the stars, and astronomers use a needlessly cumbersome system for quantifying brightness.

If you do the math, 1014/2.5, a magnitude difference of 14 is about a factor of 400,000 in brightness. Yes, objectively, the sun is 400,000 times brighter than the moon (as seen from Earth). So when the moon shines its paltry reflected sunlight back at you, what happens to the other 99.99975% of the light? How do we go from a sun’s worth of light to one moon unit (a Zappa)?

It's at this point you may recall that different objects reflect and absorb different amounts of light. That's why color exists, after all. You can also measure an overall amount of reflectivity, which gets called albedo. The bond albedo of an object is just the percentage of light that is reflected rather than absorbed. Freshly fallen snow has an albedo as high as 0.9, whereas asphalt can be as low as 0.04. The moon's average albedo is 0.12, which means 88% of the sun's light is absorbed. But 88% is not 99.99975%. From albedo considerations alone, the moon is still too bright by a factor of 48,000. How does the moon get rid of the rest of its sunlight?

The problem is that we're thinking of the moon as a giant, flat mirror directly reflecting the sun's light toward us. But the moon is not a mirror. You can tell this because it doesn't look like the sun. A mirror exhibits specular reflection, which means incoming light bounces off cleanly at a particular angle. If it comes in 30° one way, it bounces off 30° the other way. And since all the light bounces in the same way, mirrors reproduce an image of what’s reflecting off of them.

Ignore everything about this picture that is ridiculous.
Non-mirrors (everything else) reflect light diffusely, which from the name alone suggests the process is not so orderly. On the moon, the properties of the rough, irregular regolith on the surface determine how light is reflected, but the gist is that it’s very strongly dependent on the phase angle. In fact, the moon has an opposition effect, which does tend to bounce light directly back when light is coming from behind us, i.e. when the moon is full. Even still, the picture above doesn't hold.

I admit I struggled with this problem for a bit before finding a suitable answer. Here's what I did to solve it. How do you account for a factor like 48,000? Well, let's compare some relevant numbers. The moon is 384,400 km away from us on average. Its radius is 1,737 km. The Earth's radius is 6,400 km. The distance from the sun to us is 150,000,000 km. Hmm, I can’t think of anything else that might be important.

The distance from the sun can't matter, because we're dealing with the apparent brightness of the sun, which is how bright it looks to us from here on Earth. Distance already factors into the 400,000 figure. The Earth's radius can't matter, because we're talking about how bright the moon is to our eyes. If the Earth were the size of a pin (and we were still the same distance from the moon), it wouldn't affect the light that hits our eyes. So the only two numbers that can matter are the moon's radius and its distance from us.

Well, what's 384,400/1,737? 221. 221 doesn't look very good, but if we square it, we get about 49,000. That's very close, within a few percent, of our factor of 48,000 (which is a heavily rounded figure). Okay, but why does squaring matter?*

In the illustration above, we're thinking that the moon intercepts the sun's light and shines this perfect sun laser back at us. If that's the case, then we are hit by a circle of light with the area of the moon's disc. The area of a circle is πr2. (I told you π was involved.) If the above relation is valid, then we are really being hit by a circle of light with the radius of the distance between the Earth and the moon. How could that be? Imagine that instead of the moonlight bouncing straight back at us, it spreads out in a cone, with the angle between the edge of the cone and the line connecting the Earth and the moon being 45°.

Jobs I won't get upon completion of my degree include: NASA artist
In that case, the cone forms a right triangle, with half that base being equal to its height, the Earth-Moon distance. And if you turn the cone toward us, you see that the base is a circle. So instead of the light being concentrated into a disc the size of the moon, it's spread out into a disc with a radius of the lunar distance, which dilutes the light by a factor of 48,000 or so, because the Moon is much farther away from us than it is big.

Why would the light reflect back that way? It probably doesn't, exactly. The process by which the moon reflects light is complicated and is modeled with something called a bidirectional reflectance distribution function. But the opposition effect means a full moon tends to reflect light directly back, so everything coming back at an angle of 45° or less seems reasonable. But we're ignoring for a moment that the moon is not a point source, so that right circular cone probably looks different at other latitudes. On average, though, it works out to produce the above picture.

Anyway, that's probably enough MS Paint illustration from me for one blog post. Also, this is a reasonable length, so I better stop now before things get out of hand.

*Update: My solution to the problem posed in this post is almost certainly wrong. I believe I was right about the square relation between the moon's radius and distance from us, but wrong about why that relationship is important. That's the tricky thing about proportionality arguments: without constants, you can fool yourself about what you're talking about. Anyway, I think I've figured out the real answer.

So one of the issues that bothered me about my solution is that it relies on the moon being this weird, hard to study surface, but gives you an answer with a simple and neat geometric interpretation. That seemed unlikely, but the math worked so I accepted the answer anyway. But it turns out that the moon's surface is both harder and easier to analyze than I realized. Before I get to that, however, there's another important issue.

When I first considered this problem, I assumed the answer was that the inverse square law causes the light reflected from the moon to diminish so that it is less luminous than the sun. But after some thought, that didn't seem plausible. You see, when the sun's light travels to us, it loses some intensity because of the inverse square law, just like gravity gets weaker with distance.

For the moon's light, however, that light goes the extra distance from the Earth to the moon and back again (for a full moon). But the distance to the sun is 150,000,000 km, and the distance to the moon is 384,400, which means the additional distance traveled is only .5% more, which is only going to lose you .25% of your intensity from the inverse square law, and not the factor of 48,000 we needed. So I figured that couldn't be the answer.

What I was failing to consider, however, was that light reflecting off of the moon changes the applicability of the inverse square law. The inverse square law isn't mysterious. Rather, it's a consequence of geometry in a 3-dimensional world. If an object emits light radially from a point source, then at any given distance from the source, the light will be spread out on a spherical shell around the source. As the distance grows, the light falls off with the square of the distance, because the surface area of a sphere is 4πr2.

But any real emitter is not actually a point source. The sun radiates the light we see from its surface, which is (almost perfectly) spherical. All this means, however, is that there is some defined power at the surface, and we can imagine that power increasing to infinity as we dip below that surface to a point. But here's the key: the power radiated per unit area has some value at 1 radii out (the surface), and that power drops to 1/4 its original value at 2 radii, 1/9 its original value at 3 radii, and so on. Note that this exactly mirrors (ha) my original answer. At 221 moon radii (384,400/1737), the power has been reduced by a factor of 2212=49,000.

This answer being applicable, however, requires that light reflected from the moon is emitted radially (from the half that is facing the sun, anyway), which seemed implausible to me in the beginning given how complicated the moon's regolith is supposed to be. But it turns out that if you assume the moon is an ideal diffuse reflecting surface, then radial emission is what happens.

For a specular reflecting surface, the incident angle of the light exactly determines the angle of reflection. But for an ideal diffuse surface, the incident angle is not important at all, and the light reflects in a random direction. If the light reflects entirely randomly, then on average the angle of reflection will be exactly perpendicular to the surface, because any angle away from perpendicular will be balanced out. So on average, a diffuse reflector looks like a radial emitter and follows the inverse square law.

The complicated surface of the moon, with its opposition effect, means that the "on average" part up there is not strictly speaking true, but it apparently doesn't have enough of an effect to eliminate the approximately true inverse square relation that shows up. Why radially emitting from the moon seems to drop off more quickly than radially emitting from the Sun is because a radial emitter that has the Sun's apparent brightness at 1 lunar radii is actually a weaker source than the same apparent brightness at 1 solar radii. If you expand the lunar emitter to the size of the solar emitter, then your power/area is reduced accordingly and you have a dimmer surface, so of course its power will fall off more quickly than the solar emitter.

Well, anyway, so much for this post being a reasonable length.

Sunday, February 14, 2016

The Equivalence Post

About twenty years ago--maybe right around the time LIGO was finally getting funding, when the gravitational waves it just detected were still a couple dozen star systems away--my elementary school class did a living wax museum. We researched a historical figure, dressed up as our subject, and, when a "visitor" to the museum pressed a red dot on our hand, recited a first-person speech based on our research. Unrepentant early nerd that I was, I chose Albert Einstein.

I don't really remember anything about the contents of my monologue. I probably gave a brief biographical sketch, but likely left out the part where Einstein bribed his first wife into divorce with Nobel money he'd yet to receive. I probably talked about the theory of relativity and how it merged space and time, but likely didn't include anything about Riemannian geometry and metric tensors.

My knowledge of the scientist and his science was patchy, to be sure, but that didn't stop me from admiring him. Einstein is the model of the lone genius working tirelessly, using nothing more than the power of his mind to change the world. For a long time, I imagined he and I were equivalent. I imagined that I alone knew the secrets of the universe and that my solitude represented nothing more than the gap in intellect between myself and others.

Before the inevitable deconstruction of that paragraph, let's talk a bit about Einstein the genius. While E=mc2 is his most famous equation, it's not the equation that made him famous. Physicists will tell you that general relativity was his crowning achievement.

GR grew out of Einstein's attempt to extend his special theory of relativity to gravity. SR and electromagnetism fit together perfectly, but gravity did not behave. According to Newton, gravity acts instantaneously, and that didn't sit well with light speed being the ultimate limit. To reconcile gravity with relativity, Einstein looked at a subtle difference between the electrostatic force and the force of gravity.

When two charged particles are sitting next to each other, the electrostatic force that one feels is proportional to the product of their charges divided by the square of the distance between them--simple enough. When two masses are sitting next to each other, the gravitational force on one is proportional to the product of their masses divided by the square of the distance between them. The forces are nearly identical, just swapping charge for mass.

But when a particle feels a force, it follows Newton's second law and accelerates by an amount inversely proportional to its mass, which is what inertia is all about. This means the mass term from gravity and the mass term from inertia cancel out and bodies under the force of gravity experience the same acceleration regardless of their masses. We know this; it's just the idea that a hammer and a feather (ignoring air resistance) fall at the same rate.

Thank you, NASA.
This quirk of gravity gets called the equivalence principle, because it seems to show that "gravitating" mass and "inertial" mass are equivalent, even though there's no particular reason why they need to be.

As Einstein thought about this peculiarity of gravity, he was struck with what he called "the happiest thought" of his life. He postulated a modification to the equivalence principle, which is that being in a gravitational field is equivalent to be in an accelerated reference frame. What he meant was that gravity is not a real force but an effect we observe, so there's no difference between your car seat pushing up against you when you hit the gas and the Earth holding you down.

The link to the other equivalence principle is that, in free fall, any object falling with you moves at the same rate, and the same thing is true in an accelerated reference frame, because the acceleration you feel is a result of the frame (your car, a rocket) and not your mass.

This happiest thought led Einstein to the conclusion that being in free fall in a gravitational field is just as "natural" as being at rest. When you do feel a force (your car seat, the ground), that's just an object getting in the way of your natural path through spacetime. As usual for Einstein, his next step was to imagine what this meant for light.

Assuming his principle is true, weird things happen in gravity. Say you're in a rocket ship at rest in space. If a beam of light comes in one window, it will trace a straight line through the rocket ship and out another window. If you're moving at a constant speed, you observe the exact same thing, because special relativity says you can't tell the difference between different inertial frames.

If you're accelerating, the light will trace out a parabolic curve, because you're moving faster when the light leaves the rocket than when the light enters it. The equivalence principle says you can't tell the difference between gravity and acceleration, so the same thing should happen if you're in a gravitational field. Light passing near the Sun, for example, will curve.

Now it's all well and good to say this happens because of the equivalence principle, but that's not a mechanism. If there isn't a force causing the light to curve, what's doing it? Einstein says this is the wrong question to ask and that what looks like a force is just light taking the only path available.

Here's an imperfect analogy: imagine you're driving up a mountain, maneuvering through twisting switchbacks. If you veer one way, you fall off the mountain. If you veer the other way, you crash into the side of it. So you stick to one narrow path. To the GPS satellites monitoring the position of your phone (but not the mountain or the road), it looks as if your phone, you, and the car are being pushed around by some mysterious force, but in reality you are simply following the only path available.

Except you might think, well that works for light zooming around at 300,000 km/s, but what if there's nothing propelling me? Why am I following any path at all? And the answer is that we are all following a path constantly through spacetime. We're moving forward through time. But in the presence of a gravitational field, spacetime gets warped, and your straight path through it moves a little bit out of time and into space. The "speed" you had going through time gets converted into speed in space, which is why clocks slow down close to a black hole.

Figuring out the specifics of how mass could warp spacetime took Einstein about a decade, but he finally succeeded in 1915, giving the world general relativity. With it came a number of predictions, including the bending of starlight, the correct shape of Mercury's orbit, and the fact that accelerating masses will send out gravitational waves that stretch and shrink spacetime as they pass by. Finally detecting those waves reaffirmed Einstein's genius one more time a century after he first proposed them. And all of that came from Einstein tinkering around with the fact that all objects fall at the same speed.

I said earlier that I equated myself to Einstein, but the truth is I'm no Einstein. I'm a pretty smart guy, but not a genius, and certainly not one of the greatest scientific minds in history, capable of deducing fundamental and quantitative physical truths about the universe from simple thought experiments. What can I possibly hope to achieve compared to that?

But there is an equivalence between me and Einstein, because in reality he was no Einstein, either. It took him a decade to complete general relativity because, talented though he was at math, he was not a mathematician and had to learn an entirely foreign branch of it to make his theory work. He got help from a mathematician friend of his, Marcel Grossmann, who was familiar with Riemannian geometry. That branch of math was invented in the 19th century by a couple of guys, including Bernhard Riemann.

The idea of looking at space and time as a unified thing was partly inspired by Hermann Minkowski, who applied geometrical concepts to Einstein's special relativity. Before Einstein even got to special relativity, which was critical for getting to GR, he frequently discussed difficult subjects with a group of likeminded friends that maybe ironically called themselves the Olymipa Academy. And most of the pieces for SR were put in place by earlier physicists, such as Hendrik Lorentz and George FitzGerald.

Black holes were first theorized about by Karl Schwarzschild, who found one of the simplest solutions to Einstein's field equations while fighting in the trenches during WWI. Roy Kerr figured out how rotating black holes behave. And many others over the ensuing decades contributed to the theory.

As far as gravitational waves are concerned, Einstein himself waffled as far as whether they even existed. But even so, he originally showed only that they could exist and radiate away energy. Solving general relativity for the shape of gravitational waves emitted by two inspiraling, merging black holes took until the 90s. In fact, it was only accomplished with the help of supercomputers using numerical techniques.

And even ignoring the many contributions from theorists not named Einstein, his prediction about gravitational waves would have meant nothing if we did not have the means to detect them. The feat accomplished by LIGO this past week involved scientists who are experts in interferometry, optics, vacuum chambers, thermodynamics, seismology, statistics, etc. The effort required theorists, as well as experimentalists, engineers, and technicians.

I don't mean to imply that Einstein's work would be for naught without the janitors who cleaned his office, that he couldn't have done it without all the little people supporting him. I mean that Einstein's contribution to the discovery was only one part of a vast web of contributions by a host of extremely talented people, alive and dead, who did things Einstein couldn't have done.

On Thursday, we all learned the magnitude of what they had accomplished. Rumors of the discovery had been swirling around for awhile before it was announced. By the time I arrived at school on Thursday to watch the LIGO press conference, I had a pretty good idea of what they were going to say.

Yet that didn't detract from the occasion. Packed into a lounge in the physics department, students, TAs, professors, and I--maybe a hundred altogether--watched the press conference webcast on a giant screen. We all cheered when the discovery was confirmed and cheered again when we heard the primary paper had already been peer reviewed. Half an hour in, I had to leave to go to my theoretical astrophysics course. There, the professor and TA set up a projector and we all continued to watch the press conference. When the webcast ended, the professor took questions about gravitational waves.

Being a part of that, in the minutest and most indirect way, was thrilling. It was a day when Einstein's greatest theory was confirmed yet again, when a new field of astronomy began, and when a thousand scientists got to tell the whole world about the amazing thing they had discovered.

There's a certain--possibly strained--equivalence to my wax museum Einstein moment from 20 years earlier. School was involved, as well as a story about Einstein. But this time I was listening to that story. My passion for science and learning has remained constant, but the attitude has changed. Back then, and for a very long time after that, I took joy in knowing more than others, in being the smartest guy in the room.

Now I know that's not the case. But I also know it doesn't matter. We just don't learn about the universe by sitting alone and thinking brilliant thoughts. That is, at most, one part of the process. So I don’t have to be a mythical genius to contribute. I can be a part of something amazing, of humanity's quest to understand the world around us, just by collaborating with others who are as passionate as I am. I haven't done it yet, obviously, but just as Einstein's magnificent theory has been reaffirmed, so too has my drive to be a scientist.

Sunday, February 7, 2016

Who Cares What Old, Dead White Guys Thought?

The title of this post is inaccurate if you don't consider the ancient Greeks to have been white. But that's probably not a discussion I want to get into right now. Anyway, today we're discussing my ancient philosophy course from last semester, or more precisely, my Socrates, Plato, and Aristotle course.

There are two main points I'd like to articulate: (1) if philosophy has made objective advancements in the last 2,400 years, why should we care what philosophers thought 2,400 years ago, and (b) man, I had a really annoying classmate in my ancient philosophy class. In essence, I'm wondering whether it was worth it to take this class, just as I had similar concerns about the value of paper writing in my philosophy in literature class from last spring.

To think about the first point, there are two paths you can go down. First, you can go the "philosophy is the mother of science" route and wonder where that leaves philosophy nowadays. That is, there used to be essentially no distinction between being a philosopher and a scientist. Science is a relatively new word, and people like Newton were referred to as "natural philosophers." Science was just doing philosophy about nature rather than philosophy about justice or god or what have you.

The usual argument you see here is that philosophy birthed the sciences we're familiar with today, and where it's done so, philosophy is obsolete and the science is all that's left. There are still philosophers of physics today (after all, I took a class on that, too), but they're not doing physics. Philosophers of physics no longer ask whether the world is made of four fundamental elements, or if all matter is composed of atoms, or if the planets travel in perfect circles, because physicists have definitively answered those questions (no, depends, no).

So the domain of philosophy has shrunk. Where philosophy about the natural world is still relevant, it's in asking questions about physical models, rather than coming up with the models themselves. (Metaphysicists might disagree, but a lot of modern philosophers don't hold metaphysics in particularly high regard, as I understand it.) Similar shrinkage has occurred in the other sciences, with psychology being one of the latest disciplines to squeeze philosophy further.

Here I want to look at a particularly egregious example from my ancient philosophy course, Plato's tripartite soul. Plato reasoned that a statement and its contradiction cannot both be true at the same time. This is reasonable and one of the foundations of classical logic. Take a statement like, "The sun is yellow." Either that statement is true, or the statement, "The sun is not yellow" is true. They can't both be true, because one implies a contradiction of the other.

So then let's look to the soul. We've all had the experience of simultaneously wanting and not wanting the same thing. "I want to eat that chocolate cake" and "I don't want to eat that chocolate cake" are thoughts we can have at the same time. In the first instance, it's our carnal desire for the cake, but in the second instance, it's our willpower that's talking. But if the law of non-contradiction holds, it can't possibly be true that we can both want and not want a piece of chocolate cake simultaneously.

...unless we have a divided soul, as alluded to above. Plato identifies three different competing interests in the human psyche that can produce contradictory desires. Roughly, these are the appetitive, passionate, and rational parts of the soul. They are distinct and incompatible, Plato argues, otherwise the law of non-contradiction is contradicted.

And that's all well and good, and proceeds from some reasonable assumptions, but it's baloney as far as modern neuroscience and psychology are concerned. What a hundred years of research into the brain have taught is that the brain is really complicated, possibly the most complicated three pounds in the universe, and it's decidedly not true that you can chop it up into distinct, one-pound chunks.

(I've cleverly switched from talking about the soul to talking about the brain, but a distinction between the two was not necessarily important to Plato, and science says that "the mind is what the brain does.")

There are two main ways in which Plato's tripartite soul fails as a theory. The first is that there are probably many components to the human psyche, far more than three. The second is a subtle problem that has plagued philosophers for thousands of years, which is that it's possible for words and concepts such as "want" to have different meanings depending on the context. So you can want something, and you can want* something. The former may mean "desire enough to actively pursue," whereas the latter might be "like thinking about but have no inclination to pursue." In that case, you can not want something, and also want* it, and there is no contradiction.

This is a tricky problem that crops up all over the place, which is why analytic philosophers spend large chunks of their time trying to tease apart just what we mean when we talk about seemingly plain concepts such as free will or beauty or truth.

But if all we have to go on is what remains of a large, sometimes disjointed collection of Plato's writings, it's easy to find flaws in his logic. His work cannot defend itself. It's also possible that those old, dead white guys were just wrong about stuff. They had a limited amount of data and lacked the thousands of years of philosophical tradition (that they began) to draw upon.

Which brings me to my annoying classmate. During lecture, he frequently raised his hand and asked the instructor questions such as, "But doesn't that produce a contradiction?" and "But wouldn't that mean nothing is beautiful?" and "But didn't Plato condone slavery?" And every single time, the instructor would engage with him and answer his questions in a thoughtful manner.

Terrible, right? Provoking the instructor into discussing philosophy with us. Well, yes. We had two lecture periods and one discussion period per week, and he brought up his objections during the lecture period. His interruptions were so frequent that there was material we were never able to cover in class. And all of this was possible because, yes, duh, Socrates and Plato and Aristotle were wrong about stuff. It was very frustrating, but I suspect I'm coming off as kind of petulant here, so let's go back to Plato for a moment.

While Plato did divide the mind into three different parts, he had particular affection for one of those parts: the rational mind. It was through employing the rational mind in dialectic that truth could be revealed. This is where Plato's allegory of the cave comes in. Plato conceived of a metaphor where the reality we perceive is just shadow puppets lit by torchlight that we are forced to watch in some kinky Clockwork Orange setup.

Philosophers, however, have broken out of the cave and can see real objects illuminated by the pervasive, powerful sun. So there's a distinction between the ever-changing, distorted, and 2-dimensional shadows we think of as reality and the constant, colorful, 3-dimensional objects that actually compose reality. When we see a chair, we are only seeing an indistinct, imperfect shadow of a chair that does not fully encompass the essence of true chairness.

At first blush, this whole idea seems patently ridiculous. We all accept that our eyes can deceive us and that reality is maybe actually electrons and protons, but it seems laughable to suggest that in some eternal, unchanging realm there exists the true forms of the objects we behold here. Where is this realm? Is there a form of the electric fan there, the cell phone, the credit card offer?

Well, it's unclear how diverse Plato intended his realm of forms to be, but he almost certainly thought it was populated by mathematical objects. Many ancient Greeks (including Plato) took math and geometry as the model of a priori knowledge, knowledge we could come to know just by thinking logically and without relying on evidence from our senses. To Plato, this meant accessing Platonic forms.

So there's some ideal triangle out there, as well as a perfectly straight, infinitesimally thin line, and also the true form of the number 5. Again, this sounds plainly absurd. But let's look at a particular number, such as the ratio between the circumference and diameter of a circle: π.

In a little over a month, it will be Pi Day, which means the internet will be stuffed with memes about π pies and whether ϕ is the true constant and how π is a magical number that contains everything in the universe.

That last one relies on a conjectured property of π, that it is a normal number. A normal number is one that has an endless sequence of digits in a non-repeating pattern that are distributed perfectly randomly, with no particular numeral being more likely than any other. Assuming that’s true, then if you peer deep enough into the digits of π, you will eventually find your telephone number, or a bitmap of your face, or your life story written out in ASCII code.

But you'll also find a lot of nonsense, and there's no way to tell the true from the false, so this is more like Borges' Library of Babel than, say, the Encyclopedia Galactica. It’s true that highly random data has a lot of information in it, but there’s nothing profound about that; that’s numerology, not number theory.

Additionally, it turns out that almost all (real term) the real numbers are normal, but it's not easy to pick out any particular number and say that it's normal. Currently, there is no proof that π is normal, although the evidence suggests that it is.

But what if there is no proof? What if it turns out to be impossible to demonstrate rigorously that π is a normal number? (You can often prove that it's impossible to prove something in math, but maybe a proof is just never found.) In math, a statement is only taken to be true if can be proven via deductive logic. So if there is no proof that π is normal, is it normal?

Well you're probably thinking, it's either normal or not, duh. Its being normal doesn't depend on whether or not we're smart enough to prove it. The Earth was four and a half billion years old long before we were able to show, scientifically, that it was. But look what's happened here. We've asserted that π has definite properties independent of our conception of it. That is, we're saying π is real, as real as the Earth, and that it has a form beyond our crude and incomplete perceptions.

So perhaps Plato's forms are not as crazy as they sound. Now, I'm not arguing that Plato is correct and that numbers are "real." This is a lively debate in the philosophy of mathematics (a subject I'll have more to say about at the end of this semester), with the other positions being "idealist" and "anti-realist." But Plato originated (or was the best, earliest articulator of) one tradition in this philosophical debate.

Which brings me back to my annoying classmate. If instead of a philosophy course, this had been a course on the history of Ancient Greece, at no point during the lecture would a classmate have interrupted the instructor with, "But teacher, weren't the Athenians wrong to butcher and enslave whole cities?" Of course they were wrong! That is clearly not up for debate. What's interesting, however, is why the Greeks did what they did, and how their actions propagated through history. That is, I want to understand the legacy they left behind, the traditions they began.

And that's how I look at an ancient philosophy course. To me, it's not primarily about finding all the myriad logical inconsistencies in the thoughts of some old, dead white guys, but in understanding how their thinking shaped humanity for millennia to come. In some cases, their ideas are obsolete and need to be discarded, while in others they represent the seeds of debates still flourishing in philosophy now. The greatest difference I see is that philosophers today strive for precision and nuance so as to avoid falling into the same old traps. But we couldn't have gotten here, couldn't have learned that lesson, without first falling in.

Thursday, January 21, 2016


This is the second year in a row that I've seen an article decrying our collective cyber stupidity because of the awful passwords we use to protect ourselves. And this is the second year in a row that I've rolled my eyes very hard at the article because of its mathematical ignorance. This is the first year I've decided to blog about it, though.

The article linked to above lists the most popular passwords found in databases of stolen passwords. At the top of the list are groaners such as "password", "123456", and “qwerty”. How could those be the most popular, when everyone knows China is hacking its way into our country and we're using passwords to protect our finances, identities, and porn habits? How could everyone be so stupid?

Well, the truth is, very few people have to be stupid for those passwords to be the most popular. In fact, there's an easy to imagine scenario in which no one is so cyber-challenged. Let's see how.

The most popular password is the one that gets used more than any other individual password. This doesn't mean it's used by a majority of people, obviously, just as Donald Trump isn't supported by a majority of Republicans. Additionally, when we're ranking password popularity, we're doing so by login rather than by person, because that's how the data comes to us. So password popularity is measured as logins/password.

And what's being railed against in the above article is that the passwords with the highest login/password are bad ones. But what makes a password bad? Ease of guessing--those that take the least time to crack are the least secure.

This quality is quantified in a password's information entropy, which is a measure of the number of bits needed to specify the password. In other contexts, a piece of data's information entropy tells you how much that data can be compressed. The higher the entropy, the more bits needed to specify the data, the fewer bits you can get rid of and still preserve it.

When I think entropy, I think physics. Most people probably do, too, knowing it has something to do with thermodynamics and disorder. You probably know the second law of thermodynamics, which is usually stated as something like, "The entropy (disorder) of a system tends to increase."

The "tends to" there indicates that this is a probabilistic law. That is, if you have a box with octillions of gas molecules all bouncing around at different speeds and directions, it's hard to say exactly what they're going to do, but you can say what they're likely to do. And it turns out that a box of gas is more likely to go to a high entropy state than a low one. The reason is that there are many more high entropy states than low ones available.

This is where the connection to disorder comes in. The canonical example is probably an egg. An intact egg is a very ordered thing. It has a specific shape, and you can't change the shape of the egg without changing the fact that it's an intact egg. Thus order means low entropy, because there are only a small number of ways for an egg to be an egg.

Scrambled eggs, on the other hand, are disordered and high entropy. The high entropy results from the fact that you can rearrange your egg particles (eggs are made of egg particles, right?) in many, many different ways but still end up with the same basic breakfast: scrambled eggs.

How does this connect back to information and passwords? Because as the entropy of a system increases, it takes longer and longer to accurately describe the system in detail. With low entropy, high order systems, there might be one law of nature telling you why the system is shaped the way it is, which means it's easy to specify it in detail. But with a high entropy system, there are many microstates that are approximately the same, so you need to be more and more detailed if you want to specify a particular one. "No, the one with particle 1,036,782,561 going this way, not that way."

So high entropy data doesn't compress as easily because there are many high entropy systems, which means it takes a lot of bits to differentiate between two chunks of data. And this is also why high entropy passwords are more secure: because if you're randomly guessing a password, it takes you much, much longer to get through all the available high entropy passwords than it does the low entropy passwords.

But that's also why the least secure passwords will always be the most popular ones. Compared to the secure passwords, there just aren't that many bad passwords out there, because bad passwords are low entropy. The login/password for bad passwords is going to be high essentially by definition. Here's a toy model to demonstrate.

Mathematically, the entropy of a system (s) is proportional to the log of the number of microstates (n) that correspond to a single macrostate. Computer people like to do things in binary, so they use a log base of two: S = log2(n). Now let’s take some real data and see what we find. Using this website, I have found the entropy of each of the 25 most popular passwords. Their average entropy is 20.12. Using my password manager, I've found the average entropy of 10 randomly generated strong passwords (I got lazy, but the variation in entropy was low): 80.84.

So the average good password is ~4 times as strong as the average bad password. If we assume there are only 25 bad passwords (there are many more, but more makes the point even stronger), and that the population of logins (p) uses either good passwords or bad passwords, we can write an expression comparing password popularity (logins/password). For our model, let’s see what it would take for good passwords to be just as popular as bad passwords:

pbad/nbad = pgood/ngood

How do the number of good passwords compare to the number of bad ones? Well, from the log formula up there, if we multiply the strength of a bad password by 4, we get 4S = 4log2(n). From the rules of logs, we can take that 4 on the outside of the log and bring it in: 4S = log2(n4). So if you have n bad passwords, then you have n4 good passwords.

pbad/nbad = pgood/nbad4

Solving for the ratio of logins using bad passwords to good, we get:

pbad/pgood = 1/nbad3

Now let’s plug in nbad = 25.

pbad/pgood = 1/15625 = 0.000064

This means that as long as more than 0.0064% of all logins use bad passwords, they will be the most popular. Stating the converse, 99.9935% of all logins can use strong passwords, and the bad ones will still be more common.

Of course, in the real world, there are more than 25 bad passwords (and waaaay more than 254 good passwords), and people aren't divided up into binary good and bad password users. But I think this demonstrates that very few people need actually be stupid for the above article to be true.

And as I said, it's possible that no one is stupid because this is based on logins rather than users. All it takes is that more than 0.0064% of the time you need to pick a username and password for a site, it's a site for rating cat videos and you rightly don't care about security.

Tuesday, January 19, 2016

Quantifying Weirdness

Quantum mechanics is weird; there's no doubt about that. It’s got wave-particle duality, the uncertainty principle, and spooky action at a distance. Other fields have weird results, too, but although we might comment on the peculiarity of a particular finding, we do not indict other fields as a whole. With quantum mechanics in particular, though, it seems like its idiosyncrasies leave people with the feeling that it is either too weird to be right or too weird to be understood.

Well, today I'd like to help dispel those attitudes, particularly the first one—or at the very least put a number on just how weird quantum mechanics is. To do so, I'm going to be regurgitating material I learned in my philosophy of physics course.

In order to quantify the weirdness of quantum mechanics, we'll be exploring the phenomenon of quantum entanglement. Hopefully, we'll be able to unravel some of its mysteries and not get caught in a web of confusion.

I'm sorry, I promise there will be no more entanglement puns.

Entanglement first gained widespread awareness in physics after a 1935 paper by Einstein, Podolsky, and Rosen, henceforth known as the EPR paper. Einstein was unhappy with how that paper turned out, but he articulated his thoughts more clearly to his colleagues (especially Schrodinger) in private. Additionally, the thought experiment proposed then was more complicated than it had to be. The upshot is I'll be talking about this from a slightly more modern perspective; but historically, the EPR paper is one of the jumping off points for discussing quantum funny business.

So here's entanglement. In quantum mechanics, particles like electrons are described by a wave function which tells you the probability of finding the electron in a particular state. One such state is spin which, because of weird quantum mechanical reasons, can be either up or down. So the wave function could say there's a 50% chance the spin is up and a 50% chance it's down, for example.

You won't know what the spin is until you measure it. When you do so, the language is that the wave function “collapses,” so now it's just in one state, either up or down, instead of a superposition of both.

If two electrons are hanging out, normally you have two wave functions to keep track of. But if two electrons get created together in a particular process, then they will be described by a single wave function. Once that happens, barring interference from the outside world, it is not possible to decompose that wave function into two separate ones.

Where before your wave function for a single electron said there was a 50/50 chance of spin-up or spin-down, now it might say something like there is a 50% chance that electron A is spin-up and electron B is spin-down, and a 50% chance that electron A is spin-down and electron B is spin-up. So if electron A is in your lab, and electron B is down the road at the chemist, and you measure electron A to be spin-up, then you know the wave function has collapsed to "A up, B down." This means you also know, without having measured it, that electron B is now spin-down. If you do later measure it, you will always find it to be spin-down if A was up.

Here's where things get weird. Again, as long as you prevent your electrons from being interfered with, they remain entangled until you measure the spin of one of them, no matter how far apart the electrons get. So if electron A is in your lab, and you send electron B to Alpha Centauri, when you measure the spin of electron A, you instantly know, across a distance that would take light 4 years to travel, what the spin of electron B is.

This is weird.

Here's another scenario. This one is totally going to blow your mind. Imagine you are playing a game with a street magician. He's got two hands and one coin. While your back is turned, he puts the coin in one of his hands and then asks you to guess where the coin is. There's a 50/50 chance for either hand. You say left hand. He opens, and reveals that there is no coin there.

Now here's the wacky part. Assuming the magician exhibits no trickery and that the coin is in one of his hands, you now know, as if by magic, that the coin is in his right hand. Even if the magician performs some real magic and sends his right hand to Alpha Centauri after hiding the coin, you know instantly, across a distance that would take light 4 years to travel, that the coin is in his right hand. Information has traveled faster than light—a clear violation of Einstein's special relativity!

Okay, no matter how hard I try, I can't make that second scenario sound as weird as the first one. But why not? Because you're saying, “Silly Ori Vandewalle (if that even is your real name), nothing spooky is going on here. The coin's location is a result of the magician's actions before the hands are separated. Revealing the hand doesn't decide the fate of the coin. Duh.”

This is essentially the argument that Einstien made in the EPR paper. If two electrons are entangled, and one of them is sent to Alpha Centauri, and measuring the spin of one tells you the spin of the other, then the only reasonable conclusion you can draw is that the spins were determined beforehand.

The name of the EPR paper is, "Can Quantum-Mechanical Description of Physical Reality Be Considered Complete?" Following Betteridge's law, Einstein posited the answer was no. That's because quantum mechanics can only tell you the probability of the electron's spin being up. But just as with the magician's coin, Einstein argued, this probability represents nothing more than our ignorance, not any actual indeterminacy on the part of the coin or the electron.

So is the weirdness gone?

Well, let's see if we can't make this spooky action even more mundane. Another way to think of this result is that the two electrons are correlated. If two objects are correlated, they have a common cause. A caused B, or B caused A, or C caused both A and B. So we are suggesting that some common cause configured both spins beforehand but didn't bother to tell the wave function this.

In the 60s, physicist John Stewart Bell developed a theorem that must be true about any three binary properties of a single system. This theorem tells us something important about common causes. There are a few assumptions that go into the theorem, the most relevant of which is that, once you measure property A, that measurement can't affect properties B and C before you measure them.

Let's go through Bell's theorem with cookies so that I can distract you from the fact that we're doing math.

By Kimberly Vardeman from Lubbock, TX, USA (Perfect Chocolate Chip Cookies) [CC BY 2.0], via Wikimedia Commons
Say you've baked a batch of cookies, and the cookies can be large or not large (L, ~L), have walnuts or no walnuts (W, ~W), and have chocolate chips or no chocolate chips (C, ~C). Now say you want to know how many large, non-walnut cookies you have. We'll call that N(L, ~W). This number is the sum of all large, non-walnut, chocolate chip cookies N(L, ~W, C) and all large, non-walnut, non-chocolate chip cookies N(L, ~W, ~C). This must be true, because whether or not a cookie has chocolate chips does not affect its size or walnut content.

Similarly, the number of cookies with walnuts but no chocolate chips is N(L, W, ~C) + N(~L, W, ~C) because size doesn't matter. And finally, the number of large, non-chocolate chip cookies is N(L, W, ~C) + N(L, ~W, ~C) because walnuts don't matter.

Now let's add together the number of large, non-walnut cookies and the number of walnut cookies with no chocolate chips. That quantity is:

N(L, ~W, C) + N(L, ~W, ~C) + N(L, W, ~C) + N(~L, W, ~C)

If you notice, the second and third terms are also the terms for the number of large, non-chocolate chip cookies. That means our sum is always at least as great as the number of large, non-chocolate chip cookies.

Now let's make a slight shift and talk instead about probabilities. If you randomly reach out for a cookie, the probability that you get a particular one is directly proportional to the number of that cookie there is to take. This means we can reword Bell's cookie theorem thusly:

The probability of choosing a large, non-walnut cookie or a walnut, non-chocolate chip cookie is always greater than or equal to the probability of choosing a large, non-chocolate chip cookie.

This theorem is true regardless of how many of each cookie there actually is, because at no point in demonstrating this did we use numbers. It's also true no matter what kinds of properties we're talking about, so long as they are binary properties, because we could just as easily say L stands for lemon cookies or even something non-cookie-related.

But what's more, this theorem tells us about correlations. You see, if I give instructions to a thousand people to bake exactly the number of cookies I say and have each person randomly select and eat one cookie, we'll find that Bell's cookie theorem holds true. The probabilities will be maintained across all kitchens, because the cookie batches are correlated--spooky baking at a distance. The correlation is a result of the common cause known as me giving out instructions.

Now let's switch gears and talk about sunglasses—or as I prefer to call them, quantum shields. Polarized sunglasses only admit light that oscillates in a particular direction (up and down or left and right, for example). If you have horizontally polarized sunglasses, then only light waving from left to right (from the frame of the frames) will get through. But light coming from the sun is equally likely to be waving in any direction, so if you think about it, polarized sunglasses should only let a tiny, infinitesimal amount of light through—only light that is exactly horizontal and nothing at any other angle. Yet this isn't what happens. Polarized sunglasses will absorb roughly half the incident light and let the rest pass. Why is that?

Well, let's talk about the quanta of light, photons. A single photon doesn't have a direction it's waving, but it does have a polarization that is based on its spin. When a photon passes through sunglasses, the photon's spin is measured by the polarizing filter. Before the measurement, it's in a superposition of horizontal and vertical spin based on the angle of its spin (the direction it's waving).

When it's measured, that superposition collapses so that its spin is either horizontal or vertical. If it ends up being horizontal, it passes through. Otherwise, it's absorbed. The closer the angle of its spin is to horizontal, the higher the probability that it collapses to a horizontal spin. In this way, light from any polarization (except exactly vertical) can pass through, but the odds of it doing so go down the further away from horizontal you get, and anything that does pass through will subsequently be measured as horizontal. So sunglasses are quantum shields.

"Oakley half wire" by Jpogi at Licensed under Public Domain via Commons
This probability of getting a particular spin works for electrons, too, such as the two entangled ones in our EPR thought experiment. Instead of a polarizing filter, we use magnets to measure an electron’s spin. Before we talked about a 50/50 chance of an electron being up or down, but these odds can be adjusted by rotating our magnets in exactly the same way that light waves rotated away from horizontal have different odds of passing through sunglasses.

But this adds a new wrinkle to our thought experiment. Before, getting a spin-up on Earth meant the Alpha Centauri electron would be spin-down 100% of the time. If we rotate the Earth magnet by some angle θ, then that perfect correlation stops being 100%. It turns out that the odds of one being spin-up and the other spin-down are equal to cos2(θ/2), where θ is the angle between the two magnets.

We can carry out this experiment many times, creating entangled electrons and sending them to Alpha Centauri. A third of the time, we can measure with one magnet oriented at 0 degrees and the other at θ degrees clockwise, a third with one θ degrees and the other φ degrees, and a third with one 0 degrees and the other φ degrees. In this way, we are measuring three different binary properties of the system. Bell's theorem applies.

An entangled pair can be spin-up at 0 degrees and spin-down at θ degrees, spin-up at θ degrees and spin-down at φ degrees, or spin-up at 0 degrees and spin-down at φ degrees.

Bell's theorem tells us, then, that P(θ) + P(φ-θ) >= P(φ). Using the cosine formula up there, this comes out to cos2(θ/2) + cos2([φ- θ]/2) >= cos2(φ). Okay. Looks fine.

Except this isn't always true, depending on the angles you pick. Sometimes, the left-hand side will be less than the right-hand side. If you subtract the right from the left, then whenever Bell’s inequality is violated, the expression will be negative. You can see when that happens in this graph.

I am a Matlab Master.
So what does it mean for Bell’s inequality to be violated? Well, in the case of our cookies, the correlation was upheld because I sent out a common set of instructions to all the bakers. This is the common cause of the correlation. We saw that this common cause would lead to adherence to Bell's inequality for any set of three, binary properties of a system. This means that a common cause cannot be the origin of the correlation between entangled electrons. They aren’t deciding their configuration beforehand.

What Bell's theorem does permit is a non-local connection—the electrons instantly updating each other on their spin, or electrons that are governed by interactions across all of space. The other usual possible explanation for EPR and Bell is that electrons don't have any intrinsic reality, that realism itself is a foolish idea. No one likes either of these possibilities.

There are alternative ways of deriving, formulating, and generalizing Bell's theorem. When you do so via the CHSH inequality, you find that classical correlations can be no higher than 2. But quantum correlations violate this limit and can be as high as 2√2. And yet we can imagine other correlations, such as the Popescu-Rohrlich box, that are even higher than 2√2—correlations that you cannot reach even with entangled, non-local/non-real electrons.

So quantum mechanics is weird. But it's only weirder than regular spooky action at a distance by a factor of √2, or ~41%. Although √2 is irrational, so maybe quantum mechanics is unreasonably weird.