Showing posts with label physics. Show all posts
Showing posts with label physics. Show all posts

Saturday, April 14, 2018

A World of Pure Imagination

In the philosophy of mathematics—hold on, hold on, I promise this is good—there's a perennial debate about whether numbers are real or just something we made up. This argument elicits a kind of irritated shrug from most people, but there is a fairly reliable way to evoke some pushback and/or incredulity: assert that imaginary numbers exist.

An imaginary number is the square root of a negative number, which of course doesn't make sense; any number multiplied by itself comes out positive. But mathematics is all about laying down axioms and seeing what logically follows. We can just declare that √-1 = i instead of a calculator error.

Alright, you think, but we can't just declare things into existence. What does an imaginary number even mean in the real world? You can have 3 apples, or maybe even -3 apples if you owe someone, but 3i apples has no concrete, physical interpretation, right?

Well, it turns out that by allowing complex numbers—a set that includes both real and imaginary numbers—we open up a new space for doing mathematics and physics. In fact, if we want to explain the bewildering diversity of chemical elements or the solidity of matter, we have to explore this imaginary space. Could anything be more concrete?

Take a look and you'll see...

Before we delve into the physics, let's make sure we have a little intuition about complex numbers. The imaginary unit, i, is the square root of -1. Just based on that, we see imaginary numbers cycle:

i*i = -1, because that's our definition

(i*i)*i = -1*i = -i

(i*i)*(i*i) = -1*-1 = 1

And (i*i*i*i)*i = 1*i = i again

This cycle lends itself to a neat geometric interpretation. Instead of the humdrum xy-plane, we can imagine a complex plane like this:

By Svjo [CC BY-SA 4.0], from Wikimedia Commons
Here, the horizontal axis is real and the vertical axis imaginary. Complex numbers are pairs that have the form a + bi, representing coordinates (or a vector) on our plane. If we draw a circle counter-clockwise through the points 1, i, -1, and -i, you see they follow the same cycle as our imaginary multiplication. So you can rotate through the complex plane just by multiplying two vectors.

It might look like we've only renamed a plain plane, but this space gives us flexibility the real numbers lack. Real numbers sometimes fall down on the job when you’re trying to solve polynomial equations. But if you say i is a root of -1, you can always find a complex number that does the trick. Geometrically, this lets us access points on the complex plane through simple multiplication, without having to rely on more cumbersome machinery.

Okay, finding polynomial roots probably sounds pretty boring, so we're not going to dwell on that. We'll mostly think in terms of complex rotation and how that permits us to peak into weird, non-Euclidean spaces where up and down no longer work the way they should. But know that in the background, these imaginary roots are letting us do a bunch of linear algebra by providing solutions to otherwise unsolvable equations.

We'll begin with a spin...

Let's turn back to physics. Explaining how the properties of chemical elements—the gregariousness of carbon, the aloofness of neon—arise from quantum mechanics goes like this: the protons and neutrons of an atom are squeezed into a tiny nucleus while the electrons whizz by in concentric orbital shells. How “filled” the outermost shell is (mostly) determines the chemical properties of an element. So whatever keeps these negative nancies from clumping together is responsible for, well, basically all macroscopic structure.

The culprit is the Pauli exclusion principle, which says that particles with half-integer spin (electrons) cannot occupy the same quantum state. Spin is intrinsic angular momentum, measured in units of ħ. If you measure the spin of an electron along some axis, you get either +1/2 (referred to as spin up) or -1/2 (upside down—spin down), with no other possible outcomes.

To keep track of the spin state of an electron, we can write a wave function that looks like this:

|↑⟩

Flip the electron upside down and the spin state is:

|↓⟩

Then flip it back right side up and you get:

-|↑⟩

Wait, what? We seemed to have gained a minus sign somehow. In fact, you have to rotate an electron a full 720° to cycle back to the state you started with. The minus sign doesn't matter much in measurement because anything we observe in quantum mechanics involves the square of the wave function, but it being in the math is pivotal.

Say a transporter accident duplicates Kirk and the two end up fighting.

Credit: Paramount Pictures and/or CBS Studios
There’s a brawl, both men lose their shirts, and one emerges victoriously. How does Spock tell if the original Kirk won or lost? If Kirk is a subatomic particle, we’re left with two possible states that look the same when measured. Either original Kirk wins and duplicate Kirk loses:

|W⟩|L⟩

Or vice versa:

|L⟩|W⟩

Each one will scream, "Spock... it’s... me!" but there's no evil mustache to differentiate them. With identical quantum particles, this symmetry of exchange is mathematically equivalent to taking one particle and flipping it around 360°; in both cases you end up with observationally indistinguishable states.

But there are still two outcomes. Whenever we're dealing with multiple possibilities in quantum mechanics, it's time for you-know-who and his poor cat. Just as the cat can be in a superposition of alive and dead, a Kirk particle can be in a superposition of winning and losing.

Nothing weird happens when you mix and match bosons (particles with integer spin like photons). They exchange symmetrically and their superposition looks like this:

|W⟩|L⟩ + |L⟩|W⟩

But electrons (and other half-integer fermions) are antisymmetric; a 360° flip gives us that minus sign. So their superposition is:

|W⟩|L⟩ - |L⟩|W⟩

As both sides of this expression are indistinguishable, subtracting one from the other equals 0. Any place where the wave function is 0, we have a 0% chance of finding a particle. So two electrons will never end up in a fight in the first place. (Kirk, then, is clearly a boson.) Replace "fight" with "spin up state in the 1s shell of a hydrogen atom" and you've got the beginnings of chemistry and matter.

What we'll see will defy explanation...

Okay, so how do we make sense of the weird minus sign a rotated electron acquires? This perplexing behavior originates with their 1/2 spin, which we can only understand if we venture back into the world of imaginary numbers, to a place called Hilbert space.

Physicists discovered that electrons were spin-1/2 as a result of the Stern-Gerlach experiment, where Stern and Gerlach sent silver atoms (and their attendant electrons) through a magnetic field. Spin up particles were deflected one direction, spin down particles a slightly different direction. That there were only two possible values along a given axis was weird enough, but follow-up experiments revealed even stranger behavior.

By Theresa Knott from en.wikipedia - Own work, CC BY-SA 3.0, Link
If you collect all the |↑⟩ electrons and send them through another S-G apparatus, only |↑⟩ electrons come through. You're giving me a look, I can tell; what's weird about that? Well, we're still dealing with quantum mechanics, so we always have to consider superposition. Maybe the state after detection is |↑⟩ + |↓⟩ and there's a chance one will come out |↓⟩.

Experiment says no. This is a little weird. It means +1/2 spin doesn't overlap at all with -1/2 spin (positively or negatively). That should only be the case for vectors at right angles to each other. Somehow, these up and down arrows behave as if they're orthogonal.

Say we've been measuring spin along the z-axis until now. We can set up a second S-G apparatus that measures along x (or y) and then send |↑z⟩ electrons through that. The z- and x-axes are at right angles, so there should definitely be no overlap. But electrons are capricious; they split evenly between |↑x⟩ and |↓x⟩, even though an arrow only pointing up clearly has no component in any other direction.

A pattern is emerging here. The 180° separation between |↑⟩ and |↓⟩ acts like a right angle. Right angles act like they’re only separated by 45°. And a full 360° rotation just turns a vector backward, giving it the minus sign at the center of all this. All our angles are halved. The space electrons inhabit is weird, as if someone tried to grab hold of all the axes and pull them together like a bouquet of flowers.

Try to imagine that if you can, but don't worry if you can't; we're not describing a Euclidean space. You can sort of squeeze the z- and x-axes closer together, but any attempt to bring the y-axis in while also maintaining the 90° separation between any up and down and 45° separation between any right angle just won't work.

The only way we can fit the y-axis in there is to deploy a new degree of rotation distinct from Euclidean directions. That sounds like a job for the complex plane. In fact, our inability to properly imagine this space is directly analogous to not being able to find real roots for a system of equations, which as we know is where complex numbers shine. Vectors that are too close in real space can be rotated away from each other in complex space to give us the properties we need.

From this mathematical curiosity—a space where rotation and orthogonality are governed by complex numbers—we find an accurate description of the subatomic particles that serve as matter's scaffolding. Electrons are best thought of not as tiny, spinning balls of charge but as wave functions rotating through a complex 2D vector space.

So what does it mean to have 3i apples? Nothing. But what does it mean to have 3 apple juice? The physical reality of complex numbers only manifests at the quantum level. To many philosophers, this indispensable presence demands ontological commitment. This is a way of saying, "Well, I guess if anything is real, that is." And how are we to say otherwise? Complex numbers might come from a world of pure imagination, but they're necessary for describing this world; shouldn't that count for something?

Credit: Warner Bros. for this picture and the song lyrics.

Wednesday, October 11, 2017

The United Federation of Paradox

In Star Trek, the Federation is a post-capitalist utopia where citizens act out of a desire to better themselves or civilization rather than attain monetary wealth. It's not entirely clear how this utopia came about, but we're often told humanity transcended its violent, greedy impulses through cultural evolution. A more cynical view is that the advent of replicators eliminated most scarcity, and with it any need to be violent or greedy.

I would like to offer an alternative hypothesis: Every episode, Starfleet ships employ technology that permits time travel, so the Federation should be able to Seven Days its way out of any mistake on the path to utopia. You see, the physics of the 20th century—special relativity—tell us that any method of FTL (whether warp drive, subspace communication, or galactic spore network) is also a method of time travel. FTL permits time travel because reality has no rigid, universal stage on which all events play out. Instead, space, time, and the events that occupy space and time are all linked together by a consistent set of interrelationships.

Galileo made this argument while trying to convince others that a spinning, moving Earth wouldn't throw everything out of whack. What we now call Galilean relativity says the laws of motion don't depend on your (inertial) frame of reference. A frame of reference is just a perspective from which to observe the universe. If you're sitting in a chair reading this, you and the chair constitute a frame; if you're hurtling through interstellar space (at a constant speed) in a starship, that's another frame. Also, go you.

Galilean relativity means that as long as you occupy an inertial frame, you never notice anything funny that doesn't accord with the laws of motion. Whether you're in a turbolift or a shuttlecraft, if your velocity is constant, a tossed ball will land where you expect it and you won't feel any mysterious forces pushing on you. The upshot is that no frame of reference is privileged or what's "really" happening. All are equally valid.

The tricky part is translating one reference frame to another. Walking down the aisle of a plane, everyone on the plane can treat you as moving only a few miles per hour. Everyone on the ground, however, needs a way to combine your velocity and the plane's. This feat is accomplished via a transformation, which is just a mathematical tool for moving between reference frames. In Galilean relativity, that transformation is easy and basically commonsense: to an observer on the ground, your speed = plane speed + walking speed.

It is these transformations—which spell out equally valid and consistent ways of interpreting reality from different frames of reference—that allow for time travel. To see how, we have to move from Galilean relativity to Einstein's special relativity.

Special relativity is a generalization of the Galilean variety. There are two postulates that end up having deep consequences:

(1) The laws of physics don't depend on your frame of reference.

This is an expansion of Galileo's rules to include electromagnetism.

(2) The speed of light (c) is a law of physics.

This postulate is implicitly included in the first one, because Maxwell's equations for electromagnetism predict a speed of light. It's the revolutionary part in all this, though, so Einstein spelled it out explicitly.

By itself, a law that dictates a speed is not terribly noteworthy. Any wave equation specifies the speed at which the wave travels. We usually think of waves as traveling through a medium, in which case Galilean relativity might apply. To an outside observer, the total wave speed = medium speed + wave equation speed. Physicists assumed this applied to light as well and proposed a luminiferous aether to serve as a reference frame and medium.

The trouble was, the properties required by a luminiferous aether (given how light behaved) seemed ludicrous and unphysical, and when measured, c always seemed to be the same. Additionally, and famously, the Michelson-Morley experiment failed to detect any sign of the aether. The alternative, according to Einstein, is that c is not defined relative to a frame of reference; instead, the speed of light is a law of physics and the same for all inertial observers.

But this violates the rules of the Galilean transformation, because it means you can't add velocities when light is involved. If a Klingon runs at you firing a laser pistol (canon in some of the TOS era), Galileo says the laser's speed = Klingon running speed + c. Einstein says the speed is always only c, for both you and the Klingon. And that means we need a new transformation that is, as before, equally valid and consistent for all inertial frames of reference. For special relativity, that's called the Lorentz transformation.

Rather than just show you the Lorentz transformation (it involves c and some square roots and reduces to the Galilean transformation at everyday speeds), I want to provide a visual explanation for how all observers can measure the same c. Memory Alpha says Vulcan is 16 light years from Earth. So let's imagine there's a starbase between the two planets, 8 light years from each. If the starbase emits a radio signal traveling at c, it reaches both Earth and Vulcan 8 years later. How do we represent this graphically?

Credit: Paramount/CBS for the Trek stuff and NASA for the Earth stuff.
The x-axis (horizontal) is distance in light years and the t-axis (vertical) is time in years. If our reference frame is the starbase and the planets are not moving relative to it, then they move upward in time without moving left or right through space. The radio signals, on the other hand, move 1 light year per year, so they travel 45 degrees out from the starbase. Where the radio signal and the world line of a planet intersect is the location in spacetime (at the planet, 8 years in the planet's future) where the signal reaches the planet.

Now let's say the Enterprise is at the starbase and starts heading toward Vulcan at sublight impulse speeds. What does that look like?

Credit: Paramount/CBS for the Trek stuff and NASA for the Earth stuff.
Because impulse is slower than light, its path is tilted more toward the vertical than the radio signal; more time is required to go the same distance. Since we’re dealing with special relativity, there is an inertial reference frame following along with the Enterprise, and from that frame we have to measure the same c. According to the graph, this doesn't seem possible. It sure looks like the radio signal hasn’t gotten as far away from the Enterprise as it has the starbase (horizontal distance) in the same amount of time (vertical distance).

So here's where we need to perform a coordinate transformation that takes us from the reference frame of the starbase to the reference frame of the Enterprise. For a frame centered on one inertial object, the object's position doesn't change in time. For the starbase, that means its path through spacetime follows the vertical—or time—axis. So then let's define a new time axis (t') for the Enterprise which follows its diagonal path. If c is the same in all references frames, that means we also need a new space axis (x'), which has the same angular separation from the radio signal as t’.

Credit: Paramount/CBS for the Trek stuff and NASA for the Earth stuff.
Because x' and t' are tilted toward the radio signal by the same amount, the signal still moves 1 light year per year in this new reference frame; the ratio doesn't change. This has weird consequences, though. For starters, reconciling a constant c seems to have involved squishing space and time together. But it gets worse.

In the starbase reference frame, lines parallel to the x-axis are single moments in time. Any event on such a parallel line happens simultaneously for all observers sharing that frame. For the Enterprise frame, simultaneous events happen on lines parallel to the x' axis, which is a diagonal line that cuts through time in the starbase frame. This means events that are simultaneous in the Enterprise frame happen at different times for observers in the starbase frame, and vice versa.

For example, if you draw a line parallel to the x'-axis through the moment when the radio signal reaches Vulcan, you see that the event of the signal reaching Earth is ahead of that line; it happens later in the Enterprise's frame, despite the two planets being equidistant from the starbase. This is (a) the relativity of simultaneity, (b) patently ridiculous, (c) absolutely true, and (d) the feature we want to exploit to travel through time and create a problem-free utopia.

Normally (in special relativity), observers disagreeing on the order of events doesn't matter. If observers are limited to light speed or less, by the time they're able to meet up and discuss the discrepancies, all the events they disagree about are in everybody's past. FTL lets you circumvent this restriction.

So here's how to resolve every 42-minute Star Trek plot in 3 easy steps. The scenario presented here is set up for graphical simplicity; it smooths over a few wrinkles and might not perfectly align with Star Trek technology. (Then again, neither does Star Trek technology.)

Step 1: A space-ooze-energy monster attacks the Defiant, but it turns out the creature is just misunderstood. To restock on redshirts, Worf activates the Lorentz Protocol! Via subspace, the Defiant sends a message to Deep Space Nine.

Credit: Paramount/CBS
If subspace communication is instantaneous (which it looks close enough to being in most episodes), then Worf just finds the Bajoran system along the x-axis and puts the message there. Because no time passes, the message arrives along the x-axis.

Step 2: On DS9, Sisko gives the message to O'Brien, who hops into a runabout and flies away from the Defiant at impulse (some speed close to c).

Credit: Paramount/CBS
In our diagram, we're now switching to the runabout's moving reference frame. Its speed relative to the Defiant establishes a new frame of reference.

Step 3: The runabout sends a warning about the interdimensional slug to the Defiant's location in space via subspace.

Credit: Paramount/CBS
Because we are in a new reference frame moving relative to the Defiant, an "instantaneous" subspace message no longer appears somewhere on the horizontal line but along the runabout's x'-axis, which intersects the Defiant's spacetime location in its past.

Ultimately, the speed of the runabout and its distance from the Defiant determine, via a pretty simple triangle, how far into the Defiant's past the subspace warning goes. Arrange things correctly and Worf gets the warning before ever running into the crystalline spider-snake.

But of course, now Worf's gone and killed his own grandfather (who he may have been?—time travel!). That is, if he receives the warning before sending out the message to request a warning, then he avoids the cybernetic mind worm attack and never needs to send out a message in the first place. Paradox!

This is the central reason why physicists think FTL communication or travel is a non-starter. Other aspects of special relativity prohibit reaching c, but there’s nothing about naturally faster-than-light processes. They do, however, invariably lead to issues with causality.

There's a saying about this. Pick two: special relativity, FTL, or causality.

As we've just seen, special relativity + FTL means you lose a coherent narrative leading from the past to the future. You can preserve causality with FTL but only if you abandon the rules of special relativity. Or you can live in the universe we seem to inhabit, which has relativity and causality but loses all that FTL fun.

Of course, when asked to pick two, Star Trek usually just picks one: FTL. Most time travel stories in Trek are rife with causality issues that are usually intentionally ignored, except by having characters say things like, "Oh yeah I totally flunked temporal mechanics at Starfleet Academy, haha!" And relativity is almost entirely absent, because there's rarely any mention of time dilation or length contraction or all the other whacky things that happen when you get close to c.

Nevertheless, the United Federation of Planets is a utopia, and it must have gotten there somehow... or will get there... or will have already gotten there. (Oh boy. Consult Dr. Streetmentioner's book for tense corrections.) Or maybe not—after all, utopia does mean no-place.

Wednesday, July 5, 2017

From the Earth to the Moon

I recently finished reading The Birth of a New Physics, by I. Bernard Cohen, which describes the 17th century transition from Aristotelian to Newtonian physics. This reminded me of a demonstration I did for my astronomy sections last semester, in which I tried to impress them with the power of Newtonian unification. (It didn't work.) And yesterday was the day we celebrate projectile motion, so that's as good an excuse as any to revisit the topic.

As I mentioned in my last post, I think we suffer from presentism that makes it difficult for us to understand how our predecessors saw the world. To remedy that, I've been reading a lot of history of science recently; I want to understand the role that science has played in changing our conception of the world.

When reading history of science, I sometimes struggle with the seemingly glacial pace of scientific advances that I, with my present level of education, can work out in a few lines. I am no genius, so why did it take humanity's greatest scientific minds generations to find the same solutions? The answer is these solutions originally required deep conceptual shifts that for me—thanks to the work of those scientists—are now completely in the background. Here's an example that I think simultaneously demonstrates the power of Newtonian analysis and the elusiveness of the modern scientific perspective.

Aristotelian physics held that everything from the moon up moved only in circles and was perfect and unchanging, while everything below the moon was imperfect, impermanent, and either drawn toward or away from the center of the universe. The critical thing is that the motion of objects on earth—projectiles, boats, apples—operated according to fundamentally different rules than the motion of stars, planets, and other celestial objects.

What Newton did was to show the same rules apply everywhere, to everything. His laws of motion and gravity work for cannon balls, birds, the moon, and even once in a lifetime comets. This is where our presentism hurts us, because that radical idea seems completely obvious now. Of course physics underlies both airplanes and space probes. Duh.

In the abstract, that's an easy case to make. But the demonstration I did in class, which is a modern-ish take on an analysis Newton himself performed, might be able to show how cool and counterintuitive this unification really is.

Consider this: if you drop a rock from a given height and time its descent, you can explain why a month is roughly 30 days long. These two facts seem completely unrelated but turn out to be connected by a simple law.

Aristotelian physics says that heavy objects are naturally drawn toward the center of the universe and that the celestial moon naturally moves about the Earth in a perfect circle. But even ignoring the Aristotelian perspective, from our modern vantage the link between these two facts seems kind of incredible. We have some vague idea that the length of a month is connected to the cycles of the moon, and we know that gravity makes rocks fall, but the moon is clearly not falling and rocks have nothing to do with calendars; so how are these facts related?

Now, I'm not shocking anybody by saying that gravity is the common factor, but I want to show you how relatively simple it is to work this out using the tools Newton gave us.

Newton's law of universal gravitation says that gravity is an inverse square force. In fact, other scientists before Newton (Kepler, Hooke) had suggested this. It was known that the intensity of light falls off with the square of distance; maybe the same principle worked for gravity, too. Force is proportional to acceleration, so you can measure it by timing falling objects (or the period of a pendulum, which was the most precise method available during Newton's time). At the surface of the earth, this is 9.8 m/s2 and usually denoted with a g.

If the earth is also pulling on the moon, and gravity is an inverse square law, we can find out how much earth's gravity is accelerating the moon. Divide the distance to the moon by the radius of the earth (figures known since the ancient Greeks), square the result, and that's how much weaker gravity's action on the moon is.

The distance to the moon is about 60 times the radius of the earth, so earth’s gravity pulls on the moon with 1/3600 the force that it pulls on a rock near the surface. But even so, shouldn't the moon be here by now? It's obvious that the moon is circling the earth and not slamming into us.

What we need here is another law. We see circular motion on earth, too. Imagine tying a string to a rock and spinning the rock around. What keeps the rock moving in a circle? The string, which is taut. The string pulls on the rock so that it doesn't go flying off. But if the string is pulling the rock inward, why doesn't the rock come inward toward your finger? Well, imagine slowing down the spin rate of the rock. Do that and the whole thing will fall limp. There is a specific speed required to keep the string taut. In fact, if you spin too fast, the string will break and the rock will fly off.

So here's the law. When considering circular motion, inward (centripetal) acceleration is equal to the square of the spin rate (angular velocity) times the radius. The faster you spin the rock, the harder the string needs to pull on it to keep it from flying off.

If we assume the moon is going around the earth in a perfect circle, and we suppose that gravity is pulling it inward at 1/3600 the strength it does on earth's surface, then we can figure out the moon's spin rate (around the earth), too. A little algebra gets us this formula:



re is the radius of the earth. The angular velocity ω is how many radians per second the moon moves. To figure out how many seconds it takes to make a single orbit, you basically just flip the expression upside down and multiply by 2π to get a full circle. That gives you:



Plug in the right numbers (re=6378 km, g=9.8 m/s2) and you arrive at a t of about 2.35 million seconds, which comes out to roughly 27.3 days (the sidereal period).

This is a couple days off from 29.5 days, which is how long it takes the moon to go through a complete set of phases (the synodic period). The difference is due to the fact that after those 27.3 days, the earth has also moved about 1/13 of the way around the sun, changing where the sun is in the sky. Because the phase of the moon arises from its position relative to the sun, it takes the moon a couple more days to catch up with the sun’s new position.

Those complications aside, the ease with which you can find the moon's sidereal period from a measurement of surface gravity is both stunning and surprising. The calculation is literally only a few lines long. Here, look for yourself:

Credit: Me me me
I'm not showing you this to impress you with my mathematical talent, but to bring you back to my initial perplexity. Why did it require an intellectual titan such as Newton to figure this out? That is, what conceptual leaps were necessary? I don't know that I can answer that question completely, but here's a partial explanation that comes in large part from Cohen's book.

First of all, as I've said, Newton had the creativity and imagination to suggest a unified physics at all. Others at the time were formulating laws that applied to the heavens (Kepler's laws of planetary motion) and even physical mechanisms by which the planets moved (Descartes' vortices), but none imagined that a single law lay behind falling apples, the tides, planetary orbits, the moon's phases, the movement of Jupiter's satellites, and the orbits of comets.

Furthermore, Newton's laws of motion serve as a starting point for conceptualizing the moon's orbit. Aristotelian physics held that circular motion was perfect because celestial objects could return to their starting point indefinitely, continuing the motion for all eternity. Circular motion required no further explanation.

But Newton's first law says that objects have inertia, that they will continue in straight lines (or remain motionless) unless acted on by an outside force. This law isn't a formula but a tool for analysis. If you assume it is true, then you can look at any physics problem and immediately identify where the forces are. Thus, we can look at the moon, see that it is not moving in a straight line, and conclude there must be some force acting on it.

As I mentioned before, others had already proposed an inverse square law to explain gravity. Simply writing down the law of universal gravitation was not Newton's accomplishment. Instead, what Newton did was to prove mathematically that a body obeying Kepler's laws of planetary motion must be acted on by an inverse square force and the converse that an inverse square force will always produce orbits that resemble the conic sections (circles, ellipses, parabolas, or hyperbolas).

The proof Newton develops is heavily geometrical and begins by looking at an object moving freely through space that is periodically pushed toward a central focus. Newton then reduces the time between impulses until the force becomes continuous and the orbit, which began as a gangly polygon, curves into an ellipse. The important aspect here is there are two components to an orbiting body's motion: a central force acceleration and a velocity tangent to that acceleration.

What this means is the moon is falling toward the earth just as surely as an apple is. The difference is the moon is also moving in another direction so quickly that it continually misses the earth. This is what it means to orbit. As Douglas Adams said, "There is an art to flying, or rather a knack. The knack lies in learning how to throw yourself at the ground and miss."

Credit: Newton Newton Newton
All this groundwork (and more) was necessary so that Newton could justify a key step in those few lines of math I showed you up above. (I should point out that Newton's work didn't look anything like mine, because the notation and norms of math were very different back then.) The key step is that I equate the moon's acceleration due to gravity (am) with the centripetal acceleration of uniform circular motion (ac). While the units are the same, a priori there's no reason to think the two are related.

Without a mathematical and physical framework detailing how mass, force, and gravity interact, equating those two conceptions of acceleration is nothing more than taking a wild guess. And if you're guessing, that means there are probably plenty of other guesses you could have made as well. This is what our presentism—replete with all the right guesses—hides from us. At each moment when a scientist does what comes naturally to us now, they had innumerable other options before them. The achingly slow pace of scientific discovery, then, is a result of all the frameworks and ideas and theories leading to those other guesses, equally valid a priori, that turned out not to be right.

As I've written before, in physics it is sometimes easy to guess the right answer. What I hope this post does is demonstrate that guessing—that moment of eureka when the correct answer finally materializes—is only the proverbial tip of the iceberg when it comes to science. This is important to remember when you think you’ve been struck by inspiration and arrived at a brilliant new truth about... whatever. Our popular conception of history valorizes those moments, but a fuller understanding of history vindicates the slow, haphazard, incremental work that must come first. If that work isn’t there, maybe your new truth isn’t, either.

Wednesday, April 12, 2017

The Pale Blue Discourse

By sheer coincidence, xkcd recently did a comic on why the sky is blue at about the same time the astronomy class I TA got to its unit on light and optics.

Credit: xkcd
The Wednesday before that comic appeared, I led a discussion in which I explained why, in fact, the sky is blue. The comic argues against starting out with Rayleigh scattering because, essentially, that's just a fancy name for the specific reason the sky is blue, when the general reason is just that things are the color they are because they reflect that color.

I agree with this argument on one level, and one of the reasons I mentioned the sky's blueness in discussion is because it's an example of one of the three broad reasons why an object is a particular color (reflection/absorption, spectral lines, and thermal radiation). But I also mentioned the blue of the sky because Rayleigh scattering is interesting in a couple ways.

First of all, one way to think about the color of the sky is instead to think about the color of the sun. Sunlight is white (composed of all the colors in the visible spectrum), yet the sun is yellow. Why? Because Rayleigh scattering scatters some wavelengths (blue) more than others (red). The result is that wherever you look, you're looking at the sun; it just depends on whether or not the sun's photons had to bounce around a few times before they got to your eyes (and consequently look like they're coming from somewhere other than the sun).

The second reason I brought up Rayleigh scattering is that, for most objects that are a particular color by dint of reflection, the explanation for why is both complicated (a specific configuration of quantum mechanical energy levels) an unilluminating (it just worked out that way). By contrast, Rayleigh scattering is one of the few instances where the explanation is fairly simple and clear. We can see the process at work throughout the day. Shorter wavelengths of light scatter away as they pass through air. The more air they pass through, the more they scatter. This is why sunsets and sunrises are particularly red: the sunlight is moving through more atmosphere (because the sun is not just straight up), and the blue light has a lot of opportunities to get lost along the way.

But ultimately, xkcd is right that blue is just the color of air, as long as we want to think of color as a property of an object. And why wouldn't we? Well, we can engage in some fun-sucking reductionism by pointing out there is no blueness contained within air, just as there is no greenness contained within leaves. Color arises out of an object's interaction with light and eyes, and it just so happens that a particular interaction involving the sky produces blue. Many philosophers will want to push back against this kind of reductionism by saying, well, okay, then that's just what we mean by the property of blueness: being so configured that interaction with light and eyes produces the subjective experience of blue.

This is a common theme in analytic philosophy. Science has a tendency to unravel our everyday notions by telling us things like, no, we don't really ever touch an object; it's just the electric forces of our skin interacting with the electric forces of the couch. But philosophers balk at this by arguing that we clearly successfully communicate something when we say that, for example, humans have touched the surface of the moon. So let it be that what touching really means is... you get the idea.

But then what does it really mean to say that an object is blue, if blueness is a property that arises only through interaction? Well let's do a little thought experiment. Imagine that one of those TRAPPIST-1 worlds—tidally locked into its orbit around a cool red dwarf—has an atmosphere just like ours. On tidally locked worlds, the sun never rises or sets. One half of the planet is always facing the sun, while the other half never sees it. This could lead to a situation (although an atmosphere probably helps to mitigate it) where one half is a blasted hell hole and the other is a frozen wasteland. Consequently, many scientists and SF authors have imagined life arising only in a narrow strip of twilight at the terminator between night and day. There, the temperature might be just right for life. With a cool red sun (meaning much less blue light to start with) always on the horizon, a sky such as ours might always be some shade of red.

Credit: ESO
Nevertheless, scientific-minded aliens in the twilight might eventually learn the composition of the atmosphere, learn about Rayleigh scattering, and come up with a neat science fact: you know, if you were to shine an enormous amount of white light through our atmosphere, it would appear blue. But is that a good reason to say that the atmosphere is, in fact, blue?

Let's go a step further. Say that the general lack of short wavelength light means that these aliens' eyes never evolved sensitivity to blue light at all. Again, they could perform experiments and develop a theory of optics, but there's no situation in which they would describe the sky as blue, because they have no concept of blue at all.

However, blue-seeing humans are only 40 light years away, so we might someday travel there and explain the reality to them. We might say, your sky looks red, but that is only an illusion. If your eyes were sensitive to short wavelength light, and your planet were not tidally locked, and your star were luminous enough to shine brightly across the specific range of 400-700 nanometers, then you'd see that in reality your sky is, in fact, blue. The aliens would twirl their fuzzy tentacles in derision and laughter, as aliens are wont to do.

Now you might object here and say that we have plenty of names for things we don't have direct subjective experience of. For example, we've labeled the rest of the electromagnetic spectrum, from gamma rays on up to radio waves, even though we only have access to a tiny bit of that spectrum. And that's true enough, but we wouldn't say that the color of an object is x-ray. There might be some property there, but it's not color.

Okay, but let's turn the tables around here. Maybe TRAPPIST aliens are sensitive to infrared light and have a whole host of specific names for the wavelengths they subjectively experience in that range. That sounds a lot like color, too, and it seems anthropocentric of us to deny them their infrared colors. So we can say that blue is a human (or Earth creature) color and that an object is that color when it reflects light in a particular range of wavelengths. That's what color is: the subjective experience of a particular wavelength of light.

But then the aliens might ask, so what's the wavelength of this "brown" color you humans are always talking about? Brown does not have a wavelength; it doesn't show up in the rainbow. Brown is a color humans experience because our perception of color is based on more than just wavelength; it also includes contrast levels and overall brightness. Brown only shows up when something with a red or yellow wavelength is dim compared to what’s next to it.

Purple, too, is not a "real" color by the rough definition given above. It is not composed of a single wavelength but multiple wavelengths that our brains interpret as a single color. Why? Because we don't actually have perfect, exact wavelength detectors in our eyes. Instead, we have three different kinds of cones (photoreceptor cells) that absorb light in three ranges of wavelengths that overlap a bit.

Credit: Vanessaezekowitz at Wikipedia
Our brain figures out what color we're seeing not by identifying a particular wavelength but by adding up how much each type of cone has been stimulated. When a blue cone starts firing more than the rest, our brain will interpret that as seeing blue. But we don't have purple cones. Instead, the human brain has made up the color purple for those situations when our blue and red cones are firing at equal rates.

So what do we say when the aliens ask what it means for something to be purple? Oh, an object is purple when it reflects both short wavelength and long wavelength visible light in a situation where creatures evolved to pick out that combination as signifying something distinctive. Ah, yes, of course.

All of this is not to say that there's no such thing as color, or that trees aren't brown. Again, it does no one any good to object to every statement about the color of an object by saying, "Well actually, leaves absorb everything but green!" So yes, the sky is blue because air is blue. That is a perfectly fine answer that conveys an important aspect of what color is all about. But that important aspect might not be that color depends on reflection; rather, it might be that the idiosyncratic history of our sun, our planet, and our species have led to the subjective experience of color.

Thursday, March 17, 2016

On Guessing

This is a follow-up to my Lagrange point post. At the end, I briefly mentioned the L4/L5 Lagrange points, which are stable and form equilateral triangles with the masses of a three-body system. I'd like to delve into the physics of these points a bit to illustrate something about how physicists solve problems.

That is, physicists (in general) do not like doing calculations. They don't want to sit around all day crunching numbers to arrive at an answer. When you solve a physics problem, the goal is to build as simple a model as possible that captures the essential features of what you're studying. (This is where the spherical cow jokes come in.) That way, if you're lucky, you can avoid having to do a lot of math. Instead you can arrive at the answer you want by symmetry, or dimensional analysis, or guessing.

Guessing is an important part of the physicist's toolkit and some of what makes doing these problems fun (for me, at least). It's easy to stare at a problem for hours and feel overwhelmed by the complexity of it. I liken this to how it feels when you've just begun to write something. You have a blank screen and a blinking cursor in front of you and there's nothing more terrifying or paralyzing.

In writing, sometimes the solution is to just start writing and see where the story takes you. And so it follows with physics. If you have a complex problem, at times the best strategy is to just guess at the answer and see where the physics takes you. In this way, doing physics can be a lot like playing a game or solving a puzzle. It's fun, and I seriously wouldn't still be in school if I thought otherwise.

So let's return to the L4/L5 Lagrange points. In class, when discussing the three-body problem, our professor performed enough derivation to get us to believe that stable orbits can exist. He went through the same argument I used about rotating frames and centrifugal force. So a test mass is in a stable orbit when gravity and centrifugal force cancel out. He then gave us the punch line, telling us where the Lagrange points are, but didn't go through the math of actually finding them. Why not? Because if you do the derivation, the equations of motion you end up having to solve are:

I should probably credit Massimo Ricotti for this.
I'm not going to attempt to explain what all that means. It's ugly, and you wouldn't want to solve that unless you had no other choice. But there is another way. Our professor mentioned that when thinking about the 5 Lagrange points, you can guess where 2 of them (L4/L5) must be.

This intrigued me, which is why we're here today. What makes it possible to guess these locations? As we saw with the L2 point, its exact location is related to the square root of the ratio between the two big masses. This is (probably) not something you could just pull out of thin air. But that's not the case for L4 and L5. The location of one of these points is at the vertex of an equilateral triangle that has the two large masses at the other vertices. Flip this triangle over and you get the other one. How massive the objects are isn't relevant at all; distance is the only important variable (and two masses can basically orbit each other at any distance they like). So you could conceivably guess the answer just by looking at the problem.

There are a lot more MS Paint illustrations coming. You've been warned.
But what makes equilateral triangles, as aesthetically pleasing as they are, physically appealing? Let's consider a special case and then move on to a more general scenario.

Forget the Earth-Moon system and consider two stars of equal mass in circular orbits about each other. In that case, the stars are actually orbiting their center of mass, which is halfway between the two for equal mass stars. A third body that's motionless in the rotating frame also orbits the center of mass, which means centrifugal force pushes away from that center. To make the problem even simpler, let's put the third body equidistant from the two stars.

I'm a big fan of purple.
Then the forces of gravity to the left and right cancel out, leaving only gravity pulling down and centrifugal force pushing up. To get our Lagrange point, we just need those forces to balance. This means we have to guess how far up from the center the Lagrange point is.

First, let's consider gravity. The total strength of gravity depends on the inverse square of the distance to the stars, d. But we don't want the total force, only the vertical component. That part is a fraction of the total, and that fraction is equal to h/d. This means gravity now depends on the distance to the center of mass and the inverse cube of the distance to the stars.

On the other hand, centrifugal force depends on the distance to the center of mass, h, and the inverse cube of the distance between the stars, a. Our gravity and centrifugal terms are nearly the same, except one uses a and the other d. But we're trying to find d, so let's just guess that d=a. Then all the lengths of our triangle are equal and we've found a point where all the forces cancel out--a Lagrange point. (This guess works because the constants in each equation are the same. Otherwise, d might just be proportional to a.)

So there we have it. Using a few reasonable assumptions, a simple model, and nothing more than geometry, we've found the Lagrange points. Where do we go from here? How about back to the Sun-Earth system, where one of the two masses is much, much bigger than the other. If that's the case, then the center of mass moves to the sun, and centrifugal force points directly away from it.

It's a trap!
If we maintain our equilateral triangle guess, where does that leave us? With a problem. The problem is that if you rotate the above picture so that the sun's gravity vector and the centrifugal vector are horizontal, you're left with the Earth's gravity vector at an angle of 60° away from horizontal. This is bad because the "vertical" component of the Earth's gravity isn't balanced by anything else, which means that no matter what values you insert into your equation, there is no equilibrium point. Uh, oh.

But our graph has fooled us here. You see, by moving the center of mass directly on top of the sun, we are implicitly saying that the Earth has no mass whatsoever. And if that's the case, then it has no gravitational force, which means it doesn't need to be counteracted at all. In the limit where the Earth has no mass, the three-body problem reduces to the one-body problem. So there is a point of stability at the equilateral triangle, but also at any point along the same circular orbit.

This wasn't a totally useless exercise, however. It shows us that it's reasonable to expect L4/L5 to be stable from one extreme of equal masses to the other extreme of just one big mass. But we haven't yet proven that the L4/L5 points exist where they do for any arbitrary masses. How do we do that? First, let's make a generic diagram describing the situation.

You made it.
Let's say that Star A has a mass of m and Star B has a mass of km, where k is some fraction between 0 and 1. This means we can vary between the two extremes of equal mass (k=1) and one dominant mass (k=0). The smaller k is, the farther to the left the center of mass moves, the smaller Star B's gravity vector is, and the more horizontal the centrifugal vector gets. This should mean that the forces pointing to the right stay balanced. Additionally, as k gets smaller, there is less overall gravity pointing down, but because the centrifugal force is getting more horizontal, that gravity has less it needs to counteract. So our equilateral triangle still looks good.

To prove the general validity of our guess, let's see what happens if the interior angles are some arbitrary angle, rather than the 60° they must be. We have to compare the combined vertical force of gravity to the vertical centrifugal force. Using trig, we can find the distance from the test mass to a star in terms of a and θ. Because of the inverse square law of gravity, a is going to be squared. Trig also gets us the vertical component of that force in terms of θ.

On the other hand, centrifugal force depends on the distance to the center of mass, l. But because we only want the vertical component, the actual location of the center of mass is irrelevant and all we need is h, which again can be found in terms of a and θ. As before, centrifugal force also depends on the inverse cube of a, so some canceling of exponents means it's the inverse square of a that shows up.

Because both expressions depend on the square of a, we can get rid of it. Both forces are also equally dependent on the sum of the masses of the two stars, so we can cancel the mass terms, too. This means our equation is now defined entirely in terms of θ. After a little algebra, we can arrive at the following equality:

sin(θ) = 1/2

Everything else in our equation is gone. All that matters is the angle between h and d. Now, I just happen to know that the sine of 30° is 1/2. This means the full interior angle is 60°. With our guess that the test mass is halfway between the two stars, the only possibility is an equilateral triangle with interior angles of 60° and lengths of a. (A similar argument can be made for the horizontal components of the forces.)

I should note that this doesn't prove that there aren't other Lagrange points forming different triangles when the test mass is not half way between. To see that there can't be other points of stability (except on the line joining the two stars), you need to solve for the effective potential of the force fields at work in this system. That can't be done by guessing, but it can be done by drawing! Unfortunately, drawing equipotential surfaces would strain my artistic talents past their breaking point. Here's some computer art instead.

Credit: NASA / WMAP Science Team

Sunday, February 14, 2016

The Equivalence Post

About twenty years ago--maybe right around the time LIGO was finally getting funding, when the gravitational waves it just detected were still a couple dozen star systems away--my elementary school class did a living wax museum. We researched a historical figure, dressed up as our subject, and, when a "visitor" to the museum pressed a red dot on our hand, recited a first-person speech based on our research. Unrepentant early nerd that I was, I chose Albert Einstein.

I don't really remember anything about the contents of my monologue. I probably gave a brief biographical sketch, but likely left out the part where Einstein bribed his first wife into divorce with Nobel money he'd yet to receive. I probably talked about the theory of relativity and how it merged space and time, but likely didn't include anything about Riemannian geometry and metric tensors.

My knowledge of the scientist and his science was patchy, to be sure, but that didn't stop me from admiring him. Einstein is the model of the lone genius working tirelessly, using nothing more than the power of his mind to change the world. For a long time, I imagined he and I were equivalent. I imagined that I alone knew the secrets of the universe and that my solitude represented nothing more than the gap in intellect between myself and others.

Before the inevitable deconstruction of that paragraph, let's talk a bit about Einstein the genius. While E=mc2 is his most famous equation, it's not the equation that made him famous. Physicists will tell you that general relativity was his crowning achievement.

GR grew out of Einstein's attempt to extend his special theory of relativity to gravity. SR and electromagnetism fit together perfectly, but gravity did not behave. According to Newton, gravity acts instantaneously, and that didn't sit well with light speed being the ultimate limit. To reconcile gravity with relativity, Einstein looked at a subtle difference between the electrostatic force and the force of gravity.

When two charged particles are sitting next to each other, the electrostatic force that one feels is proportional to the product of their charges divided by the square of the distance between them--simple enough. When two masses are sitting next to each other, the gravitational force on one is proportional to the product of their masses divided by the square of the distance between them. The forces are nearly identical, just swapping charge for mass.

But when a particle feels a force, it follows Newton's second law and accelerates by an amount inversely proportional to its mass, which is what inertia is all about. This means the mass term from gravity and the mass term from inertia cancel out and bodies under the force of gravity experience the same acceleration regardless of their masses. We know this; it's just the idea that a hammer and a feather (ignoring air resistance) fall at the same rate.

Thank you, NASA.
This quirk of gravity gets called the equivalence principle, because it seems to show that "gravitating" mass and "inertial" mass are equivalent, even though there's no particular reason why they need to be.

As Einstein thought about this peculiarity of gravity, he was struck with what he called "the happiest thought" of his life. He postulated a modification to the equivalence principle, which is that being in a gravitational field is equivalent to be in an accelerated reference frame. What he meant was that gravity is not a real force but an effect we observe, so there's no difference between your car seat pushing up against you when you hit the gas and the Earth holding you down.

The link to the other equivalence principle is that, in free fall, any object falling with you moves at the same rate, and the same thing is true in an accelerated reference frame, because the acceleration you feel is a result of the frame (your car, a rocket) and not your mass.

This happiest thought led Einstein to the conclusion that being in free fall in a gravitational field is just as "natural" as being at rest. When you do feel a force (your car seat, the ground), that's just an object getting in the way of your natural path through spacetime. As usual for Einstein, his next step was to imagine what this meant for light.

Assuming his principle is true, weird things happen in gravity. Say you're in a rocket ship at rest in space. If a beam of light comes in one window, it will trace a straight line through the rocket ship and out another window. If you're moving at a constant speed, you observe the exact same thing, because special relativity says you can't tell the difference between different inertial frames.

If you're accelerating, the light will trace out a parabolic curve, because you're moving faster when the light leaves the rocket than when the light enters it. The equivalence principle says you can't tell the difference between gravity and acceleration, so the same thing should happen if you're in a gravitational field. Light passing near the Sun, for example, will curve.

Now it's all well and good to say this happens because of the equivalence principle, but that's not a mechanism. If there isn't a force causing the light to curve, what's doing it? Einstein says this is the wrong question to ask and that what looks like a force is just light taking the only path available.

Here's an imperfect analogy: imagine you're driving up a mountain, maneuvering through twisting switchbacks. If you veer one way, you fall off the mountain. If you veer the other way, you crash into the side of it. So you stick to one narrow path. To the GPS satellites monitoring the position of your phone (but not the mountain or the road), it looks as if your phone, you, and the car are being pushed around by some mysterious force, but in reality you are simply following the only path available.

Except you might think, well that works for light zooming around at 300,000 km/s, but what if there's nothing propelling me? Why am I following any path at all? And the answer is that we are all following a path constantly through spacetime. We're moving forward through time. But in the presence of a gravitational field, spacetime gets warped, and your straight path through it moves a little bit out of time and into space. The "speed" you had going through time gets converted into speed in space, which is why clocks slow down close to a black hole.

Figuring out the specifics of how mass could warp spacetime took Einstein about a decade, but he finally succeeded in 1915, giving the world general relativity. With it came a number of predictions, including the bending of starlight, the correct shape of Mercury's orbit, and the fact that accelerating masses will send out gravitational waves that stretch and shrink spacetime as they pass by. Finally detecting those waves reaffirmed Einstein's genius one more time a century after he first proposed them. And all of that came from Einstein tinkering around with the fact that all objects fall at the same speed.

I said earlier that I equated myself to Einstein, but the truth is I'm no Einstein. I'm a pretty smart guy, but not a genius, and certainly not one of the greatest scientific minds in history, capable of deducing fundamental and quantitative physical truths about the universe from simple thought experiments. What can I possibly hope to achieve compared to that?

But there is an equivalence between me and Einstein, because in reality he was no Einstein, either. It took him a decade to complete general relativity because, talented though he was at math, he was not a mathematician and had to learn an entirely foreign branch of it to make his theory work. He got help from a mathematician friend of his, Marcel Grossmann, who was familiar with Riemannian geometry. That branch of math was invented in the 19th century by a couple of guys, including Bernhard Riemann.

The idea of looking at space and time as a unified thing was partly inspired by Hermann Minkowski, who applied geometrical concepts to Einstein's special relativity. Before Einstein even got to special relativity, which was critical for getting to GR, he frequently discussed difficult subjects with a group of likeminded friends that maybe ironically called themselves the Olymipa Academy. And most of the pieces for SR were put in place by earlier physicists, such as Hendrik Lorentz and George FitzGerald.

Black holes were first theorized about by Karl Schwarzschild, who found one of the simplest solutions to Einstein's field equations while fighting in the trenches during WWI. Roy Kerr figured out how rotating black holes behave. And many others over the ensuing decades contributed to the theory.

As far as gravitational waves are concerned, Einstein himself waffled as far as whether they even existed. But even so, he originally showed only that they could exist and radiate away energy. Solving general relativity for the shape of gravitational waves emitted by two inspiraling, merging black holes took until the 90s. In fact, it was only accomplished with the help of supercomputers using numerical techniques.

And even ignoring the many contributions from theorists not named Einstein, his prediction about gravitational waves would have meant nothing if we did not have the means to detect them. The feat accomplished by LIGO this past week involved scientists who are experts in interferometry, optics, vacuum chambers, thermodynamics, seismology, statistics, etc. The effort required theorists, as well as experimentalists, engineers, and technicians.

I don't mean to imply that Einstein's work would be for naught without the janitors who cleaned his office, that he couldn't have done it without all the little people supporting him. I mean that Einstein's contribution to the discovery was only one part of a vast web of contributions by a host of extremely talented people, alive and dead, who did things Einstein couldn't have done.

On Thursday, we all learned the magnitude of what they had accomplished. Rumors of the discovery had been swirling around for awhile before it was announced. By the time I arrived at school on Thursday to watch the LIGO press conference, I had a pretty good idea of what they were going to say.

Yet that didn't detract from the occasion. Packed into a lounge in the physics department, students, TAs, professors, and I--maybe a hundred altogether--watched the press conference webcast on a giant screen. We all cheered when the discovery was confirmed and cheered again when we heard the primary paper had already been peer reviewed. Half an hour in, I had to leave to go to my theoretical astrophysics course. There, the professor and TA set up a projector and we all continued to watch the press conference. When the webcast ended, the professor took questions about gravitational waves.

Being a part of that, in the minutest and most indirect way, was thrilling. It was a day when Einstein's greatest theory was confirmed yet again, when a new field of astronomy began, and when a thousand scientists got to tell the whole world about the amazing thing they had discovered.

There's a certain--possibly strained--equivalence to my wax museum Einstein moment from 20 years earlier. School was involved, as well as a story about Einstein. But this time I was listening to that story. My passion for science and learning has remained constant, but the attitude has changed. Back then, and for a very long time after that, I took joy in knowing more than others, in being the smartest guy in the room.

Now I know that's not the case. But I also know it doesn't matter. We just don't learn about the universe by sitting alone and thinking brilliant thoughts. That is, at most, one part of the process. So I don’t have to be a mythical genius to contribute. I can be a part of something amazing, of humanity's quest to understand the world around us, just by collaborating with others who are as passionate as I am. I haven't done it yet, obviously, but just as Einstein's magnificent theory has been reaffirmed, so too has my drive to be a scientist.

Thursday, January 21, 2016

Passwordocalypse

This is the second year in a row that I've seen an article decrying our collective cyber stupidity because of the awful passwords we use to protect ourselves. And this is the second year in a row that I've rolled my eyes very hard at the article because of its mathematical ignorance. This is the first year I've decided to blog about it, though.

The article linked to above lists the most popular passwords found in databases of stolen passwords. At the top of the list are groaners such as "password", "123456", and “qwerty”. How could those be the most popular, when everyone knows China is hacking its way into our country and we're using passwords to protect our finances, identities, and porn habits? How could everyone be so stupid?

Well, the truth is, very few people have to be stupid for those passwords to be the most popular. In fact, there's an easy to imagine scenario in which no one is so cyber-challenged. Let's see how.

The most popular password is the one that gets used more than any other individual password. This doesn't mean it's used by a majority of people, obviously, just as Donald Trump isn't supported by a majority of Republicans. Additionally, when we're ranking password popularity, we're doing so by login rather than by person, because that's how the data comes to us. So password popularity is measured as logins/password.

And what's being railed against in the above article is that the passwords with the highest login/password are bad ones. But what makes a password bad? Ease of guessing--those that take the least time to crack are the least secure.

This quality is quantified in a password's information entropy, which is a measure of the number of bits needed to specify the password. In other contexts, a piece of data's information entropy tells you how much that data can be compressed. The higher the entropy, the more bits needed to specify the data, the fewer bits you can get rid of and still preserve it.

When I think entropy, I think physics. Most people probably do, too, knowing it has something to do with thermodynamics and disorder. You probably know the second law of thermodynamics, which is usually stated as something like, "The entropy (disorder) of a system tends to increase."

The "tends to" there indicates that this is a probabilistic law. That is, if you have a box with octillions of gas molecules all bouncing around at different speeds and directions, it's hard to say exactly what they're going to do, but you can say what they're likely to do. And it turns out that a box of gas is more likely to go to a high entropy state than a low one. The reason is that there are many more high entropy states than low ones available.

This is where the connection to disorder comes in. The canonical example is probably an egg. An intact egg is a very ordered thing. It has a specific shape, and you can't change the shape of the egg without changing the fact that it's an intact egg. Thus order means low entropy, because there are only a small number of ways for an egg to be an egg.

Scrambled eggs, on the other hand, are disordered and high entropy. The high entropy results from the fact that you can rearrange your egg particles (eggs are made of egg particles, right?) in many, many different ways but still end up with the same basic breakfast: scrambled eggs.

How does this connect back to information and passwords? Because as the entropy of a system increases, it takes longer and longer to accurately describe the system in detail. With low entropy, high order systems, there might be one law of nature telling you why the system is shaped the way it is, which means it's easy to specify it in detail. But with a high entropy system, there are many microstates that are approximately the same, so you need to be more and more detailed if you want to specify a particular one. "No, the one with particle 1,036,782,561 going this way, not that way."

So high entropy data doesn't compress as easily because there are many high entropy systems, which means it takes a lot of bits to differentiate between two chunks of data. And this is also why high entropy passwords are more secure: because if you're randomly guessing a password, it takes you much, much longer to get through all the available high entropy passwords than it does the low entropy passwords.

But that's also why the least secure passwords will always be the most popular ones. Compared to the secure passwords, there just aren't that many bad passwords out there, because bad passwords are low entropy. The login/password for bad passwords is going to be high essentially by definition. Here's a toy model to demonstrate.

Mathematically, the entropy of a system (s) is proportional to the log of the number of microstates (n) that correspond to a single macrostate. Computer people like to do things in binary, so they use a log base of two: S = log2(n). Now let’s take some real data and see what we find. Using this website, I have found the entropy of each of the 25 most popular passwords. Their average entropy is 20.12. Using my password manager, I've found the average entropy of 10 randomly generated strong passwords (I got lazy, but the variation in entropy was low): 80.84.

So the average good password is ~4 times as strong as the average bad password. If we assume there are only 25 bad passwords (there are many more, but more makes the point even stronger), and that the population of logins (p) uses either good passwords or bad passwords, we can write an expression comparing password popularity (logins/password). For our model, let’s see what it would take for good passwords to be just as popular as bad passwords:

pbad/nbad = pgood/ngood

How do the number of good passwords compare to the number of bad ones? Well, from the log formula up there, if we multiply the strength of a bad password by 4, we get 4S = 4log2(n). From the rules of logs, we can take that 4 on the outside of the log and bring it in: 4S = log2(n4). So if you have n bad passwords, then you have n4 good passwords.

pbad/nbad = pgood/nbad4

Solving for the ratio of logins using bad passwords to good, we get:

pbad/pgood = 1/nbad3

Now let’s plug in nbad = 25.

pbad/pgood = 1/15625 = 0.000064

This means that as long as more than 0.0064% of all logins use bad passwords, they will be the most popular. Stating the converse, 99.9935% of all logins can use strong passwords, and the bad ones will still be more common.

Of course, in the real world, there are more than 25 bad passwords (and waaaay more than 254 good passwords), and people aren't divided up into binary good and bad password users. But I think this demonstrates that very few people need actually be stupid for the above article to be true.

And as I said, it's possible that no one is stupid because this is based on logins rather than users. All it takes is that more than 0.0064% of the time you need to pick a username and password for a site, it's a site for rating cat videos and you rightly don't care about security.

Tuesday, January 19, 2016

Quantifying Weirdness

Quantum mechanics is weird; there's no doubt about that. It’s got wave-particle duality, the uncertainty principle, and spooky action at a distance. Other fields have weird results, too, but although we might comment on the peculiarity of a particular finding, we do not indict other fields as a whole. With quantum mechanics in particular, though, it seems like its idiosyncrasies leave people with the feeling that it is either too weird to be right or too weird to be understood.

Well, today I'd like to help dispel those attitudes, particularly the first one—or at the very least put a number on just how weird quantum mechanics is. To do so, I'm going to be regurgitating material I learned in my philosophy of physics course.

In order to quantify the weirdness of quantum mechanics, we'll be exploring the phenomenon of quantum entanglement. Hopefully, we'll be able to unravel some of its mysteries and not get caught in a web of confusion.

I'm sorry, I promise there will be no more entanglement puns.

Entanglement first gained widespread awareness in physics after a 1935 paper by Einstein, Podolsky, and Rosen, henceforth known as the EPR paper. Einstein was unhappy with how that paper turned out, but he articulated his thoughts more clearly to his colleagues (especially Schrodinger) in private. Additionally, the thought experiment proposed then was more complicated than it had to be. The upshot is I'll be talking about this from a slightly more modern perspective; but historically, the EPR paper is one of the jumping off points for discussing quantum funny business.

So here's entanglement. In quantum mechanics, particles like electrons are described by a wave function which tells you the probability of finding the electron in a particular state. One such state is spin which, because of weird quantum mechanical reasons, can be either up or down. So the wave function could say there's a 50% chance the spin is up and a 50% chance it's down, for example.

You won't know what the spin is until you measure it. When you do so, the language is that the wave function “collapses,” so now it's just in one state, either up or down, instead of a superposition of both.

If two electrons are hanging out, normally you have two wave functions to keep track of. But if two electrons get created together in a particular process, then they will be described by a single wave function. Once that happens, barring interference from the outside world, it is not possible to decompose that wave function into two separate ones.

Where before your wave function for a single electron said there was a 50/50 chance of spin-up or spin-down, now it might say something like there is a 50% chance that electron A is spin-up and electron B is spin-down, and a 50% chance that electron A is spin-down and electron B is spin-up. So if electron A is in your lab, and electron B is down the road at the chemist, and you measure electron A to be spin-up, then you know the wave function has collapsed to "A up, B down." This means you also know, without having measured it, that electron B is now spin-down. If you do later measure it, you will always find it to be spin-down if A was up.

Here's where things get weird. Again, as long as you prevent your electrons from being interfered with, they remain entangled until you measure the spin of one of them, no matter how far apart the electrons get. So if electron A is in your lab, and you send electron B to Alpha Centauri, when you measure the spin of electron A, you instantly know, across a distance that would take light 4 years to travel, what the spin of electron B is.

This is weird.

Here's another scenario. This one is totally going to blow your mind. Imagine you are playing a game with a street magician. He's got two hands and one coin. While your back is turned, he puts the coin in one of his hands and then asks you to guess where the coin is. There's a 50/50 chance for either hand. You say left hand. He opens, and reveals that there is no coin there.

Now here's the wacky part. Assuming the magician exhibits no trickery and that the coin is in one of his hands, you now know, as if by magic, that the coin is in his right hand. Even if the magician performs some real magic and sends his right hand to Alpha Centauri after hiding the coin, you know instantly, across a distance that would take light 4 years to travel, that the coin is in his right hand. Information has traveled faster than light—a clear violation of Einstein's special relativity!

Okay, no matter how hard I try, I can't make that second scenario sound as weird as the first one. But why not? Because you're saying, “Silly Ori Vandewalle (if that even is your real name), nothing spooky is going on here. The coin's location is a result of the magician's actions before the hands are separated. Revealing the hand doesn't decide the fate of the coin. Duh.”

This is essentially the argument that Einstien made in the EPR paper. If two electrons are entangled, and one of them is sent to Alpha Centauri, and measuring the spin of one tells you the spin of the other, then the only reasonable conclusion you can draw is that the spins were determined beforehand.

The name of the EPR paper is, "Can Quantum-Mechanical Description of Physical Reality Be Considered Complete?" Following Betteridge's law, Einstein posited the answer was no. That's because quantum mechanics can only tell you the probability of the electron's spin being up. But just as with the magician's coin, Einstein argued, this probability represents nothing more than our ignorance, not any actual indeterminacy on the part of the coin or the electron.

So is the weirdness gone?

Well, let's see if we can't make this spooky action even more mundane. Another way to think of this result is that the two electrons are correlated. If two objects are correlated, they have a common cause. A caused B, or B caused A, or C caused both A and B. So we are suggesting that some common cause configured both spins beforehand but didn't bother to tell the wave function this.

In the 60s, physicist John Stewart Bell developed a theorem that must be true about any three binary properties of a single system. This theorem tells us something important about common causes. There are a few assumptions that go into the theorem, the most relevant of which is that, once you measure property A, that measurement can't affect properties B and C before you measure them.

Let's go through Bell's theorem with cookies so that I can distract you from the fact that we're doing math.

By Kimberly Vardeman from Lubbock, TX, USA (Perfect Chocolate Chip Cookies) [CC BY 2.0], via Wikimedia Commons
Say you've baked a batch of cookies, and the cookies can be large or not large (L, ~L), have walnuts or no walnuts (W, ~W), and have chocolate chips or no chocolate chips (C, ~C). Now say you want to know how many large, non-walnut cookies you have. We'll call that N(L, ~W). This number is the sum of all large, non-walnut, chocolate chip cookies N(L, ~W, C) and all large, non-walnut, non-chocolate chip cookies N(L, ~W, ~C). This must be true, because whether or not a cookie has chocolate chips does not affect its size or walnut content.

Similarly, the number of cookies with walnuts but no chocolate chips is N(L, W, ~C) + N(~L, W, ~C) because size doesn't matter. And finally, the number of large, non-chocolate chip cookies is N(L, W, ~C) + N(L, ~W, ~C) because walnuts don't matter.

Now let's add together the number of large, non-walnut cookies and the number of walnut cookies with no chocolate chips. That quantity is:

N(L, ~W, C) + N(L, ~W, ~C) + N(L, W, ~C) + N(~L, W, ~C)

If you notice, the second and third terms are also the terms for the number of large, non-chocolate chip cookies. That means our sum is always at least as great as the number of large, non-chocolate chip cookies.

Now let's make a slight shift and talk instead about probabilities. If you randomly reach out for a cookie, the probability that you get a particular one is directly proportional to the number of that cookie there is to take. This means we can reword Bell's cookie theorem thusly:

The probability of choosing a large, non-walnut cookie or a walnut, non-chocolate chip cookie is always greater than or equal to the probability of choosing a large, non-chocolate chip cookie.

This theorem is true regardless of how many of each cookie there actually is, because at no point in demonstrating this did we use numbers. It's also true no matter what kinds of properties we're talking about, so long as they are binary properties, because we could just as easily say L stands for lemon cookies or even something non-cookie-related.

But what's more, this theorem tells us about correlations. You see, if I give instructions to a thousand people to bake exactly the number of cookies I say and have each person randomly select and eat one cookie, we'll find that Bell's cookie theorem holds true. The probabilities will be maintained across all kitchens, because the cookie batches are correlated--spooky baking at a distance. The correlation is a result of the common cause known as me giving out instructions.

Now let's switch gears and talk about sunglasses—or as I prefer to call them, quantum shields. Polarized sunglasses only admit light that oscillates in a particular direction (up and down or left and right, for example). If you have horizontally polarized sunglasses, then only light waving from left to right (from the frame of the frames) will get through. But light coming from the sun is equally likely to be waving in any direction, so if you think about it, polarized sunglasses should only let a tiny, infinitesimal amount of light through—only light that is exactly horizontal and nothing at any other angle. Yet this isn't what happens. Polarized sunglasses will absorb roughly half the incident light and let the rest pass. Why is that?

Well, let's talk about the quanta of light, photons. A single photon doesn't have a direction it's waving, but it does have a polarization that is based on its spin. When a photon passes through sunglasses, the photon's spin is measured by the polarizing filter. Before the measurement, it's in a superposition of horizontal and vertical spin based on the angle of its spin (the direction it's waving).

When it's measured, that superposition collapses so that its spin is either horizontal or vertical. If it ends up being horizontal, it passes through. Otherwise, it's absorbed. The closer the angle of its spin is to horizontal, the higher the probability that it collapses to a horizontal spin. In this way, light from any polarization (except exactly vertical) can pass through, but the odds of it doing so go down the further away from horizontal you get, and anything that does pass through will subsequently be measured as horizontal. So sunglasses are quantum shields.

"Oakley half wire" by Jpogi at en.wikipedia.com. Licensed under Public Domain via Commons
This probability of getting a particular spin works for electrons, too, such as the two entangled ones in our EPR thought experiment. Instead of a polarizing filter, we use magnets to measure an electron’s spin. Before we talked about a 50/50 chance of an electron being up or down, but these odds can be adjusted by rotating our magnets in exactly the same way that light waves rotated away from horizontal have different odds of passing through sunglasses.

But this adds a new wrinkle to our thought experiment. Before, getting a spin-up on Earth meant the Alpha Centauri electron would be spin-down 100% of the time. If we rotate the Earth magnet by some angle θ, then that perfect correlation stops being 100%. It turns out that the odds of one being spin-up and the other spin-down are equal to cos2(θ/2), where θ is the angle between the two magnets.

We can carry out this experiment many times, creating entangled electrons and sending them to Alpha Centauri. A third of the time, we can measure with one magnet oriented at 0 degrees and the other at θ degrees clockwise, a third with one θ degrees and the other φ degrees, and a third with one 0 degrees and the other φ degrees. In this way, we are measuring three different binary properties of the system. Bell's theorem applies.

An entangled pair can be spin-up at 0 degrees and spin-down at θ degrees, spin-up at θ degrees and spin-down at φ degrees, or spin-up at 0 degrees and spin-down at φ degrees.

Bell's theorem tells us, then, that P(θ) + P(φ-θ) >= P(φ). Using the cosine formula up there, this comes out to cos2(θ/2) + cos2([φ- θ]/2) >= cos2(φ). Okay. Looks fine.

Except this isn't always true, depending on the angles you pick. Sometimes, the left-hand side will be less than the right-hand side. If you subtract the right from the left, then whenever Bell’s inequality is violated, the expression will be negative. You can see when that happens in this graph.

I am a Matlab Master.
So what does it mean for Bell’s inequality to be violated? Well, in the case of our cookies, the correlation was upheld because I sent out a common set of instructions to all the bakers. This is the common cause of the correlation. We saw that this common cause would lead to adherence to Bell's inequality for any set of three, binary properties of a system. This means that a common cause cannot be the origin of the correlation between entangled electrons. They aren’t deciding their configuration beforehand.

What Bell's theorem does permit is a non-local connection—the electrons instantly updating each other on their spin, or electrons that are governed by interactions across all of space. The other usual possible explanation for EPR and Bell is that electrons don't have any intrinsic reality, that realism itself is a foolish idea. No one likes either of these possibilities.

There are alternative ways of deriving, formulating, and generalizing Bell's theorem. When you do so via the CHSH inequality, you find that classical correlations can be no higher than 2. But quantum correlations violate this limit and can be as high as 2√2. And yet we can imagine other correlations, such as the Popescu-Rohrlich box, that are even higher than 2√2—correlations that you cannot reach even with entangled, non-local/non-real electrons.

So quantum mechanics is weird. But it's only weirder than regular spooky action at a distance by a factor of √2, or ~41%. Although √2 is irrational, so maybe quantum mechanics is unreasonably weird.