Beware Scientific Metaphors

I’m about a quarter finished with Isabel Paterson’s The God of the Machine, which I’m finally reading after several years of intending to. So far, it’s been both pleasurable and interesting. My main reservation, however, has been an extended metaphor which both illustrates the central idea and potentially undermines it.

Paterson develops a notion of energy to describe the synthesis of material resources, cultural virtue, and human capital which results in creativity and production. As metaphors go, this is not a bad one. That said, my engineering background gives me cause for concern. It isn’t clear that Paterson has a clear understanding of energy as a scientific concept, and her analogy may suffer for it. Complicating matters, she sometimes also phrases “energy” as if it were electricity, which is another can of worms in and of itself.

Mechanical energy behaves oddly enough for human purposes, being generally conserved between gravitational potential and kinetic energy, and dissipated through friction and heating. It emphatically does not spring ex-nihilo into cars and trains. Coal and oil have chemical potential energy, which is released as thermal energy, then converted into kinetic energy and thus motion to drive an internal combustion engine.

Electrical energy is even weirder. It’s been enough years since I finished my physics that I won’t attempt to explain the workings in detail. (My electronics class this spring bypassed scientific basis almost entirely.) Suffice to say that the analogy of water moving through a pipe is not adequate beyond the basics.

Atomic energy, the most potent source yet harnessed, does create energy, but at a cost. A nuclear generating station physically destroys a small part of a uranium atom, converting it via Einstein’s famous relation to useful energy. But more on that in later posts.

I won’t say that the “energy” metaphor is strictly-speaking wrong, because I haven’t done the work of dissecting it in detail. Paterson was a journalist and writer, but she was also self-educated, and therefore we cannot easily assess the scope and accuracy of her knowledge of such phenomena. But I don’t think it matters: even if the metaphor is faulty, the concept which it tries to communicate seems, on the face, quite plausible without grounding in the physical sciences.

I bring this up now, well before I’ve finished the book, because I’ve seen much worse analogies from writers with much less excuse to make them. The God of the Machine was published in 1943. Authors today have a cornucopia of factual knowledge at their fingertips and still screw it up. For instance, take this caption from my statics textbook:


Hibbeler, R. C., Engineering Mechanics: Statics & Dynamics, 14th ed., Pearson Prentice Hall, Hoboken, 2016.


There is no excuse for a tenured professor (or, more plausibly, his graduate students) to screw this up. The correct equation is on that very page and they couldn’t even be bothered to run the numbers and see that, no, you’re not significantly lighter in low Earth orbit. From my perspective, such a blatant error is unconscionable in the opening pages of a professional text.

Now that isn’t exactly a metaphor, but it illustrates the risks of discussing fields nominally close to your own which nevertheless you know very little about. Imagine the danger of using metaphors from totally different fields you’ve never formally studied.

So, I would advise writers to be sparing with scientific metaphors. If you can learn the science correctly, that’s great: you’ll construct metaphors that are both interesting and accurate. But as we’ve seen above, even PhDs make stupid mistakes. Err on the side of caution.

Book Review: The Signal and the Noise

Supposedly Nate Silver’s credibility took a major hit last November, which will no doubt discourage many potential readers of his book. This interpretation is wrong, but palatable, because the sorts of commentators who would come to such conclusions shouldn’t be trusted with it. This book is about how to be more intelligent when making predictions and be wrong less often. Such an attitude is not common—most “predictions” are political pot-shots or, as discussed previously, avaricious attempts to put the cart before the horse.

Let’s begin with a discussion of a few major tips. Most of these things should be taught in high school civics (how can you responsibly vote without a concept of base rates?!), but aren’t. Perhaps the most important thing is to limit the number of predictions made, so you can easily come back and score them. Calibration is recommended—nine out of ten predictions made with 90% confidence should come true.

Political pundits are terrible about these sorts of things. Meteorologists are actually great at it. Now your local weatherman is regularly wrong, but the National Weather Service makes almost perfectly calibrated forecasts1. This is, in part, because their models are under constant refinement, always seeking more accuracy. And it pays off: NWS predictions have improved drastically over the last few decades, due to improved models, more data collection, and faster computers. But more on that later.

Local meteorologists, on the other hand, are incentivized to make outlandish forecasts which drive viewership (and erode trust in their profession). One might see this as evidence that public entities make better predictions than private ones, but we quickly see that that is no panacea when we turn to seismology and epidemiology.

Part of the problem, in those fields, is that government and university researchers are under considerable pressure from their employers to develop new models which will enable them to predict disasters. This is a reasonable enough desire, but a desire alone does not a solution make. We can quite easily make statistical statements about approximately how frequently certain locations will experience earthquakes, for instance. But attempts beyond a simple logarithmic regression have so far been fruitless, not just failing to predict major earthquakes but specifically prediction that some of the most destructive earthquakes in recent memory would not occur.

Silver’s primary case study in this comes from the planning for Fukushima Daiichi Nuclear Power Plant. When engineers were designing it in the 1960s, it was necessary to extrapolate what sort of earthquake loads it might need to withstand. Fortunately, the sample size of the largest earthquakes is necessarily low. Unfortunately, there was a small dogleg in the data, an oh-so-tempting suggestion that the frequency of extremely large earthquakes was exceedingly low. The standard Gutenberg-Richter model suggests that a 9.0-magnitude earthquake would occur in the area about once every 300 years; the engineers’ adaptation suggested every 13,000. They constructed fantastical rationalizations for their model and a power station able to withstand 8.6. In March of 2011 a 9.0-magnitude earthquake hit the coast of Japan and triggered a tsunami. The rest, as they say, is history.

The problem in seismology comes from overfitting. It is easy, in the absence of hard knowledge, to underestimate the amount of noise in a dataset and end up constructing a model which predicts random outliers. Those data points don’t represent the underlying reality; rather, they are caused by influences outside the particular thing you’re wishing to study (including the imprecision of your instruments).

And it can take awhile to realize that this is the case, if the model is partially correct or if the particular outlier doesn’t appear frequently. An example would be the model developed by Professor David Bowman at California State University-Fullerton in the mid-2000s, which identified high-risk areas, some of which then experienced earthquakes. But the model also indicated that an area which soon thereafter experienced an 8.5 was particularly low-risk. Dr. Bowman had the humility to retire the model and admit to its faults. Many predictors aren’t so honest.

On the other hand, we see overly cautious models. For instance, in January of 1976, Private David Lewis of the US Army died at Fort Dix of H1N1, the same flu virus which caused the Spanish Influenza of 1918. The flu always occurs at military bases in January, after soldiers have been spread across the country for Christmas and New Year’s. The Spanish Influenza had also first cropped up at a military base, and this unexpected reappearance terrified the Center for Disease Control. Many feared an even worse epidemic. President Ford asked Congress to authorize a massive vaccination program at public expense, which passed overwhelmingly.

The epidemic never materialized. No other cases of H1N1 were confirmed anywhere in the country and the normal flu strain which did appear was less intense than usual. We still have no idea how Private Lewis contracted the deadly disease.

Alarmism, however, broke public confidence in government predictions generally and on vaccines particularly. The vaccination rate fell precipitously in the following years, opening the way to more epidemics later on.

Traditionally, this category of error was known as crying wolf. Modern writers have forgotten it and have to be reminded to not do that. Journalists and politicians make dozens if not hundreds of “predictions” each year, few if any of which are scored, in no small part because most of them turn out wrong or even incoherent.

Sadly, the pursuit of truth and popularity are uncorrelated at best. As Mr. Silver has learned, striving for accuracy and against premature conclusions is a great way to get yourself berated2. Forecasting is not the field for those seeking societal validation. If that’s your goal, skipping this book is far better than trying to balance its lessons and the public’s whim.

But let’s suppose you do want to be right. If you do, then this book can help you in that quest, though it is hardly a comprehensive text. You’ll need to study statistics, history, economics, decision theory, differential equations, and plenty more. Forecasting could be an education in its own right (though regrettably is not). The layman, however, can improve vastly by just touching on these subjects.

First and foremost is an understanding of probability, specifically Bayesian statistics. Silver has the courage to show us actual equations, which is more than can be said for many science writers. Do read this chapter.

Steal an example from another book, suppose two taxi companies operate in a particular region, based on color. Blue Taxi has a larger market share. If you think you see a Green Taxi, there’s a small chance that it’s really Blue and you’re mistaken (and a smaller chance if you see Blue, it’s really Green). The market share is the base rate, and you should adjust up or down based on the reasons you might feel uncertain. For instance, if the lighting is poor and you’re far away, your confidence should be lower that if you’re close by at mid-day. Try thinking up a few confounders of your own.

To better develop your Bayesian probability estimate of a given scenario, you need to assess what information you possess and what information you don’t possess. These will be your Known Knowns and Known Unknowns. The final category is Unknown Unknowns, the thing you aren’t aware are even a problem. A big part of rationality is trying to consider previously ignored dangers and trying to mitigate risk from the unforeseen.

This is much easier to do ex post facto. By that point, the signal you need to consider stands out against hundreds you can neglect. Beforehand, though, it’s difficult to determine which is the most important. Often, you’re not even measuring the relevant quantity directly but rather secondary and tertiary effects. Positive interference can create a signal where none exists. Negative interference can reduce clear trends to background noise. There’s a reason signal processing pays so well for electrical engineers.

The applications range from predicting terrorist attacks to not losing your shirt gambling. An entire chapter discusses the Poker Bubble and how stupid players make the game profitable for the much smaller pool of cautious ones. In addition to discussing the mechanics and economics of the gambling, I got a decent explanation of how poker is played. Certainly interesting.

Another chapter tells the story of how Deep Blue beat Gary Kasparov. Entire books have been written on the subject, but Silver gives a good overview of the final tournament and what makes computers so powerful in the first place.

Computers aren’t actually very smart. Their strength comes from solving linear equations very, very quickly. They don’t make the kinds of arithmetic mistakes which humans make, especially when the iterations run into the millions. Chess is a linear game, however, so it was really a matter of time until algorithms could beat humans. There’s certainly a larger layer of complexity and strategy than many simpler games, but it doesn’t take a particularly unique intelligence to look ahead and avoid making mistakes in the heat of the moment.

Furthermore, the stating position of chess is always the same. This is not the case for many other linear systems, let alone nonlinear ones. Nonlinear systems exhibit extreme sensitivity to initial conditions; the weather a classical example. The chapter on meteorology discusses this in detail—we have very good models of how the atmosphere behaves, but because we don’t know every property at every location, we’re stuck making inferences about the air in-between sampling points. Add to this finite computing power, and the NWS can only (only!) predict large-scale weather systems with extreme accuracy a few days ahead.

With more sampling points, more computing capacity, or more time, we could get better predictions, but all of these factors play off one another. This dilemma arises throughout prediction. More research will allow for more accurate results but delays your publication data. (This assumes that the data you need is even available: frequently, it isn’t3.)

Producing useful predictions is not about having the best data or the most computing power (though they certainly help). It is primarily about constraining your anticipation to what the evidence actually implies. Nate Silver lays out several techniques for pursuing this goal, with examples. It’s a good introduction for us laymen; experienced statisticians will probably find little they didn’t already know.

I would not recommend this book, however, unless you’re willing to do the work. Prediction is a difficult skill to master, and those without the humility to accept their inexperience can get into a lot of trouble. Should you want to test your abilities, try doing calibrated predictions and see how accurate you are. Julia Galef has a number of mostly harmless suggestions for trying this out.

If you are serious, however, The Signal and the Noise offers a quality primer on several important rationality techniques, and a good deal of information about a variety of other topics. I found it an enjoyable read and hope Nate Silver writes more books in the future.


1Major aggregators like the Weather Channel and AccuWeather tend to take the NWS predictions and paste an additional layer of modelling on top of it, for better or for worse.

2In the week before the 2016 election, several liberal commentators accused Mr. Silver of throwing the nation into unwarranted fear for only having Hillary Clinton’s odds of winning at ~70%. As it turns out, his model was one of the most balanced of mainstream predictions, yet everyone then acted as if he had reason to be ashamed for getting it wrong.

3The data may be concealed in confidential documents, nominally available but out of sight, or sitting right under your nose. Most often, however, it’s hiding in the noise. Economic forecasts suffer from this last problem. There’s econometric data everywhere, but basically no one has found more than rudimentary ways to make predictions with it. Perverse incentives complicate matters for private sector analysts, who often then ignore the few semi-reliable indicators we’ve got.

What Constitutes Space?

I’ve been writing about the assorted difficulties faced in astronautical engineering, but this presupposes a certain amount of background knowledge and was quickly getting out of hand. So let’s start with a simpler question: what is space, anyway?

Generally speaking, space is the zone beyond Earth’s atmosphere. This definition is problematic, however, because there’s no clean boundary between air and space. The US Standard Atmosphere goes up to 1000 km. The exosphere extends as high as 10,000 km. Yet many satellites (including the International Space Station) orbit much lower, and the conventional altitude considered to set the edge of space is only 100 km, or 62.1 miles.

This figure comes from the Hungarian engineer Theodore von Kármán. Among his considerable aerodynamic work, he performed a rough calculation of the altitude at which an airplane would need to travel orbital velocity to generate sufficient lift to counteract gravity, i.e. the transition from aeronautics to astronautics. It will vary moderately due to atmospheric conditions and usually lies slightly above 100 km, but that number has been widely accepted as a useful definition for the edge of space.

To better understand this value, we need to understand just what an orbit is.

Objects don’t stay in space because they’re high up. (It’s relatively easy to reach space, but considerably harder to stay there.) The gravity of any planet, Earth included, varies with an inverse square law, that is, the force which Earth exerts on an object is proportional to the reciprocal of the distance squared. This principle is known as Newton’s Law of Universal Gravitation. Its significance for the astronautical engineer is that moving a few hundred kilometers off the surface of Earth results in only a modest reduction of downward acceleration due to gravity.

To stay at altitude, a spacecraft does not counteract gravity, as an aircraft does. Instead, it travels laterally at sufficient speed that the arc of its curve is equal to the curvature of Earth itself. An orbit is a path to fall around an entire planet.

The classic example to illustrate this concept, which also comes from Newton, is a tremendous cannon placed atop a tall mountain (Everest’s height was not computed until the 1850s). As you can verify at home, an object thrown faster will land further away from the launch point, despite the downward acceleration being identical. In the case of our cannon, a projectile shot faster will land further from the foot of the mountain. Fire the projectile faster enough, and it will travel around a significant fraction of the Earth’s curvature. Firing it fast enough1 and after awhile it will swing back around to shatter the cannon from behind.

Newton_s_cannon_large.gif (313×242)

Source: European Space Agency

In this light, von Kármán’s definition is genius. While there is no theoretical lower bound on orbital altitude2, below about 100 km travelling at orbital velocity will result in a net upwards acceleration due to aerodynamic lift. Vehicles travelling below this altitude will essentially behave as airplanes, balancing the forces of thrust, lift, weight and drag—whereas vehicles above it will travel like satellites, relying entirely on their momentum to stay aloft indefinitely.

But we should really give consideration to aerodynamic drag in our analysis, because it poses a more practical limit on the altitude at which spacecraft can operate. Drag is the reason you won’t find airplanes flying at orbital speeds in the mesosphere, and the reason satellites don’t orbit just above the Kármán line. Even in the upper atmosphere, drag reduce a spacecraft’s forward velocity and therefore its kinetic energy, forcing it to orbit at a lower altitude.

This applies to all satellites, but above a few hundred kilometers is largely negligible. Spacecraft in low Earth orbit will generally decay after a number of years without repositioning; the International Space Station requires regular burns to maintain altitude. At a certain point, this drag will deorbit a satellite within a matter of days or even hours.

The precise altitude will depend on atmospheric conditions, orbital eccentricity, and the size, shape, and orientation of the satellite, but generally we state that stable orbits are not possible below 130 kilometers. This assumes a much higher apoapsis: a circular orbit below 150 km will decay just as quick. To stay aloft indefinitely, either frequent propulsion or a much higher orbit will be necessary3.

On the other hand, it is exceedingly difficult to fly a conventional airplane above the stratosphere, and even the rocket-powered X-15 had trouble breaking 50 miles, which is the US Air Force’s chosen definition. Only two X-15 flights crossed the Kármán line.

Ultimately, then, what constitutes the edge of space? From a strict scientific standpoint, there is no explicit boundary, but there are many practical ones. Which one to chose will depend on what purposes your definition needs to address. However, von Kármán’s suggestion of 100 km has been widely accepted by most major organizations, including the Fédération Aéronautique Internationale and NASA. Aircraft will rarely climb this high and spacecraft will rarely orbit so low, but perhaps having few flights through the ambiguous zone helps keep things less confusing.

1For most manned spaceflights, this works out to about 7,700 meters per second. The precise value will depend on altitude: higher spacecraft orbit slower, and lower spacecraft must orbit faster4. In our cannon example, it would be a fair bit higher, neglecting air resistance.

2The practical lower bound, of course, is the planet’s surface. The Newtonian view of orbits, however, works on the assumption that each planet can be approximated as a single point. This isn’t precisely true—a planet’s gravitation force will vary with the internal distribution of its mass, which astrodynamicists exploit to maintenance the orbits of satellites. That, however, goes beyond the scope of this introduction.

3The International Space Station orbits so low in part because most debris below 500 km reenters the atmosphere within a few years, reducing the risk of collision. This is no trivial concern—later shuttle missions to service the Hubble Space Telescope, which orbits at about 540 km, were orchestrated around the dangers posed by space junk.

4Paradoxically, we burn forward to raise an orbit, speeding up to eventually slow down. This makes perfect sense when we consider the reciprocal relationship between kinetic and potential energy, but that’s another post.

Book Review: Your Inner Fish

This book is not what I expected, but quite pleasurable to read nonetheless. Your Inner Fish does not detail the ichthyologic nature of the human body. Rather, it explores how fish moved onto land, where many now-ubiquitous adaptations came from, and how scientists figured it out.

Dr. Shubin begins with the story we all came to hear: how his team of paleontologists discovered Tiktaalik Roseae. This ancient, shallow-water fish  Tiktaalik is an important transitional fossil because it was one of the first discovered with rudimentary hands. Biologists comparing the limbs of species noticed pattern in the limbs of land animals as far back as the mid-1800s. This patter held only for land-adapted species—reptiles, amphibians, mammals (including aquatic mammals that returned to the seas).

For a long time, it was believed that fish don’t exhibit this pattern. Then lungfish were discovered: living fossils which exemplify, in some ways, the transition from ocean to land. As their name implies, they possess basic lungs, and, interestingly, the beginnings of limbs.

Tiktaalik was an improvement on the lungfish. It had a flat head, for swimming in shallow water, and fin bones that show the beginning of a wrist. Together, we see why fins evolved into arms: shallow water fish needed to do pushups. In their fish-eat-fish world, the ability to push oneself through extra-shallow patches was likely a critical advantage.

Let me tell you, exercising seems a lot less mundane when you consider that your lungfish ancestors did it to survive. That’s what your arms evolved to do. It’s only more recently we found further applications for them.

Throughout this book, Shubin is trying to explain how scientists managed to figure out our evolutionary history. He has perhaps a unique perspective to explain this process, as a paleontologist turned anatomy professor. Knowing what came before helps explain the ways in which earlier species were contorted to become the ones we see today.

Comparative anatomy and the fossil record tell us a lot about how modern species came to be. But genetics also offers considerable insight. Looking at the differences between genomes can tell us a lot about how recently certain categories of features evolved. In many cases, we can take genes from mice or fish and insert them into the DNA of invertebrates like fruit flies and get the same result. Such experiments are strong evidence that features like body plans and eyes evolved a really long time ago.

To be clear, there’s a lot of uncertainty which can probably never be resolved. We can prod algae in tanks to evolve the beginnings of multicellular bonding, but we have no idea if that particular direction is the one that our forerunners took.

Nevertheless, Your Inner Fish gives a good overview of how bacteria became bugs and fish, and how those bugs and fish became the bugs, fish, and people alive today. I certainly came away with an improved picture of how weird our bodies are and their many imperfections, though far from the whole picture. My curious is fairly sated, however—I’ve no plans to read the kinds of human anatomy texts I would need to really appreciate the magnitude of making men from microbes.

All told, I’d recommend Your Inner Fish as an entertaining and informative read about how human beings came to be. Neil Shubin has packed a lot of interesting scientific research into it, and with the exception of an example about hypothetical clown people in the final chapter, does a pretty good job of explaining it clearly. Definitely worth your time if the history of life on Earth intrigues you.