Book Review: The Signal and the Noise

Supposedly Nate Silver’s credibility took a major hit last November, which will no doubt discourage many potential readers of his book. This interpretation is wrong, but palatable, because the sorts of commentators who would come to such conclusions shouldn’t be trusted with it. This book is about how to be more intelligent when making predictions and be wrong less often. Such an attitude is not common—most “predictions” are political pot-shots or, as discussed previously, avaricious attempts to put the cart before the horse.

Let’s begin with a discussion of a few major tips. Most of these things should be taught in high school civics (how can you responsibly vote without a concept of base rates?!), but aren’t. Perhaps the most important thing is to limit the number of predictions made, so you can easily come back and score them. Calibration is recommended—nine out of ten predictions made with 90% confidence should come true.

Political pundits are terrible about these sorts of things. Meteorologists are actually great at it. Now your local weatherman is regularly wrong, but the National Weather Service makes almost perfectly calibrated forecasts1. This is, in part, because their models are under constant refinement, always seeking more accuracy. And it pays off: NWS predictions have improved drastically over the last few decades, due to improved models, more data collection, and faster computers. But more on that later.

Local meteorologists, on the other hand, are incentivized to make outlandish forecasts which drive viewership (and erode trust in their profession). One might see this as evidence that public entities make better predictions than private ones, but we quickly see that that is no panacea when we turn to seismology and epidemiology.

Part of the problem, in those fields, is that government and university researchers are under considerable pressure from their employers to develop new models which will enable them to predict disasters. This is a reasonable enough desire, but a desire alone does not a solution make. We can quite easily make statistical statements about approximately how frequently certain locations will experience earthquakes, for instance. But attempts beyond a simple logarithmic regression have so far been fruitless, not just failing to predict major earthquakes but specifically prediction that some of the most destructive earthquakes in recent memory would not occur.

Silver’s primary case study in this comes from the planning for Fukushima Daiichi Nuclear Power Plant. When engineers were designing it in the 1960s, it was necessary to extrapolate what sort of earthquake loads it might need to withstand. Fortunately, the sample size of the largest earthquakes is necessarily low. Unfortunately, there was a small dogleg in the data, an oh-so-tempting suggestion that the frequency of extremely large earthquakes was exceedingly low. The standard Gutenberg-Richter model suggests that a 9.0-magnitude earthquake would occur in the area about once every 300 years; the engineers’ adaptation suggested every 13,000. They constructed fantastical rationalizations for their model and a power station able to withstand 8.6. In March of 2011 a 9.0-magnitude earthquake hit the coast of Japan and triggered a tsunami. The rest, as they say, is history.

The problem in seismology comes from overfitting. It is easy, in the absence of hard knowledge, to underestimate the amount of noise in a dataset and end up constructing a model which predicts random outliers. Those data points don’t represent the underlying reality; rather, they are caused by influences outside the particular thing you’re wishing to study (including the imprecision of your instruments).

And it can take awhile to realize that this is the case, if the model is partially correct or if the particular outlier doesn’t appear frequently. An example would be the model developed by Professor David Bowman at California State University-Fullerton in the mid-2000s, which identified high-risk areas, some of which then experienced earthquakes. But the model also indicated that an area which soon thereafter experienced an 8.5 was particularly low-risk. Dr. Bowman had the humility to retire the model and admit to its faults. Many predictors aren’t so honest.

On the other hand, we see overly cautious models. For instance, in January of 1976, Private David Lewis of the US Army died at Fort Dix of H1N1, the same flu virus which caused the Spanish Influenza of 1918. The flu always occurs at military bases in January, after soldiers have been spread across the country for Christmas and New Year’s. The Spanish Influenza had also first cropped up at a military base, and this unexpected reappearance terrified the Center for Disease Control. Many feared an even worse epidemic. President Ford asked Congress to authorize a massive vaccination program at public expense, which passed overwhelmingly.

The epidemic never materialized. No other cases of H1N1 were confirmed anywhere in the country and the normal flu strain which did appear was less intense than usual. We still have no idea how Private Lewis contracted the deadly disease.

Alarmism, however, broke public confidence in government predictions generally and on vaccines particularly. The vaccination rate fell precipitously in the following years, opening the way to more epidemics later on.

Traditionally, this category of error was known as crying wolf. Modern writers have forgotten it and have to be reminded to not do that. Journalists and politicians make dozens if not hundreds of “predictions” each year, few if any of which are scored, in no small part because most of them turn out wrong or even incoherent.

Sadly, the pursuit of truth and popularity are uncorrelated at best. As Mr. Silver has learned, striving for accuracy and against premature conclusions is a great way to get yourself berated2. Forecasting is not the field for those seeking societal validation. If that’s your goal, skipping this book is far better than trying to balance its lessons and the public’s whim.

But let’s suppose you do want to be right. If you do, then this book can help you in that quest, though it is hardly a comprehensive text. You’ll need to study statistics, history, economics, decision theory, differential equations, and plenty more. Forecasting could be an education in its own right (though regrettably is not). The layman, however, can improve vastly by just touching on these subjects.

First and foremost is an understanding of probability, specifically Bayesian statistics. Silver has the courage to show us actual equations, which is more than can be said for many science writers. Do read this chapter.

Steal an example from another book, suppose two taxi companies operate in a particular region, based on color. Blue Taxi has a larger market share. If you think you see a Green Taxi, there’s a small chance that it’s really Blue and you’re mistaken (and a smaller chance if you see Blue, it’s really Green). The market share is the base rate, and you should adjust up or down based on the reasons you might feel uncertain. For instance, if the lighting is poor and you’re far away, your confidence should be lower that if you’re close by at mid-day. Try thinking up a few confounders of your own.

To better develop your Bayesian probability estimate of a given scenario, you need to assess what information you possess and what information you don’t possess. These will be your Known Knowns and Known Unknowns. The final category is Unknown Unknowns, the thing you aren’t aware are even a problem. A big part of rationality is trying to consider previously ignored dangers and trying to mitigate risk from the unforeseen.

This is much easier to do ex post facto. By that point, the signal you need to consider stands out against hundreds you can neglect. Beforehand, though, it’s difficult to determine which is the most important. Often, you’re not even measuring the relevant quantity directly but rather secondary and tertiary effects. Positive interference can create a signal where none exists. Negative interference can reduce clear trends to background noise. There’s a reason signal processing pays so well for electrical engineers.

The applications range from predicting terrorist attacks to not losing your shirt gambling. An entire chapter discusses the Poker Bubble and how stupid players make the game profitable for the much smaller pool of cautious ones. In addition to discussing the mechanics and economics of the gambling, I got a decent explanation of how poker is played. Certainly interesting.

Another chapter tells the story of how Deep Blue beat Gary Kasparov. Entire books have been written on the subject, but Silver gives a good overview of the final tournament and what makes computers so powerful in the first place.

Computers aren’t actually very smart. Their strength comes from solving linear equations very, very quickly. They don’t make the kinds of arithmetic mistakes which humans make, especially when the iterations run into the millions. Chess is a linear game, however, so it was really a matter of time until algorithms could beat humans. There’s certainly a larger layer of complexity and strategy than many simpler games, but it doesn’t take a particularly unique intelligence to look ahead and avoid making mistakes in the heat of the moment.

Furthermore, the stating position of chess is always the same. This is not the case for many other linear systems, let alone nonlinear ones. Nonlinear systems exhibit extreme sensitivity to initial conditions; the weather a classical example. The chapter on meteorology discusses this in detail—we have very good models of how the atmosphere behaves, but because we don’t know every property at every location, we’re stuck making inferences about the air in-between sampling points. Add to this finite computing power, and the NWS can only (only!) predict large-scale weather systems with extreme accuracy a few days ahead.

With more sampling points, more computing capacity, or more time, we could get better predictions, but all of these factors play off one another. This dilemma arises throughout prediction. More research will allow for more accurate results but delays your publication data. (This assumes that the data you need is even available: frequently, it isn’t3.)

Producing useful predictions is not about having the best data or the most computing power (though they certainly help). It is primarily about constraining your anticipation to what the evidence actually implies. Nate Silver lays out several techniques for pursuing this goal, with examples. It’s a good introduction for us laymen; experienced statisticians will probably find little they didn’t already know.

I would not recommend this book, however, unless you’re willing to do the work. Prediction is a difficult skill to master, and those without the humility to accept their inexperience can get into a lot of trouble. Should you want to test your abilities, try doing calibrated predictions and see how accurate you are. Julia Galef has a number of mostly harmless suggestions for trying this out.

If you are serious, however, The Signal and the Noise offers a quality primer on several important rationality techniques, and a good deal of information about a variety of other topics. I found it an enjoyable read and hope Nate Silver writes more books in the future.

22891356


1Major aggregators like the Weather Channel and AccuWeather tend to take the NWS predictions and paste an additional layer of modelling on top of it, for better or for worse.

2In the week before the 2016 election, several liberal commentators accused Mr. Silver of throwing the nation into unwarranted fear for only having Hillary Clinton’s odds of winning at ~70%. As it turns out, his model was one of the most balanced of mainstream predictions, yet everyone then acted as if he had reason to be ashamed for getting it wrong.

3The data may be concealed in confidential documents, nominally available but out of sight, or sitting right under your nose. Most often, however, it’s hiding in the noise. Economic forecasts suffer from this last problem. There’s econometric data everywhere, but basically no one has found more than rudimentary ways to make predictions with it. Perverse incentives complicate matters for private sector analysts, who often then ignore the few semi-reliable indicators we’ve got.

Advertisements

Prediction and Primacy of Consciousness

I finished Leonard Peikoff’s Objectivism: The Philosophy of Ayn Rand in 2015, and on the whole, didn’t get that much out of it. It took a long time to slog through, and didn’t answer some of my longstanding questions about Rand’s intellectual history. I’d recommend it as a reference text, but not as an introduction to Objectivism.

This isn’t a review of OPAR; I’ve discussed it elsewhere. Today we’re going to discuss one of the few good new ideas I learned reading it: primacy of consciousness.

Objectivism advocates a worldview based on primacy of existence. Rand holds that consciousness has no power over reality in and of itself—consciousness is the processes of identifying existents, not creating them. Now a conscious mind can decide to alter existents through physical action, or extrapolate the possibility of not-yet-existing existents, but the mere act of thinking cannot produce physical phenomena.1

Primacy of consciousness puts the cart before the horse. Perception can neither create a percept, nor modify it, nor uncreate it.2 Sufficiently invasive methods of inquiry may do that, but the mental process of observation does not.

Let us consider a technical example. When solving engineering assignments, it is often tempting to avoid checking my work. The correct answer is independent of whether I’ve made an exhaustive search for mistakes. Yet, on a certain level, it might seem that not looking will make an error go away.

But it won’t. As my structures professor often says, in aerospace engineering we have a safety factor of 1.5. In school, that’s just a target to aim for—if I screw up, the grade will point it out and I’ll feel silly for missing easy points. On the job, that’s not the case. If your work has a serious mistake, you’re going to kill people.

Or wreck the global economy.

Since starting Nate Silver’s book, perhaps the most interesting thing I’ve learned so far (besides an actually intuitive explanation of Bayes’ Theorem, contra Yudkowsky) was just how stupid the root causes of the housing crisis were.

I’d recommend reading the book if you’d like a properly comprehensive explanation, but the executive summary would be that, starting in the late 1990s, the value of houses began to skyrocket in what we now know was a real estate bubble. This was basically unprecedented in US history, which should have been a wake-up call in itself, but the problem was compounded by the fact that many investors assumed that housing prices would keep going up. They wanted to bet on these increasingly risky properties, creating all sorts of creative “assets” to bundle specious loans together. Rating agencies were happy to evaluate all of these AAA, despite being totally new and untested financial instruments. Estimated risk of default proved to be multiple orders of magnitude too low. And yet everyone believed them.

Silver describes this as a failure of prediction, of epistemology. Assessors made extremely questionable assumptions about the behavior of the economy and the likelihood of mortgage default, which are legitimate challenges in developing predictive models. Going back to my examples of structural engineering, it’s easy to drop the scientific notation on a material property when crunching the numbers. If you say that aluminum has a Young’s Modulus of 10.7, the model isn’t going to know that you meant 10.7 × 106 psi or 10.7 Msi. It’s going to run the calculations regardless of whether your other units match up, and may get an answer that’s a million times too big. Remember that your safety margin is 1.5.

I don’t think economic forecasters have explicit margins of error, but the same general principle applies. Using the wrong Young’s Modulus is an honest mistake, an accident, which is easily rectified if found early. Lots of errors in the rating agencies’ models weren’t so honest. They made what looked like big allowances for unknowns, but didn’t question a lot of their key assumptions. This speaks to a real failure of epistemic humility. They didn’t ask themselves, deep down, if their model was wrong. Not the wrong inputs, but the wrong equations entirely.

For instance, say I model an airplane’s wing as a beam, experiencing bending and axial loads, but no axial torsion. That’s a very big assumption. Say there’s engines mounted on the wing—now I’m just ignoring Physics 101. If I ran the numbers and decided that propulsive and aerodynamic twisting moments were insignificant for the specific question I’m considering, then it might be an acceptable assumption. But I would need to run the numbers.

Many people, at many organizations, didn’t run the numbers in the years leading up to the financial crisis. Now not all of them were given an explicit choice—many were facing managerial pressure to meet deadlines or get specific outputs. That’s an organizational issue, but really just bumps the responsibility up a level.3 Managers should want the correct answer, not the one that would put the biggest smile on their face.

In aerospace engineering, we have an example of what happens when you do that:

741px-challenger_explosion

Just because the numbers look good on paper doesn’t mean they correspond to the real world. Empirical testing is where that comes in. Engineers do that all the time, but even then, it doesn’t prevent organization incentives from bungling the truth. If the boss wants to hear a particular answer, she may keep looking until she finds it.

Economists are worse, trying to predict a massively nonlinear system and, Silver reports, doing quite badly at it. Objectivism is very strong on the importance of saying I know, but rationality also depends on saying I don’t know when you legitimately don’t. Try to find out, but accept that some truths are harder to obtain than others.

Existence exists, and existents don’t care what you think.


1Outside of your body, that is. This is where the line between body and mind becomes pertinent and about where I give up over reducibility problems. Suffice to say that if you can create matter ex nihilo, there’s a lot of people who would be interested in speaking with you.

2Those of you with itchy fingers about quantum mechanics are politely invited to get a graduate degree in theoretical physics. We’re talking about the macroscale here.

3Not that responsibility is a thing that can truly be distributed:

Responsibility is a unique concept… You may share it with others, but your portion is not diminished. You may delegate it, but it is still with you… If responsibility is rightfully yours, no evasion, or ignorance or passing the blame can shift the burden to someone else. Unless you can point your finger at the man who is responsible when something goes wrong, then you have never had anyone really responsible.
 —Admiral Hyman G. Rickover, USN