Locklin on science

My favorite photo of this wacky election

Posted in stats jackass of the month, Uncategorized by Scott Locklin on November 9, 2016

This dope got lucky in 2012, essentially using “take the mean” and was hailed as a prophet. He was wrong about virtually everything, and if someone were to make a table of his predictions over time and calculate Brier scores, I’m pretty sure he’ll get a higher score than Magic-8 ball (Brier scores, lower is better). Prediction is difficult, as the sage said, especially regarding the future. Claiming you can do prediction when you can’t is irresponsible and can actually be dangerous.

While he richly deserves to find his proper station in life as an opinionated taxi driver, this clown is unfortunately likely to be with us for years to come, bringing shame on the profession of quantitative analysis of data. We’ll be watching, Nate.



Bad models and the end of the world

Posted in econo-blasphemy, stats jackass of the month by Scott Locklin on March 23, 2014

I loathe journalism as a profession: a claque of careerist whores, half-educated back-slappers and propagandists for the oligarchical lizard people who are ruining civilization. I loathe “science journalism” particularly, as they’re generally talking about something I know about, and so their numerous impostures are more obvious.

Journalists: they're mostly like this

Journalists: they’re mostly like this

Today’s exhibit, the “NASA study predicting the end of Western Civilization.” The actual study can be found in PDF format here. If you read the “journalism” about this in the guardian, cnet, or wherever else, it’s all about our impending doom. Not only do scientists tell us we are doomed, fookin’ NASA scientists tell us we are doomed.

The authors of the study are not NASA scientists. The NASA subcontractor  associated with this paper would like you to know this; probably because NASA was annoyed their name was associated with this paper. The only contribution NASA made was in partially supporting an institution which partially supported one of the three authors for one semester. The author in question,  Safa Motesharrei is a grad student in “public policy” and “mathematics” at U Maryland. The other two authors are also not NASA scientists. One is allegedly a professor of … political science in Minnesota -and more recently some cranky looking outfit called “Institute of Global Environment and Society.” The other one is a professor of the U Maryland department of Oceanic and Atmospheric sciences. Unless you consider the partial ramen noodle stipend one grad student received for one semester to be “funding,” or you consider NASA’s subcontractor to be liars, NASA did not fund or endorse this study in any meaningful way. Journalism 101, failed before even getting to the content of the article.

The three authors of this study

The three authors of this study

For what it is worth, had I published anything my sophomore year of grad school instead of trying to build a crazy vacuum chamber and catch a venereal disease, I’d have had some words in my paper thanking NASA for their support as well. This is despite the fact that most of my money came from the NSF and the University I was attending. I wasn’t in any way a NASA scientist. My institution wasn’t NASA affiliated. But we did get a NASA grant that paid for at least a month’s salary for me over the course of a year. When you get those grants, you say “thank you.”

Nafeez Ahmed, the imbecile at the Guardian who broke the story, continues to insist that NASA funded this study, despite the fact that they didn’t. I guess when someone Discovers you are a shoddy journalist, the accepted thing to do these days is doltishly double down on your error. Ahmed, of course, works for some preposterous save-the-world outfit, which apparently means he can pretend he is a journalist and doesn’t have to tell the truth about anything.

Journalistic failure is to be expected these days, and NASA scientists say stupid things all the goddamned time. Still, reading the paper itself was informative. Had anybody bothered to do so, the story would have been murdered in infancy. It’s one of the godawful silliest things I have ever read.

There is a fairly standard model from ecology called the “predator prey” model. Predator/prey models were  mostly developed to model exactly what it sounds like: things like wolf and moose populations in a National Park. The model makes assumptions (that predators are always hungry,  the prey will never die of old age, and  there are no other predators or prey available, for just a few examples of the limitations of the model), but if you set these equations up right, and the parameters and conditions are non-degenerate, it can model reality reasonably well. It’s really no good for predicting things, but it’s OK for modeling things and understanding how nature works.  The equations look like this, where x(t) is the predator population, y(t) is the prey population and a is predator birth rate, b is the predator death rate, c is the prey’s birth rate, and k is the predation rate; all rates are constant.

\frac{dy}{dt} = ay(t)x(t) -bx(t)

\frac{dx}{dt} = cy(t) -kx(t)y(t)

The predator/prey model is elegant, concise, and in some limited circumstances, occasionally maps onto reality. It is, of course, a model; there is no real reason to model things using this set of differential equations, and a lot of reasons not to. But sometimes it is useful. Like most good models, it is simple and doesn’t have too many parameters. Everything can be measured, and interesting dynamics result; dynamics that we can observe in nature.

The authors of this doom-mongering paper  have  transformed that relatively simple set of equations; a set of equations which, mind you, produces some fairly complicated nonlinear dynamics, into this rather un-handy mess, known as HANDY (for “Human And Nature DYnamics”):

\frac{dx_c}{dt} = \beta_c x_c(t) - \alpha_c x_c(t)

\frac{dx_e}{dt} = \beta_e x_e(t) - \alpha_e x_e(t)

\frac{dy}{dt} = \gamma (\lambda -y(t)) y(t) - \delta x_c(t) y(t)

\frac{d \omega}{dt} = \delta y(t) x_c(t) - C_c(t) - C_e(t)

In this set of equations, x_c(t) is the productive peasant population, x_e(t) are the population of parasitic elites, y(t) is “natural resources” and w(t) is “wealth.” \lambda is a “saturation of regrowth of nature rate.” \gamma is an “exponential growth of nature rate.” \delta is a “depletion of nature” rate term. C_c(t), C_e(t) are wealth consumption rates.

to make it even more complex: \alpha_c(t), \alpha_e(t), C_c(t), C_e(t) are all functions of \omega(t), x_c(t), x_e(t)

C_c(t) = min(1,\frac{\omega(t)}{poor(t)}) s x_c(t)

C_e(t) = min(1,\frac{\omega(t)}{poor(t)}) \kappa x_e(t)

poor(t) = \rho x_c(t) + \kappa \rho x_e(t)

poor(t) is some threshhold wealth function, below which you starve, and allegedly \rho is supposed to be a minimum consumption per capita, but it really makes no sense based on the equations. s is some kind of subsistence level of wealth and \kappa is the multiple of subsistence that elites take.

Instead of contenting themselves with constant predation or death rates, this train-wreck insists on making them the following:

\alpha_c(t) = \alpha_m + max(0,1-\frac{C_c(t)}{s x_c(t)}) (\alpha_M - \alpha_m)

\alpha_e(t) = \alpha_m + max(0,1-\frac{C_e(t)}{s x_e(t)}) (\alpha_M - \alpha_m)

Where \alpha_m, \alpha_M are constants for a normal death rate and a death rate where you have a high death rate, where, and I quote the paper directly: “when the accumulated wealth has been used up and the population starves.”

It’s worth a look at what they’re implicitly modeling here by adding all this obfuscatory complexity. All of the following assumptions are made by this model. Very few of them are true in reality. Most of these assumptions are designed to get the answer they did.

  1. The natural resources of the earth is well modeled by the prey equation
  2. The natural resources of the earth regenerate themselves via a logistic function
  3. There are two classes of humans
  4. There is a thing called “wealth” that is consumed by the two classes of humans at different rates
  5. The elite class of humans preys on the peasants and produces nothing
  6. The peasant class is all equally productive
  7. Wealth comes from peasants exploiting nature
  8. Elites all have \kappa times a subsistence income, rather than a smooth distribution of incomes
  9. Peasants all have s , a subsistence income, rather than a smooth distribution of incomes
  10. An extra variable called “wealth” is needed to make sense of these dynamics, and this variable maps onto the thing known in common parlance as “wealth.”
  11. The wealth factor could sustain a human society for centuries after ecological collapse (page 18)
  12. Death rates increase as natural resources are consumed at a faster rate (everything about modern civilization indicates the exact opposite is true)
  13. The peasants get nothing from the elites except population control
  14. Technological change is irrelevant (yes, they argue this; page 7)
  15. This ridiculous spaghetti of differential equations actually models something corresponding to Human Civilization

There are more assumptions than this, but you get the idea: this model is ridiculous, over parameterized, and designed to get the answers that they did. If you assume parasitic non-productive elites, you get the situation where social stratification can help “cause” collapse. Of course, if you assume parasitic non-productive elites, you’re assuming all kinds of ideological nonsense that doesn’t map well onto reality.

If you assume natural resources also act like prey, you can get situations where the natural resources collapse, then the society collapses. This is no big surprise, and you don’t need these obfuscatory complications to say this: it’s in the predator-prey equations already. Why didn’t they just model humanity and nature as simple “predator/prey” above? I am guessing, because nobody would buy it if you say things that simply, and it wouldn’t be an original paper. It also doesn’t allow them to pontificate on egalitarian societies.

As for the additional “wealth” factor these clowns use to distinguish themselves from an earlier bad model; as far as I can tell, the only purpose served by this degree of freedom is making it easier to mine way more natural resources than we actually need to support a population (something that wouldn’t happen in a standard predator-prey model). It also doesn’t make any sense, modeled in this way, unless you believe grain silos contain centuries worth of corn, or that people can eat skyscrapers. That’s how their wealth equations work; they actually assume you can eat wealth.


Dr. Nafeez Ahmed: Guardian columnist who broke this story

I actually feel a bit sorry for these guys, even though they are unashamed quacks. They didn’t ask to become this famous. Somehow the zeitgeist and some imbecile activist newspaper reporter decided to make them famous as people who are really bad at modeling anything. God help these people if they attempt to model something real, like chemical reaction dynamics, or, say, the earth’s atmosphere.

Returning to the mendacious loon, Ahmed, who brought this paper to world fame and attention. He asserts that the paper actually compares historical civilizations using this model. It does nothing of the sort. The paper mentions historical civilizations, but they don’t even make legalistic arguments that, say, the ancient Egyptians, whose civilization lasted for thousands of years, somehow follow these equations.  All they say is, ancient cultures were cyclical; they rise and fall; something everyone has known from the time of Heraclitus. Cyclical behavior does not imply this complex pastiche of differential equations; there are cyclical behaviors in nature which can’t be modeled by any differential equations. Finally, Ahmed asserts that the model predicts things. It doesn’t; nor does it claim to. It claims to model things. Modeling things and predicting things are very different.

The model itself was bad enough. What the activist-reporter said about it is inexcusable. The fact that everyone credulously picked up on this nonsense without questioning how Nafeez Ahmed made his living is even worse. Science by activist press release. Yeah, thanks a lot, “science journalists.” Nobody even noticed the clown who broke this story is a goddamned 911 truther.

A more reliable narrator than the Ahmed bozo who broke this story

A more reliable narrator than the Ahmed bozo who broke this story

I find all this intensely sad. I’m sad for the boobs who wasted their lives cranking out a model this useless. I’m sad for our civilization that it is possible to make a living publishing rubbish, and that talented people can’t make a living doing interesting and correct research which will benefit humanity. I’m also sad that journalists aren’t fired over their credulity regarding this fraud. I’m sad that ideological hogwash is published in all the papers as some kind of scientific truth, while nobody notices simple things, like the fact that the world fisheries are presently undergoing collapse, or the fact that there are no more rhinos because Chinese people haven’t discovered viagra yet.

I’m also sad that people are so obsessed with the end of the world. Maybe some day we’ll experience some kind of ecological apocalypse, or the imbeciles in the White House will nuke the slightly less stupid cavemen in the Kremlin. Chances are pretty good though, that before these things happen, we will all be dead. Wiser men than me have pointed out that anxiety about the end of the world is a sort of transference for anxiety about their own impending demise. As Spengler put it, “perhaps it is not the end of the world, but just the end of you.”

HFT using neutrino physics: Stats Jackassery

Posted in physics, stats jackass of the month by Scott Locklin on June 18, 2012

I’ve always wondered what good electroweak theory could ever do for anybody, technologically speaking. The unification theory between electrical  and magnetic forces produced huge technological benefits for humanity; pretty much all electrical, electronic and radio technology is the result -and the technological results happened quickly. Physicists work feverishly on unification theories, more or less because electromagnetic theory was so damned important to humanity. Electromagnetism was unified with the “weak field” way back in 1973 or so (or 1968, depending on if you count the theory before the experiment, which I don’t), and Salaam, Glashow and Weinberg were awarded the Nobel Prize for it in 1979. Call it a round 40 years ago. 40 years after Maxwell’s equations (the unification theory between electricity and magnetism) were written down, humans were using electromechanical power on a wide scale, and radio was already being used (in financial applications no less). Not so much has happened technologically since folks invented electroweak theory.

Espen Haug has apparently spoken of  a potential use for electroweak theory. It got carried by Forbes. His idea, which I assume was somewhat in jest, was using electroweak theory to do high frequency trading. Because neutrinos don’t interact strongly with the rest of nature (that’s why they call it the “weak force”), you can transmit a beam of them through the earth. Basically, Haug noticed that “through the earth” is a much more straight line than “across the earth” which is how signals are generally transmitted. The perimeter of a circle is longer than its diameter. Something which has been known since people started drawing circles in patches of dirt. Therefore, you can potentially trade ahead of price movements in far-away exchanges.

The problem with this, of course, is the fact that any beam of particles which can be transmitted through a giant piece of iron and silicon like the earth can’t be easily detected by anyone. There is a reason they call it the “weak force.” It’s really weak! Detecting any neutrinos at all is a pretty neat trick.  If we wait around a long time, and have really big detectors and a lot of neutrinos coming from somewhere, we can see a neutrino interact with a proton once in a while. People do this sort of thing for a living. It’s fairly important stuff for cosmology, astrophysics and high energy physics. Measurements are difficult, so any experiment involving neutrinos pushes knowledge forward.

I had thought about doing a Shannon type calculation, making some guesses as to neutrino flux humans are capable of producing and transmitting through the earth, and looking at cross sections of the best detectors, to see what kind of information can be transmitted in this way. Another way to think about it, how long do you have to sit around at your detector and count things to see an unambiguous signal in your Poisson noise? If it’s longer than a few milliseconds, you can’t do this trick and make HFT front-runny money.  I don’t know much about neutrino detectors, but I do know that the best ones are size of large scale mining installations, and the time frames for looking for interesting signals are measured in years. It turns out someone already did the hard work for me experimentally by building a neutrino telegraph.

The MINERVA detector has been used for this purpose already, in concert with a beam of neutrinos from Fermilab, which is probably close to the best we can do for making lots of neutrinos. The bit rate is reported as 0.1 bits/second, with a 1% error rate. It was also only through 240 meters of rock (it was about 1km total), as opposed to the diameter of the earth, which is 12750 kilometers. No high frequency trading is going to happen at 0.1 bits/second, or whatever lower rate one can get transmitting the beam through some large chord of the earth’s diameter, assuming you can do that at all.

There are other problems with the idea. How do you modulate a neutrino beam? Can you do it on a millisecond timescale? Maybe you can, but accelerators are big giant things, and doing things like ramping magnetic fields in them up and down to change the energy or amplitude of neutrinos, or accelerate a bit string of neutrino-making protons clumps takes a long time. Making an atom smasher which makes a lot of neutrinos … well, I’m guessing it will be even bigger than Fermilab, which is pretty damn big. I also don’t have a good idea of how collimated a beam of neutrinos are. My guess would be, “not very.” But even if you could make a neutrino ray with a laser-like milliradian divergence (almost certainly impossible), the beam radius on the other end of the earth will be measured in kilometers. This would imply that a detector at the other end would have to be very big indeed. Or else someone else could build a detector within the beam radius and see the same thing.

On the other end of things, the detector in the MINERVA experiment would indeed “fit in a basement” at someone’s trading office; it was only 5 tons of scintillators. Putting aside the beam divergence issue, this would work a lot better if it was a lot bigger. The more mass you have, the more neutrinos you can see. That’s why folks do things like using a cubic kilometer of antarctic ice pack as a detector. Assuming you could scale up the bit rate by increasing the detector size, maybe if you built one 100,000 times bigger, that would be good enough? I’m guessing that 500,000 tons of detector might cost a bit of money. I suppose it is possible, if unlikely. Submarine cables from San Francisco to New Zealand are around 80,000 tons, rather expensive, and not as complex.

Something tells me the HFT boys  aren’t going to be running triangle arb on neutrino signals, like, ever. Nice funding attempt though.

I don’t think Espen Haug deserves the stats jackass award, as he’s a serious guy who knows about noise distributions. He also hasn’t written any papers on the subject. Similar comments apply to the aptly named neutrino physicist, John Learned who presumably knows about detector noise, and isn’t actively agitating for neutrino ansibles.  Bruce Dorminey wrote it; he should definitely know better. Either way, Forbes published this, and it doesn’t even pass a sniff test: someone there was jackassed enough to publish this without disclaimers. For that, Bruce and his credulous numskull editors at Forbes Magazine are, stats jackass of the month:

Does anyone at The Atlantic understand statistics?

Posted in finance journalism, stats jackass of the month by Scott Locklin on March 24, 2011

At least they’re not pimping talking points for the oligarchy this time: it’s just general journalistic imbecility. No, Anne Hathaway news does not drive Berkshire Hathaway price changes. No, I haven’t tried to do this regression, nor will I ever try to do this regression, because I’m not as statistically retarded as people who think the Huffington Post is anything but an exotic white noise signal generated by the amygdalas of neurotic liberal arts majors.

If anyone reading my blog has fallen victim to the latest installment of the Atlantic’s regularly scheduled moronathon, please see David Leinweber’s excellent and hysterically funny paper, Stupid Data Miner Tricks: Overfitting the S&P500. In it, he uses almost twice as many data points as the HuffPo mouth breather to show a nearly perfect correlation between Bangladeshi butter production and the S&P500. Adding in sheep population also made the regression better. Professor Leinweber’s paper is a classic in the field: anyone who cares about doing statistics properly should read and internalize its lessons.

As for the hedge fund consultant guy they interviewed, John Bates: I hope you are suitably embarrassed or they egregiously misquoted you. Otherwise, I hope I never have to fix one of your messes. You should hang your head in epic shame for your apparent donkey-like lack of understanding of even the most rudimentary ideas about spurious correlation. Until you make amends and grovel in shame before your professional peers for misleading the public about a six data point spurious correlation in exchange for a little publicity, I hereby award you with the very first “Locklin on science statistical jackass of the month” prize:

Enjoy your prize. If I were in charge of the guild, it would be the stockade and rotten cabbages for you. Progress software? Not the guys that make Apama? Some pimp tried to recruit me to that outfit. I couldn’t understand why a CEP would be written in Java, just as I now can’t even imagine working for a company whose CTO doesn’t understand regression.