Advice to a young social scientist

Posted in five minute university by Scott Locklin on August 28, 2015

A comment which woke me from my long nap:

” What areas of mathematics or technical knowledge would you consider necessary for a hedgie analyst or academic researcher in economics /pol science /anthropology / history? I’m not interested in bits, bolts, DNA or mechanical things, but would like to apply more rigor to social, business and economic problems. “

Simple answer: statistics (and ideas in probability). Not the baby stats rubbish where they give you a recipe and hope for the best. Not even the stuff they teach you in an experimental physics course: real statistics, like they use on Wall Street to make money.

If you want to be bleeding edge, or do some exploration on your own, there are interesting results in information theory and machine learning which can help you, but what will help you more than this is a deep understanding of plain old statistics. Frequentist, Bayesian, Topological; whatever: just learn some stats to the point where you understand how they work, what they’re good for and where they break down.

My formal training was in physics, where, generally speaking, statistical sophistication is fairly low. Physicists have the luxury of being able to construct experiments where the observation of one or two photons or some preposterously small amount of torque on a magnetometer is meaningful. Pretty much nobody but physicists have this luxury.

Physicists no longer have this luxury for the most interesting problems these days. Unfortunately nobody told them, which is why physics has been languishing in the swamplands, with “physicists” working on non falsifiable noodle theory, cosmology and writing software for computer architectures which will probably never exist. I think it was Kelvin who said, “in science there is only physics, all the rest is stamp collecting.” When Kelvin said it, this was true: because nobody had bothered to invent statistics yet. Physics was the only real Baconian science.

Now, we have statistics. A flawed quasi-mathematical technique which is effectively how we know anything about everything that isn’t pre-1950s physics. Yes, yes, Disraeli and Mark Twain said there are “lies, damn lies and statistics.” He should have said, “bad statistics” -but that’s all there was in those days. Before we had the adding machine, statistics was the purview of Gauss and people who mostly were not doing it right.

Guys like Fisher, Pearson (both of them), Kolmogorov, Neyman, de Finetti, Jeffreys, Savage, Cramer and the lot are as important to our understanding of the world as Heisenberg and Darwin. Indeed, at this point I would go so far as to say that statistics invented in the 1930s is arguably more important than physics done in the 1930s. Most of the useful new knowledge of the last 60 years is directly attributable to such men. They don’t get enough respect.

Kolmogorov getting respect

Doing statistics well is the essence of all useful social science. As you probably have noticed, most social science is not done well. Much of social “science” isn’t very scientific; it’s often merely ideological gorp. The statistics used in the social sciences (and biological sciences and drug discovery and …) is abused preposterously to the point where they appear to be mathematical and methodological jokes rather than results which must be taken seriously. If social sciences took themselves seriously, they would be sciences rather than shaggy dog stories.

Consider psychology: according to a recent Science article, the majority of results of a sample of psychology papers can’t be reproduced. Let that sink in for a moment: more than half the results of these psychology papers are anecdotes. Part of this is because the researchers in that field are quacks and morons. Part of it is because they are evil quacks and morons. I sit in a cafe which is near the UC Berkeley psychology building, and often overhear conversations by professors, grad students and post-docs from this place. Once in a while I overhear something intelligent and salubrious. For example, I was grateful to overhear a conversation about this paper a few months ago.

However, I have often heard learned psychology department dunderheads stating what the result of their paper will be, and instructing their underlings to mine the data for p-values. I suppose they may have thought themselves speaking over the heads of the rabble, since nobody else from their department was visible. Mind you, they did this in a public place, in a town which is filled to the nostrils with people with training in rigorous subjects, like, you know, me, the buxom Russian girl reading Dirac in the corner, the options trader eating a sandwich, and the girl pouring the coffee, who is studying mathematics. This indicates to me that such people are so abysmally stupid and unaware of their own deficiencies, they couldn’t achieve a scientific result if they actually tried to do so. Have a click on this link for the UCB psychology department: at least two people on this list are cretinous scientific frauds. If the Science paper mentioned above is a representative sample, most of them are.

Should I ever strike it rich enough to endow a foundation, I would pay legions of trained statisticians to go through the literature and eviscerate the mountains of bad “research” and arrive at the truth. If Universities were interested in advancing human knowledge, rather than advancing a tenured circle jerk which fields a football team, they’d fund entire departments of people who do nothing but act as Inquisitors about their research findings. Meanwhile I will have to content myself with instigating ambitious young people to arm themselves with the best statistical weapons they can muster, and go forth to slay dragons.

It can be done, and at this point, it can be good for your career. Examples here, here and here. There is plenty of bullshit out there, and as Thucydides (also worth a look for young social scientists) said, “the society that separates its scholars from its warriors will have its thinking done by cowards, and its fighting by fools” so get to work!

33 comments

33 Responses

Subscribe to comments with RSS.

crocodilechuck said, on August 28, 2015 at 4:02 am

Well flensed.

Reply
Mark Leeds said, on August 28, 2015 at 4:21 am

too good. and nice to see a positive blog on statistics. I think the problem with statistics is that, since you
can run a regression by typing lm(y ~ x), people think they can perform regressions. it’s not like medicine
where someone wouldn’t dare ( or be able ) to try do something without pretty serious training. you wouldn’t see the nurse picking up the scalpel and saying to the surgeon: “doc. chill. I got this”.

you should write more frequently. your articles are quite entertaining and interesting.

Reply
mvr said, on August 28, 2015 at 12:06 pm

i would be curious to hear your opinion on “mostly harmless econometrics,” i.e. the shift towards using ‘quasi-experimental’ methods in applied microeconomics (e.g. instrumental variables)

(love the Thucydides quote btw)

Reply
- Scott Locklin said, on August 28, 2015 at 2:09 pm
  
  I would say that book lives up to its name, unlike, say, Wooldridge.
  
  Reply
wrmckinney said, on August 28, 2015 at 4:36 pm

Scott, get a chance turn your pen against UCB’s Moskowitz bozo.

http://www.cnn.com/2015/07/28/health/cell-phones-brain-tumor-risk-berkeley/

Reply
- Scott Locklin said, on August 30, 2015 at 8:47 pm
  
  Too many bozos, not enough time. I need to put my superior statistical abilities to work and find a way to cash out so I can spend my time boob dissecting.
  
  FWIIW, I do remember being forced as part of a summer internship in 1992 to listen to Indira Nair yammer on the subject of how power lines … well she had to admit there was no measurable effect, but we should all do something about it anyway, because … feeeewings. This fraud became a vice provost at CMU, one of the best engineering schools on earth, and probably makes more money than any professor, basically by regurgitating pop culture neuroses.
  
  I have a renowned scientist friend at Cal trying to get a raise. The only way he can figure out how to do it is by doing fashionable horse shit like this. Academia is corrupt and insane, and the best thing I ever did was walk out the door.
  
  Oh yeah: read Ben Mann’s “Life on Loan” -best academic comedy since “Lucky Jim.”
  
  Reply
Mark Leeds said, on August 28, 2015 at 5:46 pm

Scott: Just one other thing since you do write for the sake of humanity :). I am not a political scientist
nor do I know much about it. Basically, I know zero about it but when I read articles, I can follow
the econometrics of them.

Anyway, a few years ago ( probably five now that I think about it ), I was trying to find nice- theoretical-applied explanations on distributed lags and I ended up finding some really well done explanations in “political science” publications. I can’t say that the political science itself was up to snuff but the distributed lag explanations were solid and with better intuition than a lot of the econometrics journals which often focus on limit theorems, consistency, etc.

Off the top of my head some guy named Keele seems good and a prof from Penn State whose last name started with a D. Maybe Deboeffe ? And of course Gary King is quite well known and well respected.

So, for people who are interested in the connection between political science and statistics-econometrics, the field is not totally devoid of some decent econometric work. But it is sparse. I think someone above may have even talked about the sparsity in a paper IIRC. Not that you were claiming it was devoid but I just wanted to mention it for people who are or will be interested in that connection.

Reply
- Scott Locklin said, on August 30, 2015 at 8:46 pm
  
  Fisher and Pearson (and the guys who wrote the critical p-value paper) were also psychologists, at least part time. I wouldn’t say there is nothing but frauds working in the field: some people care about truth, and are trying to make things better.
  Also, I did mention that biology and medical research is also highly suspect for mining p-values. One of my favorites was some huge long term study done by the some government bureaucracy on diet. They were annoyed that eating lots of baloney and bacon didn’t have a statistically significant negative health outcome across the list, and in fact, appeared to be good for you. At least they reported it on NPR and such, though try to find this piece of information on the google machine.
  
  Reply
zakdavid said, on August 29, 2015 at 7:40 am

Welcome back! You have been missed.

Reply
Maggette said, on August 29, 2015 at 12:18 pm

He’s back…:).

Reply
Lee Gomes said, on August 29, 2015 at 3:02 pm

I would simply like to repeat what Mark Leeds said, supra.

“you should write more frequently. your articles are quite entertaining and interesting.”

Reply
Brian said, on August 31, 2015 at 12:05 pm

don’t be a stranger

Reply
Toddy Cat said, on August 31, 2015 at 7:10 pm

“there was no measurable effect, but we should all do something about it anyway, because … feeeewings.”

Back in the 1960’s we tore a perfectly functional country to pieces on the basis of bullsh*t “research” such as you describe above, and nobody called foul, because…(alleged) Science! Glad to know that you’re back exposing crap like this. I thought that maybe you had struck it rich and decamped to the Caymens – good to hear from you again.

Reply
Petro said, on September 1, 2015 at 3:45 am

http://www.duffelblog.com/2015/08/f-35-loses-dogfight-to-red-baron/

(duck and cover 🙂 )

Reply
Mike in Boston said, on September 8, 2015 at 4:33 am

A couple of years ago I was interviewing candidates for a job requiring a solid stat background. I’ll never forget how one (brilliant) young guy started off: “I have a Ph.D. in psychology, but please don’t hold that against me.”

Glad to be reading your stuff again. Hope you will find the time to post more often.

Reply
Matt M said, on September 19, 2015 at 3:24 pm

Hi Scott,

Many thanks for your response above. I’m pretty sure you’re correct about a lot of social science arguing not in good faith but from ideological perspectives which is why I wanted to enquire about appropriate quantitative methods. I’ve written about this here:

http://www.maloneygreenblog.com/computation-by-half/

But with only a conceptual and untested approach to major social science problems, my opinion is conjecture at this point, albeit, somewhat logical.

My worry about finding the ‘truth’ in social sciences is that the reason why there is so much ideology and in some cases “fraud” as you say, is not that people are stupid per se, but that the findings are dynamite depending on how you interpret them.

For example, incarceration rates among certain communities, the lack of certain groups in certain academic fields, the main factors that cause a person to win elections etc. Some of these findings hurt ‘feewings’. Some of them even I find disappointing and upsetting.

I’ve been leafing through the CFA quant section for a primer on stats/prob but could you recommend a good textbook or course on this area to become sufficiently equipped? What’s your thoughts on the CFA out of interest?

My hunch is that, and I’m sure you’re aware, social science analysis will at best provide probable answers and ranges rather than hard linear relationships like the economists much less physicists like yourself yearn for. For example, one could say height ^1inch = ^5% chance of reproductive success but more likely height ^1-5 inch + min variable x + unknown u = 5-10% chance at time t only.

Finance is messy which is why getting it right makes so much money for people. I’m pretty certain the top people in finance are the best social scientists, not the academics or journalists.

Reply
- Scott Locklin said, on September 20, 2015 at 10:01 pm
  
  I think the best way to think about social science is to look at it as phenomenology. Economics has some theory behind it that holds in some approximate sense (humans, or at least homo economicus, tends to not light their bankroll on fire on purpose), but it’s all very vague and approximate. Most of the theories on why such and such is the case have no predictive power out of sample anyway, so what good are they? “Knowing what the truth is” doesn’t always help you. Population IQ for example: while I think the evidence is pretty good for a large genetic component … it doesn’t really matter where it comes from. It is there. It’s a simple fact of biophysics that women are not as strong as men, yet we still have idiots who insist ladies can be Marines.
  
  I guess “knowing stats” is a sort of lifelong process. I’ve actually never read a book on the subject. For me, the most valuable two things I do when I need to understand an algorithm like, say, Mann Whitney’s U-test… play with an implementation, feeding it lots of weird data and side cases …. see where it breaks and where it shines compared to something similar. Next level up, write one yourself. I don’t think the theory matters so much as having practical knowledge of, say, when to use a T-test. I’m a redneck who wants to understand all the moving parts inside. You can’t get that sort of knowledge from reading a manual; you have to take it apart and put it back together again.
  
  CFA: It is certainly a broad set of subjects, and I guess I am glad I have the books from the first series. The problem with crap like CFA is it assumes everything is stocks and bonds. Even with these items, while you learn a lot, you don’t learn the important pieces, like “the basic strats that everyone uses to make money.” I started studying for it at one point years ago. Then the guy who used to run the Medallion fund for RenTech asked me WTF a CFA was, so I quit.
  
  Reply
Mark Leeds said, on September 21, 2015 at 2:23 am

Hi Matt: As I meant to say earlier, but probably wasn’t clear, you can do solid econometrics-statistics in the social sciences. I don’t know if it leads to anything interesting but there are really solid statisticians-econometricians in those fields. Gary King is a top stat guy at Harvard whose total focus is political science. John Fox is a top stat and R guy ( in Canada somewhjere. Maybe Mcmaster ) who does sociology. There’s nothing in the social sciences that precludes it from having solid, quality statistical time series-econometrics applied to it. And not just time series either. Cliff Clogg was at Penn State right before I got there. He was a sociologist and a really top categorical stat guy in the stat department. Unfortunately he passed away suddenly in 1994. Stephen Feinberg is another similar person at CMU.

As far as what comes out of applying solid analysis to the social science, I can’t say. Scott will have more to say on that.

FInally, I agree with Scott about playing with some test to get a feel etc but I also think books can help, atleast in my experience. The Sage series on Quantitative Applications in the Social Sciences is a series of short-skinny green books that provide introductions to all sort of various statistical topics. I’m not sure about your background but, if you want something introductory, I would check that series out. Some of are good and some aren’t. I have a lot of them. They’re good when you want a quick idea of how something works quickly without getting into all the gory, gory details. They’re almost like a chapter out of a textbook covering that respective
area. Recently I was reading one called “Introduction To Time Series Analysis” ( and I’ve read books like
Hamilton, Hayashi ) and IMHO it contained the nicest explanation of distributed lags that I had seen in a loooong time and I’ve read a lot of articles on them going all the way back to the 50’s.when they were first developed.

Some samples titles in the Sage Series are:

Introduction to TIme Series Analysis
Fixed effects regression models.
Pooled Time Series Analysis.
Generalized Linear Models
Interpreting Probability Models
Regression with Dummy Variables
Stochastic Parameter Regression Models
Multiple Time Series Models
Regression Diagnostics
Understanding Regression Assumptions

Cheap too which is always nice.

Reply
- Matt M said, on September 21, 2015 at 10:38 am
  
  Many thanks for the recommendation Mark. Will look into these. Agree with Scott as well that the best way to learn and master this stuff is through application.
  
  Reply
  - Mark Leeds said, on September 21, 2015 at 6:14 pm
    
    Hi Matt: You’re welcome. Yes, definitely applications are key. But sometimes the books-theory can tell you what assumptions are being made, how you can test whether they hold, what happens when they don’t hold, etc. Also, if you use R, John Fox has a nice book with various examples of data analysis in R. I think it’s called “A Companion to Applied Regression”. ( which complements his “car” package in R ). All the best.
    
    Mark
    
    Reply
Alex said, on October 2, 2015 at 9:50 pm

Thucydides didn’t say that – it was some 19th century Brit. I’ve generally found that to be true for most witty sound-bites attributed to a classical Greek; possibly that tells you something about the difference in quality between the two civilizations.

Reply
Toddy Cat said, on October 19, 2015 at 5:37 pm

By the way, Mr. Locklin, do you still write for Taki’s, or did he bounce you like he bounced Derbyshire? JD swears that he and Taki parted friends, but it’s interesting that Taki’s Mag rather curtly announced that “Radio Derb” would no longer be hosted immediately after Derbyshire wrote a column comparing Greece to Miley Cyrus, in terms of dysfunctionality. Touchy people out there, touchy….

Anyway, if not Taki’s, is there any other venue for which you write regularly? Like a lot of other folks commenting here, I always enjoyed your work.

Reply
- Scott Locklin said, on October 29, 2015 at 5:56 pm
  
  I’m too busy to write regularly, or to go visit Derb at the Menckenfest to find out what (if anything) happened. As far as I know I am in good standing with Mandolyna; I introduced her to the guy who wrote this:
  http://takimag.com/article/brothers_in_civilianland_max_black/print#axzz3pyrxPief
  
  Taki isn’t exactly thin skinned, so I doubt as anything happened there. Probably it was just easier to do it via Unz.
  
  Reply
  - Toddy Cat said, on November 11, 2015 at 10:28 pm
    
    Probably right. All this “people getting fired for expressing their opinions” stuff has me paranoid…
    
    Reply
    - Scott Locklin said, on November 12, 2015 at 5:46 am
      
      I’ve never seen society so low in trust as now. Shit is completely bonkers.
      
      Reply
Tom Cat said, on February 25, 2016 at 2:21 pm

It’s interesting that H.L. Mencken thought higher “math” was just complex gibberish, piffle and metaphysics:

https://en.wikipedia.org/w/index.php?title=H._L._Mencken&gettingStartedReturn=true#Science

I doubted the accuracy of these quotes when I first read it, but the sources are online and worth looking at. Mencken read and reviewed popular science books, so he actually knew quite a bit about mathematics (enough to know about Cantor’s set theory).

Reply
- Scott Locklin said, on February 25, 2016 at 3:52 pm
  
  I’ve looked at that before and have mixed feelings about it. I figure his newspaperman’s instincts were correct, even if he was wrong. I’ve certainly complained enough about physicists and “physicists” going off the deep end and doing mathematical speculation that is basically untestable, but has a lot of “woo” to it.
  
  Reply
  - Tom Cat said, on March 12, 2016 at 12:20 am
    
    What I thought was:
    
    Yeah, there are lots of obvious “journalistic” mistakes he’s making (resting on superficial impressions in something obviously complex like math), but he tends to exaggerate rhetorically. Plus, he might just be referring to mathematical Platonism, not whether math is consistent or not. The mistake was relying entirely on impressions and analogies based on them, but his judgement of impressions was more accurate than most (thus encouraging that ‘mistake’). I thought the big point he might have been making is that complex stuff has to be reduced to the simple, in order to make sense of it, which is true I think.
    
    Side note: The _American Mercury_ archives online has a lot of great stuff that somehow isn’t in his published books, say this excellent putdown of Clyde Fitch, which ranks up there with his putdown of “actors”, Jennings Bryan, La Monte, etc.:
    
    http://unz.org/Pub/AmMercury-1925jan-00124
    
    Reply
Arjun Reddy said, on February 16, 2019 at 11:09 am

How were you able to learn stats and the math behind it without the help of a professor? I’m in a stage now when I basically need to sort of tinker with it, like you said, but I’m still finding it inhumanly difficult to do it without guidance

Reply
- Scott Locklin said, on February 16, 2019 at 1:49 pm
  
  Go buy Schaum outlines for $10; do all the problems in it.
  
  Reply
  - Arjun Reddy said, on February 16, 2019 at 2:04 pm
    
    Thanks man. Will check them out
    
    Reply
    - Scott Locklin said, on February 16, 2019 at 2:11 pm
      
      “Work a shitload of problems until you understand” is generally the right answer. Schaum’s stats book is decent and knowing it you’ll be in good shape. You can figure out pitfalls of stats later.
      
      Reply
      - Arjun Reddy said, on February 16, 2019 at 6:48 pm
        
        Planning to do Cosma Shalizi’s freely available books for undergrads after I make some progress and meet the prerequisites…you were in correspondence with the man, if I remember right?
        
        Reply