Casual Inference Data analysis and other apocrypha

So, I’ve been sciencing data for the last ten years or so. And one of the most common questions I get from those starting out in Data Science is something like “what books should I be reading?”. It’s hard to answer this question directly, since the answer really depends on what you’re looking to learn, and what else you know. I could just list every book I’ve read, but as the old pretentious people of the past might say ars longa, vita brevis - which is Latin for “I don’t have time to read all that, dude”. Instead, I’ve done the next best thing, which is to try and list some of my favorite books which cover most of the fields I think about regularly. I’ve divided it into a few sections:

Keep in mind that this list is not-exhaustive and highly subjective. It’s a guided tour of some resources I’ve found helpful, and perhaps you will too. It focuses more on things I care about (statistics, causal inference) and less on things which I have only a little experience in (Deep Learning, for example).

Good reads

Big-picture overviews

Core stats/probability: All of Statistics by Larry Wasserman

This book is a map of the most important definitions and theorems in Statistics. It basically starts from zero and builds up to rigorous definitions of plenty of recognizable statistical methods. The first ten chapters or so are a good thorough introduction for an undergrad (or perhaps someone who has forgotten all the proofs from undergrad).

The same author has also written All of Nonparametric Statistics, though I haven’t

All of data analysis, more or less: ADAFAEPOV

Cosma Shalizi

TALR

Getting into Bayesian Analysis: the puppy book

A computation-based perspective: Computer Age Statistical Inference

Practical Causal Inference methods

Causal Inference, the Mixtape; The EFfect; Causal Inference For the Brave and True

CI the mixtape, The Effect, CIFTBAT

Causal Inference Theory

Pearl’s Little Book

Morgan and Winship

Regression

Gelman and Hill regression,RAOS, Shalizi TALR

Good references

Snacktime: Some neat papers accessible to every level

blogs?

shalizi

gelman

rina artstain

A bonus - books which are about the usage of data but not necessarily written for data practitioners

Tyranny of metrics

Superforecasting

Seeing like a state

Nate Silver (I know, sorry)