Where it begins: The (understandable) urge to stop early Let me tell you a story - perhaps a familiar one. Product Manager: Hey {data_analyst}, I looked at your dashboard! We only kicked off {AB_test_name} a few days ago, but the results look amazing! It looks like the result is already statistically significant, even though we were going t... Read more 10 Jul 2024 - 16 minute read
Scikit-learn pipelines let you snap together transformations like Legos to make a Machine Learning model. The transformers included in the box with Sklearn are handy for anyone doing ML in Python, and practicing data scientists use them all the time. Even better, it’s very easy to build your own transformer, and doing so unlocks a zillion opport... Read more 02 Jan 2024 - 6 minute read
Models of elasticity and log-log relationships seem to show up over and over in my work. Since I have only a fuzzy, gin-soaked memory of Econ 101, I always have to remind myself of the properties of these models. The commonly used $y = \alpha x ^\beta$ version of this model ends up being pretty easy to interpret, and has wide applicabilty acro... Read more 12 Dec 2023 - 10 minute read
We use predictive models as our advisors, helping us make better decisions using their output. A reasonable question, then, is “is my model accurate enough to be useful”? An already-present part of the process for most ML practitioners is Cross Validation, that beloved Swiss Army Knife of model validation. Anyone doing their due diligence when t... Read more 24 Sep 2023 - 8 minute read
Most useful forecasts include a range of likely outcomes It’s generally good to try and guess what the future will look like, so we can plan accordingly. How much will our new inventory cost? How many users will show up tomorrow? How much raw material will I need to buy? The first instinct we have is usual to look at historical averages; we kno... Read more 28 Apr 2023 - 9 minute read