Blog

[Survival models] Cox proportional hazard model

Oct 11, 2021

In the previous blogpost, we talked about the hazard function - the probability that the event happens at time $t$ given that you it hasn’t till time $t$. In the Cox proportional hazard model, we assume the hazard function is:
[Survival models] Censored data and intro to survival models

Oct 1, 2021

I recently gave a short intro to survival models to the team as part of a knowledge share session. The goal was to motivate why we should care about censored models.
Allocation of poll observers to polling stations with networkx

Nov 13, 2020

I did a little bit of work recently to help the Wake county in North Carolina allocate poll observers to polling stations. One of the reasons poll observers cancel is because the polling station ends up being too far to travel. Now if we thought of this before we send them an allocation we can find an optimal allocation using one of the many matching algorithms.
Thompson Sampling and COVID testing

Aug 19, 2020

We’ve been doing some work with Delhi on COVID response and thinking a lot about positivity rate and optimal testing. Say you want to catch the maximum number of positive cases but you have no idea what the positivity rates are in each ward within the city but you expect wards close to each other to have similar rates. You have a limited number of tests. How do you optimally allocate these to each ward to maximise the number of positive cases you catch?
Linear Gaussian State Space Models

Jun 20, 2020

State space models (SSM) are a tonne of fun. I sneaked one into a this post I did a while ago. In that post, I was recreating an analysis but using a state space model where the hidden state, the true $\beta$s were following a Gaussian Random Walk and what we observed was the growth in GDP. In this post, I’m going to explore a generalised version of the model - the linear-Gaussian SSM (LG-SSM).
Large-Scale Hypothesis Testing (Part 2)

Apr 12, 2020

In part 1, we looked at Empirical Bayes Large-Scale Testing where we defined the data generating process as a mixture model. In this post, instead of empirically estimating $S(z)$, we assume it’s a mixture of two gaussian and define a mixture model in pymc3. We finish by considering the local false discovery rate, which has a much cleaner bayesian interpretation.
Large-Scale Hypothesis Testing (Part 1)

Apr 10, 2020

We take a short detour from Bayesian methods to talk about large-scale hypothesis testing. You all are probably quite familiar with the p-hacking controversy and the dangers of multiple testing. This post isn’t about that. What if you are not confirming a single hypothesis but want to find a few interesting “statistically significant” estimates in your data to direct your research?
Jackknife, Non-parametric and Parametric Bootstrap

Jan 23, 2020

In frequentist statistics, you want to know how seriously you should take your estimates. That’s easy if you’re doing something straight forward like averaging:
Logit Choice Model - Estimating on a subset of alternatives

Nov 20, 2019

In the last two posts, we explored some features of the logit choice model. In the first, we looked at systematic taste variation and how that can be accounted for in the model. In the second, we explored one of nice benefits of the IIA assumption - we provided a random subset of alternatives of varying size to each decision maker and were able to use that to estimate the parameters.
Logit Choice Model - Advantages of IIA

Nov 11, 2019

In the last post, we talked about how this property of Independence from Irrelevant Alternatives (IIA) may not be realistic (see red bus / blue bus example). But, say you are comfortable with it and the proportional substitution that it implies, you get to use some nice tricks.
Logit Choice Model

Nov 10, 2019

I’ve been working my way through Kenneth Train’s “Discrete Choice Methods with Simulation” and playing around with the models and examples as I go. Kind of what I did with Mackay’s book. This post and the next have are some key takeaways with code from chapter 3 - Logit.
Implementing ADVI with autograd

Jul 17, 2019

We use things without knowing how they work. Last time my fridge stopped working, I turned it off and on again to see if that fixed it. When it didn’t I promptly called the “fridge guy”. If you don’t know how things work, you don’t know when and how they break, and you definitely don’t know how to fix it.
Back to Basics with David Mackay #4: HMC and Slice sampler - now with animations!

Jun 21, 2019

I just wanted to put up a few animations of HMC and slice samplers that I have been playing around with.
Back to basics with David Mackay #3: Gibbs & Slice samplers

May 4, 2019

In this post, I just implement a Gibbs and a slice sampler for a non-totally-trivial distribution. Both of these are vanilla version – no overrelaxation for Gibbs and no elliptical slice samplers, rectangular hyper-boxes etc. I am hoping you never use these IRL. It is a good intro though.
Back to Basics with David Mackay #2: Fancy k-means

Mar 27, 2019

Following David Mackay’s book along with his videos online has been a real joy. In lecture 11, as an example of an inference problem, he goes over many variations of the k-means algorithm. Let’s check these out.
Back to Basics with David MacKay #1

Mar 6, 2019

There are two ways of learning and building intuition. From the top down, like fast.ai believes, and the bottom up, like Andrew Ng’s deep learning course on coursera. I’m not sure what my preferred strategy is.
Inference and EM (Baum-Welch) for HMM learning

Feb 25, 2019

Last month, I did a post on how you could setup your HMM in pymc3. It was beautiful, it was simple. It was a little too easy. The inference button makes setting up the model a breeze. Just define the likelihoods and let pymc3 figure out the rest.
Hierarchical Hidden Markov Model

Jan 25, 2019

A colleague of mine came across an interesting problem on a project. The client wanted an alarm raised when the number of problem tickets coming in increased “substantialy”, indicating some underlying failure. So there is a some standard rate at which tickets are raised and when something has failed or there is serious problem, a tonne more tickets are raised. Sounds like a perfect problem for a Hidden Markov Model.
Athey's Matrix Completion Methods

Dec 2, 2018

If you want to measure the causal effect of a treatment what you need is a counterfactual. What would have happened to the units if they had not got the treatment? Unless your unit is Gwyneth Paltrow in Sliding Doors, you only observe one state of the world. So the key to causal inference is to reconstruct the untreated state of the world. Athey et al. in their paper show how matrix completion can be used to estimate this unobserved counterfactual world. You can treat the unobserved (untreated) states of the treated units as missing and use a penalized SVD to reconstruct these from the rest of the dataset. If you are familiar with the econometric literature on synthetic controls, fixed effects, or unconfoundedness you should definitely read the paper; it shows these as special cases of matrix completion with the missing data of a specific form. Actually, you should read the paper anyway. Most of it is quite approachable and it’s very insightful.
Another flavour of the waiting time (or inspection) paradox

Nov 14, 2018

David McKay’s Information Theory, Inference, and Learning Algorithms, in addition to being very well written and insightful, has exercises that read like a book of puzzles. Here’s one I came across in chapter 2:
You'll be blown a way by this weird trick millennials discovered to do convergence regressions.

Oct 17, 2018

Hat tip to @mkessler_DC for the clickbaitey title.
Computer Age Statistical Inference (Chapter 9)

Sep 4, 2018

I’ve been reading Efron & Hastie’s Computer Age Statistical Inference (CASI) in my downtime. Actually, I’m doing better than reading. I don’t know why I didn’t think of this earlier - the best way to truly understand the material is to have your favourite statistical package open and actually play around with the examples as you go.
Poisson Density Estimation with Gaussian Processes

Aug 22, 2018

I have been slowly working my way through Efron & Hastie’s Computer Age Statistical Inference (CASI). Electronic copy is freely available and so far it has been a great though at time I get lost in the derivations.
Implementing Fader Hardie (2005) in pymc3

Jul 8, 2018

This posts gives the Fader and Hardie (2005) model the full Bayesian treatment. You can check out the notebook here.
Empirical and Hierarchical Bayes

Jun 15, 2018

In chapter 2 of BDA3, the authors provide an example where they regularize the cancer rates in counties in the US using an empirical Bayesian model. In this post, I repeat the exercise using county level data on suicides using firearms and other means.
What happened in 2006?

May 22, 2018

Anyone else feel that US mass shootings have increased over the past few years? My wife thinks that it’s just availability heuristic at play. Well, luckily there is data out there that we can use to test it. This analysis in this blog uses the dataset from Mother Jones. I did some minor cleaning that you can see in the notebook.
Latent GP and Binomial Likelihood

May 15, 2018

I did a quick intro to gaussian processes a little while back. Check that out if you haven’t.
Playing around with SGDR

Apr 25, 2018

This is an implementation of SGDR based on this paper by Loshchilov and Hutter. Though the cosine annealing is built into PyTorch now which handles the learning rate (LR) decay, the restart schedule and with it the decay rate update is not (though PyTorch 0.4 came out yesterday and I haven’t played with it yet). The notebook that generates the figures in this can be found here.
Gaussian Process Regressions

Apr 3, 2018

This post is an intro to Gaussian Processes.
Mapping with geopandas and friends

Mar 13, 2018

I recently had to create a bunch of maps for work. I did a bunch in d3.js a while back for India for CEA’s office and some (in non-interactive form) were included in the Indian Economic Survey.
Monte Carlo Methods

Mar 11, 2018

I imagine most of you have some idea of Monte Carlo (MC) methods. Here we’ll try and quantify it a little bit.
The connection between Simulated Annealing and MCMC (Part 3)

Mar 10, 2018

Check out part 1 and part 2. Let’s start off by writing the code for the Metropolis algorithm and comparing it to Simulated Annealing.
Why MCMC and a quick markov chains intro

Mar 3, 2018

A lot of this material is from Larry Wasserman’s All of Statistics. I love how the title makes such a bold claim and then quickly hedges by adding the subtitle “A Concise Course in Statistical Inference” (The italic are mine).
The connection between Simulated Annealing and MCMC (Part 2)

Mar 2, 2018

If you didn’t see Part 1, check that out first.
The connection between Simulated Annealing and MCMC (Part 1)

Mar 1, 2018

I was going to dive straight into it but thought I should go over Simulated Annealing (SA) first before connecting them. SA is an heuristic optimization algorithm to find the global minimum of some complex function $f(X)$ which may have a bunch of local ones. Note that $X$ can be vector of length N: $X = [x_1, x_2, …, x_n]$

Sid Ravinutala

Data Scientist

[Survival models] Cox proportional hazard model

[Survival models] Censored data and intro to survival models

Allocation of poll observers to polling stations with networkx

Thompson Sampling and COVID testing

Linear Gaussian State Space Models

Large-Scale Hypothesis Testing (Part 2)

Large-Scale Hypothesis Testing (Part 1)

Jackknife, Non-parametric and Parametric Bootstrap

Logit Choice Model - Estimating on a subset of alternatives

Logit Choice Model - Advantages of IIA

Logit Choice Model

Implementing ADVI with autograd

Back to Basics with David Mackay #4: HMC and Slice sampler - now with animations!

Back to basics with David Mackay #3: Gibbs & Slice samplers

Back to Basics with David Mackay #2: Fancy k-means

Back to Basics with David MacKay #1

Inference and EM (Baum-Welch) for HMM learning

Hierarchical Hidden Markov Model

Athey's Matrix Completion Methods

Another flavour of the waiting time (or inspection) paradox

You'll be blown a way by this weird trick millennials discovered to do convergence regressions.

Computer Age Statistical Inference (Chapter 9)

Poisson Density Estimation with Gaussian Processes

Implementing Fader Hardie (2005) in pymc3

Empirical and Hierarchical Bayes

What happened in 2006?

Latent GP and Binomial Likelihood

Playing around with SGDR

Gaussian Process Regressions

Mapping with geopandas and friends

Monte Carlo Methods

The connection between Simulated Annealing and MCMC (Part 3)

Why MCMC and a quick markov chains intro

The connection between Simulated Annealing and MCMC (Part 2)

The connection between Simulated Annealing and MCMC (Part 1)