Lee Hodg

Category Archives: Data Science

A/B Testing

A/B Tests and Experiment Size

Let’s say you’re running an A/B test. Maybe you want to test how many conversions you will get if you change the design of the Signup page or the wording. You split users landing on your site into 2 groups – the control group and the experimental group. Those in the control group see the […]

NLP Pipelines with NLTK

Often with Natural Language Processing (NLP) applications a pipeline is useful to take the raw text and process it and extract relevant features before inputting it into a machine learning (ML) algorithm. Normalization From the standpoint of an ML algorithm, it may not make much sense to differentiate between different cases of a word – […]

Jupyter x AWS

Setting up Jupyterhub on AWS

This guide will be about setting up the fiddly bits when deploying a Jupyter Hub to an AWS instance. It won’t go into explicit detail about absolutely every step as the docs already do a great job of that. The purpose of this post is to discuss the things I found tricky after the install […]