<?xml version="1.0" encoding="UTF-8"?>
<rss  xmlns:atom="http://www.w3.org/2005/Atom" 
      xmlns:media="http://search.yahoo.com/mrss/" 
      xmlns:content="http://purl.org/rss/1.0/modules/content/" 
      xmlns:dc="http://purl.org/dc/elements/1.1/" 
      version="2.0">
<channel>
<title>Rob Donnelly</title>
<link>https://robdonnelly.me/posts.html</link>
<atom:link href="https://robdonnelly.me/posts.xml" rel="self" type="application/rss+xml"/>
<description>Rob Donnelly&#39;s personal website</description>
<generator>quarto-1.6.40</generator>
<lastBuildDate>Sun, 15 Jan 2023 08:00:00 GMT</lastBuildDate>
<item>
  <title>A Simpler Alternative to X-Learner for Uplift Modeling</title>
  <link>https://robdonnelly.me/posts/2023-01-15-simple-x-learner/2023-01-15-simple-x-learner.html</link>
  <description><![CDATA[ 





<p>Meta-learners like S-Learner, T-Learner, and X-Learner are some of the most widely used approaches for Uplift modeling. When teaching about these approaches, I find that students often find the X-learner model somewhat confusing to understand. In this post, I describe a modified approach I call simplified X-learner (Xs-learner) that is easier to understand, faster to implement, and in my experience often works as well or better in practice.</p>
<section id="uplift-modeling" class="level2">
<h2 class="anchored" data-anchor-id="uplift-modeling">Uplift Modeling</h2>
<p>A/B testing is a common method used at tech companies to make informed decisions. For example, imagine you want to send out a coupon to users and you want to know how much it will increase the chances of them completing their first order with your service. By running an A/B test, you can determine on average how effective the coupon is. However, you may also want to know which users the coupon will help you generate higher profits and which users the coupon will cause you to lose money.</p>
<p>Uplift modeling is a technique that lets us go beyond learning the average effect of a treatment and instead helps us understand how the effect of the treatment varies across your users. This allows us to more efficiently decide which treatment to send to each user.</p>
</section>
<section id="meta-learners" class="level2">
<h2 class="anchored" data-anchor-id="meta-learners">Meta-learners</h2>
<p>Some of the most common approaches for solving uplift problems are known as meta-learners, because they are ways to take existing supervised learning algorithms and using their predictions in order to make estimates of the treatment effect for each user.</p>
<p>I’ll be demonstrating each of these approaches using a dataset from Lenta, a large Russian grocery store that sent out text messages to their users and saw whether it would increase their probability of making a purchase. In each of the examples I will be using the following notation:</p>
<ul>
<li>Y: Did the user make a purchase (the outcome variable)</li>
<li>T: Did the user receive a text message(the treatment variable)</li>
<li>X: All the other information we know about the user, e.g.&nbsp;age, gender, purchase history. (The Lenta dataset has almost 200 features describing each user)</li>
</ul>
<div class="sourceCode" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> matplotlib.pyplot <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> plt</span>
<span id="cb1-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> numpy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> np</span>
<span id="cb1-3"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> pandas <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> pd</span>
<span id="cb1-4"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> seaborn <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> sns</span>
<span id="cb1-5"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> sklearn.model_selection <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> train_test_split</span>
<span id="cb1-6"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> sklearn.preprocessing <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> OneHotEncoder</span>
<span id="cb1-7"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> xgboost <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> XGBClassifier, XGBRegressor</span>
<span id="cb1-8"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> sklift.datasets <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> fetch_lenta</span>
<span id="cb1-9"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> sklift.viz <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> plot_qini_curve</span>
<span id="cb1-10"></span>
<span id="cb1-11"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> numpy.random <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> default_rng</span>
<span id="cb1-12">rng <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> default_rng()</span></code></pre></div>
<p>We’ll use the sklift package, which has a useful function that helps download the data for the Lenta uplift experiment and do some basic processing of the data.</p>
<div class="sourceCode" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1">data <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> fetch_lenta()</span>
<span id="cb2-2">Y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> data[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'target_name'</span>]</span>
<span id="cb2-3">X <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> data[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'feature_names'</span>]</span>
<span id="cb2-4"></span>
<span id="cb2-5">df <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.concat([data[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'target'</span>], data[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'treatment'</span>], data[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'data'</span>]], axis<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb2-6">gender_map <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> {<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'Ж'</span>: <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'М'</span>: <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>}</span>
<span id="cb2-7">group_map <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> {<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'test'</span>: <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'control'</span>: <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>}</span>
<span id="cb2-8">df[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'gender'</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> df[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'gender'</span>].<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">map</span>(gender_map)</span>
<span id="cb2-9">df[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'treatment'</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> df[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'group'</span>].<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">map</span>(group_map)</span>
<span id="cb2-10">T <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'treatment'</span></span>
<span id="cb2-11"></span>
<span id="cb2-12"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Split our data into a training and an evaluation sample</span></span>
<span id="cb2-13"></span>
<span id="cb2-14">df_train, df_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> train_test_split(df, test_size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span>, random_state<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">42</span>)</span></code></pre></div>
<section id="s-learner" class="level3">
<h3 class="anchored" data-anchor-id="s-learner">S-Learner</h3>
<p>S-learner is the simplest and easiest to understand of these approaches. With S-learner you fit a single machine learning model using all of your data, with the treatment variable (did you get a text message) as one of the features. You can then use this model to predict “what would happen if the user got the text” and “what would happen if the user did not get the text”. The difference between these two predictions is your estimate of the treatment effect of the text message on the user.</p>
<p>In all my examples, I use XGBoost as a simple and effective baseline ML model that is fast to train and generally works well on many problems. In any real world problem you should be testing more than one type of model and should be doing cross validation to find hyperparameters that work well for your particular problem.</p>
<div class="sourceCode" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1">slearner <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> XGBClassifier()</span>
<span id="cb3-2">slearner.fit(df_train[X<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span>[T]], df_train[Y])</span>
<span id="cb3-3"></span>
<span id="cb3-4"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Calculate the difference in predictions when T=1 vs T=0</span></span>
<span id="cb3-5"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># This is our estimate of the effect of the coupon for each user in our data</span></span>
<span id="cb3-6">slearner_te <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> slearner.predict_proba(df_test[X].assign(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span>{T: <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>}))[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">\</span></span>
<span id="cb3-7">            <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> slearner.predict_proba(df_test[X].assign(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span>{T: <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>}))[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span></code></pre></div>
<p>One downside of the S-learner model is that there is nothing that tells the model to give special attention to the treatment variable. This means that often your machine learning model will focus on other variables that are stronger predictors of the outcome and end up ignoring the effect of the treatment. This means that on average your estimates of the treatment will be biased towards 0.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://robdonnelly.me/posts/2023-01-15-simple-x-learner/x-learner-s.webp" class="img-fluid figure-img"></p>
<figcaption>S-learner treatment effect distribution</figcaption>
</figure>
</div>
</section>
<section id="t-learner" class="level3">
<h3 class="anchored" data-anchor-id="t-learner">T-learner</h3>
<p>T-learner uses two separate models. The first model looks only at the users who did not receive the coupon. The second model looks only at the users who did receive the coupon. To predict the treatment effect, we take the difference between the predictions of these two models. T-learner essentially forces your models to pay attention to the treatment variable since you make sure that each of the models only focuses on either the treated or untreated observations in your data.</p>
<div class="sourceCode" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1">tlearner_0 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> XGBClassifier()</span>
<span id="cb4-2">tlearner_1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> XGBClassifier()</span>
<span id="cb4-3"></span>
<span id="cb4-4"></span>
<span id="cb4-5"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Split data into treated and untreated</span></span>
<span id="cb4-6">df_train_0 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> df_train[df_train[T] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]</span>
<span id="cb4-7">df_train_1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> df_train[df_train[T] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb4-8"></span>
<span id="cb4-9"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Fit the models on each sample</span></span>
<span id="cb4-10">tlearner_0.fit(df_train_0[X], df_train_0[Y])</span>
<span id="cb4-11">tlearner_1.fit(df_train_1[X], df_train_1[Y])</span>
<span id="cb4-12"></span>
<span id="cb4-13"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Calculate the difference in predictions</span></span>
<span id="cb4-14">tlearner_te <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tlearner_1.predict_proba[df_test[X]](:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">\</span></span>
<span id="cb4-15">            <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> tlearner_0.predict_proba[df_test[X]](:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span></code></pre></div>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://robdonnelly.me/posts/2023-01-15-simple-x-learner/x-learner-t.webp" class="img-fluid figure-img"></p>
<figcaption>T-learner treatment effect distribution</figcaption>
</figure>
</div>
</section>
<section id="simplified-x-learner-xs-learner" class="level3">
<h3 class="anchored" data-anchor-id="simplified-x-learner-xs-learner">Simplified X-learner (Xs-learner)</h3>
<p>The simplified X-learner use 3 models to form its predictions. The first two are exactly the same models we used for T-learner: one model trained only using the treated observations, and the other model trained using only the untreated observations.</p>
<p>With T-learner we formed our treatment effect estimates by taking the difference between the predictions of these two models (predicted outcome when treated minus predicted outcome when untreated). The Xs-learner takes the actual outcome of the user under the treatment their received and compares that to the predicted outcome if they received the other treatment (actual outcome minus predicted outcome).</p>
<div class="sourceCode" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># We could also just reuse the models we made for the T-learner</span></span>
<span id="cb5-2">xlearner_0 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> XGBClassifier()</span>
<span id="cb5-3">xlearner_1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> XGBClassifier()</span>
<span id="cb5-4"></span>
<span id="cb5-5"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Split data into treated and untreated</span></span>
<span id="cb5-6">df_train_0 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> df_train[df_train[T] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]</span>
<span id="cb5-7">df_train_1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> df_train[df_train[T] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb5-8"></span>
<span id="cb5-9"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Fit the models on each sample</span></span>
<span id="cb5-10">xlearner_0.fit(df_train_0[X], df_train_0[Y])</span>
<span id="cb5-11">xlearner_1.fit(df_train_1[X], df_train_1[Y])</span>
<span id="cb5-12"></span>
<span id="cb5-13"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Calculate the difference between actual outcomes and predictions</span></span>
<span id="cb5-14">xlearner_te_0 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> xlearner_1.predict_proba[df_train_0[X]](:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> df_train_0[Y]</span>
<span id="cb5-15">xlearner_te_1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> df_train_1[Y] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> xlearner_0.predict_proba[df_train_1[X]](:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span></code></pre></div>
<p>We can’t use these differences directly, because we would not be able to make predictions for any new users since we wouldn’t know the actual outcomes for these new users. So we need to train one more model. This model predicts the treatment effect as a function of the X variables.</p>
<div class="sourceCode" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Even though the outcome is binary, the treatment effects are continuous</span></span>
<span id="cb6-2">xlearner_combined <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> XGBRegressor()</span>
<span id="cb6-3"></span>
<span id="cb6-4"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Fit the combined model</span></span>
<span id="cb6-5">xlearner_combined.fit(</span>
<span id="cb6-6">  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Stack the X variables for the treated and untreated users</span></span>
<span id="cb6-7">  pd.concat[[df_train_0, df_train_1]](X),</span>
<span id="cb6-8">  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Stack the X-learner treatment effects for treated and untreated users</span></span>
<span id="cb6-9">  pd.concat([xlearner_te_0, xlearner_te_1])</span>
<span id="cb6-10">)</span>
<span id="cb6-11"></span>
<span id="cb6-12"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Predict treatment effects for each user</span></span>
<span id="cb6-13">xlearner_simple_te <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> xlearner_combined.predict(df_test[X])</span></code></pre></div>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://robdonnelly.me/posts/2023-01-15-simple-x-learner/x-learner-xs.webp" class="img-fluid figure-img"></p>
<figcaption>Xs-learner treatment effect distribution</figcaption>
</figure>
</div>
</section>
<section id="full-x-learner" class="level3">
<h3 class="anchored" data-anchor-id="full-x-learner">Full X-learner</h3>
<p>The simplified X-Learner required 3 ML models. The full X-learner as originally proposed by Künzel et al.&nbsp;requires 5 ML models.</p>
<p>Instead of fitting one combined model that predicts the treatment effects for everyone, the full X-learner uses two separate models, one for the treated users and one for the untreated users. This gives us two difference models that can predict treatment effects for new users. Künzel et al.&nbsp;recommend taking a weighted average of the two models, with the weights determined by a final propensity score model that predicts the probability of receiving the treatment.</p>
<div class="sourceCode" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb7-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Define the new models that are not used in the simple version</span></span>
<span id="cb7-2">xlearner_te_model_0 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> XGBRegressor()</span>
<span id="cb7-3">xlearner_te_model_1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> XGBRegressor()</span>
<span id="cb7-4">xlearner_propensity <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> XGBClassifier()</span>
<span id="cb7-5"></span>
<span id="cb7-6">xlearner_te_model_0.fit(df_train_0[X], xlearner_te_0)</span>
<span id="cb7-7">xlearner_te_model_1.fit(df_train_1[X], xlearner_te_1)</span>
<span id="cb7-8"></span>
<span id="cb7-9"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Calculate predictions from both models</span></span>
<span id="cb7-10">xlearner_te_model_0_te <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> xlearner_te_model_0.predict(df_test[X])</span>
<span id="cb7-11">xlearner_te_model_1_te <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> xlearner_te_model_1.predict(df_test[X])</span>
<span id="cb7-12"></span>
<span id="cb7-13"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Calculate the propensity scores</span></span>
<span id="cb7-14">xlearner_propensity.fit(df_train[X], df_train[T])</span>
<span id="cb7-15">xlearner_propensities <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> xlearner_propensity.predict_proba[df_test[X]](:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb7-16"></span>
<span id="cb7-17"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Calculate the treatment effects as propensity weighted average</span></span>
<span id="cb7-18">xlearner_te <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> xlearner_propensities <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>xlearner_te_model_0_te <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> xlearner_propensities)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> xlearner_te_model_1_te</span></code></pre></div>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://robdonnelly.me/posts/2023-01-15-simple-x-learner/x-learner-x.webp" class="img-fluid figure-img"></p>
<figcaption>X-learner treatment effect distribution</figcaption>
</figure>
</div>
</section>
</section>
<section id="comparing-the-results" class="level1">
<h1>Comparing the&nbsp;Results</h1>
<p>We can compare the performance of each of these models using our held-out test set data. Here I am using Qini plots, which are a common approach for comparing the performance of Uplift models. Similar to an ROC curve, the higher the model’s line goes above the diagonal, the better the performance.</p>
<div class="sourceCode" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb8-1">fig, ax <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> plt.subplots(figsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>))</span>
<span id="cb8-2"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> plot_qini_short(model, label, color, linestyle):</span>
<span id="cb8-3">    plot_qini_curve(df_test[Y], model, df_test[T], name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>label, ax<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>ax, perfect<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">False</span>, color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>color, linestyle<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>linestyle)</span>
<span id="cb8-4">plot_qini_short(slearner_te, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'Slearner'</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'blue'</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'solid'</span>)</span>
<span id="cb8-5">plot_qini_short(tlearner_te, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'Tlearner'</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'red'</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'solid'</span>)</span>
<span id="cb8-6">plot_qini_short(xlearner_simple_te, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'Xlearner Simple'</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'purple'</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'solid'</span>)</span>
<span id="cb8-7">plot_qini_short(xlearner_te, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'Xlearner'</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'green'</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'solid'</span>)</span>
<span id="cb8-8">ax.legend(loc<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'lower right'</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">;</span></span></code></pre></div>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://robdonnelly.me/posts/2023-01-15-simple-x-learner/uplift.webp" class="img-fluid figure-img"></p>
<figcaption>uplift curve</figcaption>
</figure>
</div>
<p>For this particular dataset, the simplified X-Learner had the best overall performance.</p>
<p>We shouldn’t draw any strong conclusions about the relative performance of difference algorithms from this single example. In my experience, which algorithm works best varies a lot depending on the specific problem you are working on. However, I do think that this example demonstrates that the simplified X-Learner (Xs-learner) is one more approach worth considering when working on Uplift problems.</p>
</section>
<section id="references" class="level1">
<h1>References</h1>
<ul>
<li>Athey, Susan, and Guido W. Imbens. Machine learning for estimating heterogeneous causal effects. №3350. 2015. <a href="https://www.gsb.stanford.edu/faculty-research/working-papers/machine-learning-estimating-heretogeneous-casual-effects" class="uri">https://www.gsb.stanford.edu/faculty-research/working-papers/machine-learning-estimating-heretogeneous-casual-effects</a></li>
<li>Künzel, Sören R., et al.&nbsp;“Metalearners for estimating heterogeneous treatment effects using machine learning.” Proceedings of the national academy of sciences 116.10 (2019): 4156–4165. <a href="http://sekhon.berkeley.edu/papers/x-learner.pdf" class="uri">http://sekhon.berkeley.edu/papers/x-learner.pdf</a></li>
<li>Gutierrez, Pierre, and Jean-Yves Gérardy. “Causal inference and uplift modelling: A review of the literature.” International conference on predictive applications and APIs. PMLR, 2017. <a href="http://proceedings.mlr.press/v67/gutierrez17a/gutierrez17a.pdf" class="uri">http://proceedings.mlr.press/v67/gutierrez17a/gutierrez17a.pdf</a></li>
</ul>


</section>

 ]]></description>
  <guid>https://robdonnelly.me/posts/2023-01-15-simple-x-learner/2023-01-15-simple-x-learner.html</guid>
  <pubDate>Sun, 15 Jan 2023 08:00:00 GMT</pubDate>
  <media:content url="https://robdonnelly.me/posts/2023-01-15-simple-x-learner/uplift.webp" medium="image" type="image/webp"/>
</item>
</channel>
</rss>
