A/B Testing, Dynamic MAB, and Causal Inference for In-Game Revenue

Background
This project focuses on a multiplayer online battle arena (MOBA) mobile game played globally by adolescents and young adults aged 15–35. The game is available on both iOS and Android platforms. The goal is to simulate randomized control trials (RCTs) to assess the impact of various experimental treatments on user engagement, retention, and monetization metrics.
Problem Statement
This study investigates whether offering a timed free trial of in-game items significantly impacts the amount of money players spend in the game. It is commonly believed that limited-time previews of premium features can lead to subsequent purchases. Consequently, players who receive free trials may spend more than those who do not.
Parallel Design
In this experiment, a random selection of users will receive a free trial promotion for an in-game skin item that enhances their character's power. League classification is used as a blocking factor to control for potential differences in spending behavior based on in-game proficiency. Advanced and professional players are likely to spend more on premium items compared to casual users. Additionally, income levels and daily playtime are considered as covariates. Income level can indicate a user's purchasing power, while playtime reflects their engagement and commitment to the game, both of which can influence spending habits. People with higher incomes might have a higher ability to purchase add-on items; likewise, more active players might be willing to invest in the game more. The average amount spent on in-game purchases will be compared between groups using the ANCOVA methodology.
Results & Discussions
Free Trial significantly increases spending – users who received a trial spent ~$20.5 more.
League has no impact on spending – No significant difference across leagues.
No interaction effect – Free Trial works similarly across all leagues.
Income has no impact on spending – No significant difference across income levels..
Playtime does not influence spending.
Model fit is weak – suggesting other factors might be more predictive.

The parallel experiment has supported the statistically significant effects of free trial promotion. In other words, receiving a timed free trial of in-game items has a significant effect on the amount of money a player spends in the game, which aligns with the conventional belief that having a limited-time preview of premium features leads to a subsequent purchase of the item. As a result, gamers receiving free trials might spend more than non-promo gamers. Thus, free trial promotion is considered to be an effective strategy to boost revenue for the game. However, the experiments cannot find evidence to support the difference in purchase amount between League or Income groups, even after accounting for the effect of the daily minutes that the gamers play.
Multi-Arm Bandit (MAB)
The MAB experiment is designed to optimize monetization strategies in gaming by testing the impact of free trials on player purchase behavior. Using Thompson sampling methodology, we compared four different variants of our trial strategy to identify which approach maximizes player spending with minimal opportunity cost. The model determines success by measuring both conversion rates and purchase amounts. Conversion denotes a participant who made a purchase, and the purchase amount measures the dollar amount they spent.
Results & Discussions

In the initial iteration, the 100 samples were distributed amongst the 4 variants equally (25 samples per variant). Variant A (no free trial) performed the worst and was the first to be eliminated, resulting in no samples in iteration 6. Variant B (skin item trial) performed the best and was the last remaining variant. Out of the maximum of 20 iterations, the model converged after only 11 iterations. This means the model used a total sample size of 1100.
Although some of the variants did switch positions in performance, variant B remained the highest-performing variant throughout the entire model with an average purchase amount of $76.58, which is ~$26 more than users receiving no promotion ($50.07). This aligns with what we have found before that users who received a trial spent ~$20.5 more than those who did not.
The key advantage of MAB over the parallel design is the ability for the MAB to dynamically test multiple variants and adjust resource allocation in real-time to maximize the outcome of the target variable.
Causal Inference
We went one step beyond the mere correlation to identify a true cause-and-effect relationship between free-trial promotion and in-game revenue by establishing a causal model.

In this model, ‘Income,’ ‘League,’ and ‘Playtime’ are latent variables. ‘Free_Trial’ is the input, and ‘Purchase_Amount’ is the estimand. Meanwhile, ‘League’ is treated as an effect modifier; the effect of ‘Free_Trial’ on the ‘Purchase_Amount’ may vary depending on the player’s league ranking, but the ranking itself does not affect the likelihood of receiving a free trial. ‘Income’ and ‘Playtime’ are covariates accounting for confounding factors that might affect spending.
Results & Discussions
Backdoor Estimand is a successful identification with the unconfoundedness assumption, indicating that given Income and Playtime, there are no unobserved confounders that affect both the treatment and the outcome. Besides, IV Estimand and Frontdoor Estimand can’t be found because there is no unmeasured confounding nor additional valid mediator between Free_Trial and Purchase_Amount.
ATE = 16.52 (from regression) means that, on average, receiving the free trial increases Purchase_Amount by $16.52.
ATE = 15.47 (from PSM) means that after matching treated and control groups based on propensity scores, the estimated increase is $15.47.
The two values being close means the findings are consistent.

The heatmap shows how the estimated effect changes when introducing an unobserved confounder (U) that influences both Free Trial (Treatment) and Purchase Amount (Outcome). The color indicates the estimated effect under different assumptions of U. Although rare, there are a few scenarios with extremely positive/negative outcomes.
Original estimated effect: 16.52
New effect range (after adding an unobserved confounder): [-8.48, 14.76]
The causal effect changes significantly when including U, ranging from a negative effect (-8.48) to a positive effect (14.76). Since the range includes negative values, it's possible that a Free Trial reduces the Purchase Amount under certain confounding scenarios. Also, the original effect (16.52) lies outside this range, suggesting it may not be robust to unobserved confounding.
The causal inference model directly controls for confounders by using propensity score matching. However, this means that any confounders that are not explicitly included in the causal model will not be accounted for. The randomization of the parallel design model potentially limits the effects of unknown confounders through randomization. If confounders are known, the causal model will generally lead to more robust experiment results than the parallel design model.
Conclusion
Through parallel design, multi-arm bandit, and causal model approaches, this study comprehensively explored the impact of offering timed free trials of in-game items on player spending. The parallel design allowed for a controlled comparison between groups receiving free trials and those who did not, revealing that free trials significantly increased spending by approximately $20.5. The multi-arm bandit experiment, utilizing Thompson sampling, efficiently optimized monetization strategies by identifying the most effective trial variant, which also demonstrated that free trials led to higher average purchase amounts. The causal model further established causation between free trials and in-game revenue, accounting for potential confounders like income and playtime, and consistently estimated the average treatment effect (ATE) through both regression and propensity score matching methods. Future research could explore additional variables and more complex models to better understand and optimize in-game monetization strategies.