LoL Draft Analyzer

Predicting match outcomes using only champion selections.

The Challenge: Decoding the Draft

In competitive League of Legends, victory is often decided before the game even begins. The "pick/ban" phase, where teams select their champions, is a complex strategic dance. With over 160 champions, each with unique abilities and synergies, the number of possible team compositions is astronomical. Can a machine learning model cut through this complexity and predict a winner based solely on the 10 champions chosen? This project was my attempt to answer that question.

Project Overview

The LoL Draft Analyzer is a deep neural network designed to predict match outcomes. By training on a massive dataset of past games, the model learns the intricate relationships and power dynamics between champion compositions. It doesn't know about player skill, in-game events, or item builds; its predictions are a pure reflection of the strategic advantage gained during the draft. The model consistently achieves around 55% accuracy, a significant signal in a system with near-infinite variables.

Riot Games API and data processing diagram — Data pipeline: Sourcing match IDs from players, fetching match data via the Riot API, and preprocessing for the model.

Technical Deep Dive

Data Acquisition & Preprocessing

The foundation of any good model is its data. I developed a set of Python scripts to systematically build a dataset. The process begins by fetching a list of high-ranking players, then spidering through their match histories via the official Riot Games API to collect thousands of unique match IDs. Each match is then queried for its outcome and the 10 champions involved. A key preprocessing step, as seen in the code, was to convert champion names into integer IDs for the model's embedding layer. I also balanced the dataset by duplicating every match with the teams and win/loss condition swapped, effectively doubling the data and removing any inherent "blue side" or "red side" advantage from the training set.

The Neural Network Architecture

I chose a sequential, fully-connected deep neural network (DNN) built with TensorFlow. The architecture's key feature is its input layer:

An Embedding Layer that takes the 10 champion IDs as input. Instead of one-hot encoding, this layer learns a dense vector representation for each of the 162 champions. This allows the model to understand abstract relationships, like which champions are similar or work well together, in a multi-dimensional space.
The flattened output is then passed through several Dense Layers with ReLU activation, `BatchNormalization` to stabilize training, and `Dropout` to prevent overfitting.
The final layer is a single neuron with a Sigmoid activation function, outputting a probability between 0 and 1 representing the likelihood of victory for the first team.

An Interesting Hurdle: The Bias of Champion Win-Rates

An early challenge was the inherent bias from individual champion win-rates. Some champions are simply stronger on average, and the model could be tempted to just bet on the team with the "better" champions. I experimented with a data balancing function (`balance_winrates`) to programmatically remove winning games from outlier champions to normalize their win-rates closer to 50%. While the function worked as intended, I found it didn't significantly improve the final model's predictive power and reduced the dataset size. This was a valuable lesson: sometimes, allowing the model to learn the inherent strengths of champions is part of capturing the meta-game, and over-normalizing the data can remove important signals.

View Project on GitHub