Genetic Algorithms (GAs) can assist finding optimal or near-optimal combinations. They can significantly reduce the development time and execution time to find a good solution.

Continuing from Part 2 which shows a concrete example of how to find the minimum of a quadratic using GAs, this section shows one way to find great fantasy-football lineups using data from FanDuel. The main objective here is to find a great solution, quickly.

FanDuel Lineup Problem

Fantasy football is a game where individuals pick players in the NFL for their team. Similar to Rotisserie Baseball, the picked players earn points based on their performance in real life. Many websites let people bet against one another to see who can create the best lineup, or a set of players.

FanDuel is one such betting site. A typical lineup, consists of 1 quarterback (QB), 2 running backs (RB), 3 wide receivers (WR), 1 tight end, (TE), 1 defense (D), and 1 kicker (k). Each player also has a “salary” associated with them which contributes to the salary limit a user has for any given lineup, further increasing the difficulty of the problem. It’s not just about picking the best players, it’s about picking the right best players and the right “sleepers” who cost less but will still perform.

Using an example spreadsheet FanDuel provided for the games on 2016-11-17, there were 28 teams to choose from. That means, with a lineup of 1QB, 2RB, 3WR, 1TE, 1D, and 1K, and the possibility of choosing between 89QBs, 168RBs, 206WRs, 141 TEs, 28Ds, and 28Ks, there are at most 89 * 168 * 167 * 206 * 205 * 204 * 141 * 28 * 28 = 2,377,947,070,807,096,320 combinations of possible lineups. Even if a computer could filter all of the valid lineups and calculate the total score at 1,000,000 lineups per second, it would still take over 27,522,535 days. Clearly, that won’t work.

Genetic algorithms are kick-ass, though! Let’s find some good solutions in a few seconds.

Read FanDuel Data Into Memory

FanDuel allows users to download player information as a CSV file, so the first step is loading it into memory and making logical Player and Lineup data structures:

import csv class FanDuelPlayer: def __init__(self, name: str, salary: int, projected_points: float, team: str, position: str, injured: bool): self.name = name self.salary = salary self.projected_points = projected_points self.team = team self.position = position self.injured = injured def __repr__(self): return '{} ({})'.format(self.name, self.position) class Lineup: def __init__(self, players: [FanDuelPlayer]): self.players = players def count(self, player: FanDuelPlayer): return self.players.count(player) def index(self, player: FanDuelPlayer): return self.players.index(player) def __len__(self): return len(self.players) def __iter__(self): return iter(self.players) def __getitem__(self, item): return self.players[item] def __setitem__(self, key, value): self.players[key] = value def _read_fanduel_data(filename: str) -> [FanDuelPlayer]: with open(filename, 'r') as f: reader = csv.DictReader(f) return [ FanDuelPlayer( name=" ".join([row['First Name'], row['Last Name']]), salary=int(row['Salary']), projected_points=float(row['FPPG']), team=row['Team'], position=row['Position'], injured=row['Injury Indicator'] in ['', 'P'] # Players with no injury or who are probable are "healthy" ) for row in reader ]

Defining Search Problem

Now on to the FanduelFootbalGeneticAlgorithm class. For this, I’m going to make the class and separate out players by their position just so it’s easier to look them up later. This can be optimized, but for readability, I’m going to structure the separation like so:

class FanDuelFootballGA(GeneticAlgorithmSearch): def __init__(self, filename: str, salary_cap=60000, **kwargs): GeneticAlgorithmSearch.__init__(self, kwargs) self._population_size = 20 self._salary_cap = salary_cap self._all_players = _read_fanduel_data(filename) self._qb_list = [x for x in self._all_players if x.position == 'QB'] self._rb_list = [x for x in self._all_players if x.position == 'RB'] self._wr_list = [x for x in self._all_players if x.position == 'WR'] self._te_list = [x for x in self._all_players if x.position == 'TE'] self._d_list = [x for x in self._all_players if x.position == 'D'] self._k_list = [x for x in self._all_players if x.position == 'K'] self.position_to_player_list_map = { 'QB': self._qb_list, 'RB': self._rb_list, 'WR': self._wr_list, 'TE': self._te_list, 'D': self._d_list, 'K': self._k_list, }

Initial Population

Each individual for the GA will be a full lineup:

def _generate_initial_population(self) -> [Lineup]: population = [] for _ in range(self._population_size): lineup = [random.choice(self._qb_list)] lineup.extend(random.sample(self._rb_list, 2)) lineup.extend(random.sample(self._wr_list, 3)) lineup.append(random.choice(self._te_list)) lineup.append(random.choice(self._d_list)) lineup.append(random.choice(self._k_list)) population.append(Lineup(lineup)) return population

Crossover

There are many was to conduct a crossover. For this example, and for simplicity, the lineups are stored as lists instead of a set of players. It makes my implementation of the crossover easier:

def _handle_crossover_between(self, chromosome1: Lineup, chromosome2: Lineup) -> Lineup: crossover_index = random.randint(0, len(chromosome1)) new_lineup = Lineup(chromosome1[:crossover_index] + chromosome2[crossover_index:]) # checking running backs while len({player for player in new_lineup if player.position == 'RB'}) < 2: self.__replace_duplicate_player_in_lineup(new_lineup, 'RB') # checking wide receivers while len({player for player in new_lineup if player.position == 'WR'}) < 3: self.__replace_duplicate_player_in_lineup(new_lineup, 'WR') return new_lineup

Lines 2 and 3 are the meat of the crossover. Essentially I’m picking a random index to split the list of each lineup coming in and squishing them together to make a new lineup. This does mean that my crossover rate is 100%.

This resulting lineup can contain duplicate RBs or WRs. I added a few extra lines to ensure duplicate players are replaced with another random player of the same position that’s not already in the lineup.

Mutation

Here, I randomly swap each gene, or player, in the lineup with another player of the same position based on the mutation rate.

def _handle_mutation_in(self, lineup: Lineup) -> Lineup: for index, player in enumerate(lineup): if random.randint(0, 100) < self.mutation_rate: # picking new random player of same position not already in lineup lineup[index] = self.__randomly_choose_player_not_in(lineup, player.position) return lineup

Filtering Valid Lineups

Remember how in the search code I had placed an option to keep an offspring or not? It wasn’t needed in the Quadratic GA example, but this problem has extra constraints. We must respect the limits of a lineup as FanDuel dictates: that the lineup doesn’t exceed the salary cap, that no lineup can have more than 3 players from the same team, and each lineup must consist of 1QB, 2RB, 3WR, 1TE, 1D, and 1K. If the lineup isn’t valid, the offspring will be discarded and the reproduction cycle will start again and create another individual to fill its place.

def _should_exclude(self, lineup: Lineup) -> bool: def count_position(position: str) -> bool: return sum(player.position == position for player in lineup) return not ( sum(player.salary for player in lineup) < self._salary_cap and count_position('QB') == 1 and count_position('RB') == 2 and count_position('WR') == 3 and count_position('TE') == 1 and count_position('D') == 1 and count_position('K') == 1 and len(set(lineup)) == 9 and max(sum(1 for player in lineup if player.team == team) for team in {player.team for player in lineup}) <= 3 )

Evaluation

Evaluation is really the sum of the projected fantasy points – easy:

def _evaluate_chromosome(self, lineup: Lineup) -> float: return sum(player.projected_points for player in lineup)

Putting It All Together

Dang, now that was pretty easy, right? Now, let’s see what kind of lineups we get as it runs!

Starting generation 0 Best score so far = 66.50055554178026 ... Starting generation 99 Best score so far = 138.4497244093153 Best Lineup $59800 Tyrod Taylor (QB) David Johnson (RB) LeSean McCoy (RB) A.J. Green (WR) Mike Evans (WR) Stefon Diggs (WR) Kyle Rudolph (TE) Philadelphia Eagles (D) Caleb Sturgis (K)

Now anyone who knows standard scoring in fantasy football will tell you that 138 points is pretty good! This lineup came in $200 under budget, and it’s valid. The best part is this calculation took about a second to run on my little laptop using a single core. Eat your heart out, Brute Force!

Like I said before though, GAs are not guaranteed to give you the absolute best solution, only a chance at a pretty good one. When I run it again, I get this lineup instead:

Starting generation 99 Best score so far = 141.2952783675421 Best Lineup $59500 Colin Kaepernick (QB) David Johnson (RB) Dion Lewis (RB) A.J. Green (WR) Mike Evans (WR) Larry Fitzgerald (WR) Greg Olsen (TE) Minnesota Vikings (D) Caleb Sturgis (K)

An even higher score for less money! But, it’s asking me to trust Kaepernick, and I…. I just can’t. It’s okay to disagree with your computer. Don’t let anyone tell you differently!

Playing around and tuning some of the metrics, I find that a population of 20 lineups for 100 generations using 10% mutation rate is sufficient to finding a pretty good lineup – certainly something better than I would makeup myself and in a much smaller amount of time. Increasing the population by a factor of two didn’t seem to have an effect on the outcome, nor did increasing the number of generations by a factor of 10. Increasing the mutation rate only seemed to stall convergence, probably because the number of genes is so small.

Now before you go out and try to win some money on FanDuel yourself, remember that fantasy-sports predictions are notoriously inaccurate when compared to the actual outcomes of a player’s performance. Garbage coming in; garbage going out. Take any prediction of any lineup with a grain of salt….. or, personally, I would recommend a gram.

See the code on GitHub.