# Softball.app Lineup Optimizer Gallery

Runs Monte Carlo simulations of games for all possible lineups. The optimizer then averages the runs scored acrosss all possible lineups and returns the lineup with the highest average runs scored.

### Overview

Simulation flow. This flowchart refers to the parameter *g* which is a configurable value indicating the number of games to simulate. The stages with thick black outlines have sections w/ additional details below.

### The Lineup Pool

This optimizer simulates games for all possible lineups. The number of possible lineups for each lineup type is given by these equations where 'm' is the number of male batters and 'f' is the number of female batters:

#### Standard

#### Alternating Gender

#### No Consecutive Females

#### Performance

Because of the factorial nature of these equations, adding even one more player to the lineup can make the optimizer take significantly longer to run.

Example simulation (7 innings, 10000 games)

# players in lineup | possible lineups | runtime (ms) | ~runtime (human) |
---|---|---|---|

6 | 720 | 1958 | 2 seconds |

7 | 5,040 | 15982 | 16 seconds |

8 | 40,320 | 135255 | 2 minutes |

9 | 362,880 | 832902 | 14 minutes |

10 | 3,628,800 | 6613484 | 2 hours |

### Simulating a game

This flowchart refers to the parameter *i* which is a configurable value indicating the number of innings to simulate (typically 9 for baseball, 7 for softball, but can be anything). Same as before, the stage with the thick black outline has a section w/ additional details below.
Simulate a game

### Simulate a Plate Appearance

Each plate appearance result (Out, SAC, E, BB, 1B, 2B, 3B, HRi, HRo) is mapped to a number indicating the number of bases awarded for that plate appearance. The mapping is illustrated in this table:

Result | Bases |
---|---|

Out, SAC*, E, K | 0 |

1B, BB* | 1 |

2B | 2 |

3B | 3 |

HRi, HRo | 4 |

We can then use the frequency of each type of hit to build a distribution that reflects the way any given player is likely perform when they get a plate appearance. Whenever we need to simulate a hit for that player, we draw a random sample from that player's distribution.

#### An Example

Tim's historical at bats are as follows: Out,1B,2B,SAC,E,HRo,3B,1B,1B,Out,Out,2B,1B,Out,Out

First we translate those hits to number of bases using our mapping from the table above: 0,1,2,0,0,4,3,1,1,0,0,2,1,0,0

Then we determine the histogram and chance of each hit:

# of bases | # of times | % of plate appearances |
---|---|---|

0 | 7 | 47 |

1 | 4 | 27 |

2 | 2 | 13 |

3 | 1 | 7 |

4 | 1 | 7 |

And every time we simulate a plate appearance for Tim, we'll draw a random hit with that distribution. That is to say, for every simulated plate appearance, Tim has a 47% of getting out, 27% chance of getting a single, a 13% chance of getting a double, a 7% chance of getting a triple, and a 7% chance of getting a home run. Of course, other players will have their own distribution of hits to draw from based of their historical performance.

### Other Notes

Things that are not accounted for in the simulation:

- Double/triple plays
- Stolen bases
- Players who were on base advancing more bases than the hitter
- Any pitching data

**We can debate about how walks or sacrifices should be counted. It probably depends on what flavor of the sport you are playing. IMHO sacrifices should be counted as outs in slowpitch softball and kickball, but not baseball or fastpitch. In any event, these mapping are configurable (or will be configurable soon). So you are welcome to impose your own philosophy.*

Employs the same approach as the Monte Carlo Exhaustive optimizer but instead of simulating a fixed number of games for each lineup, performs a variable number of simulated games. The exact number of games simulated for each lineup is determined by continuing to do simulations on a lineup until a statistical t-test determines that the expected run totals for two lineups are significantly different (by some configurable alpha value). The lineup with the lower mean is then rejected and the larger one remembered as the best so far.

A faster (time constrained), less accurate optimizer that doesn't test the entire search space of possible lineups. Instead, it employs simulated annealing to seaerch only a subset of possible lineups. Like the Monte Carlo Adaptive optimizer, this optimizer uses statistical t-tests to determine when a particular lineup is better or worse than another.

Calculates the expected runs scored mathematically up to a specified max number of batters. The number of batters is limited because there there are an infinite number of possibilities i.e. many teams could theoretical bat forever.

Calculates the expected runs scored mathematically up to a specified max number of batters. The number of batters is limited because there there are an infinite number of possibilities i.e. many teams could theoretical bat forever.