pulp constraint: exactly N in a category, up to X of a second choice, same category
Question:
I have a problem I’m trying to solve for where I want N players from one team, and up to X players from a second team, but I don’t particularly care which team fills those constraints. For example, if N=5 and X=2, I could have 5 from one team and up to 2 from a second, different, team. How would I write such a constraint?
example dataframe:
team
pos
name
ceil
salary
0
NYY
OF
Aaron Judge
21.6631
6500
1
HOU
OF
Yordan Alvarez
21.6404
6100
2
ATL
OF
Ronald Acuna Jr.
21.5363
5400
3
HOU
OF
Kyle Tucker
20.0992
4700
4
TOR
1B
Vladimir Guerrero Jr.
20.0722
6000
5
LAD
SS
Trea Turner
20.0256
5700
6
LAD
OF
Mookie Betts
19.5231
6300
7
SEA
OF
Julio Rodriguez
19.3694
5200
8
MIN
OF
Byron Buxton
19.3412
5600
9
LAD
1B
Freddie Freeman
19.3393
5600
10
TOR
OF
George Springer
19.1429
5100
11
NYM
OF
Starling Marte
19.0791
5200
12
ATL
1B
Matt Olson
19.009
4800
13
ATL
3B
Austin Riley
18.9091
5200
14
SF
OF
Austin Slater
18.9052
3700
15
NYM
1B
Pete Alonso
18.8921
5700
16
TEX
OF
Adolis Garcia
18.7115
4200
17
TEX
SS
Corey Seager
18.6957
5100
18
TOR
OF
Teoscar Hernandez
18.6834
5200
19
CWS
1B
Jose Abreu
18.497
4600
20
ATL
SS
Dansby Swanson
18.4679
4900
21
TEX
2B/SS
Marcus Semien
18.4389
4100
22
NYY
1B
Anthony Rizzo
18.4383
5300
23
NYY
2B
Gleyber Torres
18.39
4500
24
CHC
C
Willson Contreras
18.3452
5800
existing code snippet:
#problem definition
prob = LpProblem(name="DFS", sense=LpMaximize)
prob += lpSum(player_vars[i] * slate['ceil'].iloc[i] for i in player_ids), "FPTS"
#salary and total player constraints
prob += lpSum(player_vars[i] * slate['salary'].iloc[i] for i in player_ids) <= 50000, "Salary"
prob += lpSum(player_vars[i] for i in player_ids) == 10, "Total Players"
#position constraints
prob += lpSum(player_vars[i] for i in player_ids if slate['pos'].iloc[i] == 'P') == 2, "Pitcher"
prob += lpSum([player_vars[i] for i in player_ids if slate['name'].iloc[i] in slate['name'][slate['pos'].str.contains('C')].to_list()]) == 1, "Catcher"
prob += lpSum([player_vars[i] for i in player_ids if slate['name'].iloc[i] in slate['name'][slate['pos'].str.contains('1B')].to_list()]) == 1, "1B"
prob += lpSum([player_vars[i] for i in player_ids if slate['name'].iloc[i] in slate['name'][slate['pos'].str.contains('2B')].to_list()]) == 1, "2B"
prob += lpSum([player_vars[i] for i in player_ids if slate['name'].iloc[i] in slate['name'][slate['pos'].str.contains('3B')].to_list()]) == 1, "3B"
prob += lpSum([player_vars[i] for i in player_ids if slate['name'].iloc[i] in slate['name'][slate['pos'].str.contains('SS')].to_list()]) == 1, "SS"
prob += lpSum([player_vars[i] for i in player_ids if slate['name'].iloc[i] in slate['name'][slate['pos'].str.contains('OF')].to_list()]) == 3, "OF"
#no opposing pitcher constraint
for pid in player_ids:
if slate['pos'].iloc[pid] == 'P':
prob += lpSum([player_vars[i] for i in player_ids if
slate['team'].iloc[i] == slate['opp'].iloc[pid]] + [9 * player_vars[pid]]) <= 9, "P{pid}".format(pid=pid)
#three team max constraint
unique_teams = slate['team'].unique()
player_in_team = slate['team'].str.get_dummies()
team_vars = LpVariable.dicts('team', unique_teams, cat = 'Binary')
for team in unique_teams:
prob += lpSum([player_in_team[team][i] * player_vars[i] for i in player_ids if slate['pos'].iloc[i] != 'P']) >= team_vars[team], "Team{team}Min".format(team=team)
prob += lpSum(team_vars[team] for team in unique_teams) == 3, "3 Teams"
Answers:
Here’s how I would attack this… In pseudocode….
-
make a set of teams that you can use to index a couple of new variables.
-
Make subsets of your players grouped by team, or use pandas
data frame filters to limit summations of players to the team of interest.
-
Make 2 new variables, that are binary "indicator" variables, one call it use5from[team]
and one called use[team]
to indicate that the team has been used at all.
-
Make appropriate constraints to link those to the selection variables. Something like:
for team in teams:
5 * use5from[team] <= pulp.lpSum(x[i] for i in team[i])
- And for the other, a constraint to indicate any use…
for team in teams:
use[team] <= pulp.lpSum(x[i] for i in team[I])
- And then make constraints that those two variables sum to over 1 and 3 respectively.
I have a problem I’m trying to solve for where I want N players from one team, and up to X players from a second team, but I don’t particularly care which team fills those constraints. For example, if N=5 and X=2, I could have 5 from one team and up to 2 from a second, different, team. How would I write such a constraint?
example dataframe:
team | pos | name | ceil | salary | |
---|---|---|---|---|---|
0 | NYY | OF | Aaron Judge | 21.6631 | 6500 |
1 | HOU | OF | Yordan Alvarez | 21.6404 | 6100 |
2 | ATL | OF | Ronald Acuna Jr. | 21.5363 | 5400 |
3 | HOU | OF | Kyle Tucker | 20.0992 | 4700 |
4 | TOR | 1B | Vladimir Guerrero Jr. | 20.0722 | 6000 |
5 | LAD | SS | Trea Turner | 20.0256 | 5700 |
6 | LAD | OF | Mookie Betts | 19.5231 | 6300 |
7 | SEA | OF | Julio Rodriguez | 19.3694 | 5200 |
8 | MIN | OF | Byron Buxton | 19.3412 | 5600 |
9 | LAD | 1B | Freddie Freeman | 19.3393 | 5600 |
10 | TOR | OF | George Springer | 19.1429 | 5100 |
11 | NYM | OF | Starling Marte | 19.0791 | 5200 |
12 | ATL | 1B | Matt Olson | 19.009 | 4800 |
13 | ATL | 3B | Austin Riley | 18.9091 | 5200 |
14 | SF | OF | Austin Slater | 18.9052 | 3700 |
15 | NYM | 1B | Pete Alonso | 18.8921 | 5700 |
16 | TEX | OF | Adolis Garcia | 18.7115 | 4200 |
17 | TEX | SS | Corey Seager | 18.6957 | 5100 |
18 | TOR | OF | Teoscar Hernandez | 18.6834 | 5200 |
19 | CWS | 1B | Jose Abreu | 18.497 | 4600 |
20 | ATL | SS | Dansby Swanson | 18.4679 | 4900 |
21 | TEX | 2B/SS | Marcus Semien | 18.4389 | 4100 |
22 | NYY | 1B | Anthony Rizzo | 18.4383 | 5300 |
23 | NYY | 2B | Gleyber Torres | 18.39 | 4500 |
24 | CHC | C | Willson Contreras | 18.3452 | 5800 |
existing code snippet:
#problem definition
prob = LpProblem(name="DFS", sense=LpMaximize)
prob += lpSum(player_vars[i] * slate['ceil'].iloc[i] for i in player_ids), "FPTS"
#salary and total player constraints
prob += lpSum(player_vars[i] * slate['salary'].iloc[i] for i in player_ids) <= 50000, "Salary"
prob += lpSum(player_vars[i] for i in player_ids) == 10, "Total Players"
#position constraints
prob += lpSum(player_vars[i] for i in player_ids if slate['pos'].iloc[i] == 'P') == 2, "Pitcher"
prob += lpSum([player_vars[i] for i in player_ids if slate['name'].iloc[i] in slate['name'][slate['pos'].str.contains('C')].to_list()]) == 1, "Catcher"
prob += lpSum([player_vars[i] for i in player_ids if slate['name'].iloc[i] in slate['name'][slate['pos'].str.contains('1B')].to_list()]) == 1, "1B"
prob += lpSum([player_vars[i] for i in player_ids if slate['name'].iloc[i] in slate['name'][slate['pos'].str.contains('2B')].to_list()]) == 1, "2B"
prob += lpSum([player_vars[i] for i in player_ids if slate['name'].iloc[i] in slate['name'][slate['pos'].str.contains('3B')].to_list()]) == 1, "3B"
prob += lpSum([player_vars[i] for i in player_ids if slate['name'].iloc[i] in slate['name'][slate['pos'].str.contains('SS')].to_list()]) == 1, "SS"
prob += lpSum([player_vars[i] for i in player_ids if slate['name'].iloc[i] in slate['name'][slate['pos'].str.contains('OF')].to_list()]) == 3, "OF"
#no opposing pitcher constraint
for pid in player_ids:
if slate['pos'].iloc[pid] == 'P':
prob += lpSum([player_vars[i] for i in player_ids if
slate['team'].iloc[i] == slate['opp'].iloc[pid]] + [9 * player_vars[pid]]) <= 9, "P{pid}".format(pid=pid)
#three team max constraint
unique_teams = slate['team'].unique()
player_in_team = slate['team'].str.get_dummies()
team_vars = LpVariable.dicts('team', unique_teams, cat = 'Binary')
for team in unique_teams:
prob += lpSum([player_in_team[team][i] * player_vars[i] for i in player_ids if slate['pos'].iloc[i] != 'P']) >= team_vars[team], "Team{team}Min".format(team=team)
prob += lpSum(team_vars[team] for team in unique_teams) == 3, "3 Teams"
Here’s how I would attack this… In pseudocode….
-
make a set of teams that you can use to index a couple of new variables.
-
Make subsets of your players grouped by team, or use
pandas
data frame filters to limit summations of players to the team of interest. -
Make 2 new variables, that are binary "indicator" variables, one call it
use5from[team]
and one calleduse[team]
to indicate that the team has been used at all. -
Make appropriate constraints to link those to the selection variables. Something like:
for team in teams:
5 * use5from[team] <= pulp.lpSum(x[i] for i in team[i])
- And for the other, a constraint to indicate any use…
for team in teams:
use[team] <= pulp.lpSum(x[i] for i in team[I])
- And then make constraints that those two variables sum to over 1 and 3 respectively.