Create a lambda function to set values in a column without being alerted for value set in a copy of a slice of a DataFrame

Question:

Object archive:

match_date,start_time,competition,team_home,team_away,match,tip,reliability,odds,home_goals,away_goals,score,result
2023-01-13,16:45,Italian Serie A,Napoli,Juventus,Napoli v Juventus,Under 2.5 Goals,3,1.8,,
2023-01-13,17:00,English Premier League,Aston Villa,Leeds,Aston Villa v Leeds,Over 2.5 Goals,3,1.73,,
2023-01-13,17:00,Spanish La Liga,Celta Vigo,Villarreal,Celta Vigo v Villarreal,Under 2.5 Goals,1,1.6,,
2023-01-13,17:15,Portuguese Primeira Liga,Portimonense,Santa Clara,Portimonense v Santa Clara,Santa Clara To Win,4,3.6,,
2023-01-14,09:30,English Premier League,Man Utd,Man City,Man Utd v Man City,Man City To Win,3,1.83,,
2023-01-14,09:30,English Football League - Championship,Rotherham,Blackburn,Rotherham v Blackburn,Under 2.5 Goals,3,1.73,,

Object df_new:

match_date,start_time,competition,team_home,team_away,match,home_goals,away_goals,score
2023-01-13,FT,English Premier League,Aston Villa,Leeds,Aston Villa v Leeds,2,1,2 - 1
2023-01-13,FT,Spanish La Liga,Celta Vigo,Villarreal,Celta Vigo v Villarreal,1,1,1 - 1
2023-01-13,FT,Italian Serie A,Napoli,Juventus,Napoli v Juventus,5,1,5 - 1
2023-01-13,FT,Portuguese Primeira Liga,Portimonense,Santa Clara,Portimonense v Santa Clara,0,0,0 - 0

def market_result(home,away,mkt,hg,ag):
    if (mkt == f'{home} To Win') and (hg > ag):
        return 'GREEN'
    if (mkt == f'{home} To Win') and (hg <= ag):
        return 'RED'
    if (mkt == f'{away} To Win') and (ag > hg):
        return 'GREEN'
    if (mkt == f'{away} To Win') and (ag <= hg):
        return 'RED'
    if (mkt == 'Both Teams To Score') and (hg > 0) and (ag > 0):
        return 'GREEN'
    if (mkt == 'Both Teams To Score') and ((hg == 0) or (ag == 0)):
        return 'RED'
    if (mkt == 'Both Teams To Score - No') and ((hg == 0) or (ag == 0)):
        return 'GREEN'
    if (mkt == 'Both Teams To Score - No') and ((hg > 0) or (ag > 0)):
        return 'RED'
    if (mkt == 'Under 2.5 Goals') and (hg+ag < 2.5):
        return 'GREEN'
    if (mkt == 'Under 2.5 Goals') and (hg+ag >= 2.5):
        return 'RED'
    if (mkt == 'Over 2.5 Goals') and (hg+ag > 2.5):
        return 'GREEN'
    if (mkt == 'Over 2.5 Goals') and (hg+ag <= 2.5):
        return 'RED'
    if (mkt == 'Under 3.5 Goals') and (hg+ag < 3.5):
        return 'GREEN'
    if (mkt == 'Under 3.5 Goals') and (hg+ag >= 3.5):
        return 'RED'
    if (mkt == 'Over 3.5 Goals') and (hg+ag > 3.5):
        return 'GREEN'
    if (mkt == 'Over 3.5 Goals') and (hg+ag <= 3.5):
        return 'RED'

def get_result(df):
    df = df[(df['score'].notnull()) & (df['result'].isnull())]
    df['result'] = df.apply(lambda x: market_result(x['team_home'], x['team_away'], x['tip'], int(x['home_goals']), int(x['away_goals'])), axis=1)
    return df

def append_matches(archive,df_new):
    df_csv = pd.read_csv(archive)
    df_csv.loc[df_csv["score"] == "", "score"] = float("NaN")
    df_csv.loc[df_csv["home_goals"] == "", "home_goals"] = float("NaN")
    df_csv.loc[df_csv["away_goals"] == "", "away_goals"] = float("NaN")
    dt = pd.read_csv(df_new)
    df_merge = df_csv.combine_first(df_csv[['match_date','competition','team_home','team_away','match']].merge(dt, "left"))[df_csv.columns.values]
    df_merge = df_merge[df_csv.columns]
    df_results = get_result(df_merge)
    df_merge.update(df_results)
    df_merge.to_csv(archive, index=False)

def main():
    append_matches('archive.csv','df_new.csv')

if __name__ == '__main__':
    main()

Error receive:

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
df['result'] = df.apply(lambda x: market_result(x['team_home'], x['team_away'], x['tip'], int(x['home_goals']), int(x['away_goals'])), axis=1)

How should I proceed to solve this problem?

Asked By: Digital Farmer

||

Answers:

I expect your code to work when you add .copy()

def get_result(df):
    df = df[(df['score'].notnull()) & (df['result'].isnull())].copy() # <-- added .copy() to the end of this line
    df['result'] = df.apply(lambda x: market_result(x['team_home'], x['team_away'], x['tip'], int(x['home_goals']), int(x['away_goals'])), axis=1)
    return df
Answered By: René
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.