Create dictionary through text file

Question

In a lab for class we are to read from a text file containing line by line world cup winners with their country name, year, captain, and coach. We are expected to make a dictionary using the country name as keys.

I was trying to make a dictionary in a dictionary originally. With the country name being the key, then under that key having a key/value pair for wins, years, captain, coach.

def main():
    inFile = open('world_cup_champions.txt', 'r')
    champions = createDict(inFile)
    printChamps(champions)
    

def createDict(file):
    dict = {}
    try:
        for line in file:
            line = (line.strip()).split(",")
            dict[line[1]] = {}
            dict['Wins'] = 0
            for word in line:
                if (word == dict[line[1]]):
                    dict['Wins'] += 1
                    dict['Year(s)'] = line[0]
                    dict['Captain'] = line[2]
                    dict['Coach'] = line[3]
        return (dict)
    except Exception as e:
        print(e)    

def printChamps(unsorted_dict):
    print(f"{'Country':10} {'Wins':5} {'Year':32} {'Captain':72} {'Coach':30}n")
    sorted_dict = dict(sorted(unsorted_dict.items()))  
    for key in sorted_dict:
        print ("%-10s | %2i | %-30s | %-70s | %-50s" % (key, sorted_dict[key]['Wins'], sorted_dict[key]['Year(s)'], sorted_dict[key]['Captain'], sorted_dict[key]['Coach']))


if __name__ == '__main__':
    main()

This is what I had as of now but I got confused on making the dictionary within createDict. Here is what the text file looks like:

Year,Country,Coach,Captain
1930,Uruguay,Alberto Suppici,Jose Nasazzi
1934,Italy,Vittorio Pozzo,Gianpiero Combi
1938,Italy,Vittorio Pozzo,Giuseppe Meazza
1950,Uruguay,Juan Lopez,Obdulio Varela
1954,Germany,Sepp Herberger,Fritz Walter
1958,Brazil,Vicente Feola,Hilderaldo Bellini
1962,Brazil,Aymore Moreira,Mauro Ramos
1966,England,Alf Ramsey,Bobby Moore
1970,Brazil,Mario Zagallo,Carlos Alberto
1974,Germany,Helmut Schon,Franz Beckenbauer
1978,Argentina,Cesar Luis Menotti,Daniel Passarella
1982,Italy,Enzo Bearzot,Dino Zoff
1986,Argentina,Carlos Bilardo,Diego Maradona
1990,Germany,Franz Beckenbauer,Lothar Matth�us
1994,Brazil,Carlos Alberto Parreira,Dunga
1998,France,Aime Jacquet,Didier Deschamps
2002,Brazil,Luiz Felipe Scolari,Cafu
2006,Italy,Marcello Lippi,Fabio Cannavaro
2010,Spain,Vicente del Bosque,Iker Casillas
2014,Germany,Joachim Low,Philipp Lahm

I want to print in a format that lists the country’s name, how many wins they have, with the years that they have won.

This is the expected output:

Asked By: Yeonari

||

Source

Answer 1

Use csv.DictReader. By giving the right data delimiter you get back a iteraror of dictionaries… and if the 1st line are given the columns’ name they will be the keys.

# ########################################
# this is used just to mimic a file object
from io import StringIO

data = """Year,Country,Coach,Captain
1930,Uruguay,Alberto Suppici,Jose Nasazzi
1934,Italy,Vittorio Pozzo,Gianpiero Combi
1938,Italy,Vittorio Pozzo,Giuseppe Meazza
1950,Uruguay,Juan Lopez,Obdulio Varela
1954,Germany,Sepp Herberger,Fritz Walter
1958,Brazil,Vicente Feola,Hilderaldo Bellini
1962,Brazil,Aymore Moreira,Mauro Ramos
1966,England,Alf Ramsey,Bobby Moore
1970,Brazil,Mario Zagallo,Carlos Alberto
1974,Germany,Helmut Schon,Franz Beckenbauer
1978,Argentina,Cesar Luis Menotti,Daniel Passarella
1982,Italy,Enzo Bearzot,Dino Zoff
1986,Argentina,Carlos Bilardo,Diego Maradona
1990,Germany,Franz Beckenbauer,Lothar Matth�us
1994,Brazil,Carlos Alberto Parreira,Dunga
1998,France,Aime Jacquet,Didier Deschamps
2002,Brazil,Luiz Felipe Scolari,Cafu
2006,Italy,Marcello Lippi,Fabio Cannavaro
2010,Spain,Vicente del Bosque,Iker Casillas
2014,Germany,Joachim Low,Philipp Lahm"""

f = StringIO(data)
# ################

import csv
from collections import defaultdict


# main part
dd = defaultdict(dict)
r = csv.DictReader(f, delimiter=',')
for line in r:
    if line['Country'] in dd:
        dd[line['Country']]['Wins'] += 1
    else:
        dd[line['Country']]['Wins'] = 1

    dd[line['Country']].setdefault("Years", []).append(line['Year'])


# alphabetic order 
dd_ordered = {c: dd[c] for c in sorted(dd)}

# check output
print(*dd_ordered.items(), sep='n')

Answered By: cards

Answer 2

If you don’t want to format the table by yourself, it’s best to use one of the many python libraries available. Also, the best approach is maybe to use the pandas library which allows you to easily sort the data and group it by country.

If you’re not looking to use any libraries, the solution below should help you. It creates a dictionary to save the country details which is then used to format the table at the end using a loop. You might want to update the formatting of the table.

from collections import defaultdict
def main():
    inFile = [
        "Year,Country,Coach,Captain",
        "1930,Uruguay,Alberto Suppici,Jose Nasazzi",
        "1934,Italy,Vittorio Pozzo,Gianpiero Combi",
        "1938,Italy,Vittorio Pozzo,Giuseppe Meazza",
        "1950,Uruguay,Juan Lopez,Obdulio Varela",
        "1954,Germany,Sepp Herberger,Fritz Walter",
        "1958,Brazil,Vicente Feola,Hilderaldo Bellini",
        "1962,Brazil,Aymore Moreira,Mauro Ramos",
        "1966,England,Alf Ramsey,Bobby Moore",
        "1970,Brazil,Mario Zagallo,Carlos Alberto",
        "1974,Germany,Helmut Schon,Franz Beckenbauer",
        "1978,Argentina,Cesar Luis Menotti,Daniel Passarella",
        "1982,Italy,Enzo Bearzot,Dino Zoff",
        "1986,Argentina,Carlos Bilardo,Diego Maradona",
        "1990,Germany,Franz Beckenbauer,Lothar Matthus",
        "1994,Brazil,Carlos Alberto Parreira,Dunga",
        "1998,France,Aime Jacquet,Didier Deschamps",
        "2002,Brazil,Luiz Felipe Scolari,Cafu",
        "2006,Italy,Marcello Lippi,Fabio Cannavaro",
        "2010,Spain,Vicente del Bosque,Iker Casillas",
        "2014,Germany,Joachim Low,Philipp Lahm"
    ]
    champions = createDict(inFile)
    printChamps(champions)
    

def createDict(file):
    winners = defaultdict(str)
    try:
        for line in file:
            year, country, coach, captain = (line.strip()).split(",")
            if(year == "Year"): ## ignoring the first line
                continue
            if(winners.get(country) is None): 
                winners[country] = {
                    'wins': 1,
                    'years': [year],
                    'captains': [captain],
                    'coaches': [coach]
                }
            else: 
                winners[country]['wins'] += 1
                winners[country]['years'].append(year)
                winners[country]['coaches'].append(coach)
                winners[country]['captains'].append(captain)
        return (winners)
    except Exception as e:
        print(e)    

def printChamps(unsorted_dict):
    print("-------------------------------------------------------------------------|")
    print(f"{'Country':10} | {'Wins':5} | {'Year':5} | {'Captain':20} | {'Coach':20} |")
    print("-------------------------------------------------------------------------|")
    sorted_dict = dict(sorted(unsorted_dict.items(), key=lambda x: x[1]['wins'], reverse=True))
    for key in sorted_dict:
        print ("%-10s | %5i | %-5s | %-20s | %-20s |" % (key, sorted_dict[key]['wins'], ", ".join(sorted_dict[key]['years']), ", ".join(sorted_dict[key]['captains']), ", ".join(sorted_dict[key]['coaches'])))
        print("-------------------------------------------------------------------------|")
    print("-------------------------------------------------------------------------|")
if __name__ == '__main__':
    main()

Output

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
Country    | Wins  | Year                         | Captain                                                       | Coach                                                                                      |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
Brazil     |     5 | 1958, 1962, 1970, 1994, 2002 | Hilderaldo Bellini, Mauro Ramos, Carlos Alberto, Dunga, Cafu  | Vicente Feola, Aymore Moreira, Mario Zagallo, Carlos Alberto Parreira, Luiz Felipe Scolari |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
Italy      |     4 | 1934, 1938, 1982, 2006       | Gianpiero Combi, Giuseppe Meazza, Dino Zoff, Fabio Cannavaro  | Vittorio Pozzo, Vittorio Pozzo, Enzo Bearzot, Marcello Lippi                               |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
Germany    |     4 | 1954, 1974, 1990, 2014       | Fritz Walter, Franz Beckenbauer, Lothar Matthus, Philipp Lahm | Sepp Herberger, Helmut Schon, Franz Beckenbauer, Joachim Low                               |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
Uruguay    |     2 | 1930, 1950                   | Jose Nasazzi, Obdulio Varela                                  | Alberto Suppici, Juan Lopez                                                                |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
Argentina  |     2 | 1978, 1986                   | Daniel Passarella, Diego Maradona                             | Cesar Luis Menotti, Carlos Bilardo                                                         |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
England    |     1 | 1966                         | Bobby Moore                                                   | Alf Ramsey                                                                                 |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
France     |     1 | 1998                         | Didier Deschamps                                              | Aime Jacquet                                                                               |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
Spain      |     1 | 2010                         | Iker Casillas                                                 | Vicente del Bosque                                                                         |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|

Answered By: Milly Stack

Answer 3

This solution takes advantage of the pandas library’s dataframe functionality.

import pandas as pd

print(f'Country   Wins Years')
data = pd.read_csv('world_cup_champions.txt') # Create a pandas dataframe fron the data
bycountry = data.groupby('Country')  # Group the dataframe by country
for country, df in bycountry:  # For each (country) group
    years = [str(x[1]['Year']) for x in df.iterrows()] # Extract the years into a list
    print(f'{country:<10} {len(years)}  ', ', '.join(years))  # Print results

Which can be simplified to:

import pandas

print(f'Country   Wins Years')
for country, df in pandas.read_csv('world_cup_champions.txt').groupby('Country'):
    years = [str(x[1]['Year']) for x in df.iterrows()]
    print(f'{country:<10} {len(years)}  ', ', '.join(years))

Output:

Country   Wins Years
Argentina  2   1978, 1986
Brazil     5   1958, 1962, 1970, 1994, 2002
England    1   1966
France     1   1998
Germany    4   1954, 1974, 1990, 2014
Italy      4   1934, 1938, 1982, 2006
Spain      1   2010
Uruguay    2   1930, 1950

Answered By: C. Pappy

Answer 4

Suggestion 1: With `pandas`

If you’re allowed to use pandas, then you can just

# import pandas as pd
def main():
    inp_filepath = 'world_cup_champions.txt'

    df = pd.read_csv(inp_filepath) ## read from file 
    champions_df = df.groupby('Country').agg(
        Wins=('Year',len), Years=('Year',list), 
        Captains=('Captain', list), Coaches=('Coach', list) )

    champions = champions_df.to_dict('index')

    print(champions_df[['Wins']].assign(
        Years=champions_df['Years'].apply(lambda y: ', '.join(f'{i}' for i in y)) 
    ).sort_values('Wins', ascending=False).to_markdown(tablefmt='rst'))

^{[Use .sort_index() instead of .sort_values('Wins', ascending=False) to sort row by alphabetical order of countries instead of most wins.]}

champions would look like

{
  'Argentina': {'Wins': 2, 'Years': [1978, 1986], 'Captains': ['Daniel Passarella', 'Diego Maradona'], 'Coaches': ['Cesar Luis Menotti', 'Carlos Bilardo']},
  'Brazil': {'Wins': 5, 'Years': [1958, 1962, 1970, 1994, 2002], 'Captains': ['Hilderaldo Bellini', 'Mauro Ramos', 'Carlos Alberto', 'Dunga', 'Cafu'], 'Coaches': ['Vicente Feola', 'Aymore Moreira', 'Mario Zagallo', 'Carlos Alberto Parreira', 'Luiz Felipe Scolari']},
  'England': {'Wins': 1, 'Years': [1966], 'Captains': ['Bobby Moore'], 'Coaches': ['Alf Ramsey']},
  'France': {'Wins': 1, 'Years': [1998], 'Captains': ['Didier Deschamps'], 'Coaches': ['Aime Jacquet']},
  'Germany': {'Wins': 4, 'Years': [1954, 1974, 1990, 2014], 'Captains': ['Fritz Walter', 'Franz Beckenbauer', 'Lothar Matth�us', 'Philipp Lahm'], 'Coaches': ['Sepp Herberger', 'Helmut Schon', 'Franz Beckenbauer', 'Joachim Low']},
  'Italy': {'Wins': 4, 'Years': [1934, 1938, 1982, 2006], 'Captains': ['Gianpiero Combi', 'Giuseppe Meazza', 'Dino Zoff', 'Fabio Cannavaro'], 'Coaches': ['Vittorio Pozzo', 'Vittorio Pozzo', 'Enzo Bearzot', 'Marcello Lippi']},
  'Spain': {'Wins': 1, 'Years': [2010], 'Captains': ['Iker Casillas'], 'Coaches': ['Vicente del Bosque']},
  'Uruguay': {'Wins': 2, 'Years': [1930, 1950], 'Captains': ['Jose Nasazzi', 'Obdulio Varela'], 'Coaches': ['Alberto Suppici', 'Juan Lopez']}
}

and the printed output would be

=========  ======  ============================
Country      Wins  Years
=========  ======  ============================
Brazil          5  1958, 1962, 1970, 1994, 2002
Germany         4  1954, 1974, 1990, 2014
Italy           4  1934, 1938, 1982, 2006
Argentina       2  1978, 1986
Uruguay         2  1930, 1950
England         1  1966
France          1  1998
Spain           1  2010
=========  ======  ============================

Suggestion 2: Without `pandas`

def createDict(file_lines):
    finalDict = {}
    rKeys = ['Years', 'Coaches', 'Captains']
    fData = [l.strip().split(',') for l in file_lines[1:]]

    for year,country,coach,captain in fData:
        finalDict.setdefault(country, {'Wins':0, **{k:[] for k in rKeys}})
        finalDict[country]['Wins'] += 1
        for k,v in zip(rKeys,[year,coach,captain]): 
            finalDict[country][k].append(v)
    return finalDict

def printChamps(unsorted_dict, idx='Country',idxLen=10,ctKey='Wins',cLen=4,**lenRef):
    if not lenRef: 
        lenRef = {'Years':30} ## {'Years':30,'Captains':70,'Coaches':50} ## 
    
    vals_n_lens = [(idx,idxLen), (ctKey,cLen), *lenRef.items()]
    print("  ".join(f"{k:{l}}" for k,l in vals_n_lens))
    print("  ".join(f"{'='*len(k):{l}}" for k,l in vals_n_lens))

    for c, rDict in sorted(unsorted_dict.items()):
        vals_n_lens = [(c,idxLen), (rDict[ctKey],cLen)]
        vals_n_lens += [(', '.join(rDict.get(k,[])), l) for k, l in lenRef.items()]
        print("  ".join(f"{k:<{l}}" for k,l in vals_n_lens))

def main():
    inp_filepath = 'world_cup_champions.txt'
    with open(inp_filepath, 'r') as f: 
        champions = createDict(f.readlines())
    printChamps(champions)

champions would look the same as in the pandas solution, but the printed output would look a bit closer to the desired output depicted in your question:

Country     Wins  Years                         
=======     ====  =====                         
Argentina   2     1978, 1986                    
Brazil      5     1958, 1962, 1970, 1994, 2002  
England     1     1966                          
France      1     1998                          
Germany     4     1954, 1974, 1990, 2014        
Italy       4     1934, 1938, 1982, 2006        
Spain       1     2010                          
Uruguay     2     1930, 1950

If you

either changed the default lenRef in printChamps to {'Years':30,'Captains':70,'Coaches':50} (currently commented out),
or called printChamps from main like printChamps(champions, Years=30,Captains=70,Coaches=50),

then the printed output would look like

Country     Wins  Years                           Captains                                                                Coaches                                           
=======     ====  =====                           ========                                                                =======                                           
Argentina   2     1978, 1986                      Daniel Passarella, Diego Maradona                                       Cesar Luis Menotti, Carlos Bilardo                
Brazil      5     1958, 1962, 1970, 1994, 2002    Hilderaldo Bellini, Mauro Ramos, Carlos Alberto, Dunga, Cafu            Vicente Feola, Aymore Moreira, Mario Zagallo, Carlos Alberto Parreira, Luiz Felipe Scolari
England     1     1966                            Bobby Moore                                                             Alf Ramsey                                        
France      1     1998                            Didier Deschamps                                                        Aime Jacquet                                      
Germany     4     1954, 1974, 1990, 2014          Fritz Walter, Franz Beckenbauer, Lothar Matth�us, Philipp Lahm          Sepp Herberger, Helmut Schon, Franz Beckenbauer, Joachim Low
Italy       4     1934, 1938, 1982, 2006          Gianpiero Combi, Giuseppe Meazza, Dino Zoff, Fabio Cannavaro            Vittorio Pozzo, Vittorio Pozzo, Enzo Bearzot, Marcello Lippi
Spain       1     2010                            Iker Casillas                                                           Vicente del Bosque                                
Uruguay     2     1930, 1950                      Jose Nasazzi, Obdulio Varela                                            Alberto Suppici, Juan Lopez

Answered By: Driftr95

Create dictionary through text file

Question:

Answers:

Suggestion 1: With `pandas`

Suggestion 2: Without `pandas`

Create dictionary through text file

Question:

Answers:

Suggestion 1: With pandas

Suggestion 2: Without pandas

Suggestion 1: With `pandas`

Suggestion 2: Without `pandas`