Parse list of strings and find max values

Question

I’m quite new to Python and struggling to get my head round the logic in this for loop. My data has two values, a city and a temp. I would like to write a "for loop" that outputs the maximum temp for each city as follows:

PAR 31
LON 23
RIO 36
DUB 44

As it is to be used in Hadoop, I can’t use any python libraries.

Here is my dataset:

['PAR,31',
 'PAR,18',
 'PAR,14',
 'PAR,18',
 'LON,12',
 'LON,13',
 'LON,9',
 'LON,23',
 'LON,5',
 'RIO,36',
 'RIO,33',
 'RIO,21',
 'RIO,25',
 'DUB,44',
 'DUB,42',
 'DUB,38',
 'DUB,34']

This is my code:

current_city = None
current_max = 0

for line in lines:
    (city, temp) = line.split(',')
   
    temp = int(temp)
    
    if city == current_city:
        if current_max < temp:
            current_max == temp

    current_city = city
            
print(current_city, current_max)

This was my output:

DUB 0

Asked By: angeliquelinde

||

Source

Answer 1

You could iterate over your list. Separate your data. Check if the City is already in the dictionary. If so check if the temp is higher as the one saved in the dictionary if that’s the case replace the entry in the dictionary.

If the city isn’t in the dictionary simply add it into the dictionary.


a = ['PAR,31',
 'PAR,18',
 'PAR,14',
 'PAR,18',
 'LON,12',
 'LON,13',
 'LON,9',
 'LON,23',
 'LON,5',
 'RIO,36',
 'RIO,33',
 'RIO,21',
 'RIO,25',
 'DUB,44',
 'DUB,42',
 'DUB,38',
 'DUB,34']

dict = {}
for entry in a:
    city,temp = entry.split(",")
    if city in dict.keys():
        if dict[city] < int(temp):
            dict[city] = int(temp)
    else:
        dict[city] = int(temp)

print(dict)

Output:

{'PAR': 31, 'LON': 23, 'RIO': 36, 'DUB': 44}

Answered By: Lost_coder

Answer 2

Build a dictionary keyed on city names. The associated values should be a list of integers (the temperatures).

Once the dictionary has been constructed you can then iterate over its items to determine the highest value in each list of temperatures,

data = ['PAR,31',
        'PAR,18',
        'PAR,14',
        'PAR,18',
        'LON,12',
        'LON,13',
        'LON,9',
        'LON,23',
        'LON,5',
        'RIO,36',
        'RIO,33',
        'RIO,21',
        'RIO,25',
        'DUB,44',
        'DUB,42',
        'DUB,38',
        'DUB,34']
d = {}
for e in data:
    city, temp = e.split(',')
    d.setdefault(city, []).append(temp)
for k, v in d.items():
    print(k, max(map(int, v)))

Output:

PAR 31
LON 23
RIO 36
DUB 44

Answered By: Fred

Answer 3

Given the answers here are a bit verbose…

result = {}

for city, t in (l.split(',') for l in lines):
    t = int(t)
    result[city] = max(result.setdefault(city, t), t)

# you can print result however you like, e.g.:
for c, t in result.items():
    print(f"{c} {t}")

If you want to sacrifice a bit of readability for ~30% performance boost, compare values yourself instead of calling max:

    for city, t in (l.split(',') for l in lines):
        t = int(t)
        old_t = result.setdefault(city, t)
        result[city] = old_t if old_t > t else t

Answered By: Klas Š.

Parse list of strings and find max values

Question:

Answers: