How to sum second element in string?
Question:
I try to sum per fruit sort the total.
So I have it:
listfruit= [('Watermeloenen', '123,20'), ('Watermeloenen', '2.772,00'), ('Watermeloenen', '46,20'), ('Watermeloenen', '577,50'), ('Watermeloenen', '69,30'), ('Appels', '3.488,16'), ('Sinaasappels', '137,50'), ('Sinaasappels', '500,00'), ('Sinaasappels', '1.000,00'), ('Sinaasappels', '2.000,00'), ('Sinaasappels', '1.000,00'), ('Sinaasappels', '381,25')]
def total_cost_fruit_per_sort():
number_found = listfruit
fruit_dict = {}
for n, f in number_found:
fruit_dict[f] = fruit_dict.get(f, 0) + int(n)
result = 'n'.join(f'{key}: {val}' for key, val in fruit_dict.items())
return result
print(total_cost_fruit_per_sort())
So that it looks like:
Watermeloenen: 800
Sinaasappels: 1000
But if I run the code I get this error:
File "c:UsersengelDocumentspythoncodeextract_text.py", line 292, in <genexpr>
result = sum(int(n) for _, n in listfruit)
ValueError: invalid literal for int() with base 10: '123,20'
But I parse the second value to an int, even when I try to do it to parse to float. Doesn’t work.
Question: how can I can calculate total of each fruit sort?
Answers:
I believe you want to achieve something like this:
result = sum(float(n.replace(".", "").replace(",", ".")) for _, n in listfruit)
UPD:
listfruit = [('Watermeloenen', '123,20'), ('Watermeloenen', '2.772,00'), ('Watermeloenen', '46,20'), ('Watermeloenen', '577,50'), ('Watermeloenen', '69,30'), ('Appels', '3.488,16'), ('Sinaasappels', '137,50'), ('Sinaasappels', '500,00'), ('Sinaasappels', '1.000,00'), ('Sinaasappels', '2.000,00'), ('Sinaasappels', '1.000,00'), ('Sinaasappels', '381,25')]
def total_cost_fruit_per_sort():
fruit_dict = {}
for fruit, num in listfruit:
fruit_dict[fruit] = fruit_dict.get(fruit, 0) + float(num.replace(".", "").replace(",", "."))
return fruit_dict
for fruit, total in total_cost_fruit_per_sort().items():
print(f"{fruit}: {total}")
Here’s a solution exploiting .replace
to handle .
and ,
in the strings and eval
to transform them to numeric.
def total_cost_fruit_per_sort(debug=False):
number_found = listfruit
fruit_dict = {}
for f, n in number_found:
parsed_n = eval(n.replace('.', '').replace(',', '.'))
if debug:
print(f, n, parsed_n)
fruit_dict[f] = fruit_dict.get(f, 0) + parsed_n
result = 'n'.join(f'{key}: {val}' for key, val in fruit_dict.items())
return result
print(total_cost_fruit_per_sort())
This gives as output:
Watermeloenen: 3588.2
Appels: 3488.16
Sinaasappels: 5018.75
In case you want to double check the string-to-numeric conversions you can run the function with the argument debug=True
enabled.
Since your numbers are localized with specific separators for thousands and decimals, the proper way to do this is to use the locales module to parse teh numbers. (They could be "brute-force" parsed by hardocoding replacements of "." for "" and "," for "." – and for begginer programmers sometime it is more important to understand the mechanics of doing this than copying and pasting the appropriate locale functionality usage – a seasoned programmer has to understand both ways, and "feel in the guts" that going through locale is better in this case).
From your fruit names, I assume you are at NL locale, which uses "," for decimal separator in numbers. After setting the locale, a call to locale.delocalize
will convert the number representation to use nothing as a thousands separator, and "." as the decimal separator, at which point one can call Python’s "float" (not "int") to convert it to a number.
import locale
...
def total_cost_fruit_per_sort():
locale.set_locale(locale.LC_NUMERIC, ("nl", "utf-8"))
fruit_dict = {}
for f, n in list_fruit:
fruit_dict[f] = fruit_dict.get(f, 0) + float(locale.delocalize(n))
return fruit_dict # no reason to manually convert the dict to a string: printing a dict already has the format you are trying to achieve
(unfortunately locale naming is non-uniform accross operating systems – you might need to try variants on the (‘nl’, ‘utf-8’) part of the set_locale call if that fails. If you are on windows and it fails, try first replacing ‘utf-8’ by ‘iso8859-1’)
And the brute-force method, were one does not care that the program will ever work with other numeric formats, nor are worried with possible edge cases in the format:
def total_cost_fruit_per_sort():
fruit_dict = {}
for f, n in list_fruit:
n = n.replace(".", "").replace(",", ".")
fruit_dict[f] = fruit_dict.get(f, 0) + float(n)
return fruit_dict
Try this:
import locale
locale._override_localeconv = {'thousands_sep': '.', 'decimal_point': ','}
new_listfruit = [(n, locale.atof(v)) for n, v in listfruit]
Output:
>>> new_listfruit
[('Watermeloenen', 123.2), ('Watermeloenen', 2772.0), ('Watermeloenen', 46.2), ('Watermeloenen', 577.5), ('Watermeloenen', 69.3), ('Appels', 3488.16), ('Sinaasappels', 137.5), ('Sinaasappels', 500.0), ('Sinaasappels', 1000.0), ('Sinaasappels', 2000.0), ('Sinaasappels', 1000.0), ('Sinaasappels', 381.25)]
Now you can do any operations with floats (and remember that you might need to reset the locale to the default setting).
I try to sum per fruit sort the total.
So I have it:
listfruit= [('Watermeloenen', '123,20'), ('Watermeloenen', '2.772,00'), ('Watermeloenen', '46,20'), ('Watermeloenen', '577,50'), ('Watermeloenen', '69,30'), ('Appels', '3.488,16'), ('Sinaasappels', '137,50'), ('Sinaasappels', '500,00'), ('Sinaasappels', '1.000,00'), ('Sinaasappels', '2.000,00'), ('Sinaasappels', '1.000,00'), ('Sinaasappels', '381,25')]
def total_cost_fruit_per_sort():
number_found = listfruit
fruit_dict = {}
for n, f in number_found:
fruit_dict[f] = fruit_dict.get(f, 0) + int(n)
result = 'n'.join(f'{key}: {val}' for key, val in fruit_dict.items())
return result
print(total_cost_fruit_per_sort())
So that it looks like:
Watermeloenen: 800
Sinaasappels: 1000
But if I run the code I get this error:
File "c:UsersengelDocumentspythoncodeextract_text.py", line 292, in <genexpr>
result = sum(int(n) for _, n in listfruit)
ValueError: invalid literal for int() with base 10: '123,20'
But I parse the second value to an int, even when I try to do it to parse to float. Doesn’t work.
Question: how can I can calculate total of each fruit sort?
I believe you want to achieve something like this:
result = sum(float(n.replace(".", "").replace(",", ".")) for _, n in listfruit)
UPD:
listfruit = [('Watermeloenen', '123,20'), ('Watermeloenen', '2.772,00'), ('Watermeloenen', '46,20'), ('Watermeloenen', '577,50'), ('Watermeloenen', '69,30'), ('Appels', '3.488,16'), ('Sinaasappels', '137,50'), ('Sinaasappels', '500,00'), ('Sinaasappels', '1.000,00'), ('Sinaasappels', '2.000,00'), ('Sinaasappels', '1.000,00'), ('Sinaasappels', '381,25')]
def total_cost_fruit_per_sort():
fruit_dict = {}
for fruit, num in listfruit:
fruit_dict[fruit] = fruit_dict.get(fruit, 0) + float(num.replace(".", "").replace(",", "."))
return fruit_dict
for fruit, total in total_cost_fruit_per_sort().items():
print(f"{fruit}: {total}")
Here’s a solution exploiting .replace
to handle .
and ,
in the strings and eval
to transform them to numeric.
def total_cost_fruit_per_sort(debug=False):
number_found = listfruit
fruit_dict = {}
for f, n in number_found:
parsed_n = eval(n.replace('.', '').replace(',', '.'))
if debug:
print(f, n, parsed_n)
fruit_dict[f] = fruit_dict.get(f, 0) + parsed_n
result = 'n'.join(f'{key}: {val}' for key, val in fruit_dict.items())
return result
print(total_cost_fruit_per_sort())
This gives as output:
Watermeloenen: 3588.2
Appels: 3488.16
Sinaasappels: 5018.75
In case you want to double check the string-to-numeric conversions you can run the function with the argument debug=True
enabled.
Since your numbers are localized with specific separators for thousands and decimals, the proper way to do this is to use the locales module to parse teh numbers. (They could be "brute-force" parsed by hardocoding replacements of "." for "" and "," for "." – and for begginer programmers sometime it is more important to understand the mechanics of doing this than copying and pasting the appropriate locale functionality usage – a seasoned programmer has to understand both ways, and "feel in the guts" that going through locale is better in this case).
From your fruit names, I assume you are at NL locale, which uses "," for decimal separator in numbers. After setting the locale, a call to locale.delocalize
will convert the number representation to use nothing as a thousands separator, and "." as the decimal separator, at which point one can call Python’s "float" (not "int") to convert it to a number.
import locale
...
def total_cost_fruit_per_sort():
locale.set_locale(locale.LC_NUMERIC, ("nl", "utf-8"))
fruit_dict = {}
for f, n in list_fruit:
fruit_dict[f] = fruit_dict.get(f, 0) + float(locale.delocalize(n))
return fruit_dict # no reason to manually convert the dict to a string: printing a dict already has the format you are trying to achieve
(unfortunately locale naming is non-uniform accross operating systems – you might need to try variants on the (‘nl’, ‘utf-8’) part of the set_locale call if that fails. If you are on windows and it fails, try first replacing ‘utf-8’ by ‘iso8859-1’)
And the brute-force method, were one does not care that the program will ever work with other numeric formats, nor are worried with possible edge cases in the format:
def total_cost_fruit_per_sort():
fruit_dict = {}
for f, n in list_fruit:
n = n.replace(".", "").replace(",", ".")
fruit_dict[f] = fruit_dict.get(f, 0) + float(n)
return fruit_dict
Try this:
import locale
locale._override_localeconv = {'thousands_sep': '.', 'decimal_point': ','}
new_listfruit = [(n, locale.atof(v)) for n, v in listfruit]
Output:
>>> new_listfruit
[('Watermeloenen', 123.2), ('Watermeloenen', 2772.0), ('Watermeloenen', 46.2), ('Watermeloenen', 577.5), ('Watermeloenen', 69.3), ('Appels', 3488.16), ('Sinaasappels', 137.5), ('Sinaasappels', 500.0), ('Sinaasappels', 1000.0), ('Sinaasappels', 2000.0), ('Sinaasappels', 1000.0), ('Sinaasappels', 381.25)]
Now you can do any operations with floats (and remember that you might need to reset the locale to the default setting).