How to print the longest sentence from a column in a csv file

Question:

I am very new to python and am really struggling with this problem. I have a csv file with different columns, labeled "height" "weight" "full_name" etc. I’m trying to create a function that will look through the full_name column and return the longest name. (So if the longest name in the folder was Rachel Smith, I’m trying to return that value.)

Here the code that’s worked the best so far:

import csv
file = "personal_data.csv"
f = open(file)
reader = csv.reader(f, delimiter=",")
col_index = next(reader).index('full_name')
highest = max(rec[col_index] for rec in reader)
print(highest) #using this statement to test if it works
f.close()

I think it’s not working because it’s only printing Rachel, not her full name, Rachel Smith. I’m not really sure though.

Asked By: Emppy

||

Answers:

You can try to use key= parameter in max() function:

import csv

with open("personal_data.csv", "r") as f_in:
    reader = csv.reader(f_in, delimiter=",")
    col_index = next(reader).index("full_name")

    highest = max([rec[col_index] for rec in reader], key=len)  # <-- use key=len here

print(highest)  # using this statement to test if it works
Answered By: Andrej Kesely

Use csv.DictReader to eliminate the need to find the full_name column index. Use max()‘s key argument to make it return the value rather than the length of the value.

import csv


with open('personal_data.csv', 'r') as csvfile:
    reader = csv.DictReader(csvfile)
    longest_name = max([row['full_name'] for row in reader], key=len)

print(longest_name)

If the file is large enough that you care about memory usage, use map() and itemgetter() to get the names and pass as the iterable argument to max().

import csv
import operator


with open('personal_data.csv', 'r') as csvfile:
    reader = csv.DictReader(csvfile)
    names = map(operator.itemgetter('full_name'), reader)
    longest_name = max(names, key=len)

print(longest_name)

Package into function:

import csv


def get_longest_value_from_col(filename, column_name):
    with open(filename, 'r') as csvfile:
        reader = csv.DictReader(csvfile)
        longest_name = max([row[column_name] for row in reader], key=len)

    return longest_name
Answered By: Michael Ruth