How can I convert for loop into .apply?

Question:

I’m solving algorithm problems and I wanna know if I can convert this for loop into .apply. But I don’t know how to. Here is my code:

# Let's use .apply

import pandas as pd

class Solution:
    def longestCommonPrefix(self, strs: list[str]) -> str:
        result = ""
    
    for i in range(len(strs[0])):
        for s in strs:
            if i == len(s) or s[i] != strs[0][i]:
                return result
        result += strs[0][i]
        
    return result
    
strs = ['flower','flow','flight']
df = pd.DataFrame([strs])
df
Solution().longestCommonPrefix(df.iloc[0])

Here is the output:

'fl'

it’s basically the algorithm that returns longest common prefix as the word is.

Asked By: Joshua Chung

||

Answers:

Sure you can. Please refer to the example below. I’ve taken a liberty here and re-factored your code to be a bit more concise and conducive to the .apply method, and added a little bit of documentation.

For example:

import pandas as pd

def longestprefix(s: pd.Series) -> str:
    """Determine the longest prefix in a collection of words.
    
    Args:
        s (pd.Series): A pandas Series containing words to be analysed.
        
    Returns:
        str: The longest common series of characters, from the beginning.
    
    """
    result = ''
    for chars in zip(*s):
        # Test that all characters are the same.
        if all(chars[0] == c for c in chars[1:]):
            result += chars[0]
        else: break
    return result

# Create a DataFrame containing strings to be analysed.    
df = pd.DataFrame(data=['flower','flow','flight', 'flows'])

# Apply the function to each *column*. (i.e. axis=0)
df.apply(longestprefix, axis=0)

Output:

0    fl
dtype: object

To get the single string you can use:

>>> df.apply(longestprefix, axis=0).iloc[0]
'fl'

Further detail:

Per the comment, to further describe why zip(*s) is used:

The zip function enables the nth letter of each word in the Series to be packed together into a tuple. In other words, it lets us compare all of the first letters, all of the second letters, etc. To demonstrate, type list(zip(*df[0])) (Where 0 is the column name) to show the items being iterated. If you iterate the Series directly, you’re only iterating single words, rather than a tuple of nth letters.

For example:

[('f', 'f', 'f', 'f'),  # <-- These are the values being compared.
 ('l', 'l', 'l', 'l'),
 ('o', 'o', 'i', 'o'),
 ('w', 'w', 'g', 'w')]
Answered By: S3DEV
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.