Columns must be same length as ke by using lru_cache

Question:

I have a problem. I want to get the coordinates long and lat from the address. I want to check directly in the method whether this address already has a long and lat value and if so, should this be taken and not queried again via geolocator.geocode(df['address']). Unfortunately I got an error ValueError: Columns must be same length as key.

Dataframe

                                         address  customer
0              Surlej, 7513, Silvaplana, Schweiz         1
1  Vodnikova cesta 35, 1000 Ljubljana, Slowenien         2
2              Surlej, 7513, Silvaplana, Schweiz         1

Code

from functools import lru_cache
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent='testing_stackoverflow')

import pandas as pd
d = {
    "address": ['Surlej, 7513, Silvaplana, Schweiz', 'Vodnikova cesta 35, 1000 Ljubljana, Slowenien', 'Surlej, 7513, Silvaplana, Schweiz',],
    "customer": [1, 2, 1],
}
df = pd.DataFrame(data=d)
print(df)

@lru_cache(maxsize=None)
def function_that_returns_lat_lon_from_address(address):
    location = geolocator.geocode(address, timeout=10)
    print(location)
    try:
        if (location == None):
            return(None, None)
        else:
            return (location.latitude, location.longitude)
    except GeocoderTimedOut as e:
        print("Timeout ", e)
        return(None, None)

df[['lat', 'lon']] = df['address'].apply(function_that_returns_lat_lon_from_address)

What I want

                                         address  customer   latitude  
0              Surlej, 7513, Silvaplana, Schweiz         1  46.459902   
1  Vodnikova cesta 35, 1000 Ljubljana, Slowenien         2  46.065523   
2              Surlej, 7513, Silvaplana, Schweiz         1  46.459902   

   longitude  
0   9.803370  
1  14.490775  
2   9.803370  
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-8-4873cdd27090> in <module>()
     24         return(None, None)
     25 
---> 26 df[['lat', 'lon']] = df['address'].apply(function_that_returns_lat_lon_from_address)

2 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/frame.py in _iset_not_inplace(self, key, value)
   3673         if self.columns.is_unique:
   3674             if np.shape(value)[-1] != len(key):
-> 3675                 raise ValueError("Columns must be same length as key")
   3676 
   3677             for i, col in enumerate(key):

ValueError: Columns must be same length as key
Asked By: Test

||

Answers:

Solution to the problem

Convert series into list of tuples so that you have two items on each row to assign back to two columns. In this case pandas will automatically take care of unpacking tuples and assigning the unpacked values back to two columns

A slightly faster solution

df[['lat', 'lon']] = list(map(function_that_returns_lat_lon_from_address, df.address))

Or you can also fix your code by simply adding .tolist conversion,

df[['lat', 'lon']] = df['address'].apply(function_that_returns_lat_lon_from_address).tolist()
Answered By: Shubham Sharma
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.