Use lambda function to pull latitude & longitude out of city names

Question:

I have a dataframe of 1100 rows with moving data: things like origin cities and countries as well as destination cities and countries.

The process I’m working through involves taking city names (eg: Portland, Oregon) and sending them to the Nominatim search page (https://nominatim.openstreetmap.org/search/) to pull out the latitude and longitude.

I found a pretty good one-off example on Stackoverflow:

import requests
import urllib.parse

address = 'Portland, Oregon'
url = 'https://nominatim.openstreetmap.org/search/' + urllib.parse.quote(address) +'?format=json'

response = requests.get(url).json()
print(response[0]["lat"])
print(response[0]["lon"])

This works great even when I have non-city entries (eg: Texas, United States or Bavaria, Germany).

The issue I’m running into now is that I can’t quite get the code to run down my list of locations in my dataframe column and pull out the info I need.

Here is my code:

segment1 = 'https://nominatim.openstreetmap.org/search/'
segment3 = '?format=json'
df1['json_location_data'] = df1.apply(lambda x: requests.get(segment1 + urllib.parse.quote(str(df1['Origin'])) + segment3).json())

I’m getting an error that reads:

ValueError: Expected a 1D array, got an array with shape (1100, 17)

Not sure how to fix this error, so I created a reproducible example here:

import pandas as pd
locations = ['Portland, Oregon', 'Seattle, Washington','New York, New York','Texas, United States']
df = pd.DataFrame(locations, columns=['locations'])

segment1 = 'https://nominatim.openstreetmap.org/search/'
segment3 = '?format=json'
df['json_location_data'] = df.apply(lambda x: requests.get(segment1 + urllib.parse.quote(str(df['locations'])) + segment3).json())

This works without producing any errors, but returns a column with all NAs.

How can I solve this issue and get the desired data?

Asked By: user2813606

||

Answers:

You could perhaps change your code to this:

df[‘json_location_data’] = df[‘locations’].apply(lambda x: requests.get(segment1 + urllib.parse.quote(str(x)) + segment3).json())
Answered By: Govinda Rathi

Here’s a version that works. Note that I’m extracting only the lat and long from the rather large structure that gets returned.

import urllib
import pandas as pd
import requests

locations = ['Portland, Oregon', 'Seattle, Washington','New York, New York','Texas, United States']
df = pd.DataFrame(locations, columns=['locations'])

segment1 = 'https://nominatim.openstreetmap.org/search/'
segment3 = '?format=json'
def getdata(loc):
    print(loc)
    data = requests.get(segment1 + urllib.parse.quote(loc) + segment3).json()
    return {'lat':data[0]['lat'],'lon':data[0]['lon']}

df['json_location_data'] = df['locations'].apply(getdata)
print(df)

Output:

Portland, Oregon
Seattle, Washington
New York, New York
Texas, United States
              locations                           json_location_data
0      Portland, Oregon  {'lat': '45.5202471', 'lon': '-122.674194'}
1   Seattle, Washington  {'lat': '47.6038321', 'lon': '-122.330062'}
2    New York, New York  {'lat': '40.7127281', 'lon': '-74.0060152'}
3  Texas, United States  {'lat': '31.2638905', 'lon': '-98.5456116'}
Answered By: Tim Roberts
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.