How do I calculate speed from a gpx file if the speed tag itself is not given?

Question:

For example, I might have some information like this:

<trkpt lat="-33.8161780" lon="150.8710320">
 <ele>73.0</ele>
 <time>2017-07-08T22:05:45Z</time>
 <extensions>
  <power>0</power>
  <gpxtpx:TrackPointExtension>
    <gpxtpx:atemp>7</gpxtpx:atemp>
    <gpxtpx:hr>115</gpxtpx:hr>
    <gpxtpx:cad>27</gpxtpx:cad>
  </gpxtpx:TrackPointExtension>
 </extensions>
</trkpt>

How would I calculate speed from this info? I’ve used the python etree module to parse the file, and have all the info in a pandas database.

It was mentioned that I should probably instead show the pandas dataframe. It looks something like this:

                     longitude   latitude   ele   temp
time                
2017-07-08 22:05:45 150.8710320 -33.8161780 73.0    7
2017-07-08 22:05:46 150.8710350 -33.8161500 73.0    7
2017-07-08 22:05:47 150.8710440 -33.8161170 73.0    7
2017-07-08 22:05:48 150.8710540 -33.8160820 73.0    7
2017-07-08 22:05:49 150.8710690 -33.8160430 73.0    7

and so on.

Asked By: probot

||

Answers:

speed equals distance / time. The longitude and latitude presumably represent locations on the surface of the Earth. If we accept a sphere of radius 6371 km as an approximation of the Earth, then we can easily translate the longitude and latitude into xyz-coordinates:

r = 6371000 # meters
df['theta'] = np.deg2rad(df['longitude'])
df['phi'] = np.deg2rad(df['latitude'])
df['x'] = r*np.cos(df['theta'])*np.sin(df['phi'])
df['y'] = r*np.sin(df['theta'])*np.sin(df['phi'])
df['z'] = r*np.cos(df['phi'])

Now it is not hard to compute distances between consecutive points:

df['x2'] = df['x'].shift()
df['y2'] = df['y'].shift()
df['z2'] = df['z'].shift()
df['distance'] = np.sqrt((df['x2']-df['x'])**2 + (df['y2']-df['y'])**2 + (df['z2']-df['z'])**2)

However, this is a chord length — the straight-line distance between two points on the surface of the sphere. If the points are far apart, the chord would be burrowing through the surface of the Earth. Presumably the motion is on the surface of the Earth. So a more accurate calculation for distance would use the arclength:

df['central angle'] = np.arccos((df['x']*df['x2'] + df['y']*df['y2'] + df['z']*df['z2'])/r**2)
df['arclength'] = df['central angle']*r

The central angle is using the dot product formula.

Having computed the arclength (distance), we must now also compute the time interval between consecutive observations (i.e. rows of the DataFrame):

df['time'] = (df.index.to_series().diff() / pd.Timedelta(seconds=1))

So using speed = distance / time:

df['speed'] = df['arclength'] / df['time']  # in meters/second

import numpy as np
import pandas as pd

df = pd.DataFrame({'ele': [73.0, 73.0, 73.0, 73.0, 73.0], 'latitude': [-33.816178, -33.81615, -33.816117, -33.816082, -33.816043], 'longitude': [150.871032, 150.871035, 150.87104399999998, 150.87105400000002, 150.871069], 'temp': [7, 7, 7, 7, 7], 'time': ['2017-07-08 22:05:45', '2017-07-08 22:05:46', '2017-07-08 22:05:47', '2017-07-08 22:05:48', '2017-07-08 22:05:49']})
df['time'] = pd.to_datetime(df['time'])
df = df.set_index('time')
columns = df.columns.tolist()

r = 6371000 # radius of the Earth in meters
df['theta'] = np.deg2rad(df['longitude'])
df['phi'] = np.deg2rad(df['latitude'])
df['x'] = r*np.cos(df['theta'])*np.sin(df['phi'])
df['y'] = r*np.sin(df['theta'])*np.sin(df['phi'])
df['z'] = r*np.cos(df['phi'])
df['x2'] = df['x'].shift()
df['y2'] = df['y'].shift()
df['z2'] = df['z'].shift()
df['distance'] = np.sqrt((df['x2']-df['x'])**2 + (df['y2']-df['y'])**2 + (df['z2']-df['z'])**2)

df['central angle'] = np.arccos((df['x']*df['x2'] + df['y']*df['y2'] + df['z']*df['z2'])/r**2)
df['arclength'] = df['central angle']*r

df['time'] = (df.index.to_series().diff() / pd.Timedelta(seconds=1))
df['speed'] = df['arclength'] / df['time']  # in meters/second
df = df[columns + ['speed']]
print(df)

yields

                      ele   latitude   longitude  temp     speed
time                                                            
2017-07-08 22:05:45  73.0 -33.816178  150.871032     7       NaN
2017-07-08 22:05:46  73.0 -33.816150  150.871035     7  3.119892
2017-07-08 22:05:47  73.0 -33.816117  150.871044     7  3.712201
2017-07-08 22:05:48  73.0 -33.816082  150.871054     7  3.940673
2017-07-08 22:05:49  73.0 -33.816043  150.871069     7  4.433590

If you comment out

df = df[columns + ['speed']]

and re-run the script, you’ll see all the intermediate calculations. You’ll
notice that df['distance'] is quite close to df['arclength']. Since the
points are not very far apart on the surface of the Earth, the chord length is a
good approximation of the arclength. So for the data you posted

df['speed'] = df['distance'] / df['time'] 

would have worked just as well. However, the arclength calculation is a bit
more robust since it will give a more accurate value if the points are far
apart.

Answered By: unutbu
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.