How to get JSON data between specific date ranges in Python?

Question:

I have this JSON data set consisting of meteorite landings and I want to find the meteorite landing with the highest mass in the last 10 years.

JSON sample:

[
{
"name": "Aachen",
"id": "1",
"nametype": "Valid",
"recclass": "L5",
"mass": "21",
"fall": "Fell",
"year": "1880-01-01T00:00:00.000",
"reclat": "50.775000",
"reclong": "6.083330",
"geolocation": {
"type": "Point",
"coordinates": [
6.08333,
50.775
]
}
},
{
"name": "Aarhus",
"id": "2",
"nametype": "Valid",
"recclass": "H6",
"mass": "720",
"fall": "Fell",
"year": "1951-01-01T00:00:00.000",
"reclat": "56.183330",
"reclong": "10.233330",
"geolocation": {
"type": "Point",
"coordinates": [
10.23333,
56.18333
]
}
},

My code is the following. I have got to the point to get the date range of last 10 years but not able to link the mass within the date range.

import json, datetime
from dateutil.relativedelta import relativedelta
f = open('nasadata.json')
data = json.load(f)
start_date = datetime.datetime.now() - relativedelta(years = 10)
end_date = datetime.datetime.now()
for item in data:
    year = item.get('year')
    if year is not None:
        year = datetime.datetime.strptime(year, "%Y-%m-%dT%H:%M:%S.%f")
    name = item.get('name')
    mass = item.get('mass')
           
    if year is not None and year >= start_date and year <= end_date:
        max_mass = max(mass)
        print(max_mass)

How do I iterate through JSON to get the maximum mass meteorite name and mass values within the date range of 10 years?

Asked By: Arijeet Acharyya

||

Answers:

I would make a slight optimisation by maintaining dates in the "%Y-%m-%dT%H:%M:%S.%f" format as then they can be sorted alphabetically. Also you don’t need to compare against now as presumably there are no future meteorite falls recorded? So then your code can be:

start_date = datetime.datetime.now() - relativedelta(years = 10)
start_date = start_date.strftime("%Y-%m-%dT%H:%M:%S.%f")

recent = [m for m in data if m.get('year', '0') >= start_date]
heaviest = sorted(recent, key=lambda m:float(m.get('mass', 0)), reverse=True)[0]

print(heaviest)
Answered By: Nick

Modify your loop implementation a little and check for year difference between now and a year in the data in an if-clause and keep the place name of the meteorite if the mass is the greatest in the last 10 years.

Finally, outside the loop, check if valid data exists. For example, in the current input, there is no valid data (all years are outside 10 years).

from datetime import datetime
from dateutil.relativedelta import relativedelta
now = datetime.now()
max_mass = 0
max_mass_name = None
start_date = (datetime.now() - relativedelta(years=10)).strftime("%Y-%m-%dT%H:%M:%S.%f")
for item in data:
    year = item.get('year', '0')
    mass = float(item.get('mass', 0))
    # if year is more recent than 10 years prior to today and 
    # if its corresponding mass is the greatest so far (among those in the last 10 years), 
    # then save its name
    if year >= start_date and mass > max_mass:
        max_mass = mass
        max_mass_name = item.get('name')
# check if a valid data exists
max_mass_name = max_mass_name if max_mass_name is not None else 'No such data'

You can keep the year in text format as that is alphabetically sortable.

Use the following code:

import datetime

originalJson = [{...}]
today = datetime.datetime.now()
tenYearsAgo = today.replace(year=today.year-10)

filteredByLastTenYears = list(filter(lambda d: datetime.datetime.strptime(d['year'], '%Y-%m-%dT%H:%M:%S.%f') > tenYearsAgo, originalJson))
sortedByMass = sorted(filteredByLastTenYears, key=lambda d: d['mass'],reverse=True)

highestMass = sortedByMass[0]
Answered By: Pav
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.