How do I use rsplit when adding a Requests response to python dictionary?
Question:
I am currently writing a script to scrape data from an API into a Python dictionary and then export the result into a JSON file. I am trying to get the file extension from a response by splitting using .rsplit('.', 1)[-1]
The only problem is that some keys have ‘None" as their value and this throws the AttributeError: ‘NoneType’ object has no attribute ‘rsplit’. Here is my code snippet:
d = requests.get(dataset_url)
output = d.json()
output_dict = {
'data_files': {
# Format to get only extension
'format': output.get('connectionParameters', {}).get('url').rsplit('.', 1)[-1],
'url': output.get('connectionParameters', {}).get('url'),
},
}
An example of JSON response with the required key is as follows:
"connectionParameters": {
"csv_escape_char": "\",
"protocol": "DwC",
"automation": false,
"strip": false,
"csv_eol": "\n",
"csv_text_enclosure": """,
"csv_delimiter": "\t",
"incremental": false,
"url": "https://registry.nbnatlas.org/upload/1564481725489/London_churchyards_dwc.txt",
"termsForUniqueKey": [
"occurrenceID"
]
},
Any way to tackle this?
Answers:
Try:
url = output.get("connectionParameters", {}).get("url") or "-"
f = url.rsplit(".", 1)[-1]
output_dict = {
"data_files": {
"format": f,
"url": url,
},
}
This will print:
{'data_files': {'format': '-', 'url': '-'}}
If the "url"
parameter is None
.
I am currently writing a script to scrape data from an API into a Python dictionary and then export the result into a JSON file. I am trying to get the file extension from a response by splitting using .rsplit('.', 1)[-1]
The only problem is that some keys have ‘None" as their value and this throws the AttributeError: ‘NoneType’ object has no attribute ‘rsplit’. Here is my code snippet:
d = requests.get(dataset_url)
output = d.json()
output_dict = {
'data_files': {
# Format to get only extension
'format': output.get('connectionParameters', {}).get('url').rsplit('.', 1)[-1],
'url': output.get('connectionParameters', {}).get('url'),
},
}
An example of JSON response with the required key is as follows:
"connectionParameters": {
"csv_escape_char": "\",
"protocol": "DwC",
"automation": false,
"strip": false,
"csv_eol": "\n",
"csv_text_enclosure": """,
"csv_delimiter": "\t",
"incremental": false,
"url": "https://registry.nbnatlas.org/upload/1564481725489/London_churchyards_dwc.txt",
"termsForUniqueKey": [
"occurrenceID"
]
},
Any way to tackle this?
Try:
url = output.get("connectionParameters", {}).get("url") or "-"
f = url.rsplit(".", 1)[-1]
output_dict = {
"data_files": {
"format": f,
"url": url,
},
}
This will print:
{'data_files': {'format': '-', 'url': '-'}}
If the "url"
parameter is None
.