Unshort amazon EU link using Python
Question:
i’m trying to unshort amazon link using python, from pattern: "https:// amzn.eu/XXXX".
It seems the url is not recognized!
If the url is in the format "https:// amzn.to/XXXXX" it works!
Only with amzn.EU problem appears.
This is my code. Any suggest?
import os, pathlib, re, requests, time, warnings
from requests.packages.urllib3.exceptions import InsecureRequestWarning
def formaturl(url):
if not re.match('(?:http|ftp|https)://', url):
return 'http://{}'.format(url)
return url
def unshort_link(url):
url = formaturl(url)
warnings.simplefilter('ignore',InsecureRequestWarning)
session = requests.Session()
resp = session.head(url, allow_redirects=True, verify=False)
unshort_url = resp.url
return unshort_url
not_working_link = 'https://amzn.eu/d/fb1IYWl'
#working_link = 'https://amzn.to/3A0milQ'
unshorted_url = unshort_link(not_working_link)
print(unshorted_url)
Answers:
The HEAD
request doesn’t work on this link, it returns a 404.
However, with a GET
it’ll work as expected:
resp = requests.get('https://amzn.eu/d/fb1IYWl')
resp.url
# 'https://www.amazon.it/dp/B00HVFQF3I/ref=cm_sw_r_apa_i_9GRWP18TK8S32ZPVJVM7_0?_encoding=UTF8&psc=1'
i’m trying to unshort amazon link using python, from pattern: "https:// amzn.eu/XXXX".
It seems the url is not recognized!
If the url is in the format "https:// amzn.to/XXXXX" it works!
Only with amzn.EU problem appears.
This is my code. Any suggest?
import os, pathlib, re, requests, time, warnings
from requests.packages.urllib3.exceptions import InsecureRequestWarning
def formaturl(url):
if not re.match('(?:http|ftp|https)://', url):
return 'http://{}'.format(url)
return url
def unshort_link(url):
url = formaturl(url)
warnings.simplefilter('ignore',InsecureRequestWarning)
session = requests.Session()
resp = session.head(url, allow_redirects=True, verify=False)
unshort_url = resp.url
return unshort_url
not_working_link = 'https://amzn.eu/d/fb1IYWl'
#working_link = 'https://amzn.to/3A0milQ'
unshorted_url = unshort_link(not_working_link)
print(unshorted_url)
The HEAD
request doesn’t work on this link, it returns a 404.
However, with a GET
it’ll work as expected:
resp = requests.get('https://amzn.eu/d/fb1IYWl')
resp.url
# 'https://www.amazon.it/dp/B00HVFQF3I/ref=cm_sw_r_apa_i_9GRWP18TK8S32ZPVJVM7_0?_encoding=UTF8&psc=1'