Python requests stop redirects does not work
Question:
I want to access the content of a web page, but I’m being redirected to another page even though I’ve set allow_redirects to False in my requests call. Here’s an example code snippet:
import requests
from bs4 import BeautifulSoup
headers = {'User-Agent': user_agent} # assume I inserted my user agent here
URL = "https://stackoverflow.com/questions/73909641/program-is-about-space-utilisation-i-am-getting-error-72g-value-too-great-for"
html_content = requests.get(URL, allow_redirects=False, headers = headers)
soup = BeautifulSoup(html_content.content, "html.parser")
When I run this code, I don’t get any content from the web page. However, if I set allow_redirects to True, I’m redirected to this URL: Convert between byte count and "human-readable" string.
Answers:
You’d have to log in to get to the original SO question because
anonymous users get automatically redirected to the duplicate target when trying to access questions closed as duplicates with no answers
The relevant Meta Post, which is a duplicate itself of this question.
You can replicate this switching to Private Mode in your browser and then opening this link:
You should get redirected to Convert between byte count and "human-readable" string
EDIT:
You can turn off this behaviour and get to the original post with rquests
by appending the URL with:
?noredirect=1
Here’s an example:
import requests
from bs4 import BeautifulSoup
headers = {
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36",
}
URL = "https://stackoverflow.com/questions/73909641/program-is-about-space-utilisation-i-am-getting-error-72g-value-too-great-for?noredirect=1"
html_content = requests.get(URL, headers=headers)
soup = BeautifulSoup(html_content.content, "html.parser").select_one("title")
print(soup)
Output:
<title>linux - Program is about space utilisation. i am getting error : 72G value too great for base (error token is "72") - Stack Overflow</title>
I want to access the content of a web page, but I’m being redirected to another page even though I’ve set allow_redirects to False in my requests call. Here’s an example code snippet:
import requests
from bs4 import BeautifulSoup
headers = {'User-Agent': user_agent} # assume I inserted my user agent here
URL = "https://stackoverflow.com/questions/73909641/program-is-about-space-utilisation-i-am-getting-error-72g-value-too-great-for"
html_content = requests.get(URL, allow_redirects=False, headers = headers)
soup = BeautifulSoup(html_content.content, "html.parser")
When I run this code, I don’t get any content from the web page. However, if I set allow_redirects to True, I’m redirected to this URL: Convert between byte count and "human-readable" string.
You’d have to log in to get to the original SO question because
anonymous users get automatically redirected to the duplicate target when trying to access questions closed as duplicates with no answers
The relevant Meta Post, which is a duplicate itself of this question.
You can replicate this switching to Private Mode in your browser and then opening this link:
You should get redirected to Convert between byte count and "human-readable" string
EDIT:
You can turn off this behaviour and get to the original post with rquests
by appending the URL with:
?noredirect=1
Here’s an example:
import requests
from bs4 import BeautifulSoup
headers = {
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36",
}
URL = "https://stackoverflow.com/questions/73909641/program-is-about-space-utilisation-i-am-getting-error-72g-value-too-great-for?noredirect=1"
html_content = requests.get(URL, headers=headers)
soup = BeautifulSoup(html_content.content, "html.parser").select_one("title")
print(soup)
Output:
<title>linux - Program is about space utilisation. i am getting error : 72G value too great for base (error token is "72") - Stack Overflow</title>