BeautifulSoup find_all limited to 50 results?
Question:
I’m trying to get the results from a page using BeautifulSoup:
req_url = 'http://www.xscores.com/soccer/livescores/25-02'
request = requests.get(req_url)
content = request.content
soup = BeautifulSoup(content, "html.parser")
scores = soup.find_all('tr', {'style': 'height:18px;'}, limit=None)
print(len(scores))
>50
I read this previous solution: Beautiful Soup findAll doesn't find them all
and I tried html.parser, lxml and html5lib, but none of them return more than 50 results. Any suggestions?
Answers:
Try using css-selector
query.
scores = soup.select('#scoretable > tr[style*="height:18px;"]')
print(len(scores))
>>>613
Try this –
req_url = 'http://www.xscores.com/soccer/livescores/25-02'
request = requests.get(req_url)
html=request.text
soup = BeautifulSoup(html, "html5lib")
scoretable=soup.find('tbody',id='scoretable')
scores=scoretable.find_all('tr')
len(scores)
>617
This line only finds rows with ‘height:18px; style.
scores = soup.find_all('tr', {'style': 'height:18px;'}, limit=None)
If you look at the page source and search for "height:18px;"
you’ll see 50 matches. But if you search for height:18px;
without the quotes you’ll see 613 matches.
You need to edit that line to find rows that has height:18px; style (and other values).
You can pass a regex as style argument according to documentations, maybe something like this:
soup.find_all('tr', style = re.compile('height:18px'), limit=None)
I’m trying to get the results from a page using BeautifulSoup:
req_url = 'http://www.xscores.com/soccer/livescores/25-02'
request = requests.get(req_url)
content = request.content
soup = BeautifulSoup(content, "html.parser")
scores = soup.find_all('tr', {'style': 'height:18px;'}, limit=None)
print(len(scores))
>50
I read this previous solution: Beautiful Soup findAll doesn't find them all
and I tried html.parser, lxml and html5lib, but none of them return more than 50 results. Any suggestions?
Try using css-selector
query.
scores = soup.select('#scoretable > tr[style*="height:18px;"]')
print(len(scores))
>>>613
Try this –
req_url = 'http://www.xscores.com/soccer/livescores/25-02'
request = requests.get(req_url)
html=request.text
soup = BeautifulSoup(html, "html5lib")
scoretable=soup.find('tbody',id='scoretable')
scores=scoretable.find_all('tr')
len(scores)
>617
This line only finds rows with ‘height:18px; style.
scores = soup.find_all('tr', {'style': 'height:18px;'}, limit=None)
If you look at the page source and search for "height:18px;"
you’ll see 50 matches. But if you search for height:18px;
without the quotes you’ll see 613 matches.
You need to edit that line to find rows that has height:18px; style (and other values).
You can pass a regex as style argument according to documentations, maybe something like this:
soup.find_all('tr', style = re.compile('height:18px'), limit=None)