How do I create a list from a webpage?


I am attempting to create a list of words from the website text. I would like to be able to randomise the word that is produced from this list using random. I hope this makes sense.

import random as r
from bs4 import BeautifulSoup
import requests as rq

url = ''
page = rq.get(url)
soup = [BeautifulSoup(page.text, 'html.parser')]


I tried this but I get the full list. I presume this is due to the fact that the website I am scraping from does not use breaks or anything else so I am unsure how to specify what to take from.

Asked By: Nyx



There is no need of BeautifulSoup in this context, simply split() the text from the response into list.


import random as r
import requests as rq

url = ''
word_list = rq.get(url).text.split()

If you really need to use BeautifulSoup you could get_text() and split():

word_list = BeautifulSoup(rq.get(url).text).get_text('n',strip=True).split()
Answered By: HedgeHog

If you use [BeautifulSoup(page.text, 'html.parser')], the entire document will be converted as single element of the list. Instead convert into string and then use string split method to convert to list.

import random as r
from bs4 import BeautifulSoup
import requests as rq

url = ''
page = rq.get(url)
soup = str(BeautifulSoup(page.text, 'html.parser'))
soup = soup.split('n')

Note: I wanted to use the same approach you used so that you will understand the difference.

Answered By: Suramuthu R