Python How many times each meta keyword is used in a string
Question:
I am trying to print out meta keywords from a website using site url then print out how many times each keyword has been used inside the article. I have written below code to extract meta keyword first from the url
res = requests.get(
'https://www.wpbeginner.com/showcase/24-must-have-wordpress-plugins-for-business-websites/',
headers={"User-Agent": "Mozilla/5.0 (X11; CrOS x86_64 12871.102.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.141 Safari/537.36"}
)
response = res
doc = Document(response.content)
#doc.title()
#print(doc.summary())
soup = BeautifulSoup(doc.summary(), features='lxml')
soup1 = BeautifulSoup(res.text, 'html5lib')
text = soup.get_text()
meta_keywords = [item['content'] for item in soup1.select('[name=Keywords][content], [name=keywords][content]')]
for mword in meta_keywords:
print(mword)
Above code is printing out the meta keywords as comma separated value like
best wordpress plugins,wordpress business websites,wordpress plugins for business websites,wordpress tools for businesses
Now I am trying to fetch how many times each keyword is used inside the whole body text or article. I have tried below code but not working
for mword in meta_keywords:
x = text.count(mword)
print(mword,x)
it is printing out a result like below, a 0(zero) in the last, I think It is considering althose comma separated keywords as one word. don’t know how to solve this
best wordpress plugins,wordpress business websites,wordpress plugins for business websites,wordpress tools for businesses 0
Answers:
Code:-
#mword is like this yes
mword=["best wordpress plugins,wordpress business websites,wordpress plugins for business websites,wordpress tools for businesses 0"]
#Whole content
text="We are often asked by readers for the best wordpress plugins suggestions for SEO, social media, backups, speed, etc."
#converting into a string
temp=""
temp+=mword[0]
#converting string temp into a list
lis=list(temp.split(','))
print(lis)
for word in lis:
print("Frequency of ["+word+"] : "+str(text.count(word)))
break
Output:-
['best wordpress plugins', 'wordpress business websites', 'wordpress plugins for business websites', 'wordpress tools for businesses 0']
Frequency of [best wordpress plugins] : 1
I am trying to print out meta keywords from a website using site url then print out how many times each keyword has been used inside the article. I have written below code to extract meta keyword first from the url
res = requests.get(
'https://www.wpbeginner.com/showcase/24-must-have-wordpress-plugins-for-business-websites/',
headers={"User-Agent": "Mozilla/5.0 (X11; CrOS x86_64 12871.102.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.141 Safari/537.36"}
)
response = res
doc = Document(response.content)
#doc.title()
#print(doc.summary())
soup = BeautifulSoup(doc.summary(), features='lxml')
soup1 = BeautifulSoup(res.text, 'html5lib')
text = soup.get_text()
meta_keywords = [item['content'] for item in soup1.select('[name=Keywords][content], [name=keywords][content]')]
for mword in meta_keywords:
print(mword)
Above code is printing out the meta keywords as comma separated value like
best wordpress plugins,wordpress business websites,wordpress plugins for business websites,wordpress tools for businesses
Now I am trying to fetch how many times each keyword is used inside the whole body text or article. I have tried below code but not working
for mword in meta_keywords:
x = text.count(mword)
print(mword,x)
it is printing out a result like below, a 0(zero) in the last, I think It is considering althose comma separated keywords as one word. don’t know how to solve this
best wordpress plugins,wordpress business websites,wordpress plugins for business websites,wordpress tools for businesses 0
Code:-
#mword is like this yes
mword=["best wordpress plugins,wordpress business websites,wordpress plugins for business websites,wordpress tools for businesses 0"]
#Whole content
text="We are often asked by readers for the best wordpress plugins suggestions for SEO, social media, backups, speed, etc."
#converting into a string
temp=""
temp+=mword[0]
#converting string temp into a list
lis=list(temp.split(','))
print(lis)
for word in lis:
print("Frequency of ["+word+"] : "+str(text.count(word)))
break
Output:-
['best wordpress plugins', 'wordpress business websites', 'wordpress plugins for business websites', 'wordpress tools for businesses 0']
Frequency of [best wordpress plugins] : 1