To to remove html tag to get text
Question:
I have text like this:
text =
<option value="tfa_4472" id="tfa_4472" class="">helo 1</option>
<option value="tfa_4473" id="tfa_4473" class="">helo 2</option>
<option value="tfa_4474" id="tfa_4474" class="">helo 3</option>
<option value="tfa_4475" id="tfa_4475" class="">helo 4</option>
<option value="tfa_4476" id="tfa_4476" class="">helo 5</option>
i want get result like this:
my_list = get_text(text)
helo 1
helo 2
helo 3
helo 4
helo 5
Thank you
To to remove html tag to get text
Answers:
In JavaScript, if your text comes to you as a string, you can search "strip HTML tags" on Google and get a regular expression like this one from css-tricks and wrap it into a function with the name that you need:
const get_text(text) = () => {
return text.replace(/(<([^>]+)>)/gi, "");
}
In javascript you can try to select the option tags with queryselectorall and get the text with innerText by looping over the nodes and appending to myList.
$mylist = []
$nodes = document.querySelectorAll('option')
$nodes.forEach($node => {
$mylist += $node.innerText
});
console.log($mylist)
Python:
from bs4 import BeautifulSoup
myhtml = """<option value="tfa_4472" id="tfa_4472" class="">helo 1</option>
<option value="tfa_4473" id="tfa_4473" class="">helo 2</option>
<option value="tfa_4474" id="tfa_4474" class="">helo 3</option>
<option value="tfa_4475" id="tfa_4475" class="">helo 4</option>
<option value="tfa_4476" id="tfa_4476" class="">helo 5</option>"""
soup = BeautifulSoup(myhtml, 'html.parser')
my_text = []
for text_tag in soup.find_all("option", {'class': ''}):
my_text.append(text_tag.getText())
my_text
[‘helo 1’, ‘helo 2’, ‘helo 3’, ‘helo 4’, ‘helo 5’]
I have text like this:
text =
<option value="tfa_4472" id="tfa_4472" class="">helo 1</option>
<option value="tfa_4473" id="tfa_4473" class="">helo 2</option>
<option value="tfa_4474" id="tfa_4474" class="">helo 3</option>
<option value="tfa_4475" id="tfa_4475" class="">helo 4</option>
<option value="tfa_4476" id="tfa_4476" class="">helo 5</option>
i want get result like this:
my_list = get_text(text)
helo 1
helo 2
helo 3
helo 4
helo 5
Thank you
To to remove html tag to get text
In JavaScript, if your text comes to you as a string, you can search "strip HTML tags" on Google and get a regular expression like this one from css-tricks and wrap it into a function with the name that you need:
const get_text(text) = () => {
return text.replace(/(<([^>]+)>)/gi, "");
}
In javascript you can try to select the option tags with queryselectorall and get the text with innerText by looping over the nodes and appending to myList.
$mylist = []
$nodes = document.querySelectorAll('option')
$nodes.forEach($node => {
$mylist += $node.innerText
});
console.log($mylist)
Python:
from bs4 import BeautifulSoup
myhtml = """<option value="tfa_4472" id="tfa_4472" class="">helo 1</option>
<option value="tfa_4473" id="tfa_4473" class="">helo 2</option>
<option value="tfa_4474" id="tfa_4474" class="">helo 3</option>
<option value="tfa_4475" id="tfa_4475" class="">helo 4</option>
<option value="tfa_4476" id="tfa_4476" class="">helo 5</option>"""
soup = BeautifulSoup(myhtml, 'html.parser')
my_text = []
for text_tag in soup.find_all("option", {'class': ''}):
my_text.append(text_tag.getText())
my_text
[‘helo 1’, ‘helo 2’, ‘helo 3’, ‘helo 4’, ‘helo 5’]