Selenium search by tag name option
Question:
I’m trying to get all data from a website called Correios. On this website, I need to handle some dropdowns which I’m having some issues with, like:
It’s returning a list with a bunch of empty strings.
chrome_path = r"C:\Users\Gustavo\Desktop\geckodriver\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
lista_x = []
driver.get("http://www2.correios.com.br/sistemas/agencias/")
driver.maximize_window()
dropdownEstados = driver.find_elements_by_xpath("""//*[@id="estadoAgencia"]""")
optEstados = driver.find_elements_by_tag_name("option")
for valores in optEstados:
print(valores.text.encode())
And what I get from that is:
b''
b'ACRE'
b'ALAGOAS'
b'AMAPxc3x81'
b'AMAZONAS'
b'BAHIA'
b'CEARxc3x81'
b'DISTRITO FEDERAL'
b'ESPxc3x8dRITO SANTO'
b'GOIxc3x81S'
b'MARANHxc3x83O'
b'MINAS GERAIS'
b'MATO GROSSO DO SUL'
b'MATO GROSSO'
b'PARxc3x81'
b'PARAxc3x8dBA'
b'PERNAMBUCO'
b'PIAUxc3x8d'
b'PARANxc3x81'
b'RIO DE JANEIRO'
b'RIO GRANDE DO NORTE'
b'RONDxc3x94NIA'
b'RORAIMA'
b'RIO GRANDE DO SUL'
b'SANTA CATARINA'
b'SERGIPE'
b'Sxc3x83O PAULO'
b'TOCANTINS'
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
How can I remove the empty b" "?
Answers:
If I understand it right, you want to locate all these options:
Try this XPath expression to locate the dropdown elements:
//*[@id="estadoAgencia"]/option
The code sample:
chrome_path = r"C:\Users\Gustavo\Desktop\geckodriver\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
lista_x = []
driver.get("http://www2.correios.com.br/sistemas/agencias/")
driver.maximize_window()
dropdownEstados = driver.find_elements_by_xpath("//*[@id='estadoAgencia']")
# Find elements in dropdown
optEstados = driver.find_elements_by_xpath("//*[@id='estadoAgencia']/option")
for valores in optEstados:
print(valores.text.encode())
Via this XPath expression you will get all dropdown elements, without empty strings except one, which is in this dropdown. Output:
b''
b'ACRE'
b'ALAGOAS'
b'AMAPxc3x81'
b'AMAZONAS'
b'BAHIA'
b'CEARxc3x81'
b'DISTRITO FEDERAL'
b'ESPxc3x8dRITO SANTO'
b'GOIxc3x81S'
b'MARANHxc3x83O'
b'MINAS GERAIS'
b'MATO GROSSO DO SUL'
b'MATO GROSSO'
b'PARxc3x81'
b'PARAxc3x8dBA'
b'PERNAMBUCO'
b'PIAUxc3x8d'
b'PARANxc3x81'
b'RIO DE JANEIRO'
b'RIO GRANDE DO NORTE'
b'RONDxc3x94NIA'
b'RORAIMA'
b'RIO GRANDE DO SUL'
b'SANTA CATARINA'
b'SERGIPE'
b'Sxc3x83O PAULO'
b'TOCANTINS'
Note: the first element is an empty string because of this:
There is a small change required in your code:
dropdownEstados = driver.find_element_by_xpath("""//*[@id="estadoAgencia"]""")
optEstados = dropdownEstados.find_elements_by_tag_name("option")
for valores in optEstados:
print(valores.text.encode())
To retrieve the text from all the <options>
of the DropDown with id as estadoAgencia
, as it is a <select>
tag it would be much easier and efficient to use the methods associated with <select>
tag and you can use the following solution:
-
Code Block:
estado_select = Select(driver.find_element_by_id('estadoAgencia'))
for opt in estado_select.options:
print(opt.get_attribute('innerHTML'))
-
Console Output:
ACRE
ALAGOAS
AMAPÁ
AMAZONAS
BAHIA
CEARÁ
DISTRITO FEDERAL
ESPÍRITO SANTO
GOIÁS
MARANHÃO
MINAS GERAIS
MATO GROSSO DO SUL
MATO GROSSO
PARÁ
PARAÍBA
PERNAMBUCO
PIAUÍ
PARANÁ
RIO DE JANEIRO
RIO GRANDE DO NORTE
RONDÔNIA
RORAIMA
RIO GRANDE DO SUL
SANTA CATARINA
SERGIPE
SÃO PAULO
TOCANTINS
I’m trying to get all data from a website called Correios. On this website, I need to handle some dropdowns which I’m having some issues with, like:
It’s returning a list with a bunch of empty strings.
chrome_path = r"C:\Users\Gustavo\Desktop\geckodriver\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
lista_x = []
driver.get("http://www2.correios.com.br/sistemas/agencias/")
driver.maximize_window()
dropdownEstados = driver.find_elements_by_xpath("""//*[@id="estadoAgencia"]""")
optEstados = driver.find_elements_by_tag_name("option")
for valores in optEstados:
print(valores.text.encode())
And what I get from that is:
b''
b'ACRE'
b'ALAGOAS'
b'AMAPxc3x81'
b'AMAZONAS'
b'BAHIA'
b'CEARxc3x81'
b'DISTRITO FEDERAL'
b'ESPxc3x8dRITO SANTO'
b'GOIxc3x81S'
b'MARANHxc3x83O'
b'MINAS GERAIS'
b'MATO GROSSO DO SUL'
b'MATO GROSSO'
b'PARxc3x81'
b'PARAxc3x8dBA'
b'PERNAMBUCO'
b'PIAUxc3x8d'
b'PARANxc3x81'
b'RIO DE JANEIRO'
b'RIO GRANDE DO NORTE'
b'RONDxc3x94NIA'
b'RORAIMA'
b'RIO GRANDE DO SUL'
b'SANTA CATARINA'
b'SERGIPE'
b'Sxc3x83O PAULO'
b'TOCANTINS'
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
How can I remove the empty b" "?
If I understand it right, you want to locate all these options:
Try this XPath expression to locate the dropdown elements:
//*[@id="estadoAgencia"]/option
The code sample:
chrome_path = r"C:\Users\Gustavo\Desktop\geckodriver\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
lista_x = []
driver.get("http://www2.correios.com.br/sistemas/agencias/")
driver.maximize_window()
dropdownEstados = driver.find_elements_by_xpath("//*[@id='estadoAgencia']")
# Find elements in dropdown
optEstados = driver.find_elements_by_xpath("//*[@id='estadoAgencia']/option")
for valores in optEstados:
print(valores.text.encode())
Via this XPath expression you will get all dropdown elements, without empty strings except one, which is in this dropdown. Output:
b''
b'ACRE'
b'ALAGOAS'
b'AMAPxc3x81'
b'AMAZONAS'
b'BAHIA'
b'CEARxc3x81'
b'DISTRITO FEDERAL'
b'ESPxc3x8dRITO SANTO'
b'GOIxc3x81S'
b'MARANHxc3x83O'
b'MINAS GERAIS'
b'MATO GROSSO DO SUL'
b'MATO GROSSO'
b'PARxc3x81'
b'PARAxc3x8dBA'
b'PERNAMBUCO'
b'PIAUxc3x8d'
b'PARANxc3x81'
b'RIO DE JANEIRO'
b'RIO GRANDE DO NORTE'
b'RONDxc3x94NIA'
b'RORAIMA'
b'RIO GRANDE DO SUL'
b'SANTA CATARINA'
b'SERGIPE'
b'Sxc3x83O PAULO'
b'TOCANTINS'
Note: the first element is an empty string because of this:
There is a small change required in your code:
dropdownEstados = driver.find_element_by_xpath("""//*[@id="estadoAgencia"]""")
optEstados = dropdownEstados.find_elements_by_tag_name("option")
for valores in optEstados:
print(valores.text.encode())
To retrieve the text from all the <options>
of the DropDown with id as estadoAgencia
, as it is a <select>
tag it would be much easier and efficient to use the methods associated with <select>
tag and you can use the following solution:
-
Code Block:
estado_select = Select(driver.find_element_by_id('estadoAgencia')) for opt in estado_select.options: print(opt.get_attribute('innerHTML'))
-
Console Output:
ACRE ALAGOAS AMAPÁ AMAZONAS BAHIA CEARÁ DISTRITO FEDERAL ESPÍRITO SANTO GOIÁS MARANHÃO MINAS GERAIS MATO GROSSO DO SUL MATO GROSSO PARÁ PARAÍBA PERNAMBUCO PIAUÍ PARANÁ RIO DE JANEIRO RIO GRANDE DO NORTE RONDÔNIA RORAIMA RIO GRANDE DO SUL SANTA CATARINA SERGIPE SÃO PAULO TOCANTINS