Inputting data from a webscraped page using Python
Question:
I have looked through stackoverflow and am unable to find the answer I am looking for, or understand if the answer given by another post is the answer I am looking for.
So what I would like to do is pull from a webpage, that has an input box, enter data into that input box, and get the return result.
What is a way I can go about doing this with Python? I saw someone created a similar scraper using json or Node I believe. But again I would like to use Python if that is doable.
right now I have the follow code
from bs4 import BeautifulSoup
import requests
source = requests.get('https://somewebsitehere.org').text
soup = BeautifulSoup(source, 'lxml')
receipt_box = soup.find('div', class_='filed-box')
print(receipt_box)
which gives me this
<div class="filed-box">
<input class="form-control textbox initial-
focus" id="receipt_number" maxlength="13"
name="appReceiptNum" type="text"/>
</div>
I think I need to use the appReceiptNum and from there enter my "receipt_number" into the input box.
I saw that Postpy2 may be able to help me with this but I don’t really know.
any help is appreciated.
EDIT: So using Selenium this is what I have as an idea for accessing and send the desired info.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
driver = webdriver.Chrome('PATH to my chromedriver.exe')
driver.get("https://egov.uscis.gov/casestatus/landing.do")
elem = driver.find_element(By.NAME, "appReceiptNum")
elem.send_keys("case number")
How does this look? I haven’t gotten to sending the information yet.
Answers:
This is one way of inputting the caseid into that page, and clicking Submit, using Selenium:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
chrome_options = Options()
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument('disable-notifications')
chrome_options.add_argument("window-size=1280,720")
webdriver_service = Service("chromedriver/chromedriver") ## path to where you saved chromedriver binary
browser = webdriver.Chrome(service=webdriver_service, options=chrome_options)
url = 'https://egov.uscis.gov/casestatus/landing.do'
browser.get(url)
caseno_input = WebDriverWait(browser,10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input[id='receipt_number']")))
caseno_input.send_keys('WAC1234567890')
WebDriverWait(browser,10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input[title='CHECK STATUS']"))).click()
print('clicked the button')
The above setup is using Chrome/chromedriver on linux, but you can adapt it to your own setup, just observe the imports, and the code after defining the browser/driver.
Selenium documentation can be found at: https://www.selenium.dev/documentation/
I have looked through stackoverflow and am unable to find the answer I am looking for, or understand if the answer given by another post is the answer I am looking for.
So what I would like to do is pull from a webpage, that has an input box, enter data into that input box, and get the return result.
What is a way I can go about doing this with Python? I saw someone created a similar scraper using json or Node I believe. But again I would like to use Python if that is doable.
right now I have the follow code
from bs4 import BeautifulSoup
import requests
source = requests.get('https://somewebsitehere.org').text
soup = BeautifulSoup(source, 'lxml')
receipt_box = soup.find('div', class_='filed-box')
print(receipt_box)
which gives me this
<div class="filed-box">
<input class="form-control textbox initial-
focus" id="receipt_number" maxlength="13"
name="appReceiptNum" type="text"/>
</div>
I think I need to use the appReceiptNum and from there enter my "receipt_number" into the input box.
I saw that Postpy2 may be able to help me with this but I don’t really know.
any help is appreciated.
EDIT: So using Selenium this is what I have as an idea for accessing and send the desired info.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
driver = webdriver.Chrome('PATH to my chromedriver.exe')
driver.get("https://egov.uscis.gov/casestatus/landing.do")
elem = driver.find_element(By.NAME, "appReceiptNum")
elem.send_keys("case number")
How does this look? I haven’t gotten to sending the information yet.
This is one way of inputting the caseid into that page, and clicking Submit, using Selenium:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
chrome_options = Options()
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument('disable-notifications')
chrome_options.add_argument("window-size=1280,720")
webdriver_service = Service("chromedriver/chromedriver") ## path to where you saved chromedriver binary
browser = webdriver.Chrome(service=webdriver_service, options=chrome_options)
url = 'https://egov.uscis.gov/casestatus/landing.do'
browser.get(url)
caseno_input = WebDriverWait(browser,10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input[id='receipt_number']")))
caseno_input.send_keys('WAC1234567890')
WebDriverWait(browser,10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input[title='CHECK STATUS']"))).click()
print('clicked the button')
The above setup is using Chrome/chromedriver on linux, but you can adapt it to your own setup, just observe the imports, and the code after defining the browser/driver.
Selenium documentation can be found at: https://www.selenium.dev/documentation/