How I getting or extract string by beautiful soup?

Question:

how i do use beautifulsoup get only string num "611674069.14413534248" from url ?

https://shopee.co.th/Kawasaki-%E0%B8%A3%E0%B8%AD%E0%B8%87%E0%B9%80% E0%B8%97%E0%B9%89%E0%B8%B2%E0%B8%81%E0%B8%B5%E0%B8%AC%E0%B8%B2%E0%B8%A5%E0%B9%8D%E0%B8%B2%E0%B8%A5%E0%B8%AD%E0%B8%87%E0%B8%A3%E0%B8%B0%E0%B8%9A%E0%B8%9A%E0%B8%9 B%E0%B9%89%E0%B8%AD%E0%B8%87%E0%B8%81%E0%B8%B1%E0%B8%99%E0%B8%81%E0%B8%B2%E0%B8%A3%E0%B8%AA%E0%B8%B6%E0%B8%81%E0%B8%AB%E0%B8%A3%E0%B8%AD%E0%B9%81%E0%B8%9A%E0%B8 %9A%E0%B9%80%E0%B8%95%E0%B9%87%E0%B8%A1%E0%B8%A3%E0%B8%B9%E0%B8%9B%E0%B9%81%E0%B8%9A%E0%B8%9A-i.611674069.14413534248?sp_atk=639c49c1-a9bf-438f-9f19-26bc401be71 3&xptdk=639c49c1-a9bf-438f-9f19-26bc401be713

<div class="col-xs-2-4 shopee-search-item-result__item" data-sqe="item">
<a data-sqe="link" href="/Apple-iPhone-11-by-Studio7-i.301786571.4161270915?sp_atk=cc5f3783-013f-4ed4-88cb-8675a212c9d3&amp;xptdk=cc5f3783-013f-4ed4-88cb-8675a212c9d3">
    <div class="tWpFe2"><div class="VTjd7p whIxGK">
        <div style="pointer-events: none;">
        <div class="yvbeD6 KUUypF"><img width="invalid-value" height="invalid-value" alt="Apple iPhone 11 by Studio7" class="_7DTxhh vc8g9F" style="object-fit: contain" src="https://cf.shopee.co.th/file/sg-11134201-22110-xpzrtoej6pjv3f_tn">
            <div class="aLgMTQ"><div class="YeGYFd LIaN-a" style="color: rgb(208, 1, 27);">
            <div class="_0aihnk"></div></div></div>
            <div class="GOgNtl"><div class="NTmuqd _3NQO+7 WVxeBE _2UunVx"><span class="percent">13%</span><span class="Th6IF+">ลด</span></div></div></div></div>
            <div class="KMyn8J"><div class="dpiR4u" data-sqe="name">
                <div class="FDn--+"><div class="ie3A+n bM+7UW Cve6sh">Apple iPhone 11 by Studio7</div></div>
                <div class="FD2XVZ"><div class="_1PWkR nt-medium nt-foot _3nkRL" style="color: rgb(246, 145, 19);"><svg class="_2DRZW _2xFcL" viewBox="-0.5 -0.5 4 16"><path d="M4 0h-3q-1 0 -1 1a1.2 1.5 0 0 1 0 3v0.333a1.2 1.5 0 0 1 0 3v0.333a1.2 1.5 0 0 1 0 3v0.333a1.2 1.5 0 0 1 0 3q0 1 1 1h3" stroke-width="1" transform="" stroke="currentColor" fill="#f69113"></path></svg>
                    <div class="_1FKkT _3Ao0A" style="color: white; background-color: rgb(246, 145, 19);">โค้ดลด ฿300</div><svg class="_2DRZW _2xFcL" viewBox="-0.5 -0.5 4 16"><path d="M4 0h-3q-1 0 -1 1a1.2 1.5 0 0 1 0 3v0.333a1.2 1.5 0 0 1 0 3v0.333a1.2 1.5 0 0 1 0 3v0.333a1.2 1.5 0 0 1 0 3q0 1 1 1h3" stroke-width="1" transform="rotate(180) translate(-3 -15)" stroke="currentColor" fill="#f69113"></path></svg></div></div></div><div class="hpDKMN">
                        <div class="vioxXd rVLWG6"><span class="recFju">฿</span><span class="ZEgDH9">17,000</span> - <span class="recFju">฿</span><span class="ZEgDH9">21,500</span></div></div>
                        <div class="ZnrnMl"><div class="RS7p+X" data-sqe="rating"><div class="shopee-rating-stars"><div class="shopee-rating-stars__stars">
                            <div class="shopee-rating-stars__star-wrapper">
                                <div class="shopee-rating-stars__lit" style="width: 100%;"><svg enable-background="new 0 0 15 15" viewBox="0 0 15 15" x="0" y="0" class="shopee-svg-icon shopee-rating-stars__gold-star icon-rating-solid"><polygon points="7.5 .8 9.7 5.4 14.5 5.9 10.7 9.1 11.8 14.2 7.5 11.6 3.2 14.2 4.3 9.1 .5 5.9 5.3 5.4" stroke-linecap="round" stroke-linejoin="round" stroke-miterlimit="10"></polygon></svg></div><svg enable-background="new 0 0 15 15" viewBox="0 0 15 15" x="0" y="0" class="shopee-svg-icon shopee-rating-stars__dark-star icon-rating-solid"><polygon points="7.5 .8 9.7 5.4 14.5 5.9 10.7 9.1 11.8 14.2 7.5 11.6 3.2 14.2 4.3 9.1 .5 5.9 5.3 5.4" stroke-linecap="round" stroke-linejoin="round" stroke-miterlimit="10"></polygon></svg></div><div class="shopee-rating-stars__star-wrapper"><div class="shopee-rating-stars__lit" style="width: 100%;"><svg enable-background="new 0 0 15 15" viewBox="0 0 15 15" x="0" y="0" class="shopee-svg-icon shopee-rating-stars__gold-star icon-rating-solid"><polygon points="7.5 .8 9.7 5.4 14.5 5.9 10.7 9.1 11.8 14.2 7.5 11.6 3.2 14.2 4.3 9.1 .5 5.9 5.3 5.4" stroke-linecap="round" stroke-linejoin="round" stroke-miterlimit="10"></polygon></svg></div><svg enable-background="new 0 0 15 15" viewBox="0 0 15 15" x="0" y="0" class="shopee-svg-icon shopee-rating-stars__dark-star icon-rating-solid"><polygon points="7.5 .8 9.7 5.4 14.5 5.9 10.7 9.1 11.8 14.2 7.5 11.6 3.2 14.2 4.3 9.1 .5 5.9 5.3 5.4" stroke-linecap="round" stroke-linejoin="round" stroke-miterlimit="10"></polygon></svg></div>
                                <div class="shopee-rating-stars__star-wrapper">
                                    <div class="shopee-rating-stars__lit" style="width: 100%;"><svg enable-background="new 0 0 15 15" viewBox="0 0 15 15" x="0" y="0" class="shopee-svg-icon shopee-rating-stars__gold-star icon-rating-solid"><polygon points="7.5 .8 9.7 5.4 14.5 5.9 10.7 9.1 11.8 14.2 7.5 11.6 3.2 14.2 4.3 9.1 .5 5.9 5.3 5.4" stroke-linecap="round" stroke-linejoin="round" stroke-miterlimit="10"></polygon></svg></div><svg enable-background="new 0 0 15 15" viewBox="0 0 15 15" x="0" y="0" class="shopee-svg-icon shopee-rating-stars__dark-star icon-rating-solid"><polygon points="7.5 .8 9.7 5.4 14.5 5.9 10.7 9.1 11.8 14.2 7.5 11.6 3.2 14.2 4.3 9.1 .5 5.9 5.3 5.4" stroke-linecap="round" stroke-linejoin="round" stroke-miterlimit="10"></polygon></svg></div><div class="shopee-rating-stars__star-wrapper"><div class="shopee-rating-stars__lit" style="width: 100%;"><svg enable-background="new 0 0 15 15" viewBox="0 0 15 15" x="0" y="0" class="shopee-svg-icon shopee-rating-stars__gold-star icon-rating-solid"><polygon points="7.5 .8 9.7 5.4 14.5 5.9 10.7 9.1 11.8 14.2 7.5 11.6 3.2 14.2 4.3 9.1 .5 5.9 5.3 5.4" stroke-linecap="round" stroke-linejoin="round" stroke-miterlimit="10"></polygon></svg></div><svg enable-background="new 0 0 15 15" viewBox="0 0 15 15" x="0" y="0" class="shopee-svg-icon shopee-rating-stars__dark-star icon-rating-solid"><polygon points="7.5 .8 9.7 5.4 14.5 5.9 10.7 9.1 11.8 14.2 7.5 11.6 3.2 14.2 4.3 9.1 .5 5.9 5.3 5.4" stroke-linecap="round" stroke-linejoin="round" stroke-miterlimit="10"></polygon></svg></div>
                                    <div class="shopee-rating-stars__star-wrapper">
                                    <div class="shopee-rating-stars__lit" style="width: 91.1562%;"><svg enable-background="new 0 0 15 15" viewBox="0 0 15 15" x="0" y="0" class="shopee-svg-icon shopee-rating-stars__gold-star icon-rating-solid"><polygon points="7.5 .8 9.7 5.4 14.5 5.9 10.7 9.1 11.8 14.2 7.5 11.6 3.2 14.2 4.3 9.1 .5 5.9 5.3 5.4" stroke-linecap="round" stroke-linejoin="round" stroke-miterlimit="10"></polygon></svg></div><svg enable-background="new 0 0 15 15" viewBox="0 0 15 15" x="0" y="0" class="shopee-svg-icon shopee-rating-stars__dark-star icon-rating-solid"><polygon points="7.5 .8 9.7 5.4 14.5 5.9 10.7 9.1 11.8 14.2 7.5 11.6 3.2 14.2 4.3 9.1 .5 5.9 5.3 5.4" stroke-linecap="round" stroke-linejoin="round" stroke-miterlimit="10"></polygon></svg></div></div></div></div>
                                    <div class="r6HknA uEPGHT">ขายแล้ว 11.6พัน ชิ้น</div></div>
                                    <div class="zGGwiV">จังหวัดสมุทรปราการ</div></div>
                                    <div class="shopee-item-card__hover-footer _6o9eaa">ค้นหาสินค้าที่คล้ายกัน</div></div></div></a></div>
get_data = data.find_all('div',class_="col-xs-2-4 shopee-search-item-result__item")

for area in get_data:
    print('process data'+str(i))
    name = area.find('div',class_="ie3A+n bM+7UW Cve6sh").get_text()
    product_images = area.find('img')['src']
    price = area.find('span',class_="ZEgDH9").get_text()
    link = base_url + area.find('a')['href']
    sold = area.find('div',class_="r6HknA uEPGHT")
    realurl = area.find('a', text='-i.')


**realurl , i using find ‘-i.’ to get "611674069.14413534248" from url , but doesn’t work. **

[! pic.](https://i.stack.imgur.com/gYWk5.png)

Asked By: Rinukz

||

Answers:

Assuming you are using Selenium (as Shopee is a dynamic website), here is a complete minimal example of obtaining various bits of information from products, including that bit from url:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys

chrome_options = Options()
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument('disable-notifications')
chrome_options.add_argument("window-size=1280,720")

webdriver_service = Service("chromedriver/chromedriver") ## path to where you saved chromedriver binary
driver = webdriver.Chrome(service=webdriver_service, options=chrome_options)
wait = WebDriverWait(driver, 5)
url='https://shopee.co.th/search?keyword=kawasaki'
driver.get(url)
products = wait.until(EC.presence_of_all_elements_located((By.XPATH, '//div[@data-sqe="item"]/a[@data-sqe="link"]')))
for p in products:
    name = p.find_element(By.XPATH, './/div[@data-sqe="name"]').text.strip()
    some_id = p.get_attribute('href').split('?sp_atk=')[0].split('-i.')[1]
    print(name, some_id)

Selenium setup is for linux/chromedriver/Chrome, you can adapt it to your own setup, just observe the imports and code after defining the driver.

As you already parse the page with Selenium, there is no need to use BeautifulSoup to parse it again, just use Selenium locators.

See Selenium documentation here: https://www.selenium.dev/documentation/

Answered By: Barry the Platipus

Maybe you can try somthing like this with regex

import re

Link = 'https://shopee.co.th/Kawasaki-%E0%B8%A3%E0%B8%AD%E0%B8%87%E0%B9%80%E0%B8%97%E0%B9%89%E0%B8%B2%E0%B8%81%E0%B8%B5%E0%B8%AC%E0%B8%B2%E0%B8%A5%E0%B9%8D%E0%B8%B2%E0%B8%A5%E0%B8%AD%E0%B8%87%E0%B8%A3%E0%B8%B0%E0%B8%9A%E0%B8%9A%E0%B8%9B%E0%B9%89%E0%B8%AD%E0%B8%87%E0%B8%81%E0%B8%B1%E0%B8%99%E0%B8%81%E0%B8%B2%E0%B8%A3%E0%B8%AA%E0%B8%B6%E0%B8%81%E0%B8%AB%E0%B8%A3%E0%B8%AD%E0%B9%81%E0%B8%9A%E0%B8%9A%E0%B9%80%E0%B8%95%E0%B9%87%E0%B8%A1%E0%B8%A3%E0%B8%B9%E0%B8%9B%E0%B9%81%E0%B8%9A%E0%B8%9A-i.611674069.14413534248?sp_atk=639c49c1-a9bf-438f-9f19-26bc401be713&xptdk=639c49c1-a9bf-438f-9f19-26bc401be713'

num = re.search('[d]{9}.[d]{11}', Link)
print(num.group(0))

Output :

611674069.14413534248
Answered By: Oxykore