how to get exact matched phrases with re in python

Question:

I am trying to return all phrases that matches a pattern using python re.
Here’s an example of the code:

mlocations=requests.get("https://m.happyfresh.id/supplier/tip-top-hfc?tracking_source=backlink_storehome").text
data = re.findall(r'(?={"id":)(.*?)(?=","address1":)', mlocations)

here’s a snippet of the mlocations

~50%","seo_details":null,"store_categories_name":["Daily Basic Needs","Supermarket"]}},{"id":11649,"name":"HappyFresh Supermarket Depok","address1":"Jl. Gas Alam Raya No.90, RW.5, Curug, Kec. Cimanggis, Kota Depok, Jawa Barat","city":null,"zipcode":"16454","phone":"","lat":-6.38367246279211,"lon":106.876803265144,"slug":"happyfresh-supermarket-depok","photo":null,"state_name":"Depok","supplier":{"id":3468,"name":"HappyFresh Supermarket - ID","slug":"tip-top-hfc","supplier_type":"warehouse","instant_delivery":true,"delivery_time":"","delivery_price":"","brand_store_image":null,"photo":"https://cdn.happyfresh.com/spree/suppliers/photos/83817f71b5105cea577fdc3b7269004e97e27dcc-medium.jpg?1636468094","square_background":"#ffffff","square_photo":"https://cdn.happyfresh.com/spree/suppliers/square_photos/01c5f038b42d965441eb4ef55793ebb3e16d4213-medium.png?1636468095","store_photo":null,"background_square_photo":{"mini_url":"https://cdn.happyfresh.com/spree/suppliers/background_square_photos/ee855daf2c26df788660693c7c7c4f590fa6b24d-mini.png?1625202943","small_url":"https://cdn.happyfresh.com/spree/suppliers/background_square_photos/068a23faf0395d66a99d3322c52d4b27803ed9c3-small.png?1625202943","medium_url":"https://cdn.happyfresh.com/spree/suppliers/background_square_photos/8e33766f8f2957f2f7df1c6190c4edb700438e6f-medium.png?1625202943","large_url":"https://cdn.happyfresh.com/spree/suppliers/background_square_photos/0b790dd195f28fb2788cc4567e72b1c7e260505e-large.png?1625202943"},"display_promotion_label":"Diskon ~50%","seo_details":null,"store_categories_name":["Daily Basic Needs","Supermarket"]}},{"id":6501,"name":"HappyFresh Supermarket Cilandak","address1":"Pergudangan Perum Peruri, Gudang 7, Jl Lebak Bulus I, Cilandak, Jakarta Selatan ","city":"Jakarta","zipcode":"12430","phone":"","lat":-6.29639070332991,"lon":106.79448776706,"slug":"happyfresh-supermarket-cilandak","photo":null,"state_name":"Jakarta Selatan","supplier":{"id":3468,"name":"HappyFresh Supermarket - ID","slug":"tip-top-

it’s supposed to return 2 items :

11649,"name":"HappyFresh Supermarket Depok"
6501,"name":"HappyFresh Supermarket Cilandak"

However it returns all phrases that is in range of id and address1.
How do you return back just the
items that is between the {"id": and "address1"?

Asked By: Hal

||

Answers:

Basically, don’t parse json with regex, use json module:

import re
import json
import requests

mlocations = requests.get(
    "https://m.happyfresh.id/supplier/tip-top-hfc?tracking_source=backlink_storehome"
).text

data = re.search(r"window.__PRELOADED_STATE__ = (.*})", mlocations).group(1)
data = json.loads(data)  # <-- parse the initial data with Json

# now you can access data like normal python dict/list etc.
for store in data["supplierReducer"]["supplierLanding"]["stores"]["data"]:
    print(store["id"], store["name"])

Prints:

10457 HappyFresh Supermarket Senayan
11649 HappyFresh Supermarket Depok
6501 HappyFresh Supermarket Cilandak
11184 HappyFresh Supermarket Bintaro
10010 HappyFresh Supermarket Sunter
10456 HappyFresh Supermarket Puri
11323 HappyFresh Supermarket Bekasi
10455 HappyFresh Supermarket BSD
Answered By: Andrej Kesely
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.