Extracting reviews of Android App from Google Play store using Web Scraping method (Python BS4) – index out of range

Question:

The issue with the below code is "list index out of range error".

import bs4
import requests
my_url = requests.get('play.google.com/store/apps/details? 
id=com.delta.mobile.android&hl=en_US&showAllReviews=true') 
uClient = uReq(my_url) 
page_soup = uClient.read() 
uClient.close() 
#Parsing the content 
soup = BeautifulSoup(page_soup, "html.parser") 
txt = soup.find('div', class_='review-body').get_text() 
print(soup.get_text()) 
temp = pd.DataFrame({'Review Text': txt}, index=[0]) 
print('-' * 10) 
#Appending temp values into DataFrame 
reviews_df.append(temp) 
#Printing DataFrame 
print(reviews_df)
Asked By: Sivakumar Prakash

||

Answers:

Try:

import urllib , json , requests
from bs4 import BeautifulSoup
URL='http://play.google.com/store/apps/details?id=com.delta.mobile.android&hl=en_US&showAllReviews=true'
USER_AGENT = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0"
headers = {"user-agent": USER_AGENT}
resp = requests.get(URL, headers=headers)
soup = BeautifulSoup(resp.content, "html.parser")
#print(soup.prettify())
a=[]
txt = soup.find_all('script',text=True)
for i in txt:
    if("gp:" in i.text):
        a.append(i.text)
i=a[-1]
i=i.split(",null,"")
del i[0]
for j in i:
    if('http' not in j):
        print(j[:j.index(""")])
        print()

It worked for me!

Answered By: Joshua Varghese

Alternatively, you could use a third-party solution like SerpApi to retrieve all the reviews of an app. We handle proxies, solve captchas, and parse all rich structured data for you.

Example python code for retrieving YouTube reviews (available in other libraries also):

from serpapi import GoogleSearch

params = {
  "api_key": "SECRET_API_KEY",
  "engine": "google_play_product",
  "store": "apps",
  "gl": "us",
  "product_id": "com.google.android.youtube",
  "all_reviews": "true"
}

search = GoogleSearch(params)
results = search.get_dict()

Example JSON output:

  "reviews": [
    {
      "title": "Qwerty Jones",
      "avatar": "https://play-lh.googleusercontent.com/a/AATXAJwSQC_a0OIQqkAkzuw8nAxt4vrVBgvkmwoSiEZ3=mo",
      "rating": 3,
      "snippet": "Overall a great app. Lots of videos to see, look at shorts, learn hacks, etc. However, every time I want to go on the app, it says I need to update the game and that it's "not the current version". I've done it about 3 times now, and it's starting to get ridiculous. It could just be my device, but try to update me if you have any clue how to fix this. Thanks :)",
      "likes": 586,
      "date": "November 26, 2021"
    },
    {
      "title": "matthew baxter",
      "avatar": "https://play-lh.googleusercontent.com/a/AATXAJy9NbOSrGscHXhJu8wmwBvR4iD-BiApImKfD2RN=mo",
      "rating": 1,
      "snippet": "App is broken, every video shows no dislikes even after I hit the button. I've tested this with multiple videos and now my recommended is all messed up because of it. The ads are longer than the videos that I'm trying to watch and there is always a second ad after the first one. This app seriously sucks. I would not recommend this app to anyone.",
      "likes": 352,
      "date": "November 28, 2021"
    },
    {
      "title": "Operation Blackout",
      "avatar": "https://play-lh.googleusercontent.com/a-/AOh14GjMRxVZafTAmwYA5xtamcfQbp0-rUWFRx_JzQML",
      "rating": 2,
      "snippet": "YouTube used to be great, but now theyve made questionable and arguably stupid decisions that have effectively ruined the platform. For instance, you now have the grand chance of getting 30 seconds of unskipable ad time before the start of a video (or even in the middle of it)! This happens so frequently that its actually a feasible option to buy an ad blocker just for YouTube itself... In correlation with this, YouTube is so sensitive twords the public they decided to remove dislikes. Why????",
      "likes": 370,
      "date": "November 24, 2021"
    },
    ...
  ],
  "serpapi_pagination": {
    "next": "https://serpapi.com/search.json?all_reviews=true&engine=google_play_product&gl=us&hl=en&next_page_token=CpEBCo4BKmgKR_8AwEEujFG0VLQA___-9zuazVT_jmsbmJ6WnsXPz8_Pz8_PxsfJx5vJns3Gxc7FiZLFxsrLysnHx8rIx87Mx8nNzsnLyv_-ECghlTCOpBLShpdQAFoLCZiJujt_EovhEANgmOjCATIiCiAKHmFuZHJvaWRfaGVscGZ1bG5lc3NfcXNjb3JlX3YyYQ&product_id=com.google.android.youtube&store=apps",
    "next_page_token": "CpEBCo4BKmgKR_8AwEEujFG0VLQA___-9zuazVT_jmsbmJ6WnsXPz8_Pz8_PxsfJx5vJns3Gxc7FiZLFxsrLysnHx8rIx87Mx8nNzsnLyv_-ECghlTCOpBLShpdQAFoLCZiJujt_EovhEANgmOjCATIiCiAKHmFuZHJvaWRfaGVscGZ1bG5lc3NfcXNjb3JlX3YyYQ"
  }

Check out the documentation for more details.

Test the search live on the playground.

Disclaimer: I work at SerpApi.

Answered By: Milos Djurdjevic