Cannot scrape glassdoor rating's stars

Question:

Trying to figure out a way to get these stars, but the stars are the same in all attributes, both the green ones and the grey ones. These are reviews by the users not the overall company review. How can I get the number of rated stars?(star ratings) If it all.

enter image description here

Asked By: Abhishek Rai

||

Answers:

I think one way to get them is to access the company profile and get the star rating right from that profile. It must be visible right after the stars. Something like this enter image description here

Answered By: Mahery Ranaivoson

Like Dan explained in the comments, you can find the css (.css-1ihykkv) applied in the class. Within this css, you’ll get the linear-gradient in background attribute. Also you can find the percentage of green and grey color used for rating.
Check screen shot here:enter image description here

Once you find this css and its attribute you can extract the percentage data. Sharing an example below on how to extract .css data:

bgColor = driver.findElement(By.xpath("//button[contains(@class,'btn-primary')]")).getCssValue("background-color")
print bgColor

The output should be like this:

rgba(0, 123, 255, 1)

Try extracting the data from the background attribute and you can use the percentage data for different ratings like Culture & Values etc.

Answered By: Libin Thomas

I found out that always the same css class is used for the same number of stars. For example the the css class for four stars is "css-94nhxw" and for one star is "css-1mfncox". I used this in my code to find out which of the classes is used for which rating.
For example my code for the subrating worklife balance looks like this:

def scrape_work_life_balance(gdReview):
        try:
            gdReview.find_element(By.XPATH, './/span [not@class ="SVGInline d-flex css-hcqxoa"]').text
            return "Null"                                           
        except:
            try:
                gdReview.find_element(By.XPATH, './/div [@class="tooltipContainer"]/div [@class="content"]/ul[@class="pl-0"]/li/div[text()="Work/Life Balance"]/following-sibling::div[@class="css-11w4osi e1hd5jg10"]').text
                return "5"
            except:
                try:
                    gdReview.find_element(By.XPATH, './/div [@class="tooltipContainer"]/div [@class="content"]/ul[@class="pl-0"]/li/div[text()="Work/Life Balance"]/following-sibling::div[@class="css-94nhxw e1hd5jg10"]').text
                    return "4"
                except:
                    try:
                        gdReview.find_element(By.XPATH, './/div [@class="tooltipContainer"]/div [@class="content"]/ul[@class="pl-0"]/li/div[text()="Work/Life Balance"]/following-sibling::div[@class="css-k58126 e1hd5jg10"]').text
                        return "3"
                    except:
                        try:
                            gdReview.find_element(By.XPATH, './/div [@class="tooltipContainer"]/div [@class="content"]/ul[@class="pl-0"]/li/div[text()="Work/Life Balance"]/following-sibling::div[@class="css-1lp3h8x  e1hd5jg10"]').text
                            return "2"
                        except:
                            try:
                                gdReview.find_element(By.XPATH, './/div [@class="tooltipContainer"]/div [@class="content"]/ul[@class="pl-0"]/li/div[text()="Work/Life Balance"]/following-sibling::div[@class="css-1mfncox e1hd5jg10"]').text
                                return "1"
                            except:
                                return "Zero"

Just keep in mind that css classes can change. Glasdoor just changed them in the last months and I updated them today.

Answered By: Soph_183
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.