Scrape with a loop and output each table to a different sheet in the same workbook in python

Question:

I try to output the tables I scrape with this code in different sheets in the same workbook, and give them a different name but I can’t make it work. I quite new to Python so I would really appreciate some help here. This is the part of the code that seems to work fine

import requests
from bs4 import BeautifulSoup as bs
from time import sleep

masterlist = []
i = 0

url = "https://cryptopunks.app/cryptopunks/details/"

for cryptopunk in range(0,10): # The range of cryptopunks
    row_data = []
    sleep(2) # sleep time of loop so it doesn't break
    page = requests.get(url + str(i)) #change the address for each punk
    soup = bs(page.text, 'lxml') 
    table_body = soup.find('table')    
    for row in table_body.find_all('tr'): #get the rows of the table
        col = row.find_all('td') #get the cells
        col = [ele.text.strip().encode("utf-8") for ele in col]
        row_data.append(col) #append all in the file 
    masterlist.append (row_data)
    i = i+1
    print: i
    df = pd.DataFrame(masterlist).T
    writer = pd.ExcelWriter('group1.xlsx', engine='xlsxwriter')
    df.to_excel(writer,index=False)
    writer.save()

But this is the part of code that I tried to use to output the tables but it doesn’t work

    df = pd.DataFrame(masterlist).T
    writer = pd.ExcelWriter('group1.xlsx', engine='xlsxwriter')
    df.to_excel(writer,index=False)
    writer.save()

What I get with this code is the following:
enter image description here

I would like the tables to have also the following column header:

header=['Type', 'From', 'To', 'Amount', 'Txn']

Thanks

Asked By: Efthymios

||

Answers:

This is a way to write dataframes to multiple sheets in Excel.

import pandas as pd
import requests
from bs4 import BeautifulSoup as bs
from time import sleep

masterlist = []
url = "https://cryptopunks.app/cryptopunks/details/"
num_cryptopunks = 10
for i, cryptopunk in zip(range(num_cryptopunks), range(num_cryptopunks)): # The range of cryptopunks
    row_data = []
    sleep(2) # sleep time of loop so it doesn't break
    page = requests.get(url + str(i)) #change the address for each punk
    soup = bs(page.text, 'lxml') 
    table_body = soup.find('table')    
    for row in table_body.find_all('tr'): #get the rows of the table
        col = row.find_all('td') #get the cells
        col = [ele.text.strip().encode("utf-8") for ele in col]
        row_data.append(col) #append all in the file 

    df = pd.DataFrame(row_data)
    masterlist.append (df)

writer = pd.ExcelWriter('group1.xlsx'   )###, engine='xlsxwriter')
for cryptopunk, df in zip(range(num_cryptopunks), masterlist):
    df.to_excel(writer,sheet_name=str(cryptopunk),index=False, header = ['Type', 'From', 'To', 'Amount', 'Txn'])
writer.save()
Answered By: constantstranger
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.