get multiple wiki tables pass to csv

Question:

How to get tables and pass to csv file?

import pandas as pd 

# 'https://en.wikipedia.org/wiki/List_of_Linux_distributions'

url = 'https://en.wikipedia.org/wiki/List_of_Linux_distributions'
tables = pd.read_html(url)

df = tables

df.to_csv("linux_distros.csv", sep=",")

AttributeError: ‘list’ object has no attribute ‘to_csv’

Asked By: marreco

||

Answers:

read_html returns a list of dataframes, one for each table found.
You’d need to concatenate all relevant tables and then save that to a csv.

import pandas as pd

url = 'https://en.wikipedia.org/wiki/List_of_Linux_distributions'
tables = pd.read_html(url)

tables = [table for table in tables if ('Distribution' in table.columns and 'Description' in table.columns)]
# Concatenate all tables
df = pd.concat(tables, ignore_index=True)
# Save to CSV
df.to_csv("linux_distros.csv", sep=",")
Answered By: Adam Ali
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.