How to use "concat" in place of "append" while sticking with the same scraping logic in Python (Pandas)

Question:

When writing data to a csv file with Pandas, I used to use the method below. It still works, but throws this warning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.

import requests
import pandas as pd
from bs4 import BeautifulSoup

url = "https://www.breuninger.com/de/damen/luxus/bekleidung-jacken-maentel/"

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.61 Safari/537.36",
}

res = requests.get(url, headers=headers)
soup = BeautifulSoup(res.text,"lxml")

df = pd.DataFrame(columns=["Marke","Name","Preis"])

for item in soup.select(".suchen-produkt a"):
    marke = item.select_one(".suchen-produkt__marke").get_text()
    name = item.select_one(".suchen-produkt__name").get_text()
    preis = item.select_one(".suchen-produkt__preis").get_text()
    df = df.append({'Marke':marke,'Name':name,'Preis':preis}, ignore_index=True)

print(df)
df.to_csv("products.csv", index=False)

How can I use concat in place of append while keeping the same scraping logic intact?

Asked By: robots.txt

||

Answers:

Heres an example:

dfs = []
for item in soup.select(".suchen-produkt a"):
    marke = item.select_one(".suchen-produkt__marke").get_text()
    name = item.select_one(".suchen-produkt__name").get_text()
    preis = item.select_one(".suchen-produkt__preis").get_text()
    dfs.append(pd.DataFrame([{'Marke': marke, 'Name': name, 'Preis': preis}]))

final = pd.concat(dfs).reset_index(drop=True)
print(final)

Or you can append as dict and convert to df at the end:

data = []
for item in soup.select(".suchen-produkt a"):
    marke = item.select_one(".suchen-produkt__marke").get_text()
    name = item.select_one(".suchen-produkt__name").get_text()
    preis = item.select_one(".suchen-produkt__preis").get_text()
    data.append({'Marke': marke, 'Name': name, 'Preis': preis})

final = pd.DataFrame(data)
print(final)
Answered By: Jason Baker

These are warnings and are not dangerous.
Use:

import warnings
warnings.filterwarnings('ignore')

I still use append, but it’s a relatively difficult way concat:

parameters = ['a', 'b', 'c', 'd', 'e', 'f']
df = pd.DataFrame(columns=parameters)

new_row = pd.DataFrame([1,2,3,4,5,6], columns=['row1'], index=parameters).T
df = pd.concat((df, new_row)) კი