Detect language in pandas column in python

Question:

I would like to detect language in pandas column in python. After detecting it I want to write the language code as a column in pandas dataframe. Below is my code and what I tried. But I got an error please help.

Thank you.

  data = {'text':  ["It is a good option","Better to have this way","es un portal informático 
  para geeks","は、ギーク向けのコンピューターサイエンスポータルです"]}
  # Create DataFrame
  df = pd.DataFrame(data)
  #get the language
 
  for i in  df['text']:
  # Language Detection
  df['lang'] = TextBlob(i)

enter image description here

Asked By: melik

||

Answers:

You can use langdetect library in Python for language detection.

pip install langdetect
import pandas as pd
from langdetect import detect

data = {'text':  ["It is a good option","Better to have this way","es un portal informático para geeks","は、ギーク向けのコンピューターサイエンスポータルです"]}

df = pd.DataFrame(data)

df['lang'] = df['text'].apply(lambda x: detect(x))
Answered By: İlker Kara

i think this will be enough:

#get the language
df['lang'] = df.apply(lambda x: TextBlob(x['text']), axis = 1) 
Answered By: Artyom Akselrod
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.