Can not write Japanese characters by pandas in Python

Question:

I’m trying to write data with Japanese characters to file CSV.
But CSV’s not correct Japanese characters

def write_csv(columns, data):
df = pd.DataFrame(data, columns=columns)
df.to_csv("..ReportReport.csv", encoding='utf-8')

write_csv(["法人番号", "法人名称", "法人名称カナ"], [])

and CSV:

CSV File Content Attached

æ³•äººç•ªå· æ³•äººå称 法人å称カナ

How can I accomplish this?

Asked By: Vũ Minh Vương

||

Answers:

Your code is OK, just tried it. I’m guessing the CSV file is good but you’re trying to open it as cp1252 instead of UTF-8.

What software are you using to open this CSV?

  • If you’re using Microsoft Excel, make sure to use “Import” instead of “Open” so that you can choose the encoding.
  • With Google Sheets or LibreOffice it should Just Work.

Another possible explanation is that there’s something wrong with your data in the first place. Here’s how you can check that (I just took a few random characters from this generator):

df = pd.DataFrame(['勘してろむ説彼ふて惑岐とや尊続セヲ狭題'])
df.to_csv('report.csv', encoding='utf-8')

Try opening that the same way. If it opens correctly but the other doesn’t, the problem is in your code.

Answered By: Kos

For me utf_8_sig worked like a charm.

df.to_csv("..ReportReport.csv", encoding='utf_8_sig')

Answered By: Shilp Thapak
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.