Python Replace special characters in a CSV with specified string
Question:
I’m trying to replace multiple strings (with multiple language characters) in a csv file.
The following code works assuming that I rename my csv files using the .txt
extension and then rename back to .csv
. I’m wondering if the csv can be read and written directly.
import io
match = {
"太好奇了": "First String",
"धेरै एक्लो": "Second String",
"심각하게 생명이 필요하다": "Third String"
}
f = io.open("input.txt", mode="r", encoding="utf-16")
data = f.read()
def replace_all(text, dic):
for i, j in dic.items():
text = text.replace(i, j)
return text
data = replace_all(data, match)
w = open("updated.txt", "w", encoding="utf-16")
w.write(data)
Answers:
Use pandas
.
Here’s some code to help you get started.
import pandas as pd
filename = 'data.csv'
df = pd.read_csv(filename, encoding='utf-16')
# this will replace "太好奇了" with "First String"
df.replace(to_replace="太好奇了", value="First String")
df.to_csv('update.csv') # save result
A csv file is nothing else than a simple txt file that is meant to represent a data table by separating the values by commas. This allows programs to read it efficiently into data using libraries like Python’s csv
.
Since it still is just a text file, you can also open it as a usual txt using a simple function like open
and use it the exact same way you would use a txt file.
f = open("myfile.csv", mode="r", encoding="utf-16")
data = f.read()
f.close()
Note that file extensions actually change nothing about the file, they just signal how the file should be used. You could call a text file myfile.kingkong
and it would still behave the same with the open
function. In the same way, renaming .csv
to .txt
does absolutely nothing.
I’m trying to replace multiple strings (with multiple language characters) in a csv file.
The following code works assuming that I rename my csv files using the .txt
extension and then rename back to .csv
. I’m wondering if the csv can be read and written directly.
import io
match = {
"太好奇了": "First String",
"धेरै एक्लो": "Second String",
"심각하게 생명이 필요하다": "Third String"
}
f = io.open("input.txt", mode="r", encoding="utf-16")
data = f.read()
def replace_all(text, dic):
for i, j in dic.items():
text = text.replace(i, j)
return text
data = replace_all(data, match)
w = open("updated.txt", "w", encoding="utf-16")
w.write(data)
Use pandas
.
Here’s some code to help you get started.
import pandas as pd
filename = 'data.csv'
df = pd.read_csv(filename, encoding='utf-16')
# this will replace "太好奇了" with "First String"
df.replace(to_replace="太好奇了", value="First String")
df.to_csv('update.csv') # save result
A csv file is nothing else than a simple txt file that is meant to represent a data table by separating the values by commas. This allows programs to read it efficiently into data using libraries like Python’s csv
.
Since it still is just a text file, you can also open it as a usual txt using a simple function like open
and use it the exact same way you would use a txt file.
f = open("myfile.csv", mode="r", encoding="utf-16")
data = f.read()
f.close()
Note that file extensions actually change nothing about the file, they just signal how the file should be used. You could call a text file myfile.kingkong
and it would still behave the same with the open
function. In the same way, renaming .csv
to .txt
does absolutely nothing.