How to read file with space separated values in pandas
Question:
I try to read the file into pandas.
The file has values separated by space, but with different number of spaces
I tried:
pd.read_csv('file.csv', delimiter=' ')
but it doesn’t work
Answers:
you can use regex as the delimiter:
pd.read_csv("whitespace.csv", header=None, delimiter=r"s+")
add delim_whitespace=True
argument, it’s faster than regex.
If you can’t get text parsing to work using the accepted answer (e.g if your text file contains non uniform rows) then it’s worth trying with Python’s csv library – here’s an example using a user defined Dialect:
import csv
csv.register_dialect('skip_space', skipinitialspace=True)
with open(my_file, 'r') as f:
reader=csv.reader(f , delimiter=' ', dialect='skip_space')
for item in reader:
print(item)
Pandas read_fwf for the win:
import pandas as pd
df = pd.read_fwf(file_path)
You can pass a regular expression as a delimiter for read_table also, and it is fast :).
result = pd.read_table('file', sep='s+')
I try to read the file into pandas.
The file has values separated by space, but with different number of spaces
I tried:
pd.read_csv('file.csv', delimiter=' ')
but it doesn’t work
you can use regex as the delimiter:
pd.read_csv("whitespace.csv", header=None, delimiter=r"s+")
add delim_whitespace=True
argument, it’s faster than regex.
If you can’t get text parsing to work using the accepted answer (e.g if your text file contains non uniform rows) then it’s worth trying with Python’s csv library – here’s an example using a user defined Dialect:
import csv
csv.register_dialect('skip_space', skipinitialspace=True)
with open(my_file, 'r') as f:
reader=csv.reader(f , delimiter=' ', dialect='skip_space')
for item in reader:
print(item)
Pandas read_fwf for the win:
import pandas as pd
df = pd.read_fwf(file_path)
You can pass a regular expression as a delimiter for read_table also, and it is fast :).
result = pd.read_table('file', sep='s+')