How do I write a python pandas.DataFrame to a file using aligned space chacarcters?

Question:

I want to store a pandas.DataFrame to a text file that has the columns aligned using whitespace characters. If this is my sample DataFrame:

In [1]: import numpy as np
In [2]: import pandas as pd
In [3]: df = pd.DataFrame(np.linspace(0,1,9).reshape(3,3))
In [4]: df
Out[4]: 
       0      1      2
0  0.000  0.125  0.250
1  0.375  0.500  0.625
2  0.750  0.875  1.000
[3 rows x 3 columns]

I want to do something like this:

In [5]: df.to_csv('test.txt', sep='?')

to get this:

In [6]: more test.txt
  0     1     2
0 0.0   0.125 0.25
1 0.375 0.5   0.625
2 0.75  0.875 1.0

What separator should I use? I want to know if there is a way to do this without using the t character. It looks nice

    0       1       2
0   0.0     0.125   0.25
1   0.375   0.5     0.625
2   0.75    0.875   1.0

but then my text files have tab characters which create other problems.

If I use sep=' ' I get this which is obviously wrong.

 0 1 2
0 0.0 0.125 0.25
1 0.375 0.5 0.625
2 0.75 0.875 1.0

I know pandas can read in files like this so I figure there is a way to write out files like this.

Asked By: Steven C. Howell

||

Answers:

How about this

import numpy as np
import pandas as pd
import csv

df = pd.DataFrame(np.linspace(0,1,9).reshape(3,3))
df.to_csv('test.txt', 
          float_format='%10.3f', sep=" ", 
          quoting=csv.QUOTE_NONE, escapechar=" ")

It produces:

 0 1 2
0           0.000           0.125           0.250
1           0.375           0.500           0.625
2           0.750           0.875           1.000

Number of spaces can be ofc customized by the number of digits of the ‘longest’ number.

Answered By: grechut

Try using hollerith. pypi github. It’s a (very) new python library that deals with fixed width formatting – and open to contribution. Unfortunately trying to use built-in python string formatting (or the C printf for that matter!) for fixed width doesn’t work well for some widths when there are large integers and certain double precision floats.

Answered By: mohamed koubaa

One way is to use the tabulate package.

 % ipython
Python 3.10.9 (main, Jan 11 2023, 15:21:40) [GCC 11.2.0]
IPython 8.9.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: 
import numpy as np
import pandas as pd
df = pd.DataFrame(np.linspace(0,1,9).reshape(3,3))
df
Out[1]: 
       0      1      2
0  0.000  0.125  0.250
1  0.375  0.500  0.625
2  0.750  0.875  1.000

In [2]: 
from tabulate import tabulate
content = tabulate(
  df.values.tolist(), list(df.columns), showindex=list(df.index), tablefmt="plain"
)

In [3]: 
print(content)
        0      1      2
 0  0      0.125  0.25
 1  0.375  0.5    0.625
 2  0.75   0.875  1

If you do not want to print the row index, do not pass the showindex option.

In [4]: 
content2 = tabulate(
  df.values.tolist(), list(df.columns), tablefmt="plain"
)
print(content2)
    0      1      2
0      0.125  0.25
0.375  0.5    0.625
0.75   0.875  1
Answered By: Kamaraju Kusumanchi
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.