plot specific columns from a text file

Question:

If I have a text file, data.txt, which contains many columns, how to call this file by python and plot only chosen two columns?
for example:

  10 -22.82215289 0.11s
  12 -22.81978265 0.14s
  15 -22.82359691 0.14s
  20 -22.82464363 0.16s
  25 -22.82615348 0.17s
  30 -22.82641815 0.19s
  35 -22.82649347 0.21s
  40 -22.82655376 0.22s
  50 -22.82661407 0.28s
  60 -22.82663535 0.34s
  70 -22.82664864 0.42s
  80 -22.82665962 0.46s
  90 -22.82666308 0.51s
 100 -22.82666662 0.56s

and I need to plot only the first and second columns.
Note the space before the first column.

Eidt
I used the following code:

import matplotlib.pyplot as plt
from matplotlib import rcParamsDefault
import numpy as np
plt.rcParams["figure.dpi"]=150
plt.rcParams["figure.facecolor"]="white"
x, y = np.loadtxt('./calc.dat', delimiter=' ')
plt.plot(x, y, "o-", markersize=5, label='Etot')
plt.xlabel('ecut')
plt.ylabel('Etot')
plt.legend(frameon=False)
plt.savefig("fig.png")

but I have to modify my data to contain only two columns that I need to plot without any spaces before the first column, as follows

 10 -22.82215289  
 12 -22.81978265  
 15 -22.82359691  
 20 -22.82464363  
 25 -22.82615348  
 30 -22.82641815  
 35 -22.82649347  
 40 -22.82655376  
 50 -22.82661407  
 60 -22.82663535  
 70 -22.82664864  
 80 -22.82665962  
 90 -22.82666308  
100 -22.82666662 

So, how to modify the code so that I do not have to modify the data every time?

Asked By: Derive D1

||

Answers:

You could first read your file data.txt and preprocess it by stripping the whitespaces on the left of each line, save the preprocessed data to data_processed.txt, then load it with pd.read_csv and then plot the two columns of choice col1 and col2 against each other with plt.plot, as follows:

import pandas as pd
import matplotlib.pyplot as plt

s = """  10 -22.82215289 0.11s
  12 -22.81978265 0.14s
  15 -22.82359691 0.14s
  20 -22.82464363 0.16s
  25 -22.82615348 0.17s
  30 -22.82641815 0.19s
  35 -22.82649347 0.21s
  40 -22.82655376 0.22s
  50 -22.82661407 0.28s
  60 -22.82663535 0.34s
  70 -22.82664864 0.42s
  80 -22.82665962 0.46s
  90 -22.82666308 0.51s
 100 -22.82666662 0.56s"""


with open ('data.txt', 'w') as f:
    f.write(s)

with open ('data.txt', 'r') as f:
    data = f.read()

data_processed = 'n'.join([l.lstrip() for l in data.split('n')])

with open ('data_processed.txt', 'w') as f:
    f.write(data_processed)

df = pd.read_csv('data_processed.txt', sep=' ', header=None)
col1 = 0
col2 = 1
plt.plot(df[col1], df[col2]);

plot

Answered By: Michael Hodel

You can create a DataFrame from from a text file using pandas read_csv, which can simplify future processing of the data, besides plotting it.

In this case, the tricky part are the whitespaces, that can be managed by setting the optional parameter sep to 's+':

df = pd.read_csv('data.txt', sep='s+', header=None, names=['foo', 'bar', 'baz'])
>>>df
index foo bar baz
0 10 -22.82215289 0.11s
1 12 -22.81978265 0.14s
2 15 -22.82359691 0.14s
3 20 -22.82464363 0.16s
4 25 -22.82615348 0.17s
5 30 -22.82641815 0.19s
6 35 -22.82649347 0.21s
7 40 -22.82655376 0.22s
8 50 -22.82661407 0.28s
9 60 -22.82663535 0.34s
10 70 -22.82664864 0.42s
11 80 -22.82665962 0.46s
12 90 -22.82666308 0.51s
13 100 -22.82666662 0.56s

And the just your code:

plt.rcParams["figure.dpi"]=150
plt.rcParams["figure.facecolor"]="white"
plt.plot(df['foo'], df['bar'], "o-", markersize=5, label='Etot')
plt.xlabel('ecut')
plt.ylabel('Etot')
plt.legend(frameon=False)
plt.savefig("fig.png")

enter image description here

I set the names of the columns to arbitrary strings. You can avoid that, and just refer to the columns as df[0], df[1]

Answered By: Ignatius Reilly
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.