Convert pandas to txt in google colab

Question:

I have a dataset which called preprocessed_sample in the following format

preprocessed_sample.ftr.zstd 

and I am opening it using the following code

df = pd.read_feather(filepath)

The output looks something like that

index   text
0   0   i really dont come across how i actually am an...
1   1   music has become the only way i am staying san...
2   2   adults are contradicting
3   3   exo are breathing 553 miles away from me. they...
4   4   im missing people that i met when i was hospit...

and finally I would like to save this dataset in a file which called ‘examples’ and contains all these texts into txt format.

Update: @Tsingis I would like to have the above lines into txt files, for example the first line ‘i really dont come across how i actually am an…’ will be a file named ‘line1.txt’, in the same way all the lines will be txt files into a folder which called ‘examples’.

Asked By: John Angelopoulos

||

Answers:

You can use the following code:

import pathlib

data_dir = pathlib.Path('./examples')
data_dir.mkdir(exist_ok=True)

for i, text in enumerate(df['text'], 1):
    with open(f'examples/line{i}.txt', 'w') as fp:
        fp.write(text)

Output:

examples/
├── line1.txt
├── line2.txt
├── line3.txt
├── line4.txt
└── line5.txt

1 directory, 5 files

line1.txt:

i really dont come across how i actually am an...
Answered By: Corralien

Another way, is to use pandas built-ins itertuples and to_csv :

import pandas as pd

for row in df.itertuples():
    pd.Series(row.text).to_csv(f"examples/line{row.index+1}.txt",
                               index=False, header=False)
Answered By: Timeless
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.