creating a .csv from a combination of strings and panas dfs

Question

my .csv with multiple blocks need to follow this format (1 block sample):

so trying to do it in pandas and then write to csv. The problem is those comments above each of the two sections(outside the dataframes). Here is sample code:

import numpy as np
import pandas as pd
h_comment = pd.DataFrame(['#(H) Header'], columns=['name'])

df1 = pd.DataFrame({'name': 'Donald Trump',
                    'state':'FL',
                    'value':'0'},
                   index=[0])


data_comment =  pd.DataFrame(['#(S) Schedule'], columns=['A'])
df2 = pd.DataFrame(np.random.rand(3,4),
columns=list('ABCD'))

to_csv1 = pd.concat([h_comment,df1])
to_csv2 = pd.concat([data_comment,df2])

the issue is that those "comments" are inside my df columns, for example:

to_csv2
Out[116]: 
               A         B         C         D
0  #(S) Schedule       NaN       NaN       NaN
0       0.521739  0.622079  0.322372  0.687531
1       0.991336  0.297848  0.635697  0.025620
2       0.068900  0.898806  0.562971  0.567817

the solution to first create a .csv with comments and append dfs to it is not great since there are many blocks like the above impacting the performance, so i’d rather write to csv at the end.

Asked By: gregV

||

Source

Answer 1

The image you shared looks more like an Excel spreadsheet rather than a csv file.

To make a csv that matches the shape you described, one option is to use open with to_csv :

N = 2 # number of empty lines between both dfs

with open("output.csv", mode="w", newline="") as file:
    file.write("#(H) Headern")
    df1.to_csv(file, index=False)
    file.write("n"*N)
    file.write('#(S) Schedulen')
    df2.to_csv(file, index=False)

Output (.csv in Excel) :

If needed, you can make with ExcelWriter a spreadsheet that can hande sheet/cell formatting :

with pd.ExcelWriter("output.xlsx", engine="xlsxwriter") as writer:
    worksheet = writer.book.add_worksheet()
    
    header_format = writer.book.add_format({"border": None})
    title_format = writer.book.add_format({"bold": True,
                                           "italic": True,
                                           "font_size": 11})

    worksheet.write(0, 0, "#(H) Header", title_format)
    df1.to_excel(writer, index=False, startrow=1)
     
    worksheet.write(len(df1)+2, 0, "")
    worksheet.write(len(df1)+3, 0, "")
    
    worksheet.write(len(df1)+4, 0, "#(S) Schedule", title_format)
    df2.to_excel(writer, index=False, startrow=len(df1)+5)
    
    for col_num, value in enumerate(df2.columns):
        worksheet.write(len(df1)+5, col_num, value, header_format)
    
    for col_num, value in enumerate(df1.columns):
        worksheet.write(1, col_num, value, header_format)
        
    worksheet.autofit()

Output (.xlsx in Excel) :

Answered By: Timeless

creating a .csv from a combination of strings and panas dfs

Question:

Answers: