creating a .csv from a combination of strings and panas dfs

Question:

my .csv with multiple blocks need to follow this format (1 block sample):

enter image description here

so trying to do it in pandas and then write to csv. The problem is those comments above each of the two sections(outside the dataframes). Here is sample code:

import numpy as np
import pandas as pd
h_comment = pd.DataFrame(['#(H) Header'], columns=['name'])

df1 = pd.DataFrame({'name': 'Donald Trump',
                    'state':'FL',
                    'value':'0'},
                   index=[0])


data_comment =  pd.DataFrame(['#(S) Schedule'], columns=['A'])
df2 = pd.DataFrame(np.random.rand(3,4),
columns=list('ABCD'))

to_csv1 = pd.concat([h_comment,df1])
to_csv2 = pd.concat([data_comment,df2])

the issue is that those "comments" are inside my df columns, for example:

to_csv2
Out[116]: 
               A         B         C         D
0  #(S) Schedule       NaN       NaN       NaN
0       0.521739  0.622079  0.322372  0.687531
1       0.991336  0.297848  0.635697  0.025620
2       0.068900  0.898806  0.562971  0.567817

the solution to first create a .csv with comments and append dfs to it is not great since there are many blocks like the above impacting the performance, so i’d rather write to csv at the end.

Asked By: gregV

||

Answers:

The image you shared looks more like an Excel spreadsheet rather than a csv file.

To make a csv that matches the shape you described, one option is to use open with to_csv :

N = 2 # number of empty lines between both dfs

with open("output.csv", mode="w", newline="") as file:
    file.write("#(H) Headern")
    df1.to_csv(file, index=False)
    file.write("n"*N)
    file.write('#(S) Schedulen')
    df2.to_csv(file, index=False)

Output (.csv in Excel) :

enter image description here

If needed, you can make with ExcelWriter a spreadsheet that can hande sheet/cell formatting :

with pd.ExcelWriter("output.xlsx", engine="xlsxwriter") as writer:
    worksheet = writer.book.add_worksheet()
    
    header_format = writer.book.add_format({"border": None})
    title_format = writer.book.add_format({"bold": True,
                                           "italic": True,
                                           "font_size": 11})

    worksheet.write(0, 0, "#(H) Header", title_format)
    df1.to_excel(writer, index=False, startrow=1)
     
    worksheet.write(len(df1)+2, 0, "")
    worksheet.write(len(df1)+3, 0, "")
    
    worksheet.write(len(df1)+4, 0, "#(S) Schedule", title_format)
    df2.to_excel(writer, index=False, startrow=len(df1)+5)
    
    for col_num, value in enumerate(df2.columns):
        worksheet.write(len(df1)+5, col_num, value, header_format)
    
    for col_num, value in enumerate(df1.columns):
        worksheet.write(1, col_num, value, header_format)
        
    worksheet.autofit()

Output (.xlsx in Excel) :

enter image description here

Answered By: Timeless
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.