SAX Parser in Python


I am parsing xml files in a folder using Python SAX Parser and writing the output in CSV using pandas, But I am getting only the data from last file in the CSV.

I am new to Python and this is for the first time trying SAX Parsing

File read:

for dirpath, dirs, files in os.walk(fp1): 
          for filename in files:
            fname = os.path.join(dirpath,filename)
            if fname.endswith('.xml'):
              #for count in files:
 def characters(self, content):
        rows = []
        cols = ["ReporterCite","DecisionDate","CaseName","FileNum","CourtLocation","CourtName","CourtAbbrv","Judge","CaseLength","CourtCite","ParallelCite","CitedCount","UCN"]
        #ReporteCite, DecisionDate, CaseName, FileNum, CourtLocation, CourtName, CourtAbbrv, Judge, CaseLength, CourtCite, ParallelCite, CitedCount, UCN             

                     "DecisionDate": self.dd,
                     "CaseName": self.can,
                     "FileNum": self.fn,
                     "CourtLocation": self.loc,
                     "Judge": self.j,   
                     "ParallelCite": self.pc,
                     "UCN": self.rn})

        df = pd.DataFrame(rows, columns=cols)
Asked By: PythonKS



I assume you will always overwrite your previous result. This is a pandas question, not a SAX question. You would like append to the existing csv, right? If this is the case you have to use the mode = ‘a’, like
df.to_csv('filename.csv',mode = 'a')
More options, see Doc

  • ‘w’ open for writing, truncating the file first (default)
  • ‘x’ open for exclusive creation, failing if file already exists
  • ‘a’ open for writing, appending to the end of file if it exists
Answered By: Paul-ET