Python – Open file in default program and save with default program extension (or the like)
Question:
I’m currently trying to do the following:
- Open up an .xml file that’s already in spreadsheet format with Excel
- Save the .xml file as .xlsx without corrupting the file
Other options that I can take via Python are:
- Convert the .xml to .xlsx
- Copy specific columns (A1:AC6000) to another Excel workbook
- Import an XML file directly in an Excel workbook.
I failed at all of them and can’t think of a different way so here I am asking for help. My latest code is here:
# importing openpyxl module
import openpyxl as xl;
# opening the source excel file
file = 'C:\Users\ddejean\Desktop\HESKlogin\Downloads\data.xlsx'
wb1 = xl.load_workbook(file)
ws1 = wb1['Sheet1']
# opening the destination excel file
filename1 = 'C:\Users\ddejean\Desktop\HESKlogin\Downloads\updated.xlsx'
wb2 = xl.load_workbook(filename1)
ws2 = wb2['Sheet1']
# calculate total number of rows and
# columns in source excel file
mr = ws1.max_row
mc = ws1.max_column
# copying the cell values from source
# excel file to destination excel file
for i in range (1, mr + 1):
for j in range (1, mc + 1):
# reading cell value from source excel file
c = ws1.cell(row = i, column = j)
# writing the read value to destination excel file
ws2.cell(row = i, column = j).value = c.value
# saving the destination excel file
wb2.save(filename1)
I also tried changing the format of the file which ultimately corrupted the file:
A = r"C:\Users\ddejean\Desktop\HESKlogin\Downloads\data.xml"
pre, ext = os.path.splitext(A)
B = os.rename(A, pre + ".xlsx")
I tried importing the file into Excel which was terrible since none of the data in xml have properly name attributes to differentiate the data. I also tried calling a macro, but I get an error with each macro on my network, so I disposed of that alternative.
Any assistance you can give would be much appreciated! I also think it’s important to say that I’m a noob.
Answers:
This works for me 🙂
import os
import win32com.client as win32
import requests as r
import pandas as pd
hesk = "C:\Users\ddejean\Desktop\TEST\hesk.xml"
folder = "C:\Users\ddejean\Desktop\TEST"
output = "C:\Users\ddejean\Desktop\TEST\output.csv"
cd = os.path.dirname(os.path.abspath(folder))
xmlfile = os.path.join(cd, hesk)
csvfile = os.path.join(cd, output)
# EXCEL COM TO SAVE EXCEL XML AS CSV
if os.path.exists(csvfile):
os.remove(csvfile)
try:
excel = win32.gencache.EnsureDispatch('Excel.Application')
wb = excel.Workbooks.OpenXML(xmlfile)
wb.SaveAs(csvfile, 6)
wb.Close(True)
except Exception as e:
print(e)
finally:
# RELEASES RESOURCES
wb = None
excel = None
I’m currently trying to do the following:
- Open up an .xml file that’s already in spreadsheet format with Excel
- Save the .xml file as .xlsx without corrupting the file
Other options that I can take via Python are:
- Convert the .xml to .xlsx
- Copy specific columns (A1:AC6000) to another Excel workbook
- Import an XML file directly in an Excel workbook.
I failed at all of them and can’t think of a different way so here I am asking for help. My latest code is here:
# importing openpyxl module
import openpyxl as xl;
# opening the source excel file
file = 'C:\Users\ddejean\Desktop\HESKlogin\Downloads\data.xlsx'
wb1 = xl.load_workbook(file)
ws1 = wb1['Sheet1']
# opening the destination excel file
filename1 = 'C:\Users\ddejean\Desktop\HESKlogin\Downloads\updated.xlsx'
wb2 = xl.load_workbook(filename1)
ws2 = wb2['Sheet1']
# calculate total number of rows and
# columns in source excel file
mr = ws1.max_row
mc = ws1.max_column
# copying the cell values from source
# excel file to destination excel file
for i in range (1, mr + 1):
for j in range (1, mc + 1):
# reading cell value from source excel file
c = ws1.cell(row = i, column = j)
# writing the read value to destination excel file
ws2.cell(row = i, column = j).value = c.value
# saving the destination excel file
wb2.save(filename1)
I also tried changing the format of the file which ultimately corrupted the file:
A = r"C:\Users\ddejean\Desktop\HESKlogin\Downloads\data.xml"
pre, ext = os.path.splitext(A)
B = os.rename(A, pre + ".xlsx")
I tried importing the file into Excel which was terrible since none of the data in xml have properly name attributes to differentiate the data. I also tried calling a macro, but I get an error with each macro on my network, so I disposed of that alternative.
Any assistance you can give would be much appreciated! I also think it’s important to say that I’m a noob.
This works for me 🙂
import os
import win32com.client as win32
import requests as r
import pandas as pd
hesk = "C:\Users\ddejean\Desktop\TEST\hesk.xml"
folder = "C:\Users\ddejean\Desktop\TEST"
output = "C:\Users\ddejean\Desktop\TEST\output.csv"
cd = os.path.dirname(os.path.abspath(folder))
xmlfile = os.path.join(cd, hesk)
csvfile = os.path.join(cd, output)
# EXCEL COM TO SAVE EXCEL XML AS CSV
if os.path.exists(csvfile):
os.remove(csvfile)
try:
excel = win32.gencache.EnsureDispatch('Excel.Application')
wb = excel.Workbooks.OpenXML(xmlfile)
wb.SaveAs(csvfile, 6)
wb.Close(True)
except Exception as e:
print(e)
finally:
# RELEASES RESOURCES
wb = None
excel = None