Invalid file path or buffer object type: <class 'win32com.client.CDispatch'> : Outlook extract csv to python pandas dataframe

Question:

I created this code that goes inside outlook in some subfolders, and I want to transform the attachments, that are csv files, into a pandas dataframe in order to append it after in a list.

But I believe python isn’t able to read the attachments as csv ?!? Bit lost here

import win32com.client
from win32com.client import Dispatch
import pandas as pd


list_dataframes=[]
list_dataframes_names=[]

outlook = Dispatch("Outlook.Application").GetNamespace("MAPI")
root_folder = outlook.Folders.Item(1)
folder = root_folder.Folders['Project'].Folders['CLIENT XXX']
    
for item in root_folder.Folders['Inbox'].Items:
    if 'Client XXX' in str(item.subject):
        for attachment in item.Attachments:
            list_dataframes_names.append(attachment.FileName)
            pandas_data_frame=pd.read_csv(attachment, sep=';',encoding='latin-1', on_bad_lines='skip')
            list_dataframes.append(pandas_data_frame)

the error comes from this line of code:

pandas_data_frame=pd.read_csv(attachment, sep=';',encoding='latin-1', on_bad_lines='skip')
ValueError: Invalid file path or buffer object type: <class 'win32com.client.CDispatch'>

Is there a way to make this work? or some workaround? I don’t want to download the files to a local folder and after upload then, I believe that would work, but that’s last resource.

Asked By: Pedro Gomes

||

Answers:

Attachment.FileName is just that – the file name (e.g. "myfile.txt"), there is no path component. It does not exist on the file system, hence you cannot read it as a file. You need to save it as a file first (Attachment.SaveAsFile) specifying the full path and the file name, and only after that you can read the file contents.

The Attachment.SaveAsFile method should be used to save the attached file to the disk. Only then you can specify the file path to the pandas related code.

for item in root_folder.Folders['Inbox'].Items:
    if 'Client XXX' in str(item.subject):
        for attachment in item.Attachments:
            attachment.SaveAsFile(filePath)  
            list_dataframes_names.append(filePath)
            pandas_data_frame=pd.read_csv(filePath, sep=';',encoding='latin-1', on_bad_lines='skip')
            list_dataframes.append(pandas_data_frame)

Don’t forget to use a unique file name for attachments, otherwise you will get a single file overwritten multiple times ending with a single file saved to the disk. For example, you can learn how to handle attachment file names from the following VBA sample provided by MS:

Sub SaveAttachment() 
 Dim myInspector As Outlook.Inspector
 Dim myItem As Outlook.MailItem 
 Dim myAttachments As Outlook.Attachments
 Set myInspector = Application.ActiveInspector 
 If Not TypeName(myInspector) = "Nothing" Then
   If TypeName(myInspector.CurrentItem) = "MailItem" Then 
     Set myItem = myInspector.CurrentItem 
     Set myAttachments = myItem.Attachments 
     'Prompt the user for confirmation 
     Dim strPrompt As String 
     strPrompt = "Are you sure you want to save the first attachment in the current item to the Documents folder? If a file with the same name already exists in the destination folder, it will be overwritten with this copy of the file." 
     If MsgBox(strPrompt, vbYesNo + vbQuestion) = vbYes Then 
       myAttachments.Item(1).SaveAsFile Environ("HOMEPATH") & "My Documents" & _ 
       myAttachments.Item(1).DisplayName 
     End If 
   Else 
     MsgBox "The item is of the wrong type." 
   End If 
 End If 
End Sub
Answered By: Eugene Astafiev