Download google drive attachments of an email using Gmail API in python

Question:

I currently use this solution to download attachments from Gmail using Gmail API via python.
However, every time an attachment exceeds 25MB, the attachments automatically get uploaded to Google Drive and the files are linked in the mail. In such cases, there is no attachmentId in the message.
I can only see the file names in ‘snippet’ section of the message file.

Is there any way I can download the Google dive attachments from mail?

There is a similar question posted here, but there’s no solution provided to it yet

Asked By: Biswankar Das

||

Answers:

How to download a Drive "attachment"

The "attachment" referred to is actually just a link to a Drive file, so confusingly it is not an attachment at all, but just text or HTML.

The issue here is that since it’s not an attachment as such, you won’t be able to fetch this with the Gmail API by itself. You’ll need to use the Drive API.

To use the Drive API you’ll need to get the file ID. Which will be within the HTML content part among others.

You can use the re module to perform a findall on the HTML content, I used the following regex pattern to recognize drive links:

(?<=https://drive.google.com/file/d/).+(?=/view?usp=drive_web)

Here is a sample python function to get the file IDs. It will return a list.

def get_file_ids(service, user_id, msg_id):
    message = service.users().messages().get(userId=user_id, id=msg_id).execute()
    for part in message['payload']['parts']:
        if part["mimeType"] == "text/html":
            b64 = part["body"]["data"].encode('UTF-8')
            unencoded_data = str(base64.urlsafe_b64decode(b64))
            results = re.findall(
                '(?<=https://drive.google.com/file/d/).+(?=/view?usp=drive_web)',
                unencoded_data
            )
            return results

Once you have the IDs then you will need to make a call to the Drive API.

You could follow the example in the docs:

file_ids = get_file_ids(service, "me", "[YOUR_MSG_ID]"

for id in file_ids:
    request = service.files().get_media(fileId=id)
    fh = io.BytesIO()
    downloader = MediaIoBaseDownload(fh, request)
    done = False
    while done is False:
        status, done = downloader.next_chunk()
        print "Download %d%%." % int(status.progress() * 100)

Remember, seeing as you will now be using the Drive API as well as the Gmail API, you’ll need to change the scopes in your project. Also remember to activate the Drive API in the developers console, update your OAuth consent screen, credentials and delete the local token.pickle file.

References

Answered By: iansedano

Drive API has also limtitation of downloading 10MBs only

Answered By: Harsh Chaudhary