Converting xlsx to Google Sheets in Jupyter

Question:

I’m trying to open a xlsx file from Google Drive as a Google Sheets file in Jupyter.

from googleapiclient.discovery import build
from google.oauth2 import service_account

SERVICE_ACCOUNT_FILE = 'gs_credentials.json'
SCOPES = ['https://www.googleapis.com/auth/spreadsheets']

creds = None
creds = service_account.Credentials.from_service_account_file(
        SERVICE_ACCOUNT_FILE, scopes=SCOPES)

SAMPLE_SPREADSHEET_ID = 'someidhere'
RANGE = 'somerangehere'

service = build('sheets', 'v4', credentials=creds)

sheet = service.spreadsheets()
result = sheet.values().get(spreadsheetId=SAMPLE_SPREADSHEET_ID,
                            range=RANGE).execute()
values = result.get('values', [])

this script work for reading data from a Google Sheet, but i have one that’s saved as xlsx. Is there a way to convert it? I found a couple answers that went something like:

service = build('drive', 'v3', credentials=creds)
service.files().copy(fileId=ID_OF_THE_EXCEL_FILE, body={"mimeType"="application/vnd.google-apps.spreadsheet"}).execute()

from How do I solve a Google Sheet API error in Python

but this doesn’t seem work for me. (i think the syntax is broken… tried to fix it, but couldn’t figure it out)

Asked By: Raksha

||

Answers:

Syntax error:

You are in the right direction on the converting xlsx to spreadsheet.Try changing the = to : in mimeType.

Correct syntax should be body={"mimeType":"application/vnd.google-apps.spreadsheet"}

After converting the xlsx file to google spreadsheet you can then use the spreadsheet service to obtain the values on the file. Please see sample code below.

Your Code should look like this:

from googleapiclient.discovery import build
from google.oauth2 import service_account
import pandas as pd

SERVICE_ACCOUNT_FILE = 'gs_credentials.json'
SCOPES = ['https://www.googleapis.com/auth/spreadsheets']

creds = None
creds = service_account.Credentials.from_service_account_file(
        SERVICE_ACCOUNT_FILE, scopes=SCOPES)

SAMPLE_SPREADSHEET_ID = 'someidhere'
RANGE = 'somerangehere'

# Build the services
service_drive = build('drive', 'v3', credentials=creds)
service_sheets = build('sheets', 'v4', credentials=creds)


# Converting xlsx file to google spreadsheet

xlsx_file = service_drive.files().copy(fileId=SAMPLE_SPREADSHEET_ID, body= 
{"mimeType":"application/vnd.google-apps.spreadsheet"}).execute()

# extracting the new converted spreadsheet Id
new_id = xlsx_file.get('id', [])

# Getting the values using the spreadsheet service
result =   service_sheets.spreadsheets().values().get(spreadsheetId=new_id,range = RANGE).execute()
values = result.get('values', [])

#converting values to dataframe using pandas
df = pd.DataFrame(values)
df

My sample Code on Google Colaboratory:

from googleapiclient.discovery import build
from google.colab import auth
from google.auth import default
import pandas as pd

auth.authenticate_user()
creds, _ = default()

# Edits starts here from your existing code
# Build the services
service_drive = build('drive', 'v3', credentials=creds)
service_sheets = build('sheets', 'v4', credentials=creds)

# xlsx file Id
file_id = 'Id here'
range = "Range here"

# Converting xlsx file to google spreadsheet

xlsx_file = service_drive.files().copy(fileId=file_id, body={"mimeType":"application/vnd.google-apps.spreadsheet"}).execute()

# extracting the new converted spreadsheet Id
new_id = xlsx_file.get('id', [])

# Getting the values using the spreadsheet service
result = service_sheets.spreadsheets().values().get(spreadsheetId=new_id,range = range).execute()
values = result.get('values', [])

#converting values to dataframe using pandas
df = pd.DataFrame(values)
df

Note: Authentication may differ from you since I am running this code on google colab

Output:

enter image description here

References:

https://developers.google.com/drive/api/v3/reference/files/copy
https://developers.google.com/sheets/api/guides/values

Answered By: George