Pandas: How to read CSV file from google drive public?

Question:

I searched similar questions about reading csv from URL but I could not find a way to read csv file from google drive csv file.

My attempt:

import pandas as pd

url = 'https://drive.google.com/file/d/0B6GhBwm5vaB2ekdlZW5WZnppb28/view?usp=sharing'
dfs = pd.read_html(url)

How can we read this file in pandas?

Related links:

Asked By: BhishanPoudel

||

Answers:

I would recommend you using the following code:

import pandas as pd
import requests
from io import StringIO

url = requests.get('https://doc-0g-78-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/5otus4mg51j69f99n47jgs0t374r46u3/1560607200000/09837260612050622056/*/0B6GhBwm5vaB2ekdlZW5WZnppb28?e=download')
csv_raw = StringIO(url.text)
dfs = pd.read_csv(csv_raw)

hope this helps

Answered By: Nazim Kerimbekov

This worked for me

import pandas as pd
url='https://drive.google.com/file/d/0B6GhBwm5vaB2ekdlZW5WZnppb28/view?usp=sharing'
url='https://drive.google.com/uc?id=' + url.split('/')[-2]
df = pd.read_csv(url)
Answered By: BhishanPoudel

To read CSV file from google drive you can do that.

import pandas as pd

url = 'https://drive.google.com/file/d/0B6GhBwm5vaB2ekdlZW5WZnppb28/view?usp=sharing'
path = 'https://drive.google.com/uc?export=download&id='+url.split('/')[-2]
df = pd.read_csv(path)

I think this is the easiest way to read CSV files from google drive.
hope your "Anyone with the link" option enables in google drive.

Answered By: Samir Mughal

Simply change de URL from Google Drive using uc?id=, and then pass it to the read_csv function. In this example:

url = 'https://drive.google.com/uc?id=0B6GhBwm5vaB2ekdlZW5WZnppb28'
dfs = pd.read_csv(url)
Answered By: rusiano

Using pandas

import pandas as pd

url='https://drive.google.com/file/d/0B6GhBwm5vaB2ekdlZW5WZnppb28/view?usp=sharing'
file_id=url.split('/')[-2]
dwn_url='https://drive.google.com/uc?id=' + file_id
df = pd.read_csv(dwn_url)
print(df.head())

Using pandas and requests

import pandas as pd
import requests
from io import StringIO

url='https://drive.google.com/file/d/0B6GhBwm5vaB2ekdlZW5WZnppb28/view?usp=sharing'

file_id = url.split('/')[-2]
dwn_url='https://drive.google.com/uc?export=download&id=' + file_id
url2 = requests.get(dwn_url).text
csv_raw = StringIO(url2)
df = pd.read_csv(csv_raw)
print(df.head())

output

      sex   age state  cheq_balance  savings_balance  credit_score  special_offer
0  Female  10.0    FL       7342.26          5482.87           774           True
1  Female  14.0    CA        870.39         11823.74           770           True
2    Male   0.0    TX       3282.34          8564.79           605           True
3  Female  37.0    TX       4645.99         12826.76           608           True
4    Male   NaN    FL           NaN          3493.08           551          False
Answered By: BhishanPoudel

Here is similar implementation using R

library(tidyverse)

url='https://drive.google.com/file/d/0B6GhBwm5vaB2ekdlZW5WZnppb28/view?usp=sharing'
file_id=nth(strsplit(url, split = "/")[[1]], -2)
dwn_url=paste('https://drive.google.com/uc?id=',file_id,sep = "")
df = read_csv(dwn_url)

head(df)
Answered By: Ben Allen

In case you’re using Google Colab you can add file to your Drive and type (default folder names):

df = pd.read_csv('/content/drive/MyDrive/.../your_file.csv')
Answered By: toribicks

If you are using google colab as notebook you can directly mount the drive and then copy the path of file:

    df = pd.read_csv('/content/drive/MyDrive/Dataset/dataset.csv')
    df.head()
Answered By: Vaibhav Gaware

The other answers are great for reading a publicly accessible file but, if trying to read a private file that has been shared with an email account, you may want to consider using PyDrive.

There are many ways to authenticate (OAuth, using a GCP service account, etc). Once authenticated, reading a CSV can be as simple as getting the file ID and fetching its contents:

from io import StringIO

from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive

# Assuming authentication has been performed and stored in a variable called gauth
drive = GoogleDrive(gauth)
params = {
    'q': f"id='{file_id}' = id and mimeType='text/csv'"
}
# List all files that satisfy the query
file_list = drive.ListFile(params).GetList()

gdrive_csv_file = file_list[0]
input_csv = StringIO(gdrive_csv_file.GetContentString())
    
df = pd.read_csv(input_csv)
Answered By: arredond

Google has updated the part of the query string URL now (usp=share_link).
The following code works now:

import pandas as pd
url="https://drive.google.com/file/d/1a7qwzU2mbaJPkFQZMJCkdE37Ne2DbgHA/view?usp=share_link"
reconstructed_url='https://drive.google.com/uc?id=' + url.split('/')[-2]
df = pd.read_csv(reconstructed_url)
df
Answered By: BhaskarT
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.