How to read csv to dataframe in Google Colab

Question:

I am trying to read a csv file which I stored locally on my machine. (Just for additional reference it is titanic data from Kaggle which is here.)

From this question and answers I learnt that you can import data using this code which works well from me.

from google.colab import files
uploaded = files.upload()

Where I am lost is how to convert it to dataframe from here. The sample google notebook page listed in the answer above does not talk about it.

I am trying to convert the dictionary uploaded to dataframe using from_dict command but not able to make it work. There is some discussion on converting dict to dataframe here but the solutions are not applicable to me (I think).

So summarizing, my question is:

How do I convert a csv file stored locally on my files to pandas
dataframe on Google Colaboratory?

Asked By: PagMax

||

Answers:

Pandas read_csv should do the trick. You’ll want to wrap your uploaded bytes in an io.StringIO since read_csv expects a file-like object.

Here’s a full example:
https://colab.research.google.com/notebook#fileId=1JmwtF5OmSghC-y3-BkvxLan0zYXqCJJf

The key snippet is:

import pandas as pd
import io

df = pd.read_csv(io.StringIO(uploaded['train.csv'].decode('utf-8')))
df
Answered By: Bob Smith

Alternatively, you can use github to import files also.
You can take this as an example: https://drive.google.com/file/d/1D6ViUx8_ledfBqcxHCrFPcqBvNZitwCs/view?usp=sharing

Also google does not persist the file for longer so you may have to run the github snippets time and again.

Answered By: Diwakar

Colab google: uploading csv from your PC
I had the same problem with an excel file (*.xlsx), I solved the problem as the following and I think you could do the same with csv files:
– If you have a file in your PC drive called (file.xlsx) then:
1- Upload it from your hard drive by using this simple code:

from google.colab import files
uploaded = files.upload()

Press on (Choose Files) and upload it to your google drive.

2- Then:

import io
data = io.BytesIO(uploaded['file.XLSX'])    

3- Finally, read your file:

import pandas as pd   
f = pd.read_excel(data , sheet_name = '1min', header = 0, skiprows = 2)
#df.sheet_names
df.head()

4- Please, change parameters values to read your own file. I think this could be generalized to read other types of files!
Enjoy it!

Answered By: Yasser M

This worked for me:

from google.colab import auth
auth.authenticate_user()

from pydrive.drive import GoogleDrive
from pydrive.auth import GoogleAuth
from oauth2client.client import GoogleCredentials
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

myfile = drive.CreateFile({'id': '!!!YOUR FILE ID!!!'})
myfile.GetContentFile('file.csv')

Replace !!!YOUR FILE ID!!! with the id of the file in google drive (this is the long alphanumeric string that appears when you click on “obtain link to share”). Then you can access file.csv with pandas’ read_csv:

import pandas as pd
frm = pd.read_csv('file.csv', header=None)
Answered By: JARS

step 1- Mount your Google Drive to Collaboratory

from google.colab import drive 
drive.mount('/content/gdrive')

step 2- Now you will see your Google Drive files in the left pane (file explorer). Right click on the file that you need to import and select çopy path. Then import as usual in pandas, using this copied path.

import pandas as pd 
df=pd.read_csv('gdrive/My Drive/data.csv')

Done!

Answered By: Garima Jain

So, if you were not working on google colab, you would have simply written something like this:

df = pd.read_csv('path_of_the_csv_file')

In google colab, you only thing you have to know is the path of the csv file.

If you follow the steps that I have written below, your problem will be solved:

  1. First of all, upload the CSV file on your google drive.
  2. Then, open your google colab notebook and click on the ‘Files’ icon on the left
    side of the page.
  3. Then, click on the ‘Google Drive Folder’ icon to mount your Google Drive.
  4. Then, look for the csv file that you uploaded on your google drive (step 1),
    and copy its path.
  5. Once you have the path, treat it as an ordinary path and use it in your code.
    It should look something like this:
   df = pd.read_csv('/content/drive/MyDrive/File.csv')
Answered By: Yash Vardhan Singh

this worked for me:

import pandas as pd
import io

df=pd.read_csv(io.StringIO(uploaded['Filename.CSV'].decode('ISO-8859-1')))
df
Answered By: Mahsaa M