python in Rmarkdown using reticulate cannot read packages
Question:
I am using R on a MacBook. I have an Rmarkdown document and I’m trying to use reticulate in order to use python within R.
First I download the libraries:
```{r libraries, warning = FALSE, message = FALSE}
library(dplyr)
library(reticulate)
```
Next I look at an R chunk and figure out my working directory. Then I write mtcars to my desktop.
```{r chunk, warning = FALSE, message = FALSE}
getwd()
write.csv(mtcars, '/Users/name/Desktop/mtcars.csv', row.names = TRUE)
```
Then I try to use python instead to read in that csv that I just wrote to my desktop.
```{python}
import pandas as pd
mtcars = pd.read_csv('/Users/name/Desktop/mtcars.csv')
```
But I get this error:
ModuleNotFoundError: No module named 'pandas'
NameError: name 'pd' is not defined
So I went to this R documentation website and discovered that with python you have to import packages differently. So I went to terminal and then I typed in
python -m pip install pandas
It seemed to download OK? But when I return to my Rmarkdown document I can’t seem to get the python code to run and read in the csv. I still get the same error message.
I also saw a similar question on this SO post but I’m certain that my RStudio version is newer than the version in this question, so I don’t the answer hits on the same error exactly.
Answers:
An option is to create a virtualenv, install the package and then specify the virtual env to be used
virtualenv_create("py-proj")
py_install("pandas", envname = "py-proj")
In the rmarkdown, we can use
---
title: "Testing"
output:
pdf_document: default
html_document: default
---
```{r libraries, warning = FALSE, message = FALSE}
library(reticulate)
use_virtualenv("py-proj")
```
```{r chunk, warning = FALSE, message = FALSE}
write.csv(mtcars, "/Users/name/Desktop/mtcars.csv", row.names = TRUE)
```
```{python}
import pandas as pd
mtcars = pd.read_csv("/Users/name/Desktop/mtcars.csv")
mtcars.head(5)
```
-output
when I try to use the virtual environment and the installation of pandas through py_install, and although I see that it is installed in my computer in the correct folder, the error "ModuleNotFoundError: No module named ‘pandas’ " comes out again and I don’t know what to do.
These are the chunks and outcomes used:
1st chunk
```{r chunk, warning = FALSE, message = FALSE}
library (reticulate)
virtualenv_create("py-proj")
py_install("pandas", envname = "py-proj")
use_virtualenv("py-proj", required = TRUE)
write.csv(mtcars, "/Users/home/Desktop/mtcars.csv", row.names = TRUE)
```
-output
virtualenv: py-proj
Using virtual environment 'py-proj' ...
Requirement already satisfied: pandas in /Users/home/.virtualenvs/py-proj/lib/python3.10/site-packages (1.5.2)
Requirement already satisfied: python-dateutil>=2.8.1 in /Users/home/.virtualenvs/py-proj/lib/python3.10/site-packages (from pandas) (2.8.2)
Requirement already satisfied: numpy>=1.21.0 in /Users/home/.virtualenvs/py-proj/lib/python3.10/site-packages (from pandas) (1.24.1)
Requirement already satisfied: pytz>=2020.1 in /Users/home/.virtualenvs/py-proj/lib/python3.10/site-packages (from pandas) (2022.7)
Requirement already satisfied: six>=1.5 in /Users/home/.virtualenvs/py-proj/lib/python3.10/site-packages (from python-dateutil>=2.8.1->pandas) (1.16.0)
2nd chunk
```{python}
import pandas as pd
mtcars = pd.read_csv("/Users/home/Desktop/mtcars.csv")
mtcars.head(5)
```
-output
ModuleNotFoundError: No module named 'pandas'
NameError: name 'pd' is not defined
NameError: name 'mtcars' is not defined
Thanks in advance
I am using R on a MacBook. I have an Rmarkdown document and I’m trying to use reticulate in order to use python within R.
First I download the libraries:
```{r libraries, warning = FALSE, message = FALSE}
library(dplyr)
library(reticulate)
```
Next I look at an R chunk and figure out my working directory. Then I write mtcars to my desktop.
```{r chunk, warning = FALSE, message = FALSE}
getwd()
write.csv(mtcars, '/Users/name/Desktop/mtcars.csv', row.names = TRUE)
```
Then I try to use python instead to read in that csv that I just wrote to my desktop.
```{python}
import pandas as pd
mtcars = pd.read_csv('/Users/name/Desktop/mtcars.csv')
```
But I get this error:
ModuleNotFoundError: No module named 'pandas'
NameError: name 'pd' is not defined
So I went to this R documentation website and discovered that with python you have to import packages differently. So I went to terminal and then I typed in
python -m pip install pandas
It seemed to download OK? But when I return to my Rmarkdown document I can’t seem to get the python code to run and read in the csv. I still get the same error message.
I also saw a similar question on this SO post but I’m certain that my RStudio version is newer than the version in this question, so I don’t the answer hits on the same error exactly.
An option is to create a virtualenv, install the package and then specify the virtual env to be used
virtualenv_create("py-proj")
py_install("pandas", envname = "py-proj")
In the rmarkdown, we can use
---
title: "Testing"
output:
pdf_document: default
html_document: default
---
```{r libraries, warning = FALSE, message = FALSE}
library(reticulate)
use_virtualenv("py-proj")
```
```{r chunk, warning = FALSE, message = FALSE}
write.csv(mtcars, "/Users/name/Desktop/mtcars.csv", row.names = TRUE)
```
```{python}
import pandas as pd
mtcars = pd.read_csv("/Users/name/Desktop/mtcars.csv")
mtcars.head(5)
```
-output
when I try to use the virtual environment and the installation of pandas through py_install, and although I see that it is installed in my computer in the correct folder, the error "ModuleNotFoundError: No module named ‘pandas’ " comes out again and I don’t know what to do.
These are the chunks and outcomes used:
1st chunk
```{r chunk, warning = FALSE, message = FALSE}
library (reticulate)
virtualenv_create("py-proj")
py_install("pandas", envname = "py-proj")
use_virtualenv("py-proj", required = TRUE)
write.csv(mtcars, "/Users/home/Desktop/mtcars.csv", row.names = TRUE)
```
-output
virtualenv: py-proj
Using virtual environment 'py-proj' ...
Requirement already satisfied: pandas in /Users/home/.virtualenvs/py-proj/lib/python3.10/site-packages (1.5.2)
Requirement already satisfied: python-dateutil>=2.8.1 in /Users/home/.virtualenvs/py-proj/lib/python3.10/site-packages (from pandas) (2.8.2)
Requirement already satisfied: numpy>=1.21.0 in /Users/home/.virtualenvs/py-proj/lib/python3.10/site-packages (from pandas) (1.24.1)
Requirement already satisfied: pytz>=2020.1 in /Users/home/.virtualenvs/py-proj/lib/python3.10/site-packages (from pandas) (2022.7)
Requirement already satisfied: six>=1.5 in /Users/home/.virtualenvs/py-proj/lib/python3.10/site-packages (from python-dateutil>=2.8.1->pandas) (1.16.0)
2nd chunk
```{python}
import pandas as pd
mtcars = pd.read_csv("/Users/home/Desktop/mtcars.csv")
mtcars.head(5)
```
-output
ModuleNotFoundError: No module named 'pandas'
NameError: name 'pd' is not defined
NameError: name 'mtcars' is not defined
Thanks in advance