R Reticulate – Moving defined variables programmatically from Python environment to R

Question:

Caue:

I’m creating dataframes programmatically in Python using globals().

In the below code, I’m creating 5 datasets that starts with a ‘PREFIX’ in caps, followed by a letter then ending with a suffix.

R

library(reticulate)
repl_python()

Python

import os
import pandas as pd

letters = ('a','b','c','d','e')
df_names = []

for ele in letters:
  globals()['PREFIX_{}_suffix'.format(ele)] = pd.DataFrame(columns = ['col_a', 'col_b']).astype(str)
  df_names.append(['PREFIX_{}_suffix'.format(ele)][0])
print(df_names)
['PREFIX_a_suffix', 'PREFIX_b_suffix', 'PREFIX_c_suffix', 'PREFIX_d_suffix', 'PREFIX_e_suffix']

Request:

I would like to select dataframes starting with a prefix (ideally with regular expression ^PREFIX) and move those specific dataframes from reticulate’s python environment to R environment programmatically.

For the sake of the task, I have added the dataframes variable names into df_names. However, using regex is highly encouraged.

I know the variables are stored in py object that can be accessed with a $ .. but I’m not sure how to select dataframes iteratively and move those dataframes from python’s environment to R’s environment programmatically all at once.


In R, I usually use ls(pattern=<regex>) to select objects in R environment.

In Python, you can list the variables using locals(), see this thread.

This thread discuss passing python functions from R to python.

Answers:

Here is my solution using regex:

In python:

  • Create your regex pattern to fetch desired defined variables
  • Apply your pattern to dir() output, which captures the defined variables in your python’s environment
  • Save selected/fetched variables (dfs) in a list
import os
import re

r = re.compile("^PREFIX")
py_dfs = list(filter(r.match, dir())) # fetch defined variables from python's env
print(py_dfs)
['PREFIX_a_suffix', 'PREFIX_b_suffix', 'PREFIX_c_suffix', 'PREFIX_d_suffix', 'PREFIX_e_suffix']

In R:

  • Access that list from python that has the selected variables names
  • Using R’s reticulate::py_eval evaluate your python object converting it to r using reticulate::py_to_r
  • Using assign to assign dynamic defined variables with the same name of the variables (dataframes) in python
for (df in py$py_dfs){
  name  = df
  r_df = py_to_r(py_eval(df))
  assign(paste0(name), r_df)
}

> ls(pattern="^PREFIX")
[1] "PREFIX_a_suffix" "PREFIX_b_suffix" "PREFIX_c_suffix" "PREFIX_d_suffix" "PREFIX_e_suffix"
> dim(PREFIX_a_suffix)
[1] 0 2
> class(PREFIX_a_suffix)
[1] "data.frame"
> 
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.