How to Execute PaCMAP Dimension Reduction in R from Reduction Wrappers?

Question:

I am trying to Implement PaCMAP Dimension Reduction in R, as shown in this documentation https://rdrr.io/github/milescsmith/dim.reduction.wrappers/man/pacmap.html, the pacmap function was a wrapper function of the original python pacmap in https://github.com/YingfanWang/PaCMAP.

Below is Some Examples of how it should done in Python

import pandas as pd
import pacmap
fraud_data = pd.read_csv(r'C:/Users/User/Desktop/Open Source Dataset/fraud_data.csv')
embedding = pacmap.PaCMAP(n_dims=2, n_neighbors=None, MN_ratio=0.5, FP_ratio=2.0) 
X_transformed = embedding.fit_transform(fraud_data.values, init="pca")

And here is what I try in R version 4.1.0 to no avail:

library(ReductionWrappers)
fraud_data <- read.csv("fraud_data")
pacmap_mapping <- pacmap(as.matrix(fraud_data), n_dims=2, n_neighbors=2, MN_ratio=1, FP_ratio=2) 

Error in py_call_impl(callable, dots$args, dots$keywords) : 
  TypeError: 'float' object cannot be interpreted as an integer 

pacmap_mapping <- pacmap(fraud_data, n_dims=2, n_neighbors=2, MN_ratio=1, FP_ratio=2) 

 Error in py_call_impl(callable, dots$args, dots$keywords) : 
  TypeError: '(0, slice(None, None, None))' is an invalid key

I didn’t understand how to workaround on this error since there was a minimum examples done, what input do I need to make it work?

Here is the Data used to do this examples: https://drive.google.com/file/d/1Yt4V1Ir00fm1vQ9futziWbwjUE9VvYK7/view?usp=sharing

Asked By: Jovan

||

Answers:

I decided to make a recreation of PaCMAP Python using Reticulate in R, to make the function runs in a straightforward execution in R:

library(reticulate)
python_pandas <- import("pandas")
python_pacmap <- import("pacmap")
python_numpy <- import("numpy")
fraud_pandas <- reticulate::r_to_py(fraud_data)
nparray <- fraud_pandas$values
nparray <- nparray$astype(python_numpy$float)
embedding <- python_pacmap$PaCMAP(n_dims=2L, n_neighbors=NULL, MN_ratio=0.5, FP_ratio=2.0) 
X_transformed <- embedding$fit_transform(nparray, init="pca")
fraud_transformed <- data.frame(X_transformed)
Answered By: Jovan

Not sure if this was your problem but I found that the Python pacmac now uses n_components instead of n_dims. I looked at the pacmac() function and pulled out the minimal code which works nicely.

pacmap_module <- reticulate::import(module = "pacmap", delay_load = TRUE)
pacmap <- pacmap_module$PaCMAP(n_components=2L, n_neighbors=NULL, MN_ratio=0.5, FP_ratio=2.0)
    
X_transformed <- pacmap$fit_transform(fraud_data)
Answered By: James