Parameter declaration for vectorized functions

Question:

I’m working on a Python framework for training ML models with different noise functions applied on the training data. Here’s an example of this noise function.

def add_gauss(x, a=0, b=0.1)
   return x + np.random.normal(a,b)

I then build a list with several functions like this

functions = [add_gauss, add_laplace]

And this then gets used in the training function, to be vectorized and applied to the training data:

data = [1, 2, 3, 4]
modified_data_list = []

for function in functions:
   v_function = numpy.vectorize(function)
   modified_data_list.append(v_function(data))

And this results in a list with, for this case, two datasets, one with Gaussian noise and one with Laplace noise.
My current problem: this setup only works because I gave default parameters to the functions I made. I am unsure if there is a way to declare them so that I’d get something like:

functions = [add_gauss(*, 0, 0.1), add_laplace(*, 0, 0.1)]

Where the "*" represents the value of each data entry as it gets modified by the vectorized function.

Is this possible or should I change approach?

Asked By: jdsdev

||

Answers:

One possibility is to write a function that returns a function:

def add_gauss_func(a, b):
   def f(x)
       return x + np.random.normal(a,b)
   return f

This uses a concept called "closure" if you are interested in learning more.

Now you can do

functions = [add_gauss(0, 0.1), add_gauss(0, 0.2)]

for two different functions with different gaussian noise.

A similar technique can work for the laplacian noise function.

In fact, this can probably be generalized to def add_noise(f, *params) or something similar if you want to get really fancy. I encourage you read about decorators to learn about a powerful tool that allows you to add behavior to a function.


This approach is particularly good if you need to make several very similar functions that only differ in some parameters. However, if you don’t have need for such generalisation, a function-dispensing function is an overkill, and you just need a very simple expression, you can use a lambda expression:

functions = [
    lambda x: add_gauss(x, 0, 0.1),
    lambda x: add_laplace(x, 0, 0.1),
]
Answered By: Code-Apprentice
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.