Equivalent of numpy.digitize in tensorflow
Question:
I am working on a customised loss function that uses numpy.digitize()
internally. The loss is minimised for a set of parameters that are the bins
values used in digitize method. In order to use the tensorflow
optimisers, I would like to know if there an equivalent implementation of digitize
in tensorflow
? if not is there a good way to implement a workaround?
Here a numpy version:
def fom_func(b, n):
np.where((b > 0) & (n > 0), np.sqrt(2*(n*np.log(np.divide(n,b)) + b - n)),0)
def loss(param, X, y):
param = np.sort(np.asarray(param))
nbins = param.shape[0]
score = 0
y_pred = np.digitize(X, param)
for c in np.arange(nbins):
b = np.where((y==0) & (y_pred==c), 1, 0).sum()
n = np.where((y_pred==c), 1, 0).sum()
score += fom_func(b,n)**2
return -np.sqrt(score)
Answers:
The equivalent of np.digitize
method is called bucketize
in TensorFlow, quoting from this api doc:
Bucketizes ‘input’ based on ‘boundaries’.
Summary
For example, if the inputs are boundaries = [0, 10, 100] input = [[-5, 10000] [150, 10] [5, 100]]
then the output will be output = [[0, 3] [3, 2] [1, 3]]
Arguments:
scope: A Scope object
input: Any shape of Tensor contains with int or float type.
boundaries: A sorted list of floats gives the boundary of the buckets.
Returns:
Output: Same shape with ‘input’, each value of input replaced with bucket index.
(numpy) Equivalent to np.digitize.
I’m not sure why but, this method is hidden in TensorFlow (see the hidden_ops.txt file). So I wouldn’t count on it even if you can import it by doing:
from tensorflow.python.ops import math_ops
math_ops._bucketize
this has helped me, you only have to pay attention that the affiliation does not happen to the right or to the left but with regard to the spaces in between the bins:
import tensorflow_probability as tfp
tfp.stats.find_bins()
I am working on a customised loss function that uses numpy.digitize()
internally. The loss is minimised for a set of parameters that are the bins
values used in digitize method. In order to use the tensorflow
optimisers, I would like to know if there an equivalent implementation of digitize
in tensorflow
? if not is there a good way to implement a workaround?
Here a numpy version:
def fom_func(b, n):
np.where((b > 0) & (n > 0), np.sqrt(2*(n*np.log(np.divide(n,b)) + b - n)),0)
def loss(param, X, y):
param = np.sort(np.asarray(param))
nbins = param.shape[0]
score = 0
y_pred = np.digitize(X, param)
for c in np.arange(nbins):
b = np.where((y==0) & (y_pred==c), 1, 0).sum()
n = np.where((y_pred==c), 1, 0).sum()
score += fom_func(b,n)**2
return -np.sqrt(score)
The equivalent of np.digitize
method is called bucketize
in TensorFlow, quoting from this api doc:
Bucketizes ‘input’ based on ‘boundaries’.
Summary
For example, if the inputs are boundaries = [0, 10, 100] input = [[-5, 10000] [150, 10] [5, 100]]
then the output will be output = [[0, 3] [3, 2] [1, 3]]
Arguments:
scope: A Scope object
input: Any shape of Tensor contains with int or float type.
boundaries: A sorted list of floats gives the boundary of the buckets.
Returns:Output: Same shape with ‘input’, each value of input replaced with bucket index.
(numpy) Equivalent to np.digitize.
I’m not sure why but, this method is hidden in TensorFlow (see the hidden_ops.txt file). So I wouldn’t count on it even if you can import it by doing:
from tensorflow.python.ops import math_ops
math_ops._bucketize
this has helped me, you only have to pay attention that the affiliation does not happen to the right or to the left but with regard to the spaces in between the bins:
import tensorflow_probability as tfp
tfp.stats.find_bins()