Tensorflow: How to write op with gradient in python?

Yes, as mentionned in @Yaroslav’s answer, it is possible and the key is the links he references: here and here. I want to elaborate on this answer by giving a concret example.

Modulo opperation: Let’s implement the element-wise modulo operation in tensorflow (it already exists but its gradient is not defined, but for the example we will implement it from scratch).

Numpy function: The first step is to define the opperation we want for numpy arrays. The element-wise modulo opperation is already implemented in numpy so it is easy:

import numpy as np
def np_mod(x,y):
    return (x % y).astype(np.float32)

The reason for the .astype(np.float32) is because by default tensorflow takes float32 types and if you give it float64 (the numpy default) it will complain.

Gradient Function: Next we need to define the gradient function for our opperation for each input of the opperation as tensorflow function. The function needs to take a very specific form. It need to take the tensorflow representation of the opperation op and the gradient of the output grad and say how to propagate the gradients. In our case, the gradients of the mod opperation are easy, the derivative is 1 with respect to the first argument and
with respect to the second (almost everywhere, and infinite at a finite number of spots, but let’s ignore that, see https://math.stackexchange.com/questions/1849280/derivative-of-remainder-function-wrt-denominator for details). So we have

def modgrad(op, grad):
    x = op.inputs[0] # the first argument (normally you need those to calculate the gradient, like the gradient of x^2 is 2x. )
    y = op.inputs[1] # the second argument

    return grad * 1, grad * tf.neg(tf.floordiv(x, y)) #the propagated gradient with respect to the first and second argument respectively

The grad function needs to return an n-tuple where n is the number of arguments of the operation. Notice that we need to return tensorflow functions of the input.

Making a TF function with gradients: As explained in the sources mentioned above, there is a hack to define gradients of a function using tf.RegisterGradient [doc] and tf.Graph.gradient_override_map [doc].

Copying the code from harpone we can modify the tf.py_func function to make it define the gradient at the same time:

import tensorflow as tf

def py_func(func, inp, Tout, stateful=True, name=None, grad=None):

    # Need to generate a unique name to avoid duplicates:
    rnd_name="PyFuncGrad" + str(np.random.randint(0, 1E+8))

    tf.RegisterGradient(rnd_name)(grad)  # see _MySquareGrad for grad example
    g = tf.get_default_graph()
    with g.gradient_override_map({"PyFunc": rnd_name}):
        return tf.py_func(func, inp, Tout, stateful=stateful, name=name)

The stateful option is to tell tensorflow whether the function always gives the same output for the same input (stateful = False) in which case tensorflow can simply the tensorflow graph, this is our case and will probably be the case in most situations.

Combining it all together: Now that we have all the pieces, we can combine them all together:

from tensorflow.python.framework import ops

def tf_mod(x,y, name=None):

    with ops.op_scope([x,y], name, "mod") as name:
        z = py_func(np_mod,
                        [x,y],
                        [tf.float32],
                        name=name,
                        grad=modgrad)  # <-- here's the call to the gradient
        return z[0]

tf.py_func acts on lists of tensors (and returns a list of tensors), that is why we have [x,y] (and return z[0]).
And now we are done. And we can test it.

Test:

with tf.Session() as sess:

    x = tf.constant([0.3,0.7,1.2,1.7])
    y = tf.constant([0.2,0.5,1.0,2.9])
    z = tf_mod(x,y)
    gr = tf.gradients(z, [x,y])
    tf.initialize_all_variables().run()

    print(x.eval(), y.eval(),z.eval(), gr[0].eval(), gr[1].eval())

[ 0.30000001 0.69999999 1.20000005 1.70000005] [ 0.2 0.5 1. 2.9000001] [ 0.10000001 0.19999999 0.20000005 1.70000005] [ 1. 1. 1. 1.] [ -1. -1. -1. 0.]

Success!

More Related Contents:

Leave a Comment Cancel reply