World's most popular travel blog for travel bloggers.

When do the gradients exist in a Reinforcement Learning setting?

, ,
Problem Detail:

I am getting properly stuck into reinforcement learning and I am currently reading the review paper by Kober et al. (2013).

And there is one constant feature that I cannot get my head around, but which is mentioned a lot, not just in this paper, but others too; namely the existence of gradients.

In section 2.2.2. they say:

The approach is very straightforward and even applicable to policies that are not differentiable.

What does it mean to say that the gradients exist and indeed, how do we know that they exist? When wouldn't they exist?

The gradient doesn't exist / isn't well-defined for non-differentiable functions. What they mean by that statement is that there is an analogous version of gradients that can be used, instead of the gradient.

Discrete functions

In the discrete case, finite differences are the discrete version of derivatives. The derivative of a single-variable continuous function $f:\mathbb{R} \to \mathbb{R}$ is $df/dx$; the partial difference (a discrete derivative) of a single-variable discrete function $g:\mathbb{Z} \to \mathbb{Z}$ is $\Delta f : \mathbb{Z} \to \mathbb{Z}$ given by

$$\Delta f(x) = f(x+1)-f(x).$$

There's a similar thing that's analogous to a gradient. If we have a function $f(x,y)$ of two variables, the gradient consists of partial derivatives $\partial f / \partial x$ and $\partial f / \partial y$. In the discrete case $g(x,y)$, we have partial differences $\Delta_x f$ and $\Delta_y f$, where

$$\Delta_x f(x,y) = f(x+1,y) - f(x)$$

and similarly for $\Delta_y f$.

Continuous functions

If you have a continuous function that is not differentiable at some points (so the gradient does not exist), sometimes you can use the subgradient in lieu of the gradient.