# Q1 – Activation Functions

Contributed by Louis-Guillaume Gagnon.

1. Using the definition of the derivative:
$\frac{d}{dx}f(x) = \lim_{\epsilon\rightarrow 0}\ \frac{f(x + \epsilon) - f(x)}{\epsilon}$
show that the derivative of the rectified linear unit ($relu(x) = max(0, x)$) is given by the Heaviside step function:
$H(x)=\begin{cases} 1 & \text{if } x > 0\\ 0 & \text{otherwise} \end{cases}$
2. Give an alternative definition of relu using $H(x)$. Can you think of a second one?
3. Give an asymptotic expression for $H(x)$ using the sigmoid function, $\sigma(x)=1/(1+e^{-x})$.
4. Using the same technique as in (a), show that the derivative of $H(x)$ is given by the dirac delta function $\delta(x)$, defined in section 3.9.5 of the Deep Learning textbook.

## 3 thoughts on “Q1 – Activation Functions”

1. Answer – Q1:

1. Derivation of H(x)
$\frac{d}{dx}f(x) = \lim_{\epsilon\to0} \frac{f(x+\epsilon) - f(x)}{\epsilon}$

if x > 0: $\lim_{\epsilon\to0} \frac{max(0, x + \epsilon) - max(0, x)}{\epsilon} = \lim_{\epsilon\to0} \frac{x + \epsilon - x}{\epsilon} = \lim_{\epsilon\to0} \frac{\epsilon}{\epsilon} = 1$

if x < 0: $\lim_{\epsilon\to0} \frac{max(0, x + \epsilon) - max(0, x)}{\epsilon} = \lim_{\epsilon\to0} \frac{0 - 0}{\epsilon} = 0$

2. Two alternate definitions of relu(x):
a) relu(x) = max(0, x) = x * H(x)
b) relu(x) = max(0, x) = (abs(x) + x)/2

3. Asymptotic expression for H(x)
$H(x) = \begin{cases} \lim_{x\to+\infty} \sigma(x), & \mbox{if } x\mbox{ \textgreater\, 0} \\ \lim_{x\to-\infty} \sigma(x), & \mbox{if } x\mbox{ \textless\, 0} \end{cases}$

4. Derivative of H(x) is dirac delta
$\frac{d}{dx}f(x) = \lim_{\epsilon\to0} \frac{f(x+\epsilon) - f(x-\epsilon)}{2\epsilon}$

if x > 0: $\lim_{\epsilon\to0} \frac{H(x + \epsilon) - H(x - \epsilon)}{2\epsilon} = \lim_{\epsilon\to0} \frac{1 - 1}{2\epsilon} = 0$

if x < 0: $\lim_{\epsilon\to0} \frac{H(x + \epsilon) - H(x - \epsilon)}{2\epsilon} = \lim_{\epsilon\to0} \frac{0 - 0}{2\epsilon} = 0$

if x = 0: $\lim_{\epsilon\to0} \frac{H(x + \epsilon) - H(x - \epsilon)}{2\epsilon} = \lim_{\epsilon\to0} \frac{1 - 0}{2\epsilon} = \lim_{\epsilon\to0} \frac{1}{\epsilon} = \infty$

This represents the dirac delta function, which is 0 everywhere except at 0.

Like

2. gagnonlg says:

For 2, you can get another definition of ReLU using the step function: $relu(x) = \int_{-\infty}^{x} H(x) dx$

For 3, obtain a smoothed version of the step function using the sigmoid:
$\lim_{k\rightarrow\infty}\sigma(kx) = \frac{1}{1 + e^{-kx}} = H(x)$

Like

3. sebyjacob says:

I have an alternate answer for part b.

Relu (x) = x(1-H(-x))

Like