Q1 – Activation Functions

Contributed by Louis-Guillaume Gagnon.

  1. Using the definition of the derivative:
    \frac{d}{dx}f(x) = \lim_{\epsilon\rightarrow 0}\ \frac{f(x + \epsilon) - f(x)}{\epsilon}
    show that the derivative of the rectified linear unit ($relu(x) = max(0, x)$) is given by the Heaviside step function:
    H(x)=\begin{cases} 1 & \text{if } x > 0\\ 0 & \text{otherwise} \end{cases}
  2. Give an alternative definition of relu using H(x). Can you think of a second one?
  3. Give an asymptotic expression for H(x) using the sigmoid function, \sigma(x)=1/(1+e^{-x}).
  4. Using the same technique as in (a), show that the derivative of H(x) is given by the dirac delta function \delta(x), defined in section 3.9.5 of the Deep Learning textbook.
Advertisements

3 thoughts on “Q1 – Activation Functions

  1. Answer – Q1:

    1. Derivation of H(x)
    \frac{d}{dx}f(x) = \lim_{\epsilon\to0} \frac{f(x+\epsilon) - f(x)}{\epsilon}

    if x > 0: \lim_{\epsilon\to0} \frac{max(0, x + \epsilon) - max(0, x)}{\epsilon} = \lim_{\epsilon\to0} \frac{x + \epsilon - x}{\epsilon} = \lim_{\epsilon\to0} \frac{\epsilon}{\epsilon} = 1

    if x < 0: \lim_{\epsilon\to0} \frac{max(0, x + \epsilon) - max(0, x)}{\epsilon} = \lim_{\epsilon\to0} \frac{0 - 0}{\epsilon} = 0

    2. Two alternate definitions of relu(x):
    a) relu(x) = max(0, x) = x * H(x)
    b) relu(x) = max(0, x) = (abs(x) + x)/2

    3. Asymptotic expression for H(x)
    H(x) = \begin{cases} \lim_{x\to+\infty} \sigma(x), & \mbox{if } x\mbox{ \textgreater\, 0} \\ \lim_{x\to-\infty} \sigma(x), & \mbox{if } x\mbox{ \textless\, 0} \end{cases}

    4. Derivative of H(x) is dirac delta
    \frac{d}{dx}f(x) = \lim_{\epsilon\to0} \frac{f(x+\epsilon) - f(x-\epsilon)}{2\epsilon}

    if x > 0: \lim_{\epsilon\to0} \frac{H(x + \epsilon) - H(x - \epsilon)}{2\epsilon} = \lim_{\epsilon\to0} \frac{1 - 1}{2\epsilon} = 0

    if x < 0: \lim_{\epsilon\to0} \frac{H(x + \epsilon) - H(x - \epsilon)}{2\epsilon} = \lim_{\epsilon\to0} \frac{0 - 0}{2\epsilon} = 0

    if x = 0: \lim_{\epsilon\to0} \frac{H(x + \epsilon) - H(x - \epsilon)}{2\epsilon} = \lim_{\epsilon\to0} \frac{1 - 0}{2\epsilon} = \lim_{\epsilon\to0} \frac{1}{\epsilon} = \infty

    This represents the dirac delta function, which is 0 everywhere except at 0.

    Like

  2. For 2, you can get another definition of ReLU using the step function: $relu(x) = \int_{-\infty}^{x} H(x) dx$

    For 3, obtain a smoothed version of the step function using the sigmoid:
    $\lim_{k\rightarrow\infty}\sigma(kx) = \frac{1}{1 + e^{-kx}} = H(x)$

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s