# Q5 – Softmax and Sigmoid

Contributed by  Antoine Lefebvre-Brossard

1. Show that a 2-class softmax function can be rewritten a a sigmoid function. In
other words that $\text{softmax}(\boldsymbol{x}) = \sigma(z)$ where the softmax function is defined by  $\text{softmax}(\boldsymbol{x}) = (\frac{e^{x_1}}{e^{x_1}+e^{x_2}}, \frac{e^{x_2}}{e^{x_1}+e^{x_2}})^T$ and the sigmoid function is defined by $\sigma(z) = \frac{1}{1 + e^{-z}}$
2. Use the previous result to show that it’s possible to write a $k$-class softmax
function as a function of $k-1$ variables.

## 4 thoughts on “Q5 – Softmax and Sigmoid”

1. I’m having trouble showing that a 2 class softmax function can be rewritten as a sigmoid.

Anyone have an idea?

Like

• Didier N says:

1. We have to divide the numerator by exp (x1) and the denominator by exp (x1) for the first coordinate. And then we do the same thing by exp (x2) for the 2nd coordinate. We obtain z1 = x1-x2 and z2 = x2-x1. Thus for x = (x1, x2) and z = (z1 = x1-x2, z2 = x2-x1) softmax (x) = sigma(z)

Like

2. Didier N says:

2. We can notice in answer (1.) that z1 = -z2. So we can eliminate the variable x1 (or x2) because we have x1-x2 = z and x2-x1 = -z. This gives us x1 = z + x2.

Suppose k = 3:
If we then make the divisions which we have made in question (1.) over all 3 coordinates produced by softmax, we will have in the denominator for the three coordinates

1 + exp (x2-x1) + exp (x3-x1),
1 + exp (x1-x2) + exp (x3-x2),
1 + exp (x1-x3) + exp (x2-x3)

Let z1 = x1-x2 and z2 = x3-x1 then z1 + z2 = x3-x2. So the denominators of the three coordinates respectively will be:

1 + exp (x2-x1) + exp (x3-x1) = 1+ exp (-z1) + exp (z2)
1 + exp (x1-x2) + exp (x3-x2) = 1 + exp (z1) + (z1 + z2)
1 + exp (-1-x3) + exp (x2-x3) = 1 +

We see that we have k-1 variables z1 and z2 instead of x1, x2 and x3 (here k = 3). The same operations can be performed for k> 3.

Like

3. Didier N says:

I’m sorry for the last coordinate, it is 1+exp(x1-x3)+exp(x2-x3) = 1+exp(-z2)+exp(-z1-z2)

Like