Q15 – Softmax and Cross Entropy

The softmax function for m classes is given by

p_i = \frac{e^{x_i}}{\sum_{j=1}^m e^{x_j}} \text{ for } i = 1\ldots m.

It transforms a vector (x_i) of real values into a probability mass vector for a categorical distribution.  It is often used in conjunction with the cross-entropy loss
L(x, y) = - \sum_{i=1}^m y_i \log p_i

  1. Find a simplified expression for p_i when k = 2.
  2. Differentiate p_i with respect to x_k.
  3. Differentiate L with respect to x_k.

5 thoughts on “Q15 – Softmax and Cross Entropy

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s