Consider a linear regression problem with input data . weights and and targets . Now, suppose that dropout is being applied to the input units with probability .

1) Rewrite the input data matrix taking into account the probability of each unit to be dropped out (Hint: the probability of each unit to be dropped out is a Bernoulli random variable with probability ).

2)What is the cost function of the linear regression with dropout?

3)Show that applying dropout to the linear regression problem aforementioned can be seen as using L2 regularization in the loss function.