In this homework we are going to revisit some concept that we covered in class.

**Question 1**: Suppose we toss a fair coin until we get
exactly two heads. Describe the sample space \(\Omega\). What is the probability that
exactly k tosses are required?

**Question 2**: Let \(X_1,...,X_n \sim \text{Uniform(0,1)}\) and
let \(Y_n = \max{X_1,...,X_n}\), what
is \(E(Y_n)\) ?

**Question 3**: 2 players go in 2 different rooms where
they each toss a coin simultanously. Each player then has to guess the
other player’s toss outcome. The players can discuss their strategy
beforehand, but cannot communicate once they enter their designated
rooms. If both guessed wrong, they both get 0, if one or both correclty
guessed the other players’ toss outcome then both win a payoff of 1.
Find the optimal strategy and the associated expected payoff per player
(hint, it is strictly higher than \(0.75\)).

Get the prepared data constructed from the PSID from Blundell,
Pistaferri and Saporta. To load this data you will need to install
the package `readstata13`

. You can do that by running
`install.packages('readstata13')`

.

then you can load the data

` require(readstata13)`

`## Loading required package: readstata13`

` require(data.table)`

`## Loading required package: data.table`

` data = data.table(read.dta13("../data/data4estimation.dta"))`

we start by computing the wage residuals

```
data[, lwr := log_y] # we use raw wages, we could also use the residual from a regression on age, gender, etc...
# extract one wage per individual
Y = data[!is.na(lwr), list(lwr = lwr[1] ), person][,lwr]
hist(Y)
```

` qqnorm(Y)`

At this point we notice that the distribution of log wages is not a normal distribution. Never the less we want to form a confidence interval on the mean log wage in the population. The central limit theorem tells us how this will be distributed asymptotically.

Call \(F(Y)\) the distribution of wages we recovered from the data. We are interested in constructing the distribution of the average of \(n\) draws from this distribution. for each \(n \leq N\) we will construct random samples from \(F(Y)\) simply by drawing with replacement from the wages we have in our data.

**Question 4**: Write a function that draws \(n\) wages from the data, and then computes
its mean. For each value of \(n\) in
\((2,5,10,25,50,100,150,200,500,1000,2000)\),
use your function to generate 500 replications.

**Question 5**: For each value of \(n\) compute the quantiles accross
replications and compare them to the quantile of normal distribution.
Plot this function on the same graph using ggplot, and show that as
\(n\) grows, the distribution approches
the Normal distribution (recenter and rescale the means within each
\(n\)).