1. (16 points) You are asked to estimate the proportion of students at the University of Chicago who have ever cheated. Let this proportion be denoted by ✓. In order to avoid complications due to some students’ reluctance to tell the truth, you collect an i.i.d. sample of responses from n students where you give each student the following instructions: (i) Flip a fair coin secretly. (ii) If the coin comes up “heads,” then answer the following question: “Have you ever cheated?” Otherwise, answer the following question: “Were you born in Illinois?” Let Xi denote the response given by the ith student sampled, where Xi = 1 if the response is “yes” and Xi = 0 if the response is “no.” In particular, note that you do not observe which of the two questions the student answered. (a) (4 points) From admissions records, you determine that the proportion of students at the University of Chicago who were born in Illinois is .1. Show that P{Xi = 1} = ✓ 2 + .05 . (Hint: You can do the rest of this question even if you can’t do this part!) (b) (5 points) Propose an estimator ˆ✓n = ˆ✓n(X1,…,Xn) of ✓. Show that your estimator is consistent. (c) (7 points) Construct a level↵ confidence region for ✓. Please be sure to describe completely how to compute Cn. (Hint: How would you construct a confidence region for P{Xi = 1}?) 2 2. (20 points) Let Y , X and U be random variables such that Y = 0 + 1X + U . Interpret this regression as the best linear predictor of Y given X. (a) (4 points) Explain briefly why we should not expect E[U|X] = 0. (b) (5 points) Explain briefly why we are assured that E[U|X] = 0 when X 2 {0, 1}. (c) For the remainder of the question, assume X 2 {0, 1, 2}. i. (5 points) Explain briefly why we should not expect E[U|X] = 0. ii. (6 points)Can you modify the linear regression so that we are assured that E[U|X] = 0? (Hint: You may need to define new variables using indicator functions.) 3 3. (12 points) Let (Y,X1, X2, X3, U) be a random vector such that Y = 0 + 1X1 + 2X2 + 3X3 + U , where E[U] = E[X1U] = E[X2U] = E[X3U] = 0. Assume further that X1 depends only on X3 through X2 in the sense that X1 BLP(X1|X2) ?? X3 . Consider Y = ⇤ 0 + ⇤ 1X1 + ⇤ 2X2 + U⇤ , where E[U⇤] = E[X1U⇤] = E[X2U⇤] = 0. Show that ⇤ 1 = 1. (Hint: Use Frisch-Waugh-Lovell.) 4 4. (26 points) Suppose Y = 0 + 1X1 + 2X2 + 3X3 + U = X0 + U is a model of the determinants of Y . A researcher is interested in 1 and 2. (a) (3 points) Suppose that X3 is endogenous while X1 and X2 are exogenous. What does this mean? Be as precise as possible. (b) (5 points) Because the researcher suspects the endogeneity of X3, she runs linear regression of Y only on (1, X1, X2). Is the OLS estimator ⇣ ˆOLS 1 , ˆOLS 2 ⌘ consistent for (1, 2)? (c) For the remainder of the question, suppose that the researcher has an access to two valid instruments for X3, Z1 and Z2. She observes an i.i.d. sample of size n from (Y,X1, X2, X3, Z1, Z2). i. (4 points) Provide a formula for the TSLS estimator ˆT SLS. Define every quantity in your expression for the estimator. ii. (4 points) Is ˆT SLS consistent for ? iii. (4 points) Is it necessarily true that 1 n Pn i=1 Xi,3Uˆi = 0? (Hint: Recall that Uˆi = Yi Xi 0 ˆT SLS.) iv. (6 points) Describe how you would test the null hypothesis H0 : 1 = 2 = 0 against the alternative that H1 : 1 6= 0 or 2 6= 0 at 5% significance level. In particular, describe your test statistic, your critical value, and the rule you would use to determine whether or not to reject the null hypothesis. 5 5. (26 points) As an econometrician, you are hired by the University of Chicago to assess the e↵ect of an experiment using the flipped classroom approach. For the experiment, the university randomly selected students and gave them the option to take courses in a flipped classroom setup. To make the comparison, every student took a standardized test after the quarter. Let us use Y1, Y0, X1, X0 to denote the potential outcomes and the potential treatments: Y1 : test score if taught in a flipped classroom Y0 : test score if taught in a traditional classroom X1 : whether the student is taught in the flipped classroom if selected in the experiment X0 : whether the student is taught in flipped classroom if not selected in the experiment . If not selected in the experiment, a student takes courses in a traditional classroom setup. The university gives you the access to a confidential dataset of (Y1, X1, Z1), ··· ,(Yn, Xn, Zn), which is an i.i.d. sample from (Y, X, Z) where Y : the test score X : whether taught in the flipped classroom Z : whether selected in the experiment. (a) (5 points) You first consider estimating Y = 0 + 1X + U using ordinary least squares. Do you expect the limit in probability of the OLS estimator of 1 to equal the ATE? Explain briefly. (Hint: Do we expect (Y1, Y0) ?? X?) (b) (8 points) You next consider estimating Y = ⇤ 0 + ⇤ 1X + U⇤ using two-stage least squares. Provide the conditions under which ⇤ 1 can be interpreted as a LATE as well as an expression for the LATE. Comment briefly on the plausibility of each of these conditions in this context. (Hint: There should be three conditions!) (c) (8 points) Provide a formula for the IV estimator and TSLS estimator for ⇤ 1 . How are these expressions related? (d) (5 points) Suppose that you conclude using the data that the LATE is significantly positive. Would you recommend that the university enforce the flipped classroom to every student? Why or why not? 6. 1. (16 points) You are asked to estimate the proportion of students at the University of Chicago who

have ever cheated. Let this proportion be denoted by ✓. In order to avoid complications due to some

students’ reluctance to tell the truth, you collect an i.i.d. sample of responses from n students where

you give each student the following instructions:

(i) Flip a fair coin secretly.

(ii) If the coin comes up “heads,” then answer the following question: “Have you ever cheated?”

Otherwise, answer the following question: “Were you born in Illinois?”

Let Xi denote the response given by the ith student sampled, where Xi = 1 if the response is “yes” and

Xi = 0 if the response is “no.” In particular, note that you do not observe which of the two questions

the student answered.

(a) (4 points) From admissions records, you determine that the proportion of students at the University of Chicago who were born in Illinois is .1. Show that

P{Xi = 1} = ✓

2 + .05 .

(Hint: You can do the rest of this question even if you can’t do this part!)

(b) (5 points) Propose an estimator ˆ✓n = ˆ✓n(X1,…,Xn) of ✓. Show that your estimator is consistent.

(c) (7 points) Construct a level↵ confidence region for ✓. Please be sure to describe completely how

to compute Cn. (Hint: How would you construct a confidence region for P{Xi = 1}?)

2

2. (20 points) Let Y , X and U be random variables such that

Y = 0 + 1X + U .

Interpret this regression as the best linear predictor of Y given X.

(a) (4 points) Explain briefly why we should not expect E[U|X] = 0.

(b) (5 points) Explain briefly why we are assured that E[U|X] = 0 when X 2 {0, 1}.

(c) For the remainder of the question, assume X 2 {0, 1, 2}.

i. (5 points) Explain briefly why we should not expect E[U|X] = 0.

ii. (6 points)Can you modify the linear regression so that we are assured that E[U|X] = 0?

(Hint: You may need to define new variables using indicator functions.)

3

3. (12 points) Let (Y,X1, X2, X3, U) be a random vector such that

Y = 0 + 1X1 + 2X2 + 3X3 + U ,

where E[U] = E[X1U] = E[X2U] = E[X3U] = 0. Assume further that X1 depends only on X3 through

X2 in the sense that

X1 BLP(X1|X2) ?? X3 .

Consider

Y = ⇤

0 + ⇤

1X1 + ⇤

2X2 + U⇤ ,

where E[U⇤] = E[X1U⇤] = E[X2U⇤] = 0. Show that ⇤

1 = 1. (Hint: Use Frisch-Waugh-Lovell.)

4

4. (26 points) Suppose

Y = 0 + 1X1 + 2X2 + 3X3 + U

= X0

+ U

is a model of the determinants of Y . A researcher is interested in 1 and 2.

(a) (3 points) Suppose that X3 is endogenous while X1 and X2 are exogenous. What does this mean?

Be as precise as possible.

(b) (5 points) Because the researcher suspects the endogeneity of X3, she runs linear regression of Y

only on (1, X1, X2). Is the OLS estimator ⇣

ˆOLS

1 , ˆOLS

2

⌘

consistent for (1, 2)?

(c) For the remainder of the question, suppose that the researcher has an access to two valid instruments for X3, Z1 and Z2. She observes an i.i.d. sample of size n from (Y,X1, X2, X3, Z1, Z2).

i. (4 points) Provide a formula for the TSLS estimator ˆT SLS. Define every quantity in your

expression for the estimator.

ii. (4 points) Is ˆT SLS consistent for ?

iii. (4 points) Is it necessarily true that 1

n

Pn

i=1 Xi,3Uˆi = 0? (Hint: Recall that Uˆi = Yi

Xi

ˆT SLS.)

iv. (6 points) Describe how you would test the null hypothesis H0 : 1 = 2 = 0 against the

alternative that H1 : 1 6= 0 or 2 6= 0 at 5% significance level. In particular, describe your

test statistic, your critical value, and the rule you would use to determine whether or not to

reject the null hypothesis.

5

5. (26 points) As an econometrician, you are hired by the University of Chicago to assess the e↵ect of

an experiment using the flipped classroom approach. For the experiment, the university randomly

selected students and gave them the option to take courses in a flipped classroom setup. To make

the comparison, every student took a standardized test after the quarter. Let us use Y1, Y0, X1, X0 to

denote the potential outcomes and the potential treatments:

Y1 : test score if taught in a flipped classroom

Y0 : test score if taught in a traditional classroom

X1 : whether the student is taught in the flipped classroom if selected in the experiment

X0 : whether the student is taught in flipped classroom if not selected in the experiment .

If not selected in the experiment, a student takes courses in a traditional classroom setup. The

university gives you the access to a confidential dataset of (Y1, X1, Z1), ··· ,(Yn, Xn, Zn), which is an

i.i.d. sample from (Y, X, Z) where

Y : the test score

X : whether taught in the flipped classroom

Z : whether selected in the experiment.

(a) (5 points) You first consider estimating

Y = 0 + 1X + U

using ordinary least squares. Do you expect the limit in probability of the OLS estimator of 1

to equal the ATE? Explain briefly. (Hint: Do we expect (Y1, Y0) ?? X?)

(b) (8 points) You next consider estimating

Y = ⇤

0 + ⇤

1X + U⇤

using two-stage least squares. Provide the conditions under which ⇤

1 can be interpreted as a

LATE as well as an expression for the LATE. Comment briefly on the plausibility of each of these

conditions in this context. (Hint: There should be three conditions!)

(c) (8 points) Provide a formula for the IV estimator and TSLS estimator for ⇤

1 . How are these

expressions related?

(d) (5 points) Suppose that you conclude using the data that the LATE is significantly positive.

Would you recommend that the university enforce the flipped classroom to every student? Why

or why not?

6