




























































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Answers to odd number problems and review concept questions of Introduction to Econometrics 4th Edition by Stock and Mark
Typology: Exercises
1 / 197
This page cannot be seen from the preview
Don't miss anything!
(This version September 14, 2018)
Stock/Watson - Introduction to Econometrics – 4th Edition – Review the Concepts
Chapter 1
1.1 The experiment that you design should have one or more treatment groups and a control group; for example, one treatment could be studying for four hours, and the control would be not studying (no treatment). Students would be randomly assigned to the treatment and control groups, and the causal effect of hours of study on midterm performance would be estimated by comparing the average midterm grades for each of the treatment groups to that of the control group. The largest impediment is to ensure that the students in the different treatment groups spend the correct number of hours studying. How can you make sure that the students in the control group do not study at all, since that might jeopardize their grade? How can you make sure that all students in the treatment group actually study for four hours?
1.2 This experiment needs the same ingredients as the experiment in the previous question: treatment and control groups, random assignment, and a procedure for analyzing the resulting experimental data. Here there are two treatment levels: not wearing a seatbelt (the control group) and wearing a seatbelt (the treated group). These treatments should be applied over a specified period of time, such as the next year. The effect of seat belt use on traffic fatalities could be estimated as the difference between fatality rates in the control and treatment group. One impediment to this study is ensuring that participants follow the treatment (do or do not wear a seat belt). More importantly, this study raises serious ethical concerns because it instructs participants to engage in known unsafe behavior (not wearing a seatbelt).
a. You will need to specify the treatment(s) and randomization method, as in Questions 1.1 and 1.2. b. One such cross-sectional data set would consist of a number of different firms with the observations collected at the same point in time. For example, the data set might contain data on training levels and average labor productivity for 100 different firms
Stock/Watson - Introduction to Econometrics – 4th Edition – Review the Concepts
Chapter 2
2.1 These outcomes are random because they are not known with certainty until they actually occur. You do not know with certainty the gender of the next person you will meet, the time that it will take to commute to school, and so forth.
2.2 If X and Y are independent, then Pr( Y ≤ y | X = x ) = Pr( Y ≤ y ) for all values of y and x. That is, independence means that the conditional and marginal distributions of Y are identical so that learning the value of X does not change the probability distribution of Y : Knowing the value of X says nothing about the probability that Y will take on different values.
2.3 Although there is no apparent causal link between rainfall and the number of children born, rainfall could tell you something about the number of children born. Knowing the amount of monthly rainfall tells you something about the season, and births are seasonal. Thus, knowing rainfall tells you something about the month, which tells you something about the number of children born. Thus, rainfall and the number of children born are not independently distributed.
2.4 The average weight of four randomly selected students is unlikely to be exactly 145 lbs. Different groups of four students will have different sample average weights, sometimes greater than 145 lbs. and sometimes less. Because the four students were selected at random, their sample average weight is also random.
2.5 All of the distributions will have a normal shape and will be centered at 1, the mean of Y. However they will have different spreads because they have different variances. The variance of is 4/ n , so the variance shrinks as n gets larger. In your plots, the spread of the normal density when n = 2 should be wider than when n = 10, which should be wider than when n = 100. As n gets very large, the variance approaches zero, and the normal density collapses around the mean of Y. That is, the distribution of becomes
Y
Y
Stock/Watson - Introduction to Econometrics – 4th Edition – Review the Concepts
tends to 1), which is just what the law of large numbers says.
2.6 The normal approximation does not look good when n = 5, but looks good for n = 25 and n =100. Thus Pr( ≤ 0.1) is approximately equal to the value computed by the normal approximation when n is 25 or 100, but is not well approximated by the normal distribution when n = 5.
2.7 The probability distribution looks liked Figure 2.3b, but with more mass concentrated in
and, because this is substantial mass in the tails of the distribution, Pr( Y > c ) remains significantly greater than zero even for large values of c.
Y
Y
Stock/Watson - Introduction to Econometrics – 4th Edition – Review the Concepts
3.7 The treatment (or causal) effect is the difference between the mean outcomes of treatment and control groups when individuals in the population are randomly assigned to the two groups. The differences-in-mean estimator is the difference between the mean outcomes of treatment and control groups for a randomly selected sample of individuals in the population, who are then randomly assigned to the two groups.
3.8 The plot for (a) is upward sloping, and the points lie exactly on a line. The plot for (b) is downward sloping, and the points lie exactly on a line. The plot for (c) should show a positive relation, and the points should be close to, but not exactly on an upward-sloping line. The plot for (d) shows a generally negative relation between the variables, and the points are scattered around a downward-sloping line. The plot for (e) has no apparent linear relation between the variables.
Stock/Watson - Introduction to Econometrics – 4th Edition – Review the Concepts
Chapter 4
Similarly, ui is the value of the regression error for the i th^ observation; ui is the
value = + X.
4.2 There are many examples. Here is one for each assumption. If the value of X is assigned in a randomized controlled experiment, then (1) is satisfied. For the class size regression, if X = class size is correlated with other factors that affect test scores, then u and X are correlated and (1) is violated. If entities (for example, workers or schools) are randomly selected from the population, then (2) is satisfied. For the class size regression, if only rural schools are included in the sample while the population of interest is all schools, then (2) is violated, If u is normally distributed, then (3) is satisfied. For the class size regression, if some test scores are misreported as 100, (out of a possible 1000), then large outliers are possible and (3) is violated.
4.3 SER is an estimate of the standard deviation of the error term in the regression. The error term summarizes the effect of factors other than X for explaining Y. If the standard deviation of the error term is large, these omitted factors have a large effect on Y. The units of SER are the same as the units of Y. R^2 measures the fraction of the variability of Y explained by X , and 1- R^2 measures the fraction of the variability of Y explained by the factors comprising the regression’s error term. If R^2 is large, most of the variability in Y is explained by X. R^2 is “unit free” and takes on values between zero and one.
u^ ˆ i
Stock/Watson - Introduction to Econometrics – 4th Edition – Review the Concepts
Chapter 5
1,... , n can be constructed in three steps: (1) compute the sample mean and the standard error SE ( ); (2) compute the t -statistic for this sample t act^ = / SE ( ); (3) using the standard normal table, compute the p -value = Pr(| Z | > | t act |) = 2×F(−| t act |). A similar three-step procedure is used to construct the p -value for a two-sided test of
SE ( ); (2) compute the t -statistic for this sample t act^ = / SE ( ); (3) using the standard normal table, compute the p -value = Pr(| Z | > | t act |) = 2×F(−| tact |).
5.2. The wage gender gap for 2015 can be estimated using the regression in Equation (5.19) (page 148) and the data summarized in the 2015 row of Table 3.1 (page 80). The dependent variable is the hourly earnings of the i th^ person in the sample. The independent variable is a binary variable that equals 1 if the person is a male and equals 0 if the person is a female. The wage gender gap in the population is the population
for the other years can be estimated in a similar fashion.
5.3 Homoskedasticity means that the variance of u is unrelated to the value of X. Heteroskedasticity means that the variance of u is related to the value of X. If the value of X is chosen using a randomized controlled experiment, then u is homoskedastic. In a regression of a worker's earnings ( Y ) on years of education ( X ), u would heteroskedastic if the variance of earnings is higher for college graduates than for non-college graduates. Figure 5.3 (page 151) suggests that this is indeed the case.
Y Y Y
Stock/Watson - Introduction to Econometrics – 4th Edition – Review the Concepts
hour than non-college graduates.
Stock/Watson - Introduction to Econometrics – 4th Edition – Review the Concepts
6.5 If X 1 and X 2 are highly correlated, most of the variation in X 1 coincides with the variation in X 2. Thus there is little variation in X 1 , holding X 2 constant that can be used to estimate the partial effect of X 1 on Y.
Stock/Watson - Introduction to Econometrics – 4th Edition – Review the Concepts
Chapter 7
statistic from Section 7.2. The F -statistic is necessary to test a joint hypothesis because the test will be based on both and , and this means that the testing procedure must use properties of their joint distribution.
7.2 Here is one example. Using data from several years of her econometrics class, a professor regresses students’ scores on the final exam ( Y ) on their score from the midterm exam ( X ). This regression will have a high R^2 , because people who do well on the midterm tend to do well on the final. However, this regression produces a biased estimate of the causal effect of midterm scores on the final. Students who do well on the midterm tend to be students who attend class regularly, study hard, and have an aptitude for the subject. The variables are correlated with the midterm score but are determinants of the final exam score, so omitting them leads to omitted variable bias.
7.3 Control variables are regressors that capture the effects of omitted variables in a regression. These variables can eliminate or attenuate omitted variable bias for the coefficient on the variable of interest. Coefficients on the control will, in general, be biased estimates of causal effects because (by design) they capture the effect of omitted variables. In Table 7.1, student-teacher ratio is the variable of interest and the other variables are control variables.
Stock/Watson - Introduction to Econometrics – 4th Edition – Review the Concepts
8.6 You want to compare the fit of your linear regression to the fit of a nonlinear regression. Your answer will depend on the nonlinear regression that you choose for the comparison. You might test your linear regression against a quadratic regression by adding X^2 to the linear regression. If the coefficient on X^2 is significantly different from zero, then you can reject the null hypothesis that the relationship is linear in favor of the alternative that it is quadratic.
Stock/Watson - Introduction to Econometrics – 4th Edition – Review the Concepts
Chapter 9
9.1 See Key Concept 9.1 (page 316) and the item (1) in the chapter summary.
9.2 Including an additional variable that belongs in the regression will eliminate or reduce omitted variable bias. However, including an additional variable that does not belong in the regression will, in general, reduce the precision (increase the variance) of the estimator of the other coefficients.
9.3 It is important to distinguish between measurement error in Y and measurement error in X. If Y is measured with error, then the measurement error becomes part of the regression error term, u. If the assumptions of Key Concept 6.4 (page 201) continue to hold, this will not affect the internal validity of OLS regression, although by making the variance of the regression error term larger, it increases the variance of the OLS estimator. If X is measured with error, however, this can result in correlation between the regressor and regression error, leading to inconsistency of the OLS estimator. As suggested by Equation (9.2), this inconsistency becomes more severe the larger is the measurement error [that is, the larger is in Equation (9.2)].
9.4 Schools with higher-achieving students could be more likely to volunteer to take the test, so that the schools volunteering to take the test are not representative of the population of schools, and sample selection bias will result. For example, if all schools with a low student–teacher ratio take the test, but only the best-performing schools with a high student–teacher ratio do, the estimated class size effect will be biased.
9.5 Cities with high crime rates may decide that they need more police protection and spend more on police, but if police do their job then more police spending reduces crime. Thus, there are causal links from crime rates to police spending and from police spending to crime rates, leading to simultaneous causality bias.
s w^2
Stock/Watson - Introduction to Econometrics – 4th Edition – Review the Concepts
Chapter 10
10.1 Panel data (also called longitudinal data) refers to data for n different entities observed at T different time periods. One of the subscripts, i , identifies the entity, and the other subscript, t , identifies the time period.
10.2 A person’s ability or motivation might affect both education and earnings. More able individuals tend to complete more years of schooling, and, for a given level of education, they tend to have higher earnings. The same is true for highly motivated people. The state of the macroeconomy is a time-specific variable that affects both earnings and education. During recessions, unemployment is high, earnings are low, and enrollment in colleges increases. Person-specific and time-specific fixed effects can be included in the regression to control for person-specific and time-specific variables. In this case, the effect of education on earnings is estimated using the variation in earnings for individuals whose education changed during the 10-year sample.
10.3 When person-specific fixed effects are included in a regression, they capture all features of the individual that do not vary over the sample period. Since sex does not vary over the sample period, its effect on earnings cannot be determined separately from the person-specific fixed effect. Similarly, time fixed effects capture all features of the time period that do not vary across individuals. The national unemployment rate is the same for all individuals in the sample at a given point in time, and thus its effect on earnings cannot be determined separately from the time-specific fixed effect.
10.4 There are several factors that will lead to serial correlation. For example, the economic conditions in a particular individual’s city or industry might be different from the economy-wide average that is captured by the regression’s time fixed effect. If these conditions vary slowly over time, they will lead to serial correlation in the error term. As another example, suppose that in 2009 the individual is lucky and finds a particularly
Stock/Watson - Introduction to Econometrics – 4th Edition – Review the Concepts
.
high-paying job that she keeps through 2014. Other things equal, this will lead to negative values of uit before 2009 (when the individual’s earning are lower than her average earnings over 2008-2017), and positive values in 2014 and later (when the individual’s earning are higher than her average earnings over 2008-2017).