









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
ISYE 6501 Exam Questions and Answers (Solved Papers)
Typology: Exams
1 / 17
This page cannot be seen from the preview
Don't miss anything!
Classification problems are commonly solved using what model(s)? - Correct Answers โ Support Vector Machine Clustering problems are commonly solved using what model(s)? - Correct Answers โ k-means Response Prediction questions are commonly solved using what model(s)? - Correct Answers โ -ARIMA -CART -Exponential smoothing -linear regression -logistic regression -Random Forest Validation questions are commonly solved using what model(s)? - Correct Answers โ -Cross Validation Variance Estimation questions are commonly solved using what model(s)? - Correct Answers โ -GARCH
Examples of models that are designed for use with time series data - Correct Answers โ -ARIMA -CUSUM -Exponential Smoothing -GARCH In the soft classification SVM model where we select coefficients a_0 ... a_m to minimize sum(max(0, 1 - (sum(a_i * x_ij) + a_0 ) * y_j )
True or False: When using a random forest model, it's easy to interpret how its results are determined. - Correct Answers โ False Explanation: Unlike a model like regression where we can show the result as a simple linear combination of each attribute times its regression coefficient, in a random forest model there are so many different trees used simultaneously that it's difficult to interpret exactly how any factor or factors affect the result. Lesson
A common rule of thumb is to stop branching if a leaf would contain less than 5% of the data points. Why not keep branching and allow models to find very close fits to each very small subset of data? - Correct Answers โ Fitting to very small subsets of data will cause overfitting. Explanation: With too few data points, the models will fit to random patterns as well as real ones. Lesson 10.
True or false: In a regression tree, every leaf of the tree has a different regression model that might use different attributes, have different coefficients, etc. - Correct Answers โ True. Explanation: Each leaf's individual model is tailored to the subset of data points that follow all of the branches leading to the leaf. Lesson 10. True or false: Tree-based approaches can be used for other models besides regression. - Correct Answers โ True. Explanation: For example, a classification tree might have a different SVM or KNN model at each leaf. It might even use SVM at some leaves and KNN at others (though that's probably rare). Lesson 10. What does "heteroscedasticity" mean? - Correct Answers โ The variance is different in different ranges of the data. Lesson 9. You might want to de-trend data before... - Correct Answers โ ...using time-series data in a regression model.
When would regression be used instead of a time series model? - Correct Answers โ When there are other factors or predictors that affect the response. Explanation: Regression helps show the relationships between factors and a response. Lesson 8. If two models are approximately equally good, measures like AIC and BIC will favor the simpler model. Simpler models are often better because... - Correct Answers โ 1. Simple models are easier to explain and "sell" to managers and executives
Explanation: Regression is often good for describing and predicting, but is not as helpful for suggesting a course of action. Lesson 8. True or false: regression is a way to determine whether one thing causes another. - Correct Answers โ False. Explanation: Regression can show relationships between observations, but it doesn't show whether one thing causes another. Lesson 8. Suppose our regression model to estimate how tall a 2-year-old will be as an adult has the following coefficients: 0.56xFatherHeight + 0.51xMotherHeight - 0.02xFatherHeightxMotherHeight The negative sign on the coefficient of FatherHeightxMotherHeight means: - Correct Answers โ People with two taller-than-average parents won't be as tall as the individual effects of father's height and mother's height add up to Explanation: The negative coefficient for the interaction term brings down the overall estimate. Lesson 8.
True or False: In the exponential smoothing equation ๐๐ก=๐ผ๐ฅ๐ก+(1โ๐ผ)๐๐ก โ1 only the current observation ๐ฅ๐ก is considered in calculating the estimate ๐๐ก. - Correct Answers โ False. Explanation: Plugging in for ๐๐ก โ1 , and then for ๐๐ก โ2 , etc., shows that ๐๐ก=๐ผ๐ฅ๐ก+(1โ๐ผ)๐ผ๐ฅ๐ก โ1+(1 โ๐ผ)2๐ผ๐ฅ๐ก โ2+(1 โ๐ผ)3๐ผ๐ฅ๐ก โ3+... Lesson
Is exponential smoothing better for short-term forecasting or long-term forecasting? - Correct Answers โ Short-term Explanation: Exponential smoothing bases its forecast primarily on the most-recent data points. For forecasts of the longer-term future, there aren't data points close to the time being forecasted. Lesson 7. What does autoregression mean? - Correct Answers โ Previous values of the thing being estimated are used to calculate the estimate. Explanation: Its own previous values are used in the estimate. Lesson 7.
Why is GARCH different from ARIMA and exponential smoothing? - Correct Answers โ GARCH estimates variance Explanation: ARIMA and exponential smoothing both estimate the value of an attribute; GARCH estimates the variance. Lesson 7. In the CUSUM model, having a higher threshold T makes it... - Correct Answers โ ...detect changes slower, and less likely to falsely detect changes. Explanation: A higher threshold makes it slower to detect both true and false changes. Lesson 6. Why are hypothesis tests often not sufficient for change detection? - Correct Answers โ They are often slow to detect changes. Explanation: Hypothesis tests generally have high threshold levels, which makes them slow to detect changes. Lesson 6. Which of these is generally a good reason to remove an outlier from your data set?
Explanation: Because you know the correct classification for hundreds of images, you can build a model to classify the rest (supervised learning). Lesson 4. The k-means algorithm for clustering is a "heuristic" because... - Correct Answers โ ...it isn't guaranteed to get the best answer. Explanation: Heuristic algorithms are not guaranteed to find the best answer. Lesson 4. Straight-line distance โ๐๐=1(๐ฅ๐โ๐ฆ๐)2โพโพโพโพโพโพโพโพโพโพโพโพโพโพ โ2 corresponds to which distance metric? - Correct Answers โ 2-norm Explanation: The power and root are the same as the norm. Lesson 4. In k-fold cross-validation, how many times is each part of the data used for training, and for validation? - Correct Answers โ k- times for training, and 1 time for validation
Explanation: Each of the k times the model is fit, a different part of the data is used for validation and the rest is used for training. Lesson 3. Which should we use most of the data for: training, validation, or test? - Correct Answers โ Training. Explanation: Most experts recommend using 50-70% of the data for training, and splitting the rest equally between validation and test. Lesson 3. When comparing models, if we use the same data to pick the best model as we do to estimate how good the best one is, what is likely to happen? - Correct Answers โ The model will appear to be better than it really is. Explanation: The model with the highest measured performance is likely to be both good and lucky in its fit to random patterns. Lesson 3. If we use the same data to fit a model as we do to estimate how good it is, what is likely to happen? - Correct Answers โ The model will appear to be better than it really is.
Explanation: The multiplier for classification errors is 200 for data points 21-50, much more than 5 for data points 1-20. Lesson 2. Which of these two terms measures the error in classifying all of the data points? A. โ๐๐=1(๐๐) B. โ๐๐=1๐๐๐ฅ{0,1 โ โ( ๐๐=1๐๐๐ฅ๐๐+๐0)๐ฆ๐} - Correct Answers โ B. Explanation: This term measures classification error. Lesson 2. A survey of 25 people recorded each person's family size and type of car. Which of these is a data point? A. The 14th person's family size and car type B. The 14th person's family size C. The car type of each person - Correct Answers โ A. Explanation: A data point is all the information about one observation. Lesson 2.
A survey of 25 people recorded each person's family size and type of car. Which of these is structured data? A. The contents of a person's Twitter feed B. The amount of money in a person's bank account - Correct Answers โ B. Explanation: Every entry will be a number of dollars and cents. Lesson 2. A survey of 25 people recorded each person's family size and type of car. Which of these is time series data? A. The average cost of a house in the United States every year since 1820. B. The height of each professional basketball player in the NBA at the start of the season - Correct Answers โ A. Explanation: The same thing is measured at yearly time intervals. Lesson 2.