The investigation detailed in the previous post taught us at least two things:
- A sitting president's approval rating can be highly correlated with the national unemployment rate over the course of his/her term.
- There seems to be a link between how long a sitting president has been in office and his/her approval rating. Often, approval rating trends downward as time progresses. Can we say, familiarity breeds contempt?
I decided to take these two variables (unemployment rate and time a president has been in office) and do a crude projection to see how the next four years might fare for the new president-elect. I did this using OLS (ordinary least squares), i.e. a straight-up, simple linear regression of approval rating on these two variables.
However, I used an ARIMA process to model and project the path of unemployment over the next four years. I took advantage of the Auto ARIMA feature in R, which automatically chooses an ARIMA model to best fit the given data.
(A fuller introduction to what an ARIMA process is can be found here: https://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average )
Basically, ARIMA is a model you can use to forecast future values of a stationary time series (e.g. stock prices, rainfall, frequency of crimes) that is autoregressive, i.e. it depends on its own past values, and contains a random, stochastic component, called "white noise".
This is an outline of the steps I followed to come up with predicted values for the sitting president's approval rating from 2016 Q4 through 2020 Q4 (this of course includes President Obama's final term approval rating which is still unknown, so I grouped it in the forecasting period):
1) Use Auto ARIMA feature in R to pick a time series model and forecast unemployment rate for the next four years (2016Q4 through 2020Q4).
The model chosen by R is an ARIMA(1, 1, 2) model, or a damped trend linear exponential smoothing model, that has the following AR and MA terms, along with standard error values:
2) Fit a linear (OLS) regression model to the data for the independent variable Approval Rating in terms of the dependent variables (Unemployment Rate, Quarters as President).
I created the time variable Quarters as President (QAP) to simply denote how many quarters the sitting president will have been in office as of that quarter during the forecast period (assuming he remains in office for his full term). The QAP for Obama's last term in 2016Q4 is 32.
The statistics summary for the linear regression model Approval Rating ~ Unemployment + QAP indicates that while both dependent variables are significant (i.e. p-value < 0.05), the R squared value is only 0.1932, which means that approval rating clearly depends on other factors than the two we are considering here. Nonetheless, this is an exercise in crude and simple estimation, so I pushed forward.
Linear model summary:
3) Use the linear model from step # 2 with the forecasted unemployment rate in step # 1 to generate predicted mean values and a confidence interval for Approval Rating in the forecast period (2016 Q4 -2020 Q4).
This is the resulting output from the linear regression model in step 2 after plugging in the forecasted unemployment data and QAP ("fit" column is the actual prediction, "lwr" and "upr" are the default 95% confidence intervals):
Basically, ARIMA is a model you can use to forecast future values of a stationary time series (e.g. stock prices, rainfall, frequency of crimes) that is autoregressive, i.e. it depends on its own past values, and contains a random, stochastic component, called "white noise".
This is an outline of the steps I followed to come up with predicted values for the sitting president's approval rating from 2016 Q4 through 2020 Q4 (this of course includes President Obama's final term approval rating which is still unknown, so I grouped it in the forecasting period):
1) Use Auto ARIMA feature in R to pick a time series model and forecast unemployment rate for the next four years (2016Q4 through 2020Q4).
The model chosen by R is an ARIMA(1, 1, 2) model, or a damped trend linear exponential smoothing model, that has the following AR and MA terms, along with standard error values:
Using this model, the forecasted unemployment rates by quarter for the next 17 quarters, i.e. from 2016 Q4 to 2020 Q4, came out as:
Forecasted Unemployment Rate (2017-2020)
Could the unemployment rate really change so little over the next four years? Maybe, but this is the mean predicted value from the ARIMA model. The confidence interval itself is much larger - it extends from below 0% (which we know isn't possible) to above 10%. For simplicity, I am using this mean predicted value in the final forecast of approval rating.
In a way, it makes sense that unemployment would hit that asymptotic line around 5%, because that's close to the frictional, or "minimum" unemployment rate seen historically.
For further illustration, here is the unemployment rate "forecast" with massive 95% confidence intervals in light shaded blue. The mean predicted value I am using is the dark blue line:
I created the time variable Quarters as President (QAP) to simply denote how many quarters the sitting president will have been in office as of that quarter during the forecast period (assuming he remains in office for his full term). The QAP for Obama's last term in 2016Q4 is 32.
The statistics summary for the linear regression model Approval Rating ~ Unemployment + QAP indicates that while both dependent variables are significant (i.e. p-value < 0.05), the R squared value is only 0.1932, which means that approval rating clearly depends on other factors than the two we are considering here. Nonetheless, this is an exercise in crude and simple estimation, so I pushed forward.
Linear model summary:
3) Use the linear model from step # 2 with the forecasted unemployment rate in step # 1 to generate predicted mean values and a confidence interval for Approval Rating in the forecast period (2016 Q4 -2020 Q4).
This is the resulting output from the linear regression model in step 2 after plugging in the forecasted unemployment data and QAP ("fit" column is the actual prediction, "lwr" and "upr" are the default 95% confidence intervals):
Shown graphically with upper and lower 95% default confidence intervals, and appended to the actual historical presidential approval rating data from 2010 onwards:
Forecasted Approval Rating (2017-2020)
Conclusion and Interpretations:
In conclusion, a couple observations:
- The QAP (quarters as president) variable is responsible for predicting the initial spike in approval rating that comes at the beginning of a new president's term, as the country anticipates the change to a fresh new administration, often with a certain degree of optimism and hopefulness. But because Donald Trump is known to not be terribly popular as an incoming president, it's unlikely that this would be a very pronounced spike.
- The current unemployment rate (about 5%) is not likely to get much lower, since this is near the lower boundary due to frictional unemployment. Also, the QAP variable captures the fact that approval rating tends to go down over time, rather than up, as the nation grows weary of a president. In other words, after the initial "spike" in approval, there's not much room for Trump's approval rating to increase significantly - at least not on the basis of unemployment rate. However, as I noted earlier, the R-squared value for this model is only 0.1932, so there are many other potential explanatory variables that could determine how the new president's popularity plays out.
Data sources:
1, The American Presidency Project, UCSB (http://www.presidency.ucsb.edu/)
2. FRED, Civilian Unemployment Rate (https://fred.stlouisfed.org/)








No comments:
Post a Comment