Friday, December 2, 2016

Predicting Trump's First Term Approval Rating

If we had to model a president's approval rating, what dependent variables might we choose for our model?

The investigation detailed in the previous post taught us at least two things: 
  1. A sitting president's approval rating can be highly correlated with the national unemployment rate over the course of his/her term.
  2. There seems to be a link between how long a sitting president has been in office and his/her approval rating. Often, approval rating trends downward as time progresses. Can we say, familiarity breeds contempt?
I decided to take these two variables (unemployment rate and time a president has been in office) and do a crude projection to see how the next four years might fare for the new president-elect. I did this using OLS (ordinary least squares), i.e. a straight-up, simple linear regression of approval rating on these two variables. 

However, I used an ARIMA process to model and project the path of unemployment over the next four years. I took advantage of the Auto ARIMA feature in R, which automatically chooses an ARIMA model to best fit the given data.

(A fuller introduction to what an ARIMA process is can be found here: https://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average )

Basically, ARIMA is a model you can use to forecast future values of a stationary time series (e.g. stock prices, rainfall, frequency of crimes) that is autoregressive, i.e. it depends on its own past values, and contains a random, stochastic component, called "white noise".

This is an outline of the steps I followed to come up with predicted values for the sitting president's approval rating from 2016 Q4 through 2020 Q4 (this of course includes President Obama's final term approval rating which is still unknown, so I grouped it in the forecasting period):


1) Use Auto ARIMA feature in R to pick a time series model and forecast unemployment rate for the next four years (2016Q4 through 2020Q4).

The model chosen by R is an ARIMA(1, 1, 2) model, or a damped trend linear exponential smoothing model, that has the following AR and MA terms, along with standard error values:


Using this model, the forecasted unemployment rates by quarter for the next 17 quarters, i.e. from 2016 Q4 to 2020 Q4, came out as:


                        Forecasted Unemployment Rate (2017-2020)

Could the unemployment rate really change so little over the next four years? Maybe, but this is the mean predicted value from the ARIMA model. The confidence interval itself is much larger - it extends from below 0% (which we know isn't possible) to above 10%. For simplicity, I am using this mean predicted value in the final forecast of approval rating. 

In a way, it makes sense that unemployment would hit that asymptotic line around 5%, because that's close to the frictional, or "minimum" unemployment rate seen historically.

For further illustration, here is the unemployment rate "forecast" with massive 95% confidence intervals in light shaded blue. The mean predicted value I am using is the dark blue line:


2) Fit a linear (OLS) regression model to the data for the independent variable Approval Rating in terms of the dependent variables (Unemployment Rate, Quarters as President).

I created the time variable Quarters as President (QAP) to simply denote how many quarters the sitting president will have been in office as of that quarter during the forecast period (assuming he remains in office for his full term). The QAP for Obama's last term in 2016Q4 is 32.


The statistics summary for the linear regression model Approval Rating ~ Unemployment + QAP indicates that while both dependent variables are significant (i.e. p-value < 0.05), the R squared value is only 0.1932, which means that approval rating clearly depends on other factors than the two we are considering here. Nonetheless, this is an exercise in crude and simple estimation, so I pushed forward.

Linear model summary:



3) Use the linear model from step # 2 with the forecasted unemployment rate in step # 1 to generate predicted mean values and a confidence interval for Approval Rating in the forecast period (2016 Q4 -2020 Q4).

This is the resulting output from the linear regression model in step 2 after plugging in the forecasted unemployment data and QAP ("fit" column is the actual prediction, "lwr" and "upr" are the default 95% confidence intervals):


Shown graphically with upper and lower 95% default confidence intervals, and appended to the actual historical presidential approval rating data from  2010 onwards:

          Forecasted Approval Rating (2017-2020)
Conclusion and Interpretations:

In conclusion, a couple observations:
  • The QAP (quarters as president) variable is responsible for predicting the initial spike in approval rating that comes at the beginning of a new president's term, as the country anticipates the change to a fresh new administration, often with a certain degree of optimism and hopefulness. But because Donald Trump is known to not be terribly popular as an incoming president, it's unlikely that this would be a very pronounced spike.
  • The current unemployment rate (about 5%) is not likely to get much lower, since this is near the lower boundary due to frictional unemployment. Also, the QAP variable captures the fact that approval rating tends to go down over time, rather than up, as the nation grows weary of a president. In other words, after the initial "spike" in approval, there's not much room for Trump's approval rating to increase significantly - at least not on the basis of unemployment rate. However, as I noted earlier, the R-squared value for this model is only 0.1932, so there are many other potential explanatory variables that could determine how the new president's popularity plays out.

Data sources:

1, The American Presidency Project, UCSB (http://www.presidency.ucsb.edu/) 
2. FRED, Civilian Unemployment Rate (https://fred.stlouisfed.org/)

Sunday, November 27, 2016

The Link Between Unemployment and Presidential Approval Rating

The last post attempted to answer the question, does it matter who the president is for the purpose of determining national economic performance (growth)? The answer was, at least statistically speaking, no, probably not. 

But does the president get the blame for a poorly performing economy (and high regard for a strongly performing one)? 

In this investigation, I gathered and aggregated data on national unemployment rate and presidential approval rating by quarter, from 1977Q1 to 2016Q2. This time, I selected unemployment rate as the economic performance metric because the availability of jobs and hiring conditions is probably the most directly related to how the average Joe feels about the state of the economy (more so than inflation, GDP, etc). 

The below graphs are scatter plots for each of the six presidencies from Carter to Obama of average quarterly presidential approval rating against average quarterly civilian unemployment rate. Each dot on the graph represents one quarter during that president's term.










 Conclusions and Interpretations:

When the economy was the main theme of a particular presidency, the poll numbers showed that Americans' satisfaction with the sitting president was strongly correlated with how the economy was performing.

There are three presidencies (Reagan, Bush Sr, Clinton) where the unemployment rate was highly negatively correlated with approval rating - that is, the president's approval rating was higher when unemployment was low, and vice versa. Logically, this is what one might expect. These are strong correlations, as seen from the p-value on the right hand side of the above table, which tells us the probability of the observed values if the null hypothesis were true (i.e. if there were actually no correlation between the two variables whatsoever). All three of these presidencies have p-values under 0.05, indicating that there conclusively is a correlation between approval rating and unemployment at the 5% significance level, and all of the correlations are negative, as shown by the correlation coefficients. 

Interestingly, the other three presidencies (Carter, George W. Bush, Obama) showed slightly positive correlations between the two variables. However, one should not conclude that these presidents benefited in terms of popularity from a sinking economy, because these are all weak correlations, and none of them show any conclusive link between the two variables at the 5% significance level. What this suggests is that other factors than unemployment or economic performance played a more significant in determining these presidents' approval ratings.

Here's a brief look at the presidencies on a case-by-case basis:

1) Jimmy Carter (weak positive/no correlation): He came into office with a high approval rating (almost 70%) even though unemployment was about 7% at the time. This has a lot to do with Republicans falling out of favor following Watergate and Richard Nixon's resignation. Carter's approval did surely decline as the economy worsened and inflation skyrocketed. But this effect is not fully apparent from the above data because of his initially high rating and the fact that his presidency lasted only four years.

2) Ronald Reagan (strong negative correlation): The 1980's began with a recession, which was followed by a long period of expansion and prosperity. Reagan left office with a high approval rating, no doubt in large part because of the country's economic growth under his presidency.

3) George H.W. Bush (strong negative correlation): A souring economy in the early 1990's ended up being the dominant factor in driving Bush Sr's approval ratings down, and a key reason for his defeat in the 1992 presidential election.

4) Bill Clinton (strong negative correlation): "The economy, stupid" was the focus of Clinton's presidential campaign in 1992. The strongest correlation of the six presidencies by far (-0.762), Clinton's presidency coincided with a long boom in economic growth and technological innovation. Generally speaking, Americans responded to these good economic times with high regard for their leader, even amidst the clamor of an impeachment. 

5) George W. Bush (weak positive/no correlation): Never mind how the economy was doing - the country was widely upset over the state of the wars in Iraq and Afghanistan, and this drowned out the "economy" correlation effect. Of course the 2008 financial crisis didn't help Bush's ratings, but that was already near the end of his term.

6) Barack Obama (weak positive/no correlation): He came in highly popular and with great anticipation in the midst of the economic disaster. Unemployment decreased by half during his time in office. However, it seems that Americans did not give Obama proportionately greater props, as gripes over the president's approach to healthcare reform, partisan deadlock in Congress, and handling of other foreign and domestic issues prevented his approval rating from breaking 50% for most of his two terms. 



Data sources:
1, Presidential Approval Rating: The American Presidency Project, University of California, Santa Barbara (http://www.presidency.ucsb.edu/) 
2. Unemployment by quarter: Federal Reserve Bank of St. Louis, Civilian Unemployment Rate 
(https://fred.stlouisfed.org/)


Monday, November 21, 2016

No Evidence that Economic Growth Differs Based on Who is President

If you divide the last 40 years into the six respective presidencies since 1977 (Carter, Reagan, Bush Sr, Clinton, W. Bush, Obama) and treat Real GDP Growth as a normally distributed random variable which produced 40 independent observations, is there a statistically significant difference in the mean GDP growth parameter (μ) between the six presidencies?

Below is the list of years analyzed, from 1977 to 2015 (actually only 39 observations, not 40, because the full year GDP growth for 2016 is not yet determined).

The one-way ANOVA test for equality of means essentially compares inter-group variation to intra-group variation to determine whether the difference in group means is statistically significant.

The assumption that economic growth for the 39 past years is approximately normal holds true, as seen from the histogram of GDP Growth.

GDP Growth by Year and President
(Source: US Bureau of Economic Analysis)

Obs
President
Party
Year
Real Annual
GDP Growth (%)
1
Carter
Democrat
1977
4.90
2
Carter
Democrat
1978
6.68
3
Carter
Democrat
1979
1.30
4
Carter
Democrat
1980
0.00
5
Reagan
Republican
1981
1.29
6
Reagan
Republican
1982
-1.40
7
Reagan
Republican
1983
7.83
8
Reagan
Republican
1984
5.63
9
Reagan
Republican
1985
4.28
10
Reagan
Republican
1986
2.94
11
Reagan
Republican
1987
4.45
12
Reagan
Republican
1988
3.84
13
Bush
Republican
1989
2.78
14
Bush
Republican
1990
0.65
15
Bush
Republican
1991
1.22
16
Bush
Republican
1992
4.33
17
Clinton
Democrat
1993
2.63
18
Clinton
Democrat
1994
4.13
19
Clinton
Democrat
1995
2.28
20
Clinton
Democrat
1996
3.80
21
Clinton
Democrat
1997
4.45
22
Clinton
Democrat
1998
5.00
23
Clinton
Democrat
1999
4.69
24
Clinton
Democrat
2000
2.89
25
W. Bush
Republican
2001
0.21
26
W. Bush
Republican
2002
2.04
27
W. Bush
Republican
2003
4.36
28
W. Bush
Republican
2004
3.12
29
W. Bush
Republican
2005
3.03
30
W. Bush
Republican
2006
2.39
31
W. Bush
Republican
2007
1.87
32
W. Bush
Republican
2008
-2.70
33
Obama
Democrat
2009
-0.20
34
Obama
Democrat
2010
2.73
35
Obama
Democrat
2011
1.68
36
Obama
Democrat
2012
1.28
37
Obama
Democrat
2013
2.66
38
Obama
Democrat
2014
2.49
39
Obama
Democrat
2015
1.88













Conclusion: 


The F-Value from the ANOVA test indicates that there is not enough evidence (at the 5% significance level) to conclude that the different presidencies had inherently different mean real GDP growth parameters.

In other words, we cannot conclude that the observed yearly historical differences in economic growth level by presidency are anything other than the expected level of variation caused by other factors completely separate from the identity of the sitting president.

Why this is true, in my perspective:

1) The Federal Reserve has a more direct and immediate role than the President of the United States in influencing GDP growth through monetary policy. For example, by sharply increasing interest rates, the Fed can curb economic growth and even cause a recession (as happened in the early 1980's).

2) Policies of the current presidential administration may not have an impact until many years (and many presidencies) later. Just as investments in infrastructure, education, and the like will boost economic growth somewhere down the road, bad economic policies can cause financial disasters many years after they are implemented.

3) The business cycle is inherent to a capitalist economy, and periods of growth and contraction are influenced by a wide variety of factors that are out of the president's immediate control - these include consumer spending and confidence, business confidence and uncertainty, world political events, oil shocks, etc.

What the one-way ANOVA test cleverly captures is that GDP growth has on average varied enough within presidencies, no doubt in large part due to the natural boom-and-bust cycles, that we cannot reliably conclude that variance between presidencies is significant.







Obama, Nobama? Using PCA Algorithm for Dimensionality Reduction of Images

I learned about the PCA (principal components analysis) algorithm from Andrew Ng's Stanford Machine Learning course, and I had to give t...