Keegan Skeate

The Influence of Reference Group Contributions on a Family's Charitable Giving

This paper began as my thesis for completion of a Master's in Economics from the University of North Carolina, Charlotte and was then revised under the supervision of Amihai Glazer at the University of California, Irvine. This paper was presented at the 2016 Associated Graduate Student Symposium at UC Irvine and was awarded the Judge's Winner in Economics.

Notes on Macroeconomics

These notes were written during 2015 and 2016 and incorporate the lessons of Guillaume Rocheteau, William Branch, and Eric Swanson. A revised and complete copy will be posted when it is finished.

Notes on Microeconomics

These notes were written during 2015 and 2016 and incorporate the lessons of Stergios Skaperdas, Jean-Paul Carvalho, and Igor Kopylov. A revised and complete copy will be posted when it is finished.

Conditional Logit Model

Estimation of a conditional logit begins with the likelihood function $$\mathscr{L}(\beta)=\prod_{n=1}^N\prod_{j=1}^J(P_{nj})^{y_{nj}},$$ that has log-likelihood $$\text{ln}\mathscr{L}(\beta)=\sum_{n=1}^N\sum_{j=1}^Jy_{nj}\text{ln}(P_{nj}),$$ where $$y_{ni} =\begin{cases}1, & \text{if } n \text{ choses }j \\ 0, & \text{otherwise,}\end{cases}$$ and the probability $n$ chose $j$ is $$P_{nj} = \tfrac{exp(x_{nj}'\beta)}{\sum_{j=1}^Jexp(x_{ni}'\beta)}.$$ Exclusively for the conditional logit, the first order condition is $$\frac{\partial\text{ln}\mathscr{L}(\beta)}{\partial\beta} = \sum_{n=1}^N\sum_{j=1}^J(y_{nj}-P_{nj})x_{nj} = 0,$$ and the hessian, which is negative semi-definite, is given by $$\frac{\partial^2\text{ln}\mathscr{L}(\beta)}{\partial\beta\partial\beta'} = -\sum_{n=1}^N\sum_{j=1}^JP_{nj}(d_{nj}-\bar{x}_n)(d_{nj}-\bar{x}_n)',$$ where $$\bar{x}_n = \sum_{j=1}^JP_{nj}x_{nj}$$ $$d_{nj}= \begin{cases}1, & \text{if } i \text{ chose }j \\ 0, & \text{otherwise} .\end{cases}$$ The asymptotic variance of the conditional logit model is $$I^{-1}= \bigg(-E\bigg[\frac{\partial^2l(\beta)}{\partial \beta \partial \beta'}\bigg]\bigg)^{-1}.$$ The conditional logit model is estimated with maximum likelihood using the Newton-Rhapson algorithm. Below is the algorithm written in Python 2.7. Attached is an example that uses data from a stated preference experiment to gauge demand for a new mag-lev rail system in Switzerland called Swiss Metro. The example fits a conditional logit model of the choice between car, rail, and Swiss Metro, controlling for travel time, travel cost, alternative specific constants, and if the respondent has purchased an annual pass for Swiss trains. In general, the program can be run with $N$ individuals and $J$ choices. The results are identical to those produced in Stata 14.1 and MATLAB 9.0.

import numpy as np
import pylab as py
    
def Cond_Logit(Y, X, N, J):
    """
    Estimates a conditional logit
    given Y, X, # of N, and # of J.
    """
    initial_beta = np.dot(np.linalg.inv(
    		   np.dot(X.T, X)),
    		   np.dot(X.T, Y))
    beta_MLE = newton_method(
    	       cond_score, cond_hessian,
    	       initial_beta, Y, X, N, J)
    cov = - np.linalg.inv(
    	    cond_hessian(
    	    beta_MLE, Y, X, N, J))
    se = np.diag(np.sqrt(cov))    
    return beta_MLE, se, cov

def cond_Pr(x,beta):
    """
    Conditional Logit Choice Probability
    """
    J = len(x)
    probabilities = []
    for i in range(J):
        Num = np.exp(np.dot(x[i], beta))
        Denom = 0.0
        for j in range(J):
            Denom += np.exp(np.dot(x[j], beta))
        probabilities.append(1.0 * Num / Denom)
    return probabilities
    
def cond_score(beta, Y, X, N, J):
    """Conditional Logit Score Function"""
    score = 0.0
    for n in range(N):
        X_n = X[(n) * J : J + J * n]
        y_n = Y[(n) * J : J + J * n]
        for j in range (J):
            score += (y_n[j] - 
            	      cond_Pr(X_n, beta)[j]) *\
                      X_n[[j],:].T
    return score

def cond_hessian(beta, Y, X, N, J):
    """
    Conditional Logit Hessian Function
    """
    hessian = 0.0
    for n in range(N):
        X_n = X[(n) * J : J + J * n]
        x_bar_n = 0.0
        for j in range(J):
            x_bar_n += X_n[[j],:] * 
            	       cond_Pr(X_n, beta)[j]
        for j in range(J):
            hessian -= np.dot((
            	       X_n[[j],:].T - x_bar_n.T),
                       (X_n[[j],:] - x_bar_n)) *
                       cond_Pr(X_n, beta)[j]
    return hessian
    
def newton_method(gradient, hessian, beta0,
    		  Y, X, N, J, tol, maxIter):
    """
    The Newton-Raphson Algorithm
    """
    beta_old = beta0
    for i in range(maxIter):
        beta_new = beta_old - 
        	   np.dot(
        	   np.linalg.inv(
        	   hessian(beta_old, Y, X, N, J)),
        	   gradient(beta_old, Y, X, N, J))
        if (py.norm(beta_new - beta_old) < tol):
            break
        beta_old = beta_new
    return beta_new

Using Gibbs Sampling with Data Augmentation to Fit an Ordered Probit Model

Given observed choices between three ranked alternatives, you can use Gibbs sampling with data augmentation to fit an ordered probit model where $y_i \in \{1,2,3\}$. The latent variable $$y_i^* = x_i'\beta + \epsilon_i,$$ where $\epsilon_i \sim \mathcal{N}(0,\sigma^2),$ but you observe $$y_i = \begin{cases} 1, & \text{if} \hspace{1ex}\gamma_0 < y_i^*\leq\gamma_1 \\ 2, & \text{if}\hspace{1ex}\gamma_1 < y_i^*\leq\gamma_2 \\ 3, & \text{if}\hspace{1ex}\gamma_2 < y_i^*\leq\gamma_3.\end{cases}$$ You can impose scale and location restrictions on the model parameters such that $\gamma_1=0$ and $\gamma_2=1$. It is standard to let $\gamma_0=-\infty$ and $\gamma_J = \infty$. This allows for estimation of the posterior using the following Gibbs recursion.

First, draw $$\boldsymbol{y}^*\vert \beta,\sigma^2,\boldsymbol{y} \sim TN_{\gamma_{i-1},\gamma_i}(x_i'\beta,\sigma^2).$$ Second, draw $$\sigma^2\vert \boldsymbol{y}^*,\beta,\boldsymbol{y} \sim IG\big(\frac{\nu_0+N}{2},\frac{\delta_0 + (\boldsymbol{y}^* - \boldsymbol{x}\beta)'(\boldsymbol{y}^* - \boldsymbol{x}\beta)}{2}\big).$$ Third, draw $$\beta\vert \boldsymbol{y}^*, \sigma^2, \boldsymbol{y} \sim N(\hat{\beta},\hat{B}),$$ where$$\hat{B}= (B_0^{-1}+\frac{\boldsymbol{x}'\boldsymbol{x}}{\sigma^2})^{-1},$$ $$\hat{\beta}= \hat{B}(B_0^{-1}\beta_0+\frac{\boldsymbol{x}'\boldsymbol{y}^*}{\sigma^2}).$$ This method simulates draws from the posterior distribution $\pi(\beta\vert\boldsymbol{y},\boldsymbol{x})$. An example that fits an ordered probit model to sample data using Gibbs sampling is attached. The example reports the means and standard deviations of 10,000 draws of model parameters, as well as plots the drawn estimates.

import numpy as np
import scipy.stats as ss

def gibbs(beta0, B0, v0, d0, y, X, iters):
    """
    Inputs:
        Priors - beta0, B0, v0, d0.
        Data - y and X.
        Desired iterations - iters.
    Returns:
    	A posterior distribution.
    """
    beta = np.zeros((iters, len(beta0)))
    sigma_sq = np.ones(iters)
    B0_inv = np.linalg.inv(B0)  
    for i in range(iters):
        y_star = y
        XB = np.dot(X, beta[i - 1])
        s = sigma_sq[i - 1]
        for n in range(len(y)):
            if y[n] == 1:
                y_star[n] = ss.truncnorm.rvs(
                	    - np.infty, 0.0,
                	    loc = XB[n],
                	    scale = s)
            if y[n] == 2:
                y_star[n] = ss.truncnorm.rvs(
                	    0.0, 1.0,
                	    loc = XB[n],
                	    scale = s)
            if y[n] == 3:
                y_star[n] = ss.truncnorm.rvs(
                	    1., np.infty,
                	    loc = XB[n],
                	    scale = s)
        v0_hat = v0 + (len(y) / 2)
        d0_hat = 1. / ((1 / d0) + 
        	.5 * (np.dot(
        	(y_star-XB).T,
        	y_star-XB)))
        sigma_sq[i] = ss.invgamma.rvs(
        	      v0_hat, d0_hat) 
        B_hat = np.linalg.inv(
        	np.dot(X.T, X) /
        	sigma_sq[i] + B0_inv)
        beta_hat = np.dot(
        	   B_hat, 
        	   (np.dot(B0_inv, beta0).T +
                   (np.dot(X.T, y_star) / 
                   sigma_sq[i])).T)
        beta[i] = np.random.multivariate_normal(
        	  np.hstack(beta_hat), B_hat)      
    return beta, sigma_sq

Tailored Metropolis-Hastings Algorithm

The Metropolis-Hastings algorithm simulates draws from a posterior distribution $$\pi(\beta\vert\boldsymbol{y},\boldsymbol{x}).$$ The algorithm iterates the following steps.

First, draw $$\beta^*\sim q(\beta \vert y).$$ Second, accept the draw with probability $$\alpha=\text{min}\big\{1,\frac{f(y\vert\beta^*)\pi(\beta^*)q(\beta\vert y)}{f(y\vert\beta)\pi(\beta)q(\beta^*\vert y)}\big\}.$$ Third, if the draw is accepted, then set $\beta=\beta^*$ and repeat. Otherwise, keep $\beta$. The Tailored Metropolis-Hastings algorithm uses a multivariate $t$-distribution proposal density $$q(\beta \vert y) \sim T_{\nu}(\hat{\beta}_{MLE},\hat{\Sigma}_{MLE}).$$ Below is the algorithm written in Python 2.7. Attached is an example that fits a logit model to sample data, and returns the realized posterior distribution from $i$ iterations.

import numpy as np
from math import *

def MV_t_pdf(x, mu, Sigma, df):
    """
    Inputs:
        x - kx1 Random variables.
        mu - kx1 Mean values.
        Sigma - kxk Scale matrix.
        df - Degrees of freedom.
    Returns:
    	The density of the given element.
    """
    p = len(x)
    numerator = gamma(1. * (p + df) / 2)
    denomenator = (gamma(1. * df / 2) * 
    		  pow(df * pi, 1. * p / 2) *
                  pow(
                  np.linalg.det(Sigma),
                  1. / 2) *
                  pow(1 + (1. / df) *
                  np.dot(np.dot(
                  (x - mu), 
                  np.linalg.inv(Sigma)),
                  (x - mu)), 
                  1. * (p + df) / 2))
    density = 1. * numerator / denomenator
    return density
    
def MV_t_random_var(mu, Sigma, df):
    """
    Inputs:
        mu - kx1 Means of the random variables.
        Sigma - kxk Scale matrix.
        df - Degrees of freedom.
    Returns:
    	Independent draws from
    	a MV-t distribution.
    """
    mu = np.asarray(mu)
    p = len(mu)
    u = np.random.chisquare(df)
    y = np.random.multivariate_normal(
        np.zeros(p), Sigma)
    return mu + y * np.sqrt(df / u)
    
def Tailored_MH(
    f, prior, beta0, Sigma0,
    y, X, iters, df):
    """
    Inputs:
        f - a likelihood function.
        prior - Your Bayesian prior.
        beta0 - kx1 initial values
        Sigma0 - kxk initial values
        y - nx1 dependent variable.
        X - nxk independent variables.
        iters - iterations.
        df - degrees of freedom.
    Returns:
        The acceptance rate and i
        draws of beta from the posterior.
    """
    beta = np.zeros((iters, len(beta0)))
    beta[0] = beta0
    acceptance = np.zeros(iters)
    for i in range(iters):
        beta_star = MV_t_random_var(
                    beta0, Sigma0, df)
        Numerator = f(beta_star, y, X) *\
        	    prior(beta_star,
        	    beta0, Sigma0) *\
                    MV_t_pdf(beta[i - 1],
                    beta0, Sigma0, df)
        Denomenator = f(beta[i - 1], y , X) *\
        	      prior(beta[i - 1], 
        	      beta0, Sigma0) *\
                      MV_t_pdf(beta_star, 
                      beta0, Sigma0, df)
        alpha = min(
                1., 1. * Numerator / Denomenator)
        acceptance[i] = np.random.binomial(
                        1, alpha)
        if acceptance[i] == 1:
            beta[i] = beta_star
        else:
            beta[i] = beta[i - 1]
    acceptance_rate = np.mean(acceptance)
    return beta, acceptance_rate