Problem Set 1 Suggested Solutions

4. Estimating misspesified models with market-level data

4.1 Derive the logit estimation equation from the choice probabilities.

Suppose we assume that the data comes from the following misspesified indirect utility

\[\begin{align*} u_{ijt} = \alpha p_{jt} + \beta^{(1)}x_{jt} + \beta^{(2)}l_j + \beta^{(3)} roof_{j} + \xi_{jt} + \epsilon_{ijt} \end{align*}\]

Denote the mean utility \[\delta_{jt} = \alpha p_{jt} + \beta^{(1)}x_{jt} + \beta^{(2)}l_j + \beta^{(3)} roof_{j} + \xi_{jt} \] Then we have

\[\begin{align*} u_{ijt} = \delta_{jt} + \epsilon_{ijt} \end{align*}\]

From lecture 2 slide 14, the probability of consumer i choosing product j in market t is

\[\begin{align*} Pr(y_{ijt} = 1) &= Pr(u_{ijt} \ge u_{ikt}, \; \forall k, \; j \neq k)\\ &= Pr(\epsilon_{ikt}-\epsilon_{ijt} \leq \delta_{jt} - \delta_{kt}, \; \forall k, \; j \neq k) \end{align*}\]

We’re assuming \(\epsilon\) is distributed type 1 extreme value. Then the probability becomes

\[\begin{align*} P(y_{ijt}=1) = P_{ijt} = \frac{\text{exp}(\delta_{jt})}{1 + \sum_k^J \text{exp}(\delta_{kt})} \end{align*}\]

For details of how to derive this, see Train (2009) Chapter 3. Note that this is the same for all consumers. Then the empirical counterpart of the choice probability \(P_{ijt}\) is the market share \(s_{jt}\)

a)

\[\begin{align*} s_{jt} = \frac{\text{exp}(\delta_{jt})}{\sum_{k=0}^J \text{exp}(\delta_{kt})} = \frac{\text{exp}(\delta_{jt})}{1 + \sum_{k=1}^J \text{exp}(\delta_{kt})} \end{align*}\]

And after normalizing the outside good to zero

\[\begin{align*} s_{0t} = \frac{exp(0)}{\sum_{k=0}^J \text{exp}(\delta_{kt})} = \frac{1}{1 + \sum_{k=1}^J \text{exp}(\delta_{kt})} \end{align*}\]

b)

Take natural logarithms of the choice probabilities

\[\begin{align*} \text{ln}(s_{jt}) &= \text{ln}(\text{exp}(\delta_{jt})) - \text{ln}(\sum_{k=0}^J \text{exp}(\delta_{kt})) = \delta_{jt} - \text{ln}(\sum_{k=0}^J \text{exp}(\delta_{kt})) \\ \text{ln}(s_{0t}) &= \text{ln}(1) -\text{ln}(\sum_{k=0}^{J} \text{exp}(\delta_{kt})) = -\text{ln}(\sum_{k=0}^{J} \text{exp}(\delta_{kt}))\\ \end{align*}\]

Now, subtract the log of the outside good’s market share from the log of product j’s market share. Note that the number of inside goods is 4, so we can substitute \(J=4\)

\[\begin{align*} \text{ln}(s_{jt}) - \text{ln}(s_{0t}) &= \delta_{jt} - \text{ln}(\sum_{k=0}^4 exp(\delta_{kt})) -[- \text{ln}(\sum_{k=0}^{4} exp(\delta_{kt}))]\\ \text{ln}(s_{jt}/s_{0t}) &= \delta_{jt} =\alpha p_{jt} + \beta^{(1)}x_{jt} + \beta^{(2)}l_j + \beta^{(3)} roof_{j} + \xi_{jt} \end{align*}\]

The purpose of deriving this equation is that now we have a nice linear expression that we can estimate using OLS and standard linear IV methods. Moreover, we only need to observe market level data.

4.2 Estimate logit demand without any instruments and explain what is the endogeneity issue at hand.

Note that the excel file has the market shares of the inside goods (taking into account the outside good) but not the market share of the outside good. Thus, in order to run the regression on the estimation equation derived in 4.1. b), we must

calculate the market share of the outside good for each market
generate the dependent variable \(\text{ln}(s_{jt}) - \text{ln}(s_{0t}) = \text{ln}(s_{jt}/s_{0t})\)

# Loads data and converts it into a data.table object using a pipe %>%.
# %>% passes an object to the next line.
# data.table is just an enhanced version of data.frame
boat_dt <-
  read_xlsx("boat_data.xlsx") %>%
  data.table(.)

head(boat_dt)

##      prices      shares   length   quality cost_shifter  roof firm_ids
##       <num>       <num>    <num>     <num>        <num> <num>    <num>
## 1: 5.858942 0.498415102 9.336710 1.2291334    0.8022839     1        1
## 2: 4.600115 0.102175147 7.092352 0.7524592    1.5204211     1        2
## 3: 1.000218 0.004850799 6.159682 0.3415478    0.3349984     0        3
## 4: 3.233179 0.216493489 5.780847 0.7570901    0.7599571     0        4
## 5: 3.827983 0.354432180 9.336710 0.5739109    0.1647224     1        1
## 6: 2.976261 0.136502418 7.092352 0.8794283    0.7626693     1        2
##    market_ids
##         <num>
## 1:          1
## 2:          1
## 3:          1
## 4:          1
## 5:          2
## 6:          2

Calculate the outside good’s market share for each market and create the dependent variable.

# Outside good market share = 1 - sum of the inside goods' market shares within 
# the market.
boat_dt[, outside_good_ms := 1 - sum(shares), by = market_ids]

# Dependent variable = ln(market share/outside good market share)
# note that in R, log is the natural logarithm
boat_dt[, ln_sj_s0 := log(shares / outside_good_ms)]

head(boat_dt[, .(shares, outside_good_ms, market_ids, ln_sj_s0)])

##         shares outside_good_ms market_ids     ln_sj_s0
##          <num>           <num>      <num>        <num>
## 1: 0.498415102       0.1780655          1  1.029282011
## 2: 0.102175147       0.1780655          1 -0.555462791
## 3: 0.004850799       0.1780655          1 -3.603007839
## 4: 0.216493489       0.1780655          1  0.195409217
## 5: 0.354432180       0.3555943          2 -0.003273548
## 6: 0.136502418       0.3555943          2 -0.957448236

Estimate the logit model. Note that there is no constant in the estimation equation!

# You might as well use lm. I use feols to get a clean output using etable
# Note that R includes an intercept by default, so we have to include -1 in the equation to run the regression without a constant.
logit <-
  feols(ln_sj_s0 ~ -1 + quality + prices + length + roof, data = boat_dt)

etable(logit)

##                               logit
## Dependent Var.:            ln_sj_s0
##                                    
## quality           1.092*** (0.0329)
## prices          -0.1248*** (0.0254)
## length          -0.2148*** (0.0120)
## roof              1.758*** (0.0526)
## _______________ ___________________
## S.E. type                       IID
## Observations                  4,000
## R2                          0.32147
## Adj. R2                     0.32096
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Note that all the estimates are badly biased. The true parameter values are quality = 2, prices = -2, length = 0.5 and (mean of) roof = 4. The differences from the estimates are large.

The endogeneity problem arises from the fact that there is quality unobserved by the econometrician. The higher the unobserved quality the higher price the consumers are willing to pay for the good, all else equal. Therefore, price is positively correlated with unobserved quality \(\xi_{jt}\) which is positively correlated with utility and prices are endogenous.

4.3 Estimate logit using 2SLS. Can you use the observed quality and costs shifter as instruments? Why?

Firm’s own cost shifter and observed quality of other firms in the market are useful instruments:

cost shifter affects prices through the equilibrium conditions: it has an impact on the marginal cost, which makes the firm change its price to maximize profits (relevance). And it is uncorrelated with unobserved quality by assumption, so it only affects utility through prices (exclusion restriction)
- Here we must think that the cost shifter is something that does not correlate with unobserved quality. An example could be the distance from factory to the store or the price of some necessary component or labor costs.
Firm must price lower if competitors sell high quality goods (relevance). Other products’ observed quality does not affect the utility from product \(j\) other than through price (exclusion restriction).
- motivated by the fact that characteristics of other goods are exogenous. We do not allow firms to adjust their product characteristics in response to competitors products.
And of course, firm’s own observed quality cannot be used as an excluded instrument because it has a direct effect on utility
You could also use other firms’ cost shifters: they affect the prices of other firms, and if the other firms change their prices I want to change my price too (relevance). They are also uncorrelated with my unobserved quality (exclusion).

Generate instruments from competitors’ qualities and cost shifters:

# Sum of competitors' qualities
boat_dt[, sum_obs_quality_iv := sum(quality) - quality, by = market_ids]
# Sum of competitors' cost shifters
boat_dt[, sum_cost_shifter_iv := sum(cost_shifter) - cost_shifter, by = market_ids]

Estimate 2SLS:

#Use all instruments
logit_2sls <-
  feols(ln_sj_s0 ~ -1 + quality + length + roof | 
        prices ~ cost_shifter + sum_obs_quality_iv + sum_cost_shifter_iv,
      data = boat_dt)

#Use only own cost shifter and sum of the competitors' observed qualities
logit_2sls_2<-
  feols(ln_sj_s0 ~ -1 + quality + length + roof | 
        prices ~ cost_shifter + sum_obs_quality_iv,
      data = boat_dt)

etable(logit, logit_2sls, logit_2sls_2)

##                               logit         logit_2sls       logit_2sls_2
## Dependent Var.:            ln_sj_s0           ln_sj_s0           ln_sj_s0
##                                                                          
## quality           1.092*** (0.0329)  1.616*** (0.0483)  1.614*** (0.0483)
## prices          -0.1248*** (0.0254) -1.712*** (0.0559) -1.707*** (0.0563)
## length          -0.2148*** (0.0120) 0.4306*** (0.0243) 0.4287*** (0.0245)
## roof              1.758*** (0.0526)  3.111*** (0.0825)  3.107*** (0.0827)
## _______________ ___________________ __________________ __________________
## S.E. type                       IID                IID                IID
## Observations                  4,000              4,000              4,000
## R2                          0.32147           -0.33965           -0.33572
## Adj. R2                     0.32096           -0.34065           -0.33672
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

All estimates look better, but not very close to the true parameter values.

True parameter values: quality = 2, prices = -2, length = 0.5 and (mean of) roof = 4.
Our model is still misspesified as we are omitting the consumer specific heterogeneity with respect to how much they like boats with roofs.
Still, this is much better than the model with no instruments and very easy to implement.
Results pretty much the same if we only use own cost shifter as instrument

4.4 Calculate the price elasticity in each market for each firm.

a)

Price elasticity can be expressed using the derivative of market share w.r.t. price

\[\eta_{jj}= \frac{\Delta s_{jt}/s_{jt}}{\Delta p_{jt}/p_{jt}}= \frac{\Delta s_{jt}}{\Delta p_{jt}}\frac{p_{jt}}{s_{jt}}= \frac{\partial s_{jt}}{\partial p_{jt}}\frac{p_{jt}}{s_{jt}}\]

Let us derive the formula for the price elasticity. First substitute

\[ s_{jt} = \frac{\text{exp}(\delta_{jt})}{1 + \sum_{k=1}^J \text{exp}(\delta_{kt})} \] We get

\[\begin{align*} \eta_{jj}&=\frac{\partial s_{jt}}{\partial p_{jt}}\frac{p_{jt}}{s_{jt}} =\frac{\partial \left( \frac{\text{exp}(\delta_{jt})}{1 + \sum_{k=1}^J \text{exp}(\delta_{kt})}\right)}{\partial p_{jt}}\frac{p_{jt}}{s_{jt}} \\ &=\frac{\frac{\partial \text{exp}(\delta_{jt})}{\partial p_{jt}} (1 +\sum_{k=1}^J \text{exp}(\delta_{kt})) - \text{exp}(\delta_{jt}) \frac{\partial}{\partial p_{jt}}(1 +\sum_{k=1}^J \text{exp}(\delta_{kt}))}{(1 + \sum_{k=1}^J \text{exp}(\delta_{kt}))^2} \frac{p_{jt}}{s_{jt}} \end{align*}\]

Substitute \(\frac{\partial \text{exp}(\delta_{jt})}{\partial p_{jt}} = \alpha \cdot \text{exp}(\delta_{jt})\) and \(\frac{\partial}{\partial p_{jt}}(1 +\sum_{k=1}^J \text{exp}(\delta_{kt}))= \alpha \cdot \text{exp}(\delta_{jt})\) since the derivative of the other products’ mean utilities w.r.t. \(p_{jt}\) are zero

\[\begin{align*} \eta_{jj}&= \frac{\alpha \cdot\text{exp}(\delta_{jt}) (1 +\sum_{k=1}^J \text{exp}(\delta_{kt})) - \text{exp}(\delta_{jt}) \cdot \alpha \cdot \text{exp}(\delta_{jt})}{(1 + \sum_{k=1}^J \text{exp}(\delta_{kt}))^2} \frac{p_{jt}}{s_{jt}}\\ &= \left[ \alpha \frac{\text{exp}(\delta_{jt}) }{1 + \sum_{k=1}^J \text{exp}(\delta_{kt})} - \alpha \left( \frac{\text{exp}(\delta_{jt})}{1 + \sum_{k=1}^J \text{exp}(\delta_{kt})}\right)^2 \right]\frac{p_{jt}}{s_{jt}}\\ &= \left( \alpha s_{jt} - \alpha s_{jt}^2 \right) \frac{p_{jt}}{s_{jt}}\\ &= \alpha(1-s_{jt})s_{jt} \frac{p_{jt}}{s_{jt}} \\ &= \alpha(1-s_{jt}) p_{jt} \end{align*}\]

b)

The elasticity only depends on firm \(j\)’s price and market share as well as \(\alpha\). I use the estimate for alpha from logit_2sls_2 specification to calculate this.

alpha <- logit_2sls_2$coefficients["fit_prices"]

# calculates price elasticity
boat_dt[, elasticity_jt := alpha * (1 - shares) * prices]

# plots
logit_plot <-
  ggplot(data = boat_dt,
         aes(
           x = elasticity_jt,
           group = as.factor(firm_ids),
           fill = as.factor(firm_ids)
         )) + 
  geom_density(alpha = 0.4) +
  ggtitle("Logit Price Elasticities Across Markets")

logit_plot

Notice that the distribution of the price elasticities are very similar between firm 1 and 2 (the boats with roof) and between firms 3 and 4 (firms with no roof).

The average elasticities by firm are

boat_dt[, .("Average elasticity" = mean(elasticity_jt)), by = firm_ids]

##    firm_ids Average elasticity
##       <num>              <num>
## 1:        1          -5.600759
## 2:        2          -5.289546
## 3:        3          -4.072071
## 4:        4          -3.934288

4.5 Calculate Logit Diversion Ratios

Conlon and Mortimer (2021) define the diversion ratio as follows: “As the price of j increases, some consumers leave product j, and a subset of these consumers switch to a substitute product k. The diversion ratio, \(D_{jk}\) , is defined as the ratio of the switchers to the leavers.”

Note that the diversion ratio from product j to product k only depends on the market shares of those two products.

\[\begin{align*} D_{jk} = \frac{s_{kt}}{1 - s_{jt}} \end{align*}\]

logit_div_13 <- 
  boat_dt[firm_ids == 3, shares] / boat_dt[firm_ids == 1, 1 - shares]

logit_div_31 <- 
  boat_dt[firm_ids == 1, shares] / boat_dt[firm_ids == 3, 1 - shares]


data.table("D_13" = mean(logit_div_13),
           "D_31" = mean(logit_div_31))

##         D_13      D_31
##        <num>     <num>
## 1: 0.1921416 0.3445739

These substitution patterns are lacking as I do not take into account that firm 3 has roof in it’s boat, while 2 does not. We’re assuming the IIA holds which is not the case with the true model. With random coefficients, the ratio of probabilities of two goods depends on the whole market, including attributes of all other goods on the market, making IIA not hold. As a result, the logit model gives incorrect substitution patterns.

4.6 Write down the nested logit estimation equation.

Our estimation equation is \[\begin{align*} ln(s_{jt}) - ln(s_{0t}) &= \delta_{jt} + \sigma ln(s_{jt/g}) + \xi_{jt}\\ ln(s_{jt}/s_{0t}) &= \alpha p_{jt} + \beta^{(1)}x_{jt} + \beta^{(2)}l_j + \beta^{(3)} roof_{j} + \sigma ln(s_{jt/g}) + \xi_{jt} \end{align*}\]

exogenous variables: observed quality, length (and roof)
note that roof was excluded from the mean utility in the assignment but there is motivation to include it in order to estimate the mean effect of roof. I show the results with and without roof.
endogenous variables: price and within group market share \(s_{jt/g}\)
- within group market share is endogenous since unobserved quality is correlated with within nest market share
take the nest structure into account when instrumenting

4.7 Estimate the nested logit model by 2SLS

There are multiple options for instruments. Some possible instruments for price are
- own cost shifter
- within nest competitor’s cost shifter
- other nest’s competitors’ summed cost shifters
- withing nest competitor’s observed quality
- other nests competitors summed quality.
Within-nest market share is instrumented with withing nest competitor’s observed quality
- the observed quality of my competitor in the same nest affects my market share within the nest but has no direct effect on the utility from my product and isn’t correlated with my unobserved quality

boat_dt[, group := fifelse(firm_ids %in% c(1, 2), 1, 2)]
head(boat_dt[, .(firm_ids, group)])

##    firm_ids group
##       <num> <num>
## 1:        1     1
## 2:        2     1
## 3:        3     2
## 4:        4     2
## 5:        1     1
## 6:        2     1

boat_dt[, within_nest_ms := shares / sum(shares), by = .(group, market_ids)]

# Generate own nest's competitor's cost shifter IV
boat_dt[,
        own_n_comp_costs := sum(cost_shifter) - cost_shifter,
        by = .(market_ids, group)]

# Generate own nest's summed cost shifter, is used below
boat_dt[,
        n_cost_shifter := sum(cost_shifter),
        by = .(group, market_ids)]

# Generate other group's competitors' summed cost shifter IV
other_groups_cost_shift <- boat_dt[, .(n_cost_shifter, group, market_ids)]
other_groups_cost_shift[, merge_group :=  fifelse(group == 1, 2, 1)] 
other_groups_cost_shift[, group := NULL]
other_groups_cost_shift[, group := merge_group] #switch groups
other_groups_cost_shift[, other_n_cost_s := n_cost_shifter]
boat_dt[other_groups_cost_shift, on = c("group", "market_ids"),
        other_n_cost_s := other_n_cost_s]


# Within nest competitor's observed quality IV for within nest market share
boat_dt[,
        own_n_obs_quality := sum(quality) - quality,
        by = .(group, market_ids)]

# Sum of observed quality within nest, is used below
boat_dt[,
        n_obs_quality := sum(quality),
        by = .(group, market_ids)]

# Generate other group's competitors' summed observed quality IV
other_groups_quality <- boat_dt[, .(n_obs_quality, group, market_ids)]
other_groups_quality[, merge_group :=  fifelse(group == 1, 2, 1)]
other_groups_quality[, group := NULL]
other_groups_quality[, group := merge_group] #switch groups
other_groups_quality[, other_n_qual := n_obs_quality]
boat_dt[other_groups_quality, on = c("group", "market_ids"),
        other_n_qual := i.other_n_qual]

#Remember our instruments are:
# 1. own cost shifter
# 2. within nest competitor's cost shifter
# 3. other nest's competitors' summed cost shifters
# 4. withing nest competitor's observed quality
# 5. other nests competitors summed quality.

#Riku used 1. 2. and 4.

Estimate specifications with and without IVs. Tuomas found that the model works better if he didn’t use other firms’ cost shifters as IVs.

#with roof in mean quality
nl<-
  feols(ln_sj_s0 ~ -1 + quality + length + roof + prices + log(within_nest_ms),
      data = boat_dt)

#without roof in mean quality
nl_nr<-
  feols(ln_sj_s0 ~ -1 + quality + length + prices + log(within_nest_ms),
      data = boat_dt)

#instrument using own cost shifter,
# observed quality of within nest competitor and 
# observed quality of other nest's competitors
nl_2sls<-
  feols(ln_sj_s0 ~ -1 + quality + length + roof | 
        prices + log(within_nest_ms) ~ cost_shifter + own_n_obs_quality +
          other_n_qual,
      data = boat_dt)

#same without roof in mean quality
nl_2sls_nr<-
  feols(ln_sj_s0 ~ -1 + quality + length | 
        prices + log(within_nest_ms) ~ cost_shifter + own_n_obs_quality +
          other_n_qual,
      data = boat_dt)

#instrument using all generated instruments:
# 1. own cost shifter
# 2. within nest competitor's cost shifter
# 3. other nest's competitors' summed cost shifters
# 4. withing nest competitor's observed quality
# 5. other nests competitors summed quality.

nl_2sls_all <-
  feols(ln_sj_s0 ~ -1 + quality + length + roof | 
        prices + log(within_nest_ms) ~ cost_shifter 
        + own_n_comp_costs 
        + own_n_obs_quality 
        + other_n_cost_s
        + other_n_qual,
      data = boat_dt)

#Same without roof
nl_2sls_all_nr <-
  feols(ln_sj_s0 ~ -1 + quality + length | 
        prices + log(within_nest_ms) ~ cost_shifter 
        + own_n_comp_costs 
        + own_n_obs_quality 
        + other_n_cost_s
        + other_n_qual,
      data = boat_dt)

Below are results without roof in mean quality compared with the logit results. The results are very similar whether we use the three instruments (3rd column) or the full set of instruments (last column).

etable(logit_2sls, nl_nr, nl_2sls_nr, nl_2sls_all_nr)

##                             logit_2sls              nl_nr         nl_2sls_nr
## Dependent Var.:               ln_sj_s0           ln_sj_s0           ln_sj_s0
##                                                                             
## prices              -1.712*** (0.0559) 0.1313*** (0.0185) -1.289*** (0.0689)
## quality              1.616*** (0.0483) 0.4716*** (0.0259) 0.9330*** (0.0507)
## length              0.4306*** (0.0243)   -0.0083 (0.0101) 0.6249*** (0.0300)
## roof                 3.111*** (0.0825)                                      
## log(within_nest_ms)                    0.9727*** (0.0152) 0.6690*** (0.0450)
## ___________________ __________________ __________________ __________________
## S.E. type                          IID                IID                IID
## Observations                     4,000              4,000              4,000
## R2                            -0.33965            0.57183           -0.13569
## Adj. R2                       -0.34065            0.57150           -0.13654
## 
##                         nl_2sls_all_nr
## Dependent Var.:               ln_sj_s0
##                                       
## prices              -1.285*** (0.0636)
## quality             0.9621*** (0.0489)
## length              0.6091*** (0.0286)
## roof                                  
## log(within_nest_ms) 0.5912*** (0.0396)
## ___________________ __________________
## S.E. type                          IID
## Observations                     4,000
## R2                            -0.16355
## Adj. R2                       -0.16442
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Below are results with roof in mean quality compared with the logit results. Remember the true parameter values are quality = 2, prices = -2, length = 0.5 and (mean of) roof = 4. We see that the value of the nesting parameter if between 0 and 1, suggesting that products within nest are closer substitutes with each other than products in different nests.

etable(logit_2sls, nl, nl_2sls, nl_2sls_all)

##                             logit_2sls                  nl            nl_2sls
## Dependent Var.:               ln_sj_s0            ln_sj_s0           ln_sj_s0
##                                                                              
## prices              -1.712*** (0.0559) -0.0799*** (0.0185) -1.442*** (0.0692)
## quality              1.616*** (0.0483)  0.6804*** (0.0248)  1.415*** (0.0582)
## length              0.4306*** (0.0243)  -0.0259** (0.0093) 0.3791*** (0.0222)
## roof                 3.111*** (0.0825)   1.115*** (0.0397)  2.710*** (0.1066)
## log(within_nest_ms)                     0.8631*** (0.0144) 0.2430*** (0.0530)
## ___________________ __________________ ___________________ __________________
## S.E. type                          IID                 IID                IID
## Observations                     4,000               4,000              4,000
## R2                            -0.33965             0.64239            0.01259
## Adj. R2                       -0.34065             0.64204            0.01160
## 
##                            nl_2sls_all
## Dependent Var.:               ln_sj_s0
##                                       
## prices              -1.382*** (0.0546)
## quality              1.370*** (0.0488)
## length              0.3682*** (0.0200)
## roof                 2.621*** (0.0864)
## log(within_nest_ms) 0.2984*** (0.0392)
## ___________________ __________________
## S.E. type                          IID
## Observations                     4,000
## R2                             0.07960
## Adj. R2                        0.07868
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

4.8 Calculate elasticities for nested logit.

I use estimates from the nl_2sls (with roof in the mean utility and only three instruments).

alpha <- nl_2sls$coefficients["fit_prices"]
names(alpha) <- NULL

sigma <- nl_2sls$coefficients["fit_log(within_nest_ms)"]
names(sigma) <- NULL

boat_dt[, n_elasticity_jt := alpha * prices *
          ((1 / (1 - sigma)) - (sigma / (1 - sigma) * within_nest_ms) - shares)
        ]

# plots
n_logit_plot <-
  ggplot(data = boat_dt,
         aes(
           x = n_elasticity_jt,
           group = as.factor(firm_ids),
           fill = as.factor(firm_ids)
         )) + 
  geom_density(alpha = 0.4) +
  ggtitle("Nested Logit Price Elasticities Across Markets")

par(mar = c(4, 4, .1, .1))
logit_plot
n_logit_plot

boat_dt[, .("Average elasticity" = mean(n_elasticity_jt)), by = firm_ids]

##    firm_ids Average elasticity
##       <num>              <num>
## 1:        1          -5.695437
## 2:        2          -5.484764
## 3:        3          -4.065844
## 4:        4          -3.933168

These are quite close to the average logit price elasticities but a bit further from 0.

4.8 Calculate diversion ratios for nested logit.

D_13_n <-
  boat_dt[firm_ids == 3, shares * (1 - sigma)] /
  boat_dt[firm_ids == 1, 1 - sigma * within_nest_ms + (1 - sigma) * shares]

D_31_n <-
  boat_dt[firm_ids == 1, shares * (1 - sigma)] /
  boat_dt[firm_ids == 3, 1 - sigma * within_nest_ms + (1 - sigma) * shares]


data.table("D_13" = mean(logit_div_13),
           "D_31" = mean(logit_div_31),
           "D_13_n" = mean(D_13_n),
           "D_31_n" = mean(D_31_n))

##         D_13      D_31     D_13_n    D_31_n
##        <num>     <num>      <num>     <num>
## 1: 0.1921416 0.3445739 0.09510307 0.2321172

The nested logit diversion ratios are smaller. This makes sense since products 1 and 3 are in different nests and the nested logit model is able to capture the fact that there is less substitution between products with roof and without roof. Therefore, we expect the nested logit diversion ratios to be closer to the true substitution patterns.