Suppose we assume that the data comes from the following misspesified indirect utility
\[\begin{align*} u_{ijt} = \alpha p_{jt} + \beta^{(1)}x_{jt} + \beta^{(2)}l_j + \beta^{(3)} roof_{j} + \xi_{jt} + \epsilon_{ijt} \end{align*}\]
Utility with mean utility \(\delta_{jt}\) is
\[\begin{align*} u_{ijt} = \delta_{jt} + \epsilon_{ijt} \end{align*}\]
From the lecture 2
\[\begin{align*} Pr(y_{ijt} = 1) &= Pr(u_{ijt} \ge u_{ikt}, \; \forall k, \; j \neq k)\\ &= \int I(\epsilon_{ikt}-\epsilon_{ijt} \leq \delta_{jt} - \delta_{kt})f(\epsilon)d\epsilon \end{align*}\]
Assuming \(\epsilon\) is type 1 extreme value we have
\[\begin{align*} P(y_{ijt}=1) = P_{ijt} = \frac{\text{exp}(\delta_{jt})}{1 + \sum_k^J \text{exp}(\delta_{kt})} \end{align*}\]
What is the empirical counterpart of the \(P_{ijt}\)?
\[\begin{align*} s_{jt} = \frac{\text{exp}(\delta_{jt})}{1 + \sum_k^J \text{exp}(\delta_{kt})} \end{align*}\]
And after normalizing the outside good to zero
\[\begin{align*} s_{0t} = \frac{1}{1 + \sum_k^J \text{exp}(\delta_{kt})} \end{align*}\]
Take natural logarithms of the choice probabilities (\(ln(1)=0\))
\[\begin{align*} \text{ln}(s_{jt}) &= \delta_{jt} - \text{ln}(\sum_k^4 exp(\delta_{kt})) \\ \text{ln}(s_{0t}) &= -\text{ln}(\sum_k^{4} exp(\delta_{kt}))\\ \text{ln}(s_{jt}) - \text{ln}(s_{0t}) &= \delta_{jt} - \text{ln}(\sum_k^4 exp(\delta_{kt})) + \text{ln}(\sum_k^{4} exp(\delta_{kt}))\\ \text{ln}(s_{jt}/s_{0t}) &= \delta_{jt} =\alpha p_{jt} + \beta^{(1)}x_{jt} + \beta^{(2)}l_j + \beta^{(3)} roof_{j} + \xi_{jt} \end{align*}\]
We have a nice linear expression that we can estimate using OLS and standard linear IV methods. Moreover, we only need to observe market level data.
# Loads data and converts it into a data.table object using a pipe %>%.
# %>% passes an object to the next line.
boat_dt <-
readxl::read_xlsx("../boat_data.xlsx") %>%
data.table(.)
head(boat_dt)
## prices shares lenght quality cost_shifter roof firm_ids
## 1: 5.858942 0.498415102 9.336710 1.2291334 0.8022839 1 1
## 2: 4.600115 0.102175147 7.092352 0.7524592 1.5204211 1 2
## 3: 1.000218 0.004850799 6.159682 0.3415478 0.3349984 0 3
## 4: 3.233179 0.216493489 5.780847 0.7570901 0.7599571 0 4
## 5: 3.827983 0.354432180 9.336710 0.5739109 0.1647224 1 1
## 6: 2.976261 0.136502418 7.092352 0.8794283 0.7626693 1 2
## market_ids
## 1: 1
## 2: 1
## 3: 1
## 4: 1
## 5: 2
## 6: 2
Calculate outside good’s market share and create logged dependent variable.
boat_dt[, outside_good_ms := 1 - sum(shares), by = market_ids]
boat_dt[, ln_sj_s0 := log(shares / outside_good_ms)]
head(boat_dt[, .(outside_good_ms, market_ids, ln_sj_s0)])
## outside_good_ms market_ids ln_sj_s0
## 1: 0.1780655 1 1.029282011
## 2: 0.1780655 1 -0.555462791
## 3: 0.1780655 1 -3.603007839
## 4: 0.1780655 1 0.195409217
## 5: 0.3555943 2 -0.003273548
## 6: 0.3555943 2 -0.957448236
Estimate the logit model.
logit <-
feols(ln_sj_s0 ~ -1 + quality + prices + lenght + roof, data = boat_dt)
etable(logit)
## logit
## Dependent Var.: ln_sj_s0
##
## quality 1.092*** (0.0329)
## prices -0.1248*** (0.0254)
## lenght -0.2148*** (0.0120)
## roof 1.758*** (0.0526)
## _______________ ___________________
## S.E. type IID
## Observations 4,000
## R2 0.32147
## Adj. R2 0.32096
boat_dt[, sum_obs_quality_iv := sum(quality) - quality, by = market_ids]
boat_dt[, sum_cost_shifter_iv := sum(cost_shifter) - cost_shifter, by = market_ids]
logit_2sls <-
feols(ln_sj_s0 ~ -1 + quality + lenght + roof |
prices ~ sum_obs_quality_iv + cost_shifter + sum_cost_shifter_iv,
data = boat_dt)
logit_2sls_no_sum_cs <-
feols(ln_sj_s0 ~ -1 + quality + lenght + roof |
prices ~ sum_obs_quality_iv + cost_shifter,
data = boat_dt)
etable(logit, logit_2sls, logit_2sls_no_sum_cs)
## logit logit_2sls logit_2sls_no_su..
## Dependent Var.: ln_sj_s0 ln_sj_s0 ln_sj_s0
##
## quality 1.092*** (0.0329) 1.616*** (0.0483) 1.614*** (0.0483)
## prices -0.1248*** (0.0254) -1.712*** (0.0559) -1.707*** (0.0563)
## lenght -0.2148*** (0.0120) 0.4306*** (0.0243) 0.4287*** (0.0245)
## roof 1.758*** (0.0526) 3.111*** (0.0825) 3.107*** (0.0827)
## _______________ ___________________ __________________ __________________
## S.E. type IID IID IID
## Observations 4,000 4,000 4,000
## R2 0.32147 -0.33965 -0.33572
## Adj. R2 0.32096 -0.34065 -0.33672
Let us derive the formula for the price elasticity.
\[\begin{align*} \frac{\partial s_{jt}}{\partial p_{jt}} \frac{p_{jt}}{s_{jt}} &= \frac{\frac{\partial}{\partial p_{jt}} e^{V_{jt}} (1 +\sum_k^4 e^{V_{kt}}) - \frac{\partial}{\partial p_{jt}}(1 +\sum_k^4 e^{V_{kt}}) e^{V_{jt}}}{(1 + \sum_k^4 e^{V_{kt}})^2} \frac{p_{jt}}{s_{jt}} \\ &= \frac{e^{V_{jt}} \frac{\partial V_{jt}}{\partial p_{jt}} (1 +\sum_k^4 e^{V_{kt}}) - e^{2V_{jt}} \frac{\partial V_{jt}}{\partial p_{jt}}}{(1 +\sum_k^4 e^{V_{kt}})^2} \frac{p_{jt}}{s_{jt}} \\ &= \alpha s_{jt} - \alpha s_{jt}^2 = \alpha(1 - s_{jt})s_{jt} \frac{p_{jt}}{s_{jt}} = \alpha(1 - s_{jt})p_{jt} \end{align*}\]
The elasticity only depends on firm \(j\)’s price and market share as well as \(\alpha\). I use estimate for alpha from logit_2sls spesification.
alpha <- logit_2sls$coefficients["fit_prices"]
names(alpha) <- NULL# remove column name
# calcluates price elasticity
boat_dt[, elasticity_jt := alpha * (1 - shares) * prices]
# plots
logit_plot <-
ggplot(data = boat_dt,
aes(
x = elasticity_jt,
group = as.factor(firm_ids),
fill = as.factor(firm_ids)
)) +
geom_density(alpha = 0.4) +
ggtitle("Logit Price Elasticities Across Markets")
logit_plot
boat_dt[, .("Average elasticity" = mean(elasticity_jt)), by = firm_ids]
## firm_ids Average elasticity
## 1: 1 -5.616266
## 2: 2 -5.304192
## 3: 3 -4.083346
## 4: 4 -3.945181
\[\begin{align*} D_{jk} = \frac{s_{kt}}{1 - s_{jt}} \end{align*}\]
Conlon and Mortimer (2021) define the diversion ratio as follows: “Diversion ratio \(D_{jk}\) is the fraction of consumers who leave product \(j\) after a price increase and switch to product \(k\).”
logit_div_13 <-
boat_dt[firm_ids == 3, shares] / boat_dt[firm_ids == 1, 1 - shares]
logit_div_31 <-
boat_dt[firm_ids == 1, shares] / boat_dt[firm_ids == 3, 1 - shares]
logit_div_12 <-
boat_dt[firm_ids == 2, shares] / boat_dt[firm_ids == 1, 1 - shares]
logit_div_43 <-
boat_dt[firm_ids == 3, shares] / boat_dt[firm_ids == 4, 1 - shares]
data.table("D_13" = mean(logit_div_13),
"D_31" = mean(logit_div_31),
"D_12" = mean(logit_div_12),
"D_43" = mean(logit_div_43))
## D_13 D_31 D_12 D_43
## 1: 0.1921416 0.3445739 0.3438125 0.1513089
These substitution patterns are lacking as I do not take into account that firm 3 has roof in it’s boat, while 2 does not.
Our estimation equation is \[\begin{align*} ln(s_{jtg}) - ln(s_{0t}) &= \delta_{jt} + \sigma ln(\bar{s_{jt/g}}) + \xi_{jt}\\ ln(s_{jtg}/s_{0t}) &= \alpha p_{jt} + \beta^{(1)}x_{jt} + \beta^{(2)}l_j + \beta^{(3)} roof_{j} + \sigma ln(\bar{s_{jt/g}}) + \xi_{jt} \end{align*}\]
exogenous variables: observed quality and length.
endogenous variables: price and within group market share \(s_{jt/g}\)
take the nest structure into account when instrumenting.
We instrument price with own cost shifter, within nets competitors cost shifter and other nest’s competitors’ summed cost shifters, as well as withing nest competitor’s observed quality and other nests competitors summed quality.
Market share is instrumented with withing nest competitor’s observed quality and other nests competitors’ summed observed quality.
boat_dt[, group := fifelse(firm_ids %in% c(1, 2), 1, 2)]
head(boat_dt[, .(firm_ids, group)])
## firm_ids group
## 1: 1 1
## 2: 2 1
## 3: 3 2
## 4: 4 2
## 5: 1 1
## 6: 2 1
boat_dt[, within_nest_ms := shares / sum(shares), by = .(group, market_ids)]
# cost shifter IV for within nest market share
boat_dt[,
own_n_comp_costs := sum(cost_shifter) - cost_shifter,
by = .(market_ids, group)]
boat_dt[,
n_cost_shifter := sum(cost_shifter),
by = .(group, market_ids)]
other_groups_cost_shift <- boat_dt[, .(n_cost_shifter, group, market_ids)]
other_groups_cost_shift[, merge_group := fifelse(group == 1, 2, 1)]
other_groups_cost_shift[, group := NULL]
other_groups_cost_shift[, group := merge_group]
other_groups_cost_shift[, other_n_cost_s := n_cost_shifter]
boat_dt[other_groups_cost_shift, on = c("group", "market_ids"),
other_n_cost_s := other_n_cost_s]
# obs quality shifter IV for within nest market share
boat_dt[,
own_n_obs_quality := sum(quality) - quality,
by = .(group, market_ids)]
boat_dt[,
n_obs_quality := sum(quality),
by = .(group, market_ids)]
other_groups_quality <- boat_dt[, .(n_obs_quality, group, market_ids)]
other_groups_quality[, merge_group := fifelse(group == 1, 2, 1)]
other_groups_quality[, group := NULL]
other_groups_quality[, group := merge_group]
other_groups_quality[, other_n_qual := n_obs_quality]
boat_dt[other_groups_quality, on = c("group", "market_ids"),
other_n_qual := i.other_n_qual]
head(boat_dt[,
.(firm_ids,
within_nest_ms,
market_ids,
own_n_obs_quality,
other_n_qual,
other_n_cost_s)]
)
## firm_ids within_nest_ms market_ids own_n_obs_quality other_n_qual
## 1: 1 0.82987545 1 0.7524592 1.098638
## 2: 2 0.17012455 1 1.2291334 1.098638
## 3: 3 0.02191518 1 0.7570901 1.981593
## 4: 4 0.97808482 1 0.3415478 1.981593
## 5: 1 0.72195397 2 0.8794283 2.171100
## 6: 2 0.27804603 2 0.5739109 2.171100
## other_n_cost_s
## 1: 1.094955
## 2: 1.094955
## 3: 2.322705
## 4: 2.322705
## 5: 1.483692
## 6: 1.483692
Estimate spesifications with and without IVs. I find that the model works better if I do not use other firms’ cost shifters as IVs.
n_logit<-
feols(ln_sj_s0 ~ -1 + quality + lenght + roof + prices + log(within_nest_ms),
data = boat_dt)
n_logit_2sls<-
feols(ln_sj_s0 ~ -1 + quality + lenght + roof |
prices + log(within_nest_ms) ~ cost_shifter + own_n_obs_quality +
other_n_qual,
data = boat_dt)
n_logit_2sls_no_r<-
feols(ln_sj_s0 ~ -1 + quality + lenght |
prices + log(within_nest_ms) ~ cost_shifter + own_n_obs_quality +
other_n_qual,
data = boat_dt)
n_logit_2sls_all_ivs <-
feols(ln_sj_s0 ~ -1 + quality + lenght + roof |
prices + log(within_nest_ms) ~ cost_shifter + own_n_obs_quality +
other_n_qual + other_n_qual + own_n_comp_costs + other_n_cost_s,
data = boat_dt)
n_logit_2sls_all_ivs_no_r <-
feols(ln_sj_s0 ~ -1 + quality + lenght |
prices + log(within_nest_ms) ~ cost_shifter + own_n_obs_quality +
other_n_qual + other_n_qual + own_n_comp_costs + other_n_cost_s,
data = boat_dt)
etable(n_logit, n_logit_2sls, n_logit_2sls_no_r, n_logit_2sls_all_ivs, logit_2sls, n_logit_2sls_all_ivs_no_r)
## n_logit n_logit_2sls n_logit_2sls_no_r
## Dependent Var.: ln_sj_s0 ln_sj_s0 ln_sj_s0
##
## quality 0.6804*** (0.0248) 1.415*** (0.0582) 0.9330*** (0.0507)
## lenght -0.0259** (0.0093) 0.3791*** (0.0222) 0.6249*** (0.0300)
## roof 1.115*** (0.0397) 2.710*** (0.1066)
## prices -0.0799*** (0.0185) -1.442*** (0.0692) -1.289*** (0.0689)
## log(within_nest_ms) 0.8631*** (0.0144) 0.2430*** (0.0530) 0.6690*** (0.0450)
## ___________________ ___________________ __________________ __________________
## S.E. type IID IID IID
## Observations 4,000 4,000 4,000
## R2 0.64239 0.01259 -0.13569
## Adj. R2 0.64204 0.01160 -0.13654
## n_logit_2sls_all.. logit_2sls n_logit_2sls_all..
## Dependent Var.: ln_sj_s0 ln_sj_s0 ln_sj_s0
##
## quality 1.370*** (0.0488) 1.616*** (0.0483) 0.9621*** (0.0489)
## lenght 0.3682*** (0.0200) 0.4306*** (0.0243) 0.6091*** (0.0286)
## roof 2.621*** (0.0864) 3.111*** (0.0825)
## prices -1.382*** (0.0546) -1.712*** (0.0559) -1.285*** (0.0636)
## log(within_nest_ms) 0.2984*** (0.0392) 0.5912*** (0.0396)
## ___________________ __________________ __________________ __________________
## S.E. type IID IID IID
## Observations 4,000 4,000 4,000
## R2 0.07960 -0.33965 -0.16355
## Adj. R2 0.07868 -0.34065 -0.16442
I use estimates from the n_logit_2sls.
alpha <- n_logit_2sls$coefficients["fit_prices"]
names(alpha) <- NULL
sigma <- n_logit_2sls$coefficients["fit_log(within_nest_ms)"]
names(sigma) <- NULL
boat_dt[, n_elasticity_jt := alpha * prices *
((1 / (1 - sigma)) - (sigma / (1 - sigma) * within_nest_ms) - shares)
]
# plots
n_logit_plot <-
ggplot(data = boat_dt,
aes(
x = n_elasticity_jt,
group = as.factor(firm_ids),
fill = as.factor(firm_ids)
)) +
geom_density(alpha = 0.4) +
ggtitle("Nested Logit Price Elasticities Across Markets")
par(mar = c(4, 4, .1, .1))
logit_plot
n_logit_plot
boat_dt[, .("Average elasticity" = mean(n_elasticity_jt)), by = firm_ids]
## firm_ids Average elasticity
## 1: 1 -5.695437
## 2: 2 -5.484764
## 3: 3 -4.065844
## 4: 4 -3.933168
D_13_n <-
boat_dt[firm_ids == 3, shares * (1 - sigma)] /
boat_dt[firm_ids == 1, 1 - sigma * within_nest_ms + (1 - sigma) * shares]
D_31_n <-
boat_dt[firm_ids == 1, shares * (1 - sigma)] /
boat_dt[firm_ids == 3, 1 - sigma * within_nest_ms + (1 - sigma) * shares]
D_12_n <-
boat_dt[firm_ids == 2, shares * (1 - sigma)] /
boat_dt[firm_ids == 1, 1 - sigma * within_nest_ms + (1 - sigma) * shares]
data.table("D_13" = mean(logit_div_13),
"D_31" = mean(logit_div_31),
"D_13_n" = mean(D_13_n),
"D_31_n" = mean(D_31_n))
## D_13 D_31 D_13_n D_31_n
## 1: 0.1921416 0.3445739 0.09510307 0.2321172
These are the better diversion ratios as they take into account whether some boats have roofs or not.
Conlon, C., & Mortimer, J. H. (2021). Empirical properties of diversion ratios. The RAND Journal of Economics, 52(4), 693-726.
2SLS nested logit without roof.
n_logit_estimates <-
feols(ln_sj_s0 ~ -1 + quality + lenght |
prices + log(within_nest_ms) ~ cost_shifter + own_n_obs_quality + other_n_qual + own_n_comp_costs,
data = boat_dt)
etable(n_logit_estimates)
## n_logit_estimates
## Dependent Var.: ln_sj_s0
##
## prices -1.485*** (0.0722)
## log(within_nest_ms) 0.4943*** (0.0444)
## quality 1.048*** (0.0543)
## lenght 0.6886*** (0.0323)
## ___________________ __________________
## S.E. type IID
## Observations 4,000
## R2 -0.40837
## Adj. R2 -0.40943