Instructions

Delmar and Wiklund (2008) study the relationship between the small business managers' growth motivation and firm growth. One of their hypotheses is

     Hypothesis 2: Growth motivation at T1 has a positive effect on growth at T2.

Your task is to do a replication study using the provided data. The data are from the Orbis database available to all Aalto Students (https://primo.aalto.fi/permalink/358AALTO_INST/ngpgq9/alma997722844406526) and from a longitudinal survey of software companies in Finland. The data from the Orbis database are self-explanatory. The survey data are from question number 4 of the survey form, which is provided as a part of the data package: "How well do the following statements describe the growth of your firm?”

(The data are anonymized by shuffling the company identifiers in both datasets in the same way.)

Download the data files here

Your task is to combine the data, assess the dimensionality of the survey scale using exploratory factor analysis, construct one or more summed scales from the survey items and assess their reliabilities, and then finally do a regression analysis of realized growth on growth motivation and control variables. This exercise will familiarize you with simple data management tasks as well as arguing reliability and construct validity.

If you want, you can extend the analysis in a few different ways

  1. In addition to exploratory factor analysis, do a confirmator factor analysis. If you choose to do this analysis, pay attention to the chi2 statistic and diagnose the model by inspecting the residuals.
  2. The non-indepence of observations assumption is violated in the data because they are repeated observations over time. You can optionally take this into consideration by using cluster robust standard errors. Moreover, regression gives the population average effect, which is rarely of interest because it does not have a clear causal interpreation. As an alternative, you can apply a model that produces the within effect.

Both these extensions are demonstrated in the model answer.

Document your analysis: what was the purpose of each analysis step and how did you interpret the results. The submitted report should be prepared according to instructions that you can find here.

Suggested outline of the analysis process and commands

The table below lists the sub tasks and commands in Stata and R that you can use to complete the assignment. This is just one of the possible ways to do the assignment and you are of course free to do it also in any other way you can.

Subtask

Stata commands and links

R commands and links

Prepare the Orbis data

Load the data

insheet

read.csv

Explore the data

UCLA website on data exploration

stem, pairs, summary, head, cor

Create new identifier variable

The data needs to be setup as a panel a bit later and this requires numerical ID variables, but the raw data have text identifiers (e.g. FI12345678).

seq-function in egen command

Not applicable to R

Reshape from wide to long

reshape

melt, cast (reshape library), str_sub (stringr library), as.numeric

Set up the data as panel

xtset

Not applicable to R

Ensure that all variables that contain numeric data are stored as numeric and not as text

describe, destring

as.numeric, gsub, as.character

Generate new company level variables and transform existing variables if needed

generate, replace

You need to define at least one new variable for growth. Use the relative change of revenue over one or more years.

Stata documentation on lags and leads

R does not have a convenient built-in function for lagged variables. You need to either sort the data and shift the observation vectors yourself, or you can use the slide command in DataCombine package.

Drop unnecessary variables

drop, keep

subset, Extract ([])

Save the data on disk

save

Not needed in R because you can have multiple datasets in memory

Prepare the survey data

Load the data

insheet

read.csv

Explore the data

UCLA website on data exploration

stem, pairs, summary, head, cor

Do a factor analysis of the survey data

factor, rotate

fa (from psych package. You also need GPARotation package)

Calculate one or more summed scales and asses their reliabilities

alpha

alpha (from psych package)

Merge the datasets

Prepare the datasets for merge

You need to merge the two datasets by company identifier and year. The variables on which you merge the two datasets need to have identical names on both datasets. Also, you need to make sure that there are no duplicate observations in the data on the identifying variables.

rename, duplicates list, duplicates drop

names, duplicated

Merge the datasets

merge

merge

Analyze the full data

Descriptive statistics and correlations

correlate, summarize

summary, cor

Run regression models and compare the results

regress, estimates store, estimates table, estimates clear

lm, screenreg (from the texreg package)

Post-estimation diagnostics

Stata documentation for regression postestimation and regression postestimation plots

plot.lm, plot, residuals, avPlots (from the car package)

Data exclusions (e.g. outliers) and transformations, if needed.

replace, drop

subset, Extract ([])

Other issues

The commands for reshaping and merging require that there are no duplicate observations.

duplicates list, duplicates drop

Delmar, F., & Wiklund, J. (2008). The Effect of Small Business Managers' Growth Motivation on Firm Growth: A Longitudinal Study. Entrepreneurship Theory and Practice, 32(3), 437-457. doi:10.1111/j.1540-6520.2008.00235.x

Sorry, no guest users are allowed to access this plugin. Please login.

window