TU-L0022 - Statistical Research Methods D, Lecture, 2.11.2021-6.4.2022
This course space end date is set to 06.04.2022 Search Courses: TU-L0022
Basic statistical tests (9:48)
This
video presents basic statistical tests. The video goes through a null
hypothesis significance test, a t test, and multiple statistics.
Click to view transcript
In this video, I will explain a couple of basic statistical tests. We
have addressed statistical testing before in the context of this
example. So in 2005, there's a finding that among the largest 500
Finnish companies, the companies that were led by a women CEO were 4.7
percentage points more profitable than the companies led by male CEOs and
the question that we want to answer is whether that kind of difference
can be by chance only? So to assess, whether chance provides a plausible
explanation we need to understand a couple of different things. Most
importantly, how many women-led companies there are which is 22 and
also, how is the return on assets distributed? In this example, we knew
that the mean of return of assets was about 10. So could it be by chance
only? To formally address
that question we did a null hypothesis significance test and the idea of
null hypothesis significance testing is that we first define a test
statistic and then we derive a sampling distribution for the test
statistic under the null hypothesis that there is no difference. So
when we actually repeated samples of this population, we found out that
taking 22 companies and comparing against the 478 remaining companies,
the mean difference between these two companies follows a normal
distribution. So sometimes we get
mean ROA for the smaller sample that is more than 10 points larger than
the larger sample and sometimes the opposite result. The probability of obtaining a 4.7 point or greater difference to this direction is 0.04. So it's 4 % probability. Normally
we will be using a two-tailed test, where we also use this probability
area here or this area here and that will give us 8 % probability. So
it's close to the 5 % threshold that we normally use for statistical
significance. So how do we generalize this
idea? So what's the logic behind this comparison? Well first of all
this comparison, why it's difficult to do is that we have to specify
this distribution for each of our research studies or research problems
separately because the scale of the dependent variable varies. So
if we have ROA, then the scale is somewhere between -10 to +10 for this
statistic. If we have a number of personnel, then the scale for the
statistic is in the hundreds or thousands and if we have the revenues
then it's in the millions or billions of euros scale. So
this is not practical because we'd have to define the distribution for
each problem separately. So we want to use one standardized approach for
every problem of this kind and to do that we use the t test. The
idea of a t test is that instead of looking at the raw statistics, the
raw estimate the 4.7 here we look at the estimate divided by its
standard error and that gives us a standardized metric that we can
compare against the null distribution. So the estimate divided by its
standard error is distributed as student's t, if it's a t test, and the
idea was that instead of looking at the raw estimate we standardized the
estimates. So remember
standardizing something is subtracting the mean first and then dividing
by the standard deviation. So the mean here is the null hypothesis
value, so subtracting zero doesn't do anything and then we divide by the
standard error which is the estimate of the standard deviation of this
estimate when we actually do the study over and over many times. So that's the logic. We standardized the estimates and then based on the standard error and then we compare against the t distribution. Then
if our sample size is large or we are using large sample statistics
then this same test goes by the name z test. And the z test statistic is
defined exactly the same way as the t statistic. The difference is that
in the z test we compare against the standard normal distribution. So
in large samples the student's t distribution approaches the standard
normal distribution with a mean of zero and a standard deviation of 1. So,
what's the difference, why do we need two tests? The reason why we need
these two tests is that for some or most statistical tests, we don't
actually know how they behave in small samples and their behavior is
known only in large samples. So if for example, maximum likelihood
estimates that we'll cover later, we know how they work in very large
samples in small samples their behavior is generally not known. So for
that we rely on the assumption that our sample size is large enough and then we use the z statistic. In
regression analysis, we know how the estimates are distributed even in
small samples and therefore we can use the t distribution. Whenever you
see t or z, well it just quantifies the estimate divided by its standard
error and your statistical software will pick the proper distribution
for you automatically so you don't have to know anything else except
that these are basically the same thing except that we compare against a
different distribution. So that's
simple, two tests, t and z, depending on whether you know the small
sample distribution of the statistic. If you don't, then you use a z
test and assume that the sample size is large enough. So
what if you have multiple statistics? Sometimes we want to test if two
regression coefficients for example, are different from zero at the same
time. So we want to know that the null hypothesis is that both
regression coefficients are exactly zero and if either one of those is
nonzero then we reject the null. So
in t and z tests, we basically assess how far from the zero point the
actual estimate is on a standardized metric. So we have a line and we go
along that line and see how far we get. So when we have two statistics
then we have a plain so we have y here and have x here and then we have
an estimate here so we have an estimate of both regression coefficients 1
and 2 or y and x, whatever and we want to know how far again we are
from the zero point. So high school
math tells you that to get this distance, you race the distance on the
x-axis to the second power, you raise the distance on the y-axis to the
second power, you take a sum and then you take a square root of that sum
so that gives you the distance here. So basic geometry. In
practice, we do that except, we don't take the square root because we
just want to have some reference value and it doesn't matter, whether we
get this distance as a reference value or the square of this distance
as a reference value. It's quicker if we don't take the square root. So
what we actually do is that for this kind of multiple hypothesis testing
is that we compare the sum of squared variables against the Chi-square
distribution. So the Chi-square
distribution is defined as the sum of squared normal variables. So if
you have two variables we take squares of both and then we sum the
squares and then that gives us a distribution which follows Chi-square
with a decrease of freedom of two. So
that has an obvious parallel to minimizing the sum of squared
residuals. So quite often we take sums of squares in statistics. So how
it works actually in the context of t test or how this Chi-square test
can be seen as an extension of the z test. Well we just take the square
of the estimate, take a square of the standard deviation or standard
error in this case and that provides us with something called a Wald
test and that can be applied to multiple hypotheses. So
whenever you divide, estimate divided by the standard error, then you
are doing a z test or t test. If you divide estimate squared divided by
standard error square then that's called a Wald test and
the advantage of the Wald test is that it can be applied to these
multiple or two-dimensional or more-dimensional problems and the
difference is Chi-square distributions. So if you have two variables,
then the sum of squares of the estimates, divided by the standard error
squared will follow Chi-square with two degrees of freedom. If
you have three variables, the sum of squares divided by the standard
errors will follow Chi-square with three degrees of freedom. So that's
the basics of chi-square testing. So you are basically assessing how far
on a standardized metric the individual combination of two or more
estimates are from the zero point.