TU-L0022_aalto-CUR-141790-3063741: Bootstrapping (15:57)

Etusivu Koulut Kurssipalaute Palvelulinkit Intelliboard

Kurssiasetusten perusteella kurssi on päättynyt 06.04.2022 Etsi kursseja: TU-L0022

Bootstrapping (15:57)

Vaatii arvosanan

This video goes through the bootstrapping. Video explains what bootstrapping is and when it works. The video also explains bootstrap regression coefficient, confidence intervals, and normal approximation and empirical confidence intervals

Click to view transcript

When we do statistical analysis we always get the point estimate or the estimate of the effect, just one regression coefficient or one number. We also need to know how certain we are about that number and that certainty is quantified by the standard error.

So that standard error quantifies the precision and we use the standard error and the actual estimate to calculate a statistics that give us the p-values. In some scenarios calculating the standard error is hard or calculating the standard error is something that requires assumptions that we are not willing to make or assumptions that we know that they are not true for our particular data and analysis.

Bootstrapping provides an alternative way of calculating standard errors or estimating how much a statistic would vary from one sample to another and bootstrapping is like a computational approach to the problem of calculating a standard error.

How bootstrapping works is that we have original sample. So we have a sample of 10 observations here from a normally distributed population with mean of 0 and standard deviation of 1. So that's our original sample here, the mean is zero point 13 from that sample and if we take multiple samples from the same population,

here is the sampling distribution of the of the sample mean if the sample size is 10 from this population. Most of the time we get values close to the 0 which is the population value of mean and then sometimes we get estimates that are far from the actual population value.

The idea of bootstrapping is that if we don't know how we estimate the width of this sampling distribution or the shape using statistical theory or are closed for equation then we can do the temperately. So instead of our calculating it using an equation we take repeated samples from our original sample. So that's our original sample it forms the population for the bootstrap. Then we take a repeated sample so we take first or 0.31 it is our here, then we put it back so we allow every observation to include it be included in the sample multiple times. Then we take randomly another one 0.83, it's here we put it back, then we take yet another number, yet another number, we take the zero point - 0.84 the second time and so on.

So we take these samples from an original data and every observation can be included in the in the sample multiple times. So each of these are randomly chosen numbers and doesn't depend on any other previous choices. So we get our using this bootstrap sample, we get 0.34 a sample mean, we calculated it many times, typically we do one hundred five hundred or thousand times or even ten thousand times depending on the complexity of the calculation. Thousand repetitions is quite normal nowadays.

So we can see that from sample to sample this sample mean varies and these are various of sample mean, calculator the distribution of this sample mean from the bootstrap samples, calculated from our thousand bootstrap replications here is about the same shape as that if we would take the samples from the actual population.

So these two distributions are quite similar and we can use that information the knowledge that these two distributions are similar. They approach each other when the sample size increases. We can use that knowledge to say that this distribution here is a good representation of that distribution and if we want to estimate the standard deviation of this distribution which is what standard error quantifies or estimates. Then we can just use the standard deviation of that distribution.

Here we can see that the mean of this distribution is slightly off. That's called the bootstrap bias. So this mean here is roughly at the mean here. So it's not that the population mean in instead of if it's closer that they are the mean of this particular sample.

Then they're also the width of this distribution is in this case slightly smaller, so that this person here is slightly smaller than this person here and that is also something that we in sometimes need to take in the consideration. The key thing in bootstrapping is that when sample size increases then this mean and the standard deviation will be closer to that mean and that standard deviation.

Let's take a look at a demonstration of how bootstrapping works. This is a video from a Statistics Department from University of Auckland and they demonstrate that you have your original sample here. So we have two variables. We have up there on X variable and Y variable and then we have a regression coefficient. So we calculate the regression coefficient here and we are interested in how much this regression coefficient, the slope would vary if we were to take this sample over and over from the same population.

So that's what the standard error quantifies. For some reason we don't want to use the normal formula that our statistical software uses to calculate the standard error. We want to do it by bootstrapping. So we take samples from your original data. You can see here that each observation can be included multiple times. Sometimes an observation is not included in the sample. Then we get the regression coefficient that is slightly different from the original one. We do another bootstrap sample. We get another regression coefficient again slightly different from the original one. We take yet another bootstrap sample, we get slightly different one and we go on a hundred times a thousand times and ultimately we get an estimate of how much this regression coefficient would really vary if we were to take multiple different samples. So that's when you get a thousand samples or a hundred samples.

Then you can see that the variance of the regression coefficient is that much between the bootstrap samples and if sample size is large enough this variation of the bootstrap samples is a good approximation of how much the regression coefficient would vary if we were to repeat the same independent samples from the same population and calculate's the regression analysis again and again from those independent samples.

Bootstrapping can be used to calculate the standard error within which case we just take a standard deviation of these regression slopes and then that is our standard error estimate. We can also use bootstrapping to calculate confidence intervals. So the idea of a confidence interval is that instead of estimating a standard error and a p-value, we estimate a point estimate. So for example a value of a correlation one single value and then we estimate an interval let's say 95% a interval which has an upper limit and lower limit and then if we repeat the calculation many times from independent samples then the population value will be within the interval if it's a valid interval 95% of the times.

So this is an example of correlation and we can see that the correlation estimates, when there is a zero correlation in the population, we have a small sample size, they vary between zero point minus 0.2 and plus zero point two and most of the time when we draw the confidence interval which is the line here. The line includes the population value. This is two and a half percent of the replications here and it doesn't include the population values. So the population value here falls above, there are the upper limit.

Here we have extremely large correlations and the population value for about two and a half percent of the replications falls below the lower limit. In 95% of the cases here, there are population value is within the interval. So that's the idea of confidence intervals. Here we can see them when the police value is large then the width of the confidence interval depends on the correlation estimate. So when the correlation estimate is very high then there are the confidence intervals, it's narrow. When the corrosion estimate is very low then it's a lot wider here, they are the confidence interval.

So the confidence interval depends on the value of the statistic and also it depends on the estimated standard error of the statistic. Now there are a couple of ways that bootstrapping can be used for calculating confidence interval. Normally when we do confidence intervals we use the normal approximation. So the idea is that we assume that the estimate is normally distributed over repeated samples. Then we calculate the confidence interval, it is estimate plus or minus 1.96 which covers 95% of the normal distribution multiplied by the standard error.

So that gives us the plus or minus. So if we have an estimate of correlation that is here then we multiply the standard error by 1.96 - estimate minus. That is the load limit estimate plus one point nine times the standard error is here. So that gives us the upper and lower limit, in this example 1 percent and 13 percent when the actual estimate is about our 5 percent.

So we calculate how we use bootstrapping for this calculation is that there the standard error is simply the standard deviation of the bootstrap estimate. So if we take a correlation with bootstrap it, then we calculate how much their correlation varies between the bootstrap samples using standard deviation metric and then we use that plug that in. That formula gives us the confidence intervals. So that works when we can assume that the estimate is normally distributed

What if we can't assume that the estimate are normally? That is the case when we can use empirical confidence intervals based on bootstrapping. So the idea of the normal approximation interval is that the estimate is normally distributed. Then we can use this equation or we can use empirical confidence intervals. The idea of an empirical confidence interval is that we do the bootstrapping and then we take our thousand bootstrap replications. Then we take the 25th from smallest to largest, we take the 25th value of the bootstrap replicates and that is our lower limit for the confidence interval. Then we take the 975th and that is the upper limit so that's 2.5% and 97.5% and that's the upper limit of our or confidence interval. So that's called percentile intervals.

So when we have this kind of bootstrap distribution we will take replication here to 25th replication that is our lower limit, and we take the 975th replication here, that is our upper limit. So that gives us the confidence interval for the mean that is estimated here.

That has two problems this approach. First, the bootstrap distribution is biased. So the mean of these bootstrap replications is about 0.15 and the actual sample value for the mean is zero. To account for that bias we have a bias corrected confidence intervals. The idea of bias corrected confidence intervals is that instead of taking the 25th and 975th bootstrap replicate as the endpoints, we first estimate how much the bootstrap bias is and then a based on that estimate we take for example the 40th and 980th replication. So instead of taking the fixed 25th and fixed 975th, we adjust which replicates we take as the end points.

There's also the problem that the variance, the standard deviation here is not always the same as the standard deviation of here. So in the correlation example you saw that the confidence interval decreased as actual correlation estimate went up. So the idea is that the width of the interval depends on the value of the estimate. To take that into account we have bias correlate correctly and an accelerated confidence intervals, which apply the same idea at us the bias corrected ones but instead of just taking the bias into account, they take the estimated differences in variance of these two distributions into account when we choose the endpoints for the confidence intervals.

Now the question is, these looks really good, so we can estimate the variance of any statistic empirically and we don't have to know the math and that's basically true with some qualifications. The qualifications are that bootstrapping requires large sample size.

There is a good article or a book chapter by Koopman and co-authors in the book edited by Bundaberg about statistical myths and urban legends, and they point out that there are three different claims made in the literature. There's the claim that bootstrapping works well in small samples and there is a fact that bootstrapping assumes that sample is representative of the population. So if our sample is very different from the population then the bootstrap samples that we take from our original sample cannot approximate how the samples would actually behave from the real population. Then sampling error, which means how different the sample is from the population, is troublesome in small samples.

So in small samples the sample may not be very accurate representation of the population. So if small samples are not representative population and if we require that sample must be representative to population then bootstrapping cannot work in small samples. So bootstrapping generally requires a large sample size.

Then there are also some boundary conditions under which bootstrapping doesn't work even if you have a large sample. So there are that kind of scenarios but for most practical applications only the sample size is the thing that you need to be concerned about. The problem is that it is very hard to say when your sample size is large enough.

Tämä sisältö näytetään esikatselutilassa, suoritustasi ei tallenneta.

TU-L0022 - Statistical Research Methods D, Lecture, 2.11.2021-6.4.2022

Bootstrapping (15:57)

Opiskelijoille

Opettajille

Palvelusta