Wednesday, September 24, 2014

Understanding the distribution of sample mean (x_bar)

Cool, say now we have a huge population with characteristics (Mu, Sigma^2). When doing a study by sampling, we take a random sample (size n items) and then perform the study on the sample and conclude results back for the population.

From Central Limit Theorem, we know that the sample mean will always follow a normal distribution apart from what the population distribution is, such that:

x_bar ~ N (Mu, Sigma^2/n)
or say:
Expected (x_bar) = Mu
Variance (x_bar) = Sigma^2/n

Well, let's see a simple illustrating example: Suppose we have a population with mean Mu=100.
Now, we have taken a sample, and computed the sample mean, x_bar. We mostly will have x_bar near 100 but not exactly 100. OK, let take another 9 separate samples... suppose these results:

First sample --> x_bar = 99.8
Second sample --> x_bar = 100.1
10th sample --> x_bar = 100.3

What we see that the sample mean is usually close to real population mean, that is the meaning of the expected value of x_bar will be Mu.

Regarding the variance of sample mean (x_bar), variance will always decrease as sample size increase (sample variance=Sigma^2/n) which is natural behavior. We may think of this as the larger sample size we use, we tend to have more precise values for population mean.
When sample size goes to infinity (theoretically), the x_bar variance will be zero. The reason here is that the sample will be exactly the same as the population (all items). Thus, sample mean will give the real exact value for the population mean. There will be no variability in the sample mean because the it fully represents the population mean.

No comments:

Post a Comment