Bayesian Distribution of Sample Mean

Initializing live version
Download to Desktop

Requires a Wolfram Notebook System

Interact on desktop, mobile and cloud with the free Wolfram Player or other Wolfram Language products.

This Demonstration provides Bayesian estimates of the posterior distribution of the mean and the standard deviation of a normally distributed random variable . These posterior distributions are based upon observing independent observations of the random variable that have sample mean and sample distribution . Prior knowledge about statistical parameters is an important part of Bayesian statistics. In this case, it is initially assumed that the unknown mean is uniformly distributed on the interval and that the unknown standard deviation is distributed with a Jeffrey's prior distribution on the interval . Bayes's theorem provides a convenient way of incorporating the prior information and the observed information into a posterior probabilistic characterization of the unknown parameters and .

Contributed by: Marshall Bradley (July 2011)
Open content licensed under CC BY-NC-SA



The Bayesian approach to statistics provides an intuitive way of making probabilistic statements about statistical parameters of interest. Consider a situation in which independent samples of a random variable have been obtained. Based upon prior information, the random variable is known to be normally distributed with unknown mean and standard deviation . Additional initial information consists of a priori knowledge about the distribution of the unknown mean and the unknown standard deviation of the random variable . In the Bayesian approach, this a priori information is formally represented by the prior probability density functions defined on the interval and defined on the interval .

Bayes's theorem can be used to compute the joint posterior probability distribution of and given the data and the initial information about and . Formally the result for the posterior distribution is


where is the likelihood of the data , given specific values of the parameters and . Since the data samples are independent and drawn from a normal distribution, the likelihood of the data can be written


The likelihood of the data can be expressed in terms of the data sample mean and standard deviation . The result is



and .

This last result demonstrates that knowledge about the data is captured by the sample statistics and together with the sample size .

The posterior distribution can now be written


The posterior distribution of can be found by marginalizing across . Formally the result is


For a normal distribution, the mean and standard deviation are respectively position and scale parameters. A normal distribution is centered on and has a width that is related to . In situations where there is significant prior uncertainty in the distributions of and , it is appropriate to assume that the prior probability distribution of the mean is uniformly distributed with probability density function

, .

Since the standard deviation is a scale parameter and not a position parameter, it is appropriate to assume that the prior probability distribution for is given by the Jeffrey's prior

, .

With these specific representations for and , the posterior distribution of can be written in terms of the initial and observed information:


The posterior probability distribution of is


In the special case of a single observation of the random variable , that is, , and complete uncertainty in the mean , that is, and , then , and the posterior distribution of the mean is given by


and the posterior distribution of the standard deviation is

, .

This is identical to the prior distribution . Thus in the special case and complete uncertainty regarding , Bayes's theorem says that a single observation tells us nothing about its own uncertainty.

In this Demonstration, you can choose values of the initial data , , , and the observed data , , , and investigate the affects of these choices on the posterior distributions and . The black dots on the horizontal axes in the lower two plots indicate the locations of the observed data and with respect to their respective posterior distributions.


[1] E. T. Jaynes, Probability Theory: The Logic of Science, Cambridge: Cambridge University Press, 2003.

[2] H. Jeffreys, Scientific Inference, 3rd ed., Cambridge: Cambridge University Press, 1973.

Feedback (field required)
Email (field required) Name
Occupation Organization
Note: Your message & contact information may be shared with the author of any specific Demonstration for which you give feedback.