Snapshot 1: adjusting the number of bins
Snapshot 2: estimating a density
Snapshot 3: comparing two estimated densities
Snapshot 4: estimating a mixture density
The Demonstration can be used to estimate both usual distributions and twocomponent mixture distributions. (Mixture distributions are also called compound distributions.)
As to the estimation of a usual distribution, a typical use of this Demonstration could proceed as follows:
1) Define the data as a list of values; give it the name "data".
2) Adjust the number of bins of the histogram (Snapshot 1).
3) Try several densities from "density 1". Once a promising candidate is found, keep it in "density 1" (Snapshot 2).
4) Choose other densities from "density 2", comparing them with "density 1" (Snapshot 3).
By doing these pairwise visual comparisons, try to find the best density, that is, the density that best fits the data as summarized by the histogram and gives the best

plot.
The

plot shows points whose first component is a component of the vector of sorted data and whose second component is the corresponding quantile calculated from the estimated density. The estimated density is better the closer the points of the

plot are to the diagonal line
. The RMSE shown is the root mean square error between the sorted data and the calculated quantiles; the fit is better the closer the RMSE is to zero.
Consider then the estimation of a twocomponent mixture distribution. In a typical use of the Demonstration, steps 1) and 2) are the same as above. Proceed then as follows:
3) Check the "mixture density" checkbox. Choose two densities, say
and
, from "density 1" and "density 2". The mixture is then of the form
, where
is an unknown constant,
.
4) Investigate various pairs of densities to find the best mixture (Snapshot 4).
A mixture density may be particularly valuable in estimating data whose histogram seems to not easily suggest a good ordinary density. In particular, a mixture density may be useful in cases where the histogram seems to be bimodal (i.e., contains two maxima).
In the example shown in the snapshots, we consider monthly maximum wind speed in Turku, Finland, from 1973 to 2009. For this data, the extreme value distribution gives a good fit (RMSE = 2.9, Snapshot 2). However, a mixture of the extreme value distribution and the inverse Gaussian distribution gives still a better density (
, Snapshot 4; note that to get this result we have to adjust the initial value of the weight).
To get an example of bimodal data, replace in our example "MaxWindSpeed" with "MaxTemperature". In this case, a mixture of the extreme value distribution and the Gumbel distribution gives a very good fit.
Some technical details follow. The parameters of the densities are estimated with the maximum likelihood method by using
Mathematica's builtin function
NMaximize. Some parameters of some densities are required to be positive, and this is taken into account in the optimization as constraints.
To help
NMaximize find the global maximum, we have started the maximization of the log likelihood function from a point that corresponds with estimates of the parameters calculated with the method of moments. These estimates are deviated by ±0.1 to get two starting values.
The default starting values for the weighting constant
are
but the center of the starting values can be changed with a slider. A change of the starting value of
may help if we do not get the global optimum. For example, if the histogram suggests a bimodal mixture density but the default value of
only gives rise to a unimodal density; changing the value of
may help in getting a bimodal density.
Estimating the mixture density is a demanding task, and in some cases the optimization may not succeed (in this case, the Demonstration only shows the histogram and the sorted data). In such situations the problem should be investigated in more detail, outside of the Demonstration. We could, for example,
1) try other methods for
NMaximize like differential evolution, simulated annealing, or random search (the default method is Nelder–Mead);
2) try various seed numbers for random number generation in
NMaximize;
3) try to give better starting values for the parameters in
NMaximize.
In adjusting the number of bins, note that too low a value gives a histogram that is too coarse and does not reveal the true distribution of the data. On the other hand, too high a value for the number of bins results in a too detailed histogram where the heights of the bars vary wildly. Try to choose a value for the number of bins that gives a smooth enough histogram that is representative of the distribution of the data.
The densities in the dropdown menus are shown in three groups: densities defined on
, densities defined on
, and the beta density that is defined on
.
The quantiles of estimated ordinary densities are calculated with
Mathematica's builtin function
Quantile. To calculate the quantiles of a mixture density, we use
Mathematica's builtin function
FindRoot to numerically find the solution of the equation that defines the quantile.
For finite mixture distributions, see, for example, [1], [2], or [3]. See also
Mixture density at Wikipedia.
[1] B. S. Everitt and D. J. Hand,
Finite Mixture Distributions, London: Chapman and Hall, 1981.
[2] G. McLachlan and D. Peel,
Finite Mixture Models, New York: Wiley, 2000.
[3] D. M. Titterington, A. F. M. Smith, and U. E. Makov,
Statistical Analysis of Finite Mixture Distributions, Chicester: Wiley, 1985.