I was recently reading something about probabilities and came across the interesting subject of the central limit theorem. It’s a really powerful concept in maths and probabilities in particular.

I’m no mathematician but when I come across interesting stuff like this, I like to dive deeper in and understand it better. I find it helpful to write a software programs that allow me to fully grasp the concept and visualise it in my own head. Writing a piece of software not only improves my coding skills but also forces me to step through the concept I’m trying to understand piece by piece.

I thought it would be nice to pop up a quick web page with a graph to demonstrate the theorem and allow me test out my code with different parameters in a visual way.

The central limit theorem is useful for understanding distributions and basically states that if you have a random set of values (say the height of adults in the UK) and you take a number (say N) of samples of those random values then the frequency of the averages of those samples will approximately tend to a normal distribution (a bell shaped curve of a graph) as the value of N increases.

Wikipedia gives a more verbose definition than I can.

In probability theory, the

central limit theorem(CLT) states that, given certain conditions, the arithmetic mean of a sufficiently large number of iterates of independentrandom variables, each with a well-defined expected value and well-defined variance, will be approximately normally distributed, regardless of the underlying distribution.

I have deployed this web application to http://mathdemos.azurewebsites.net/

## The graphs

As an example we can plot the frequency of the heights of adults in the UK by taking a 1000 samples of values between 4ft and 8ft. We can then specify a sample size (N).

As N gets larger we should see the graph start to resemble a normal distribution.

Below is the graph when N = 3, a rather small sample size.

As you can see the graph isn’t very smooth and doesn’t much resemble a normal distribution.

Next we increase the sample size N = 10 and see what happens.

We can start to see the graph smoothing out a bit as the value of N increases. Next we increase the value to 100. N = 100

Now we can see with a large sample size N = 100 the graph definitely resembles a normal distribution with a nice smooth bell shaped curve.

You can play around yourself on the web application for different values of N.

## The Code

This is no algorithmic masterpiece by any means but it does demonstrate how to generate the graphs above.

The full code is up on GitHub if you want to take a look https://github.com/leedale1981/ATT.Maths

The model just consists of a simple class with five properties.

public class CentralLimitTheorem { public int SampleSize { get; private set; } public int NumberOfSamples { get; set; } public int MinValue { get; private set; } public int MaxValue { get; private set; } private readonly Random random = new Random();

I’m using the pseudo-random generator in .Net but this is good enough for what I wanted to do.

My constructor takes four parameters.

public CentralLimitTheorem(int samepleSize, int numberOfSamples, int minValue, int maxValue) { this.SampleSize = samepleSize; this.NumberOfSamples = numberOfSamples; this.MinValue = minValue; this.MaxValue = maxValue; }

We first have to generate the random numbers for each sample. We then iterate over the sample size generating the values and add the mean value of the sample to the array of mean values.

public double[] GetMeanSamples() { double[] sample = new double[this.SampleSize]; double[] samples = new double[this.NumberOfSamples]; for (int samplesIndex = 0; samplesIndex < NumberOfSamples; samplesIndex++) { for (int sampleIndex = 0; sampleIndex < SampleSize; sampleIndex++) { double random = this.GetRandomNumber(); sample[sampleIndex] = random; } samples[samplesIndex] = Math.Round((sample.Sum() / this.SampleSize), 1); } return samples; }

Next we need to create a dictionary containing the mean values and their corresponding frequency. The important thing here is to use a SortedDictionary so that the mean values are sorted from lowest to highest. These mean values will be used as the labels for the X axis of our graph.

public SortedDictionary<double, double>; GetFrequencyOfMeans() { double[] means = this.GetMeanSamples(); SortedDictionary<double, double>; frequencies = new SortedDictionary<double, double>(); for (int meanIndex = 0; meanIndex < means.Length; meanIndex++) { double mean = means[meanIndex]; if (frequencies.ContainsKey(mean)) { frequencies[mean] = frequencies[mean] + 1; } else { frequencies.Add(mean, 1); } } return frequencies; }

That is basically all there is for the logic that creates the graph. The rest is an MVC 6 application that renders a view containing the inputs and the graph. I used a library called Chart.Mvc which is a .Net wrapper for the ChartJS JavaScript library. I had to contribute code to this library and modify it to support MVC 6.

I created a tag helper for the various charts which can be seen in the code base at https://github.com/leedale1981/Chart.Mvc