Over a million developers have joined DZone. {{announcement.body}}
{{announcement.title}}

# Fitting a Triangular Distribution

DZone 's Guide to

# Fitting a Triangular Distribution

· Big Data Zone ·
Free Resource

Comment (1)

Save
{{ articles.views | formatCount}} Views

Sometimes you only need a rough fit to some data and a triangular distribution will do. As the name implies, this is a distribution whose density function graph is a triangle. The triangle is determined by its base, running between points a and b, and a point c somewhere in between where the altitude intersects the base. (c is called the foot of the altitude.) The height of the triangle is whatever it needs to be for the area to equal 1 since we want the triangle to be a probability density.

One way to fit a triangular distribution to data would be to set a to the minimum value and b to the maximum value. You could pick a and b are the smallest and largest possible values, if these values are known. Otherwise you could use the smallest and largest values in the data, or make the interval a little larger if you want the density to be positive at the extreme data values.

How do you pick c? One approach would be to pick it so the resulting distribution has the same mean as the data. The triangular distribution has mean

(a + b + c)/3

so you could simply solve for c to match the sample mean.

Another approach would be to pick c so that the resulting distribution has the same median as the data. This approach is more interesting because it cannot always be done.

Suppose your sample median is m. You can always find a point c so that half the area of the triangle lies to the left of a vertical line drawn through m. However, this might require the foot c to be to the left or the right of the base [a, b]. In that case the resulting triangle is obtuse and so sides of the triangle do not form the graph of a function.

For the triangle to give us the graph of a density function, c must be in the interval [a, b]. Such a density has a median in the range

[b – (ba)/√2, a + (ba)/√2].

If the sample median m is in this range, then we can solve for c so that the distribution has median m. The solution is

c = b – 2(bm)2 / (ba)

if m < (a + b)/2 and

c = a + 2(am)2 / (ba)

otherwise.

Topics:
bigdata ,big data ,computer science

Comment (1)

Save
{{ articles.views | formatCount}} Views

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.