Distributions

Distributions

Data is seldom in a form ready for use in a simulation model. Usually some analysis and conversion needs to be performed for the data to be useful as an input parameter to the simulation. Random phenomena must be fitted to some standard, theoretical distribution such as a normal or exponential distribution (Law and Kelton, 1991), or be input as a frequency distribution.

 

To define a distribution using a theoretical distribution requires that the data, if available, be fit to an appropriate distribution that best describes the variable. An alternative to using a standard theoretical distribution is to summarize the data in the form of a frequency distribution that can be used directly in the model. A frequency distribution is sometimes referred to as an empirical or user-defined distribution.

 

Whether fitting data to a theoretical distribution, or using an empirical distribution, it is often useful to organize the data into a frequency distribution table. Defining a frequency distribution is done by grouping the data into intervals and stating the frequency of occurrence for each particular interval. To illustrate how this is done, the following frequency table tabulates the number and frequency of observations for a particular task requiring a certain range of time to perform.

 

Delivery Time(days) Number of Observations Percentage Cumulative Percentage
0-1 25 16.5 16.5
1-2 33 21.7 38.2
2-3 30 19.7 57.9
3-4 22 14.5 72.4
4-5 14 9.2 81.6
5-6 10 6.6 88.2
6-7 7 4.6 92.8
7-8 5 3.3 96.1
8-9 4 2.6 98.7
9-10 2 1.3 100.0

Total Number of Observations = 152

 

While there are rules that have been proposed for determining the interval or cell size, the best approach is to make sure that enough cells are defined to show a gradual transition in values, yet not so many cells that groupings become obscured.

 

Note in the last column of the frequency table that the percentage for each interval may be expressed optionally as a cumulative percentage. This helps verify that all 100% of the possibilities are included.

 

When gathering samples from a static population, one can apply descriptive statistics and draw reasonable inferences about the population. When gathering data from a dynamic and possibly time varying system, however, one must be sensitive to trends, patterns, and cycles that may occur with time. The samples drawn may not actually be homogenous samples and, therefore, unsuitable for applying simple descriptive techniques.

Process Simulator Distributions

Distribution Syntax Individual Components
Beta B(a,b,c,d)

a=shape value 1

b=shape value 2

c=lower boundary

d=upper boundary

Binomial B(a,b

a=batch size

b=probability of success

Erlang ER(a,b)

a = batch size

b= parameter

Exponential E(a) a=mean
Gamma G(a,b)

a=shape value

b=scale value

Geometric GEO(a) a=probability of success
Inverse Gaussian IG(a,b)

a=shape value

b=scale value

Lognormal L(a,b)

a=mean

b=standard deviation

Normal N(a,b)

a=mean

b=standard deviation

Pearson5 P5(a,b)

a=shape value

b=scale value

Pearson6 P6(a,b,c)

a=shape value

b=shape value

c=scale value

Poisson P(a) a=quantity
Triangular T(a,b,c)

a=minimum

b=mode

c=maximum

Uniform U(a,b)

a=mean

b=half range

Weibull W(a,b)

a=shape value

b=scale value

General Components

Any negative value returned by a distribution that is used for a time expression will be automatically converted to zero.


© 2014 ProModel Corporation • 556 East Technology Avenue • Orem, UT 84097 • Support: 888-776-6633 • www.promodel.com