# Statistical Modeling Constructs

#### Random Number Generation

SIMPROCESS contains 215 random number streams, each having a different random number seed. You can view the seeds by selecting Seeds on the View menu.

SIMPROCESS offers two types of random number generation: Legacy and Mersenne Twister. The Legacy random number generator is the default random number generator. It is the same random number generator used in CACI’s Simscript product. Mersenne Twister is a random number generator developed in 1997 by Makoto Matsumoto and Takuji Nishimura.

#### Standard Distributions

SIMPROCESS includes many standard probability distributions. Some of the probability distributions have more than one implementation. SIMPROCESS includes 11 distributions from the Apache Commons Math library. These distributions include ACM in the distribution name.

#### Standard Distributions

BetaBinomialErlangExponentialGammaGeometricHyper ExponentialInverse GaussianInverted WeibullJohnson SBJohnson SULog-LogisticLog-LaplaceLognormalNegative BinomialNormalParetoPertBetaPearson Type VPearson Type VIPoissonRandom WalkTriangularUniformUniform IntegerWeibull Probability density function and cumulative density function for a beta distribution with shape1 = 1.5, shape2 = 5.0, minimum = 0.0, and maximum = 1.0.

Syntax: Bet(shape1, shape2, minimum, maximum, stream) or BetACM(shape1, shape2, minimum, maximum, stream)

The beta distribution (continuous) could be used to model the time required to perform some task when the possible values are restricted to the finite interval [minimum, maximum] (minimum >= 0.0, maximum >= 1.0, and maximum > minimum). Parameter restrictions for shape1 and shape2 are shape1 >= 0.0 and shape2 > 0.0. The density function is skewed to the left, symmetric, or skewed to the right if shape1 > shape2, shape1 = shape2, or shape1 < shape2, respectively. A beta distribution with shape1 = shape2 = 1 is a uniform distribution with the interval [0, 1]. Probability density function and cumulative density function for a binomial distribution with trials = 50 and probability = 0.5.

Syntax: Bin(trials, probability, stream) or BinACM(trials, probability, stream)

The binomial distribution (discrete) with parameters trials = t (a positive integer) and probability = p can be thought of as the distribution of the number of successes in t independent Bernoulli trials, where success occurs on each trial with a probability of p and failure occurs on each trial with a probability of 1 – p. A binomial distribution with trials = 1 is called a Bernoulli distribution with probability = p. Probability density function and cumulative density function for an Erlang distribution with mean = 5 and shape = 1.

Syntax: Erl(mean, shape, stream)

The Erlang distribution (continuous) could be used to model the time required to perform some task. If an Erlang distribution has parameters mean = m and shape = a, then b = m/a is a scale parameter. An Erlang distribution is just a gamma distribution whose shape parameter is a positive integer. The sum of k exponential random variables with mean = m is an Erlang distribution with mean = km and shape = k. Parameter restrictions are mean >= 0.0 and 0.0 < shape < 100.0. Probability density function and cumulative density function for an exponential distribution with mean = 1.

Syntax: Exp(mean, stream) or ExpACM(mean, stream)

The exponential distribution (continuous) is commonly used to model interarrival times of customers to some system when the arrival rate is approximately constant over the time period of interest. It is also sometimes used to model the time to failure of a piece of equipment. The mean of an exponential distribution is a scale parameter and must be greater than 0.0. An exponential distribution with mean = m is a gamma distribution with mean = m and shape = 1. An exponential distribution with mean = m is a Weibull distribution with shape = 1 and scale = m. If interarrival times of customers have an exponential distribution with mean = m, then the number of arrivals in any time interval of length t has a Poisson distribution (discrete) with mean = t/m. Probability density curve and cumulative density function for a gamma distribution with mean = 2 and shape = 1.

Syntax: Gam(mean, shape, stream) or GamACM(mean, shape, stream)

The gamma distribution (continuous) could be used to model the time required to perform some task. If a gamma distribution has parameters mean = m and shape = a, then b = m/a is a scale parameter. A gamma distribution with mean = m and shape = 1 is an exponential distribution with mean = m. When shape is a positive integer, the gamma distribution is an Erlang distribution. Parameter restrictions are mean >= 0.0 and 0.0 < shape < 100.0. Probability density function and cumulative density function for a geometric distribution with probability = 0.5.

Syntax: Geo(probability, stream)

The geometric distribution (discrete) with probability = p can be thought of as the distribution of the number of failures before the first success in a sequence of independent Bernoulli trials, where success occurs on each trial with a probability of p and failure occurs on each trial with a probability of 1 – p. Hyper Exponential distribution with mean1 = 5, mean2 = 10, and probability1 = 0.5.

Syntax: Hex(mean1, mean2, probability1, stream)

The hyper exponential distribution (continuous) is a mixture of two exponential distributions. Specifically, a hyper exponential distribution with parameters mean1, mean2, and probability1 takes on values from an exponential distribution with parameter mean1 with a probability of probability1 and takes on values from an exponential distribution with parameter mean2 with a probability of 1 – probability1. A hyper exponential distribution with probability1 = 1 is an exponential distribution with parameter mean1. Parameter restrictions are mean1 >= 0.0, mean2 > 0.0, and 0.0 <= probability1 <= 1.0. Probability density function and cumulative density function for an inverse Gaussian distribution with location = 0, scale = 1, and shape = 4.

Syntax: InG(location, scale, shape, stream)

The Inverse Gaussian distribution (continuous) could be used to model the time required to perform some task. Parameter restrictions are scale > 0.0 and shape > 0.0. Probability density function and cumulative density function for an Inverted Weibull distribution with location = 0, scale = 1, and shape = 2.

Syntax: InW(location, scale, shape, stream)

The Inverted Weibull distribution (continuous) could be used to model the time required to perform some task. The mean and variance are finite only if shape > 2. If the random variable X has an inverted Weibull distribution with location = 0, scale = b, and shape = a, then Y = 1/X has a Weibull distribution with scale = 1/b and shape = a. (The location parameter is 0.) Parameter restrictions are scale > 0.0 and shape > 0.0. Probability density function and cumulative density function for a Johnson SB distribution with minimum = 0, maximum = 1, shape1 = 2, and shape2 = 2.

Syntax: JSB(minimum, maximum, shape1, shape2, stream)

The Johnson SB distribution (continuous) could be used to model the time required to perform some task when the possible values are restricted to the finite interval [minimum, maximum]. The density function is skewed to the left, symmetric, or skewed to the right if shape1 > 0, shape1 = 0, or shape1 < 0, respectively. The Johnson SB distribution is closely related to the classical normal distribution. Parameter restrictions are shape2 > 0.0 and maximum > minimum. Probability density function and cumulative density function for a Johnson SU distribution with location = 0, scale = 1, shape1 = -2, and shape2 = 2.

Syntax: JSU(location, scale, shape1, shape2, stream)

The Johnson SU distribution (continuous) could be used to model a random variable that can take on any value between minus infinity and plus infinity. The density function is skewed to the left, symmetric, or skewed to the right if shape1 > 0, shape1 = 0, or shape1 < 0, respectively. The Johnson SU distribution is closely related to the classical normal distribution. Parameter restrictions are scale > 0.0 and shape2 > 0.0. Probability density function and cumulative density function for a Log-Logistic distribution with location = 0, scale = 1, and shape = 3.

Syntax: LLg(location, scale, shape, stream)

The log-logistic distribution (continuous) could be used to model the time required to perform some task. The mean and variance are finite only if shape > 2. Parameter restrictions are scale > 0.0 and shape > 0.0. Probability density function and cumulative density function for a Log-Laplace distribution with location = 0, scale = 1, and shape = 2.

Syntax: LLp(location, scale, shape, stream)

The Log-Laplace distribution (continuous) could be used to model the time required to perform some task. The mean and variance are finite only if shape > 2. Parameter restrictions are scale > 0.0 and shape > 0.0. Probability density function and cumulative density function for a lognormal distribution with mean = 2 and standard deviation = 0.5.

Syntax: Log(mean, standard deviation, stream) or LogACM(mean, standard deviation, stream)

The lognormal distribution could be used to model the time required to perform some task when “large” values sometimes occur. It is always skewed to the right and it has a longer right tail than the gamma or Weibull distributions. The lognormal distribution is closely related to the classical normal distribution. Furthermore, the parameters of the lognormal distribution, namely, mean and standard deviation, correspond to the lognormal distribution and are not the mean and standard deviation of the corresponding normal distribution. Parameter restrictions are mean > 0.0 and standard deviation > 0.0. Probability density function and cumulative density function for a negative binomial distribution with s = 5 and probability = 0.5.

Syntax: NgB(s, probability, stream)

The negative binomial distribution (discrete) with parameters s (> 0.0) and probability = p can be thought of as the distribution of the number of failures before the sth success in a sequence of independent Bernoulli trials, where success occurs on each trial with a probability of p and failure occurs on each trial with a probability of 1 – p. A negative binomial distribution with parameters s = 1 and probability = p is a geometric distribution with probability = p. Probability density function and cumulative density function for normal distribution function with mean = 10, standard deviation = 1. There are two normal distributions in SIMPROCESS. One only returns non-negative values (zero or higher), and the other will return negative values. For both it is required that standard deviation > 0.0.

Syntax: Nor(mean, standard deviation, stream) or NorACM(mean, standard deviation, stream) – returns only non-negative values

This distribution is similar to the classical normal distribution, but if a negative value is generated, it is rejected and new values are generated until a non-negative value is generated. In general, this distribution will not be a good model for the time required to perform some task, since task-time distributions are almost always skewed to the right.

Syntax: Nrm(mean, standard deviation, stream) – unbounded

This is the classical normal distribution, which is found in most statistics books. It takes on real values between minus infinity and plus infinity. The density function is the familiar “bell-shaped” curve, which is symmetric about the mean. The probability that a value is between the mean minus 2 standard deviations and the mean plus two standard deviations is approximately 0.95. This distribution should not be used to model the time required to perform some task, since the normal distribution can take on negative values. Furthermore, as stated above, the distribution of the time to perform some task is almost always skewed to the right, rather than being symmetric. Probability density function and cumulative density function for a Pareto distribution with location = 1 and shape = 2.

Syntax: Par(location, shape, stream)

The Pareto distribution (continuous) could be used to the model interarrival times of customers (e.g., messages) when the traffic occurs in bursts. The mean and variance are finite only if shape > 2. Parameter restrictions are location >= 0.0 and shape > 0.0. Probability density function and cumulative density function for a PertBeta distribution with minimum = 1, mode = 5, maximum = 10, and lambda = 4.

Syntax: Per(minimum, mode, maximum, lambda, stream)

The PertBeta distribution (continuous) can be used instead of the triangular distribution as a model for the time required to perform some task. The distribution produces a smooth curve and takes on values in the finite interval [minimum, maximum] (minimum >= 0.0, mode > minimum, and maximum >= mode), with values near the mode being most likely to occur. Subjective estimates of the three parameters are obtained from subject-matter experts. The mean of a PertBeta distribution is only equal to the mode when the distribution is symmetric. The lambda parameter is optional and defaults  to 4. Probability density function and cumulative density function for a Pearson type V distribution with location = 0, scale = 1, and shape = 2.

Syntax: PT5(location, scale, shape, stream)

The Pearson type V distribution (continuous) could be used to model the time required to perform some task. The mean and variance are finite only if shape > 2. The Pearson type V distribution is closely related to the gamma distribution. Parameter restrictions are scale > 0.0 and shape > 0.0. Probability density function and cumulative density function for a Pearson Type VI distribution with location = 0, scale = 1, shape1 = 3, and shape2 = 4.

Syntax: PT6(location, scale, shape1, shape2, stream)

The Pearson Type VI distribution (continuous) could be used to model the time required to perform some task. The density function can take on a wide variety of shapes because it has two shape parameters shape1 and shape2. The mean and variance are finite only if shape2 > 2.The Pearson Type VI distribution is closely related to the beta distribution. Parameter restrictions are scale > 0.0 and shape1 > 0.0. Poisson distribution with mean = 10.

Syntax: Poi(mean, stream) or PoiACM(mean, stream)

The Poisson distribution (discrete) with mean = m is the distribution of the number of customers that arrive to some system in any time interval of length 1 when the interarrival times have an exponential distribution (continuous) with mean = 1/m. The mean must be greater than 0.0. Probability density function and cumulative density function for a random walk distribution with location = 0, scale = 1, and shape = 3.

Syntax: RnW(location, scale, shape, stream)

The random walk distribution (continuous) could be used to model the time required to perform some task. Parameter restrictions are scale > 0.0 and shape > 0.0. Probability density function and cumulative density function for a Triangular distribution with minimum = 2, mode = 5, and maximum = 10.

Syntax: Tri(minimum, mode, maximum, stream) or TriACM(minimum, mode, maximum, stream)

The Triangular distribution (continuous) is typically used as a rough model for the time required to perform some task when no real-world data are available. A Triangular distribution takes on values in the finite interval [minimum, maximum] (minimum >= 0.0, mode > minimum, and maximum > mode), with values near the mode being most likely to occur. Subjective estimates of the three parameters are obtained from subject-matter experts. The mean of a Triangular distribution is only equal to the mode when the distribution is symmetric. Probability density function and cumulative density function for a Uniform distribution with minimum = 0 and maximum = 10.

Syntax: Uni(minimum, maximum, stream) or UniACM(minimum, maximum, stream)

The Uniform distribution (continuous) is equally likely to take on any real number in the finite interval [minimum, maximum] (minimum >= 0.0 and maximum > minimum). The real numbers produced by a random-number generator (appear to) have a Uniform distribution on the interval [0, 1]. Probability density function and cumulative density function for Uniform Integer distribution with minimum = 0 and maximum = 10.

Syntax: Int(minimum, maximum, stream) or IntACM(minimum, maximum, stream)

A Uniform Integer distribution (discrete) is equally likely to take on any integer in the finite interval [minimum, maximum], where minimum and maximum are integers with minimum >= 0 and minimum < maximum. Probability density function and cumulative density function for a Weibull distribution with shape = 1 and scale = 1.

Syntax: Wei(shape, scale, stream) or WeiACM(shape, scale, stream)

The Weibull distribution (continuous) could be used to model the time required to perform some task. It is also sometimes used to model the time to failure of a piece of equipment. A Weibull distribution with parameters shape = 1 and scale = b is an exponential distribution with mean = b. The Weibull distribution is skewed to the left when shape > 3.6. Parameter restrictions are shape >= 0.0 and scale > 0.0.

#### User-Defined Distributions

There are three types of user-defined distributions. A Standard distribution is a customization of an existing SIMPROCESS distribution. A Tabular distribution is a statistical distribution created from discrete data points using a table format. Auto Fit distributions are distributions created from sample data using ModelFit. The sample data can be in an ASCII file, spreadsheet, or database. This can be done on-demand or automatically at the beginning of a simulation. See Input Data Analysis for more information on Auto Fit distributions. Also, see Chapter 5 of Part A of the SIMPROCESS User’s Manual.