#1
01-04-2008
|
|
Vice President Technical Services
|
|
Join Date: Jul 2007
Location: Utah
Posts: 272
Downloads: 158
Uploads: 14
Thanks: 102
Thanked 304 Times in 110 Posts
Rep Power: 412
|
|
Explanation of empirical, cempirical and dempirical distributions
We receive several questions regarding the use of empirical distributions in Flexsim. I would like to review some important points in hopes that it will clear up some confusion.
1. Flexsim uses standard global tables to hold the list of values and their associated probability percentages. Probability percentages are entered in column 1 starting with row 1 of the table, and the associated values are entered in column 2. The table may have as many rows as needed to define all values. The percents are entered as numbers between 0 and 100.
2. There are three commands in Flexsim that can be used to generate random samples from the empirical distributions defined in these global tables. The three commands are dempirical(), empirical() and cempirical(). The first one is discrete and the last two are continuous, meaning it either returns an explicit value, or a continous continous number within a range.
3. Let's assume I make a global table with 4 rows and 2 columns. In column one I enter the percentages 10, 20, 30 and 40 adding up to 100 percent. In column two I enter the numbers 0.1, 0.2, 0.3 and 0.4. Now let's see the difference between how the discrete empirical distribution command works and how the continuous empirical distributions work. The discrete command will return the exact values that were entered into column 2 of the table. The continuous commands will return real values uniformly distributed between the values listed in column 2 of the table. The difference between the two continuous commands is in how the bounds of the uniform ranges are defined.
4. The command dempirical("mytable") will return the number 0.1 for ten percent of the samples, the number 0.2 for twenty percent of the samples, the number 0.3 for thirty percent of the samples, and the number 0.4 for fourty percent of the samples.
5. The command empirical("mytable") will return a number uniformly distributed between 0.1 and 0.2 for ten percent of the samples, a number uniformly distributed between 0.2 and 0.3 for twenty percent of the samples, a number uniformly distributed between 0.3 and 0.4 for thirty percent of the samples and a number uniformly distributed between 0.4 and 0.4 for fourty percent of the samples.
6. The command cempirical("mytable") will return a number uniformly distributed between 0.0 and 0.1 for ten percent of the samples, a number uniformly distributed between 0.1 and 0.2 for twenty percent of the samples, a number uniformly distributed between 0.2 and 0.3 for thirty percent of the samples and a number uniformly distributed between 0.3 and 0.4 for fourty percent of the samples.
7. Here is a summary in tabular form showing the possible return values ("x") for each of the three distribution functions :
8. Here is an example of adding a "dummy" first row so that cempirical() starts at 0.05 instead of 0.0 (notice that row 1 has a probability of 0 percent):
9. Here is an example of adding a "dummy" last row so that empirical() has a range between each value including the last (any number can be entered for the percent of the last row because the percents already add up to 100% with the previous row, so it doesn't matter what it is):
10. When using ExpertFit to determine an empirical distribution that matches your data, you need to be aware that if your data has been defined as integers, then ExpertFit will fit it for use with dempirical(), and if your data has been defined as real numbers, ExpertFit will fit it for use with empirical(). When your data set is composed of real numbers, ExpertFit will show the same percentage for the last value as for the previous to last value, but you'll notice that the percents add up to 100% with the second to last entry.
|