![]() ![]() We discuss type-I error, p-values, type-II error, effect sizes, and statistical power. We review input modeling in general and then briefly review fundamentals of hypothesis testing. In this lecture, we (nearly) finish our coverage of Input Modeling, where the focus of this lecture is on parameter estimation and assessing goodness of fit. Our goal was to discuss the difference between point estimation and interval estimation for simulation, but we will hold off to discuss that topic in the next lecture. We will pick up next time with discussing details related to performance measures (and methods) for transient simulations next time and steady-state simulations after that. We then switch to focusing on simulations and their outputs, starting with the definition of terminating and non-terminating systems as well as the related transient and steady-state simulations. We also introduce a few more advanced statistical topics, such as non-parametric methods and special high-power tests for normality. Most of this lecture is a review of statistics and reasons for the assumptions for various parametric and non-exact non-parametric methods. In this lecture, we introduce the estimation of absolute performance measures in simulation – effectively shifting our focus from validating input models to validating and making inferences about simulation outputs. We then generalize to the case of more than 2 systems, particularly for "ranking and selection (R&S)." This lets us review the multiple-comparisons problem (and Bonferroni correction) and how post hoc tests (after an ANOVA) are more statistically powerful ways to do comparisons. Each of these different experimental conditions sets up a different standard error of the mean formula and formula for degrees of freedom that are used to define the actual confidence interval half widths (centered on the difference in sample means in the pairwise comparison of systems). This means covering confidence interval half widths for the paired-difference t-test, the equal-variance (pooled) t-test, and Welch's unequal variance t-test. ![]() We introduce two-sample confidence intervals (i.e., confidence intervals on DIFFERENCES based on different two-sample t-tests) that are tested against a null hypothesis of 0. In this lecture, we review what we have learned about one-sample confidence intervals (i.e., how to use them as graphical versions of one-sample t-tests) for absolute performance estimation in order to motivate the problem of relative performance estimation. We start to discuss control variates (CVs), but that discussion will be picked up at the start of the next lecture. We discuss Common Random Numbers (CRNs), which use a paired/blocked design to reduce the variance caused by different random-number streams. After that review, we move on to introducing variance reduction techniques (VRTs) which reduce the size of confidence intervals by experimentally controlling/accounting for alternative sources of variance (and thus reducing the observed variance in response variables). ![]() We then move to the ranking and selection problem for three or more different simulation models, which allows us to talk about analysis of variance (ANOVA) and post hoc tests (like the Tukey HSD or Fisher's LSD). We then move to different ways to use confidence intervals on mean DIFFERENCES to compare two different simulation models. This begins with a reminder of the use of confidence intervals for estimation of performance for a single simulation model. In this lecture, we start by reviewing approaches for absolute and relative performance estimation in stochastic simulation. ![]()
0 Comments
Leave a Reply. |