Evaluate the Design
Kelly Avery

A key question across test & evaluation is “How much testing is enough?” DOE can help decision makers “right-size” a test, providing a framework for building and evaluating test plans and assessing risk. Test designs should be realistic and executable. To avoid reporting artificially inflated metrics, design evaluations should always account for test structure and randomization. This lesson covers various methods and considerations for design evaluation, including power & confidence, factor identifiability, and executability. Leveraging statistical techniques in both test design and test analysis helps decision makers reach correct conclusions about system performance.


  1. Freeman, L. J., Johnson, T., Avery, M., Lillard, V. B., & Clutter, J. (2018). Testing Defense Systems. Analytic Methods in Systems and Software Testing, 441.
  2. Montgomery, D. C. (2017). Design and analysis of experiments (Ninth ed.): Hoboken, NJ: John Wiley & Sons, Inc.
  3. Director, Operational Test & Evaluation (2010). Guidance on the use of Design of Experiments (DOE) in Operational Test and Evaluation.
  4. Director, Operational Test & Evaluation (2013). Best Practices for Assessing the Statistical Adequacy of Experimental Designs Used in Operational Test and Evaluation.
  5. Office of the Director, Operational Test & Evaluation, Action Officer Training (2016). Experimental Designs. https://www.dote.osd.mil/Publications/Training/
  6. Morgan-Wall, T. (2020). skpr R package. https://www.rdocumentation.org/packages/skpr/versions/0.64.2