Session Title | Speaker | Type | Recording | Materials | Year |
---|---|---|---|---|---|
Breakout Panelist 2 |
Laura Freeman Director, Intelligent Systems Division Virginia Tech ![]() (bio)
Dr. Laura Freeman is a Research Associate Professor of Statistics and dual hatted as the Director of the Intelligent Systems Lab, Virginia Tech National Security Institute and the Director of the Information Sciences and Analytics Division, Virginia Tech Applied Research Corporation (VT-ARC). Her research leverages experimental methods for conducting research that brings together cyber-physical systems, data science, artificial intelligence (AI), and machine learning to address critical challenges in national security. She develops new methods for test and evaluation focusing on emerging system technology. In her role with VT-ARC she focuses on transitioning emerging research in these areas to solve challenges in Defense and Homeland Security. She is also a hub faculty member in the Commonwealth Cyber Initiative and leads research in AI Assurance. She is the Assistant Dean for Research for the College of Science, in that capacity she works to shape research directions and collaborations in across the College of Science in the Greater Washington D.C. area. Previously, Dr. Freeman was the Assistant Director of the Operational Evaluation Division at the Institute for Defense Analyses. Dr. Freeman has a B.S. in Aerospace Engineering, a M.S. in Statistics and a Ph.D. in Statistics, all from Virginia Tech. Her Ph.D. research was on design and analysis of experiments for reliability data. |
Breakout | Session Recording |
Recording | 2022 |
Breakout Waste Not, Want Not: A Methodological Illustration of Quantitative Text Analysis (Abstract)
“The wise use of one’s resources will keep one from poverty.” This is the definition of the proverbial saying “waste not, want not” according to www.dictionary.com. Indeed, one of the most common resources analysts encounter is text in free-form. This text might come from survey comments, feedback, websites, transcriptions of interviews, videos, etcetera. Notably, researchers have used wisely the information conveyed in text for many years. However, in many instances, the qualitative methods employed require numerous hours of reading, training, coding, and validating, among others. As technology continues to evolve, simple access to text data is blooming. For example, analysts conducting online studies can have thousands of text entries from participants’ comments. Even without recent advances in technology analysts have had access to text in books, letters, and other archival data for centuries. One important challenge, however, is figuring out how to make sense of text data without investing a large number of resources, time, and the effort involved in qualitative methodology or “old-school” quantitative approaches (such as reading a collection of 200 books and counting the occurrence of important terms in the text). This challenge has been solved in the information retrieval field –a branch of computer science—with the implementation of a technique called latent semantic analysis (LSA; Manning, Raghavan, & Schütze, 2008) and a closely related technique called topic analysis (TA; SAS Institute Inc., 2018). Undoubtedly, other quantitative methods for text analysis, such as latent Dirichlet analysis (Blei, Ng, & Jordan, 2003), are also apt for the task of unveiling knowledge from text data, but we restrict the discussion in this presentation to LSA and TA because these exclusively deal with the underlying structure of the text rather than identifying clusters. In this presentation, we aim to make quantitative text analysis –specifically LSA and TA– accessible to researchers and analysts from a variety of disciplines. We do this by leveraging understanding of a popular multivariate technique: principal components analysis (PCA). We start by describing LSA and TA by drawing comparisons and equivalencies to PCA. We make these comparisons in an intuitive, user-friendly manner and then through a technical description of mathematical statements, which rely on the singular value decomposition of a document-term matrix. Moreover, we explain the implementation of LSA and TA using statistical software to enable simple application of these techniques. Finally, we show a practical application of LSA and TA with empirical data of aircraft incidents. |
Laura Castro-Schilo | Breakout |
![]() | 2019 |
|
Tutorial Introduction to Structural Equation Modeling: Implications for Human-System Interactions (Abstract)
Structural Equation Modeling (SEM) is an analytical framework that offers unique opportunities for investigating human-system interactions. SEM is used heavily in the social and behavioral sciences, where emphasis is placed on (1) explanation rather than prediction, and (2) measuring variables that are not observed directly (e.g., perceived performance, satisfaction, quality, trust, etcetera). The framework facilitates modeling of survey data through confirmatory factor analysis and latent (i.e., unobserved) variable regression models. We provide a general introduction to SEM by describing what it is, the unique features it offers to analysts and researchers, and how it is easily implemented in JMP Pro 16.0. Attendees will learn how to perform path analysis and confirmatory factor analysis, assess model fit, compare alternative models, and interpret results provided in SEM. The presentation relies on a real-data example everyone can relate to. Finally, we shed light on a few published studies that have used SEM to unveil insights on human performance factors and the mechanisms by which performance is affected. The key goal of this presentation is to provide general exposure to a modeling tool that is likely new to most in the fields of defense and aerospace. |
Laura Castro-Schilo Sr. Research Statistician Developer SAS Institute ![]() (bio)
Laura Castro-Schilo works on structural equations models in JMP. She is interested in multivariate analysis and its application to different kinds of data; continuous, discrete, ordinal, nominal and even text. Previously, she was Assistant Professor at the L. L. Thurstone Psychometric Laboratory at the University of North Carolina at Chapel Hill. Dr. Castro-Schilo obtained her PhD in quantitative psychology from the University of California, Davis. |
Tutorial | Session Recording |
![]() Recording | 2021 |
Breakout Constructing Designs for Fault Location (Abstract)
Abstract. While fault testing a system with many factors each appearing at some number of levels, it may not be possible to test all combinations of factor levels. Most faults are caused by interactions of only a few factors, so testing interactions up to size t will often find all faults in the system without executing an exhaustive test suite. Call an assignment of levels to t of the factors a t-way interaction. A covering array is a collection of tests that ensures that every t-way interaction is covered by at least one test in the test suite. Locating arrays extend covering arrays with the additional feature that they not only indicate the presence of faults but locate the faulty interactions when there are no more than d faults in the system. If an array is (d, t)-locating, for every pair of sets of t-way interactions of size d, the interactions do not appear in exactly the same tests. This ensures that the faulty interactions can be differentiated from non-faulty interactions by the results of some test in which interactions from one set or the other but not both are tested. When the property holds for t-way interaction sets of size up to d, the notation (d, t ¯ ) is used. In addition to fault location, locating arrays have also been used to identify significant effects in screening experiments. Locating arrays are fairly new and few techniques have been explored for their construction. Most of the available work is limited to finding only one fault (d = 1). Known general methods require a covering array of strength t + d and produce many more tests than are needed. In this talk, we present Partitioned Search with Column Resampling (PSCR), a computational search algorithm to verify if an array is (d, t ¯ )-locating by partitioning the search space to decrease the number of comparisons. If a candidate array is not locating, random resampling is performed until a locating array is constructed or an iteration limit is reached. Algorithmic parameters determine which factor columns to resample and when to add additional tests to the candidate array. We use a 5 × 5 × 3 × 2 × 2 full factorial design to analyze the performance of the algorithmic parameters and provide guidance on how to tune parameters to prioritize speed, accuracy, or a combination of both. Last, we compare our results to the number of tests in locating arrays constructed for the factors and levels of real-world systems produced by other methods. |
Erin Lanus | Breakout |
![]() | 2019 |
|
Short Course Combinatorial Interaction Testing (Abstract)
This mini-tutorial provides an introduction to combinatorial interaction testing (CIT). The main idea behind CIT is to pseudo-exhaustively test software and hardware systems by covering combinations of components in order to detect faults. In 90 minutes, we provide an overview of this domain that includes the following topics: the role of CIT in software and hardware testing, how it complements and differs from design of experiments, considerations such as variable strength and constraints, the typical combinatorial arrays used for constructing test suites, and existing tools for test suite construction. Last, defense systems are increasingly relying on software with embedded machine learning (ML), yet ML poses unique challenges to applying conventional software testing due to characteristics such as the large input space, effort required for white box testing, and emergent behaviors apparent only at integration or system levels. As a well-studied black box approach to testing integrated systems with a pseudo-exhaustive strategy for handling large input spaces, CIT provides a good foundation for testing ML. In closing, we present recent research adapting concepts of combinatorial coverage to test design for ML. |
Erin Lanus Research Assistant Professor Virginia Tech ![]() (bio)
Erin Lanus is a Research Assistant Professor at the Hume Center for National Security and Technology at Virginia Tech. She has a Ph.D. in Computer Science with a concentration in cybersecurity from Arizona State University. Her experience includes work as a Research Fellow at University of Maryland Baltimore County and as a High Confidence Software and Systems Researcher with the Department of Defense. Her current interests are software and combinatorial testing, machine learning in cybersecurity, and artificial intelligence assurance. |
Short Course | Session Recording |
Materials
Recording | 2021 |
Tutorial Statistics Boot Camp (Abstract)
In the test community, we frequently use statistics to extract meaning from data. These inferences may be drawn with respect to topics ranging from system performance to human factors. In this mini-tutorial, we will begin by discussing the use of descriptive and inferential statistics. We will continue by discussing commonly used parametric and nonparametric statistics within the defense community, ranging from comparisons of distributions to comparisons of means. We will conclude with a brief discussion of how to present your statistical findings graphically for maximum impact. |
Stephanie Lane Research Staff Member IDA |
Tutorial |
![]() | 2018 |
|
Tutorial Evolving Statistical Tools (Abstract)
In this session, researchers from the Institute for Defense Analyses (IDA) present a collection of statistical tools designed to meet ongoing and emerging needs for planning, designing, and evaluating operational tests. We first present a suite of interactive applications hosted on test.testscience.testscience.org that are designed to address common analytic needs in the operational test community. These freely available resources include tools for constructing confidence intervals, computing statistical power, comparing distributions, and computing Bayesian reliability. Next, we discuss four dedicated software tools: JEDIS – a JMP Add-In for automating power calculations for designed experiments skpr – an R package for generating optimal experimental designs and easily evaluating power for normal and non-normal response variables ciTools – an R package for quickly and simply generating confidence intervals and quantifying uncertainty for simple and complex linear models nautilus – an R package for visualizing and analyzing aspects of sensor performance, such as detection range and track completeness |
Stephanie Lane Research Staff Member IDA |
Tutorial | Materials | 2018 |
|
Breakout A Survey of Statistical Methods in Aeronautical Ground Testing |
Drew Landman | Breakout |
![]() | 2019 |
|
Breakout Statistically Defensible Experiment Design for Wind Tunnel Characterization of Subscale Parachutes for Mission to Mars (Abstract)
https://s3.amazonaws.com/workshop-archives-2016/IDA+Workshop+2016/testsciencemeeting.ida.org/pdfs/1b-ExperimentalDesignMethodsandApplications.pdf |
Drew Landman Old Dominion Univerity |
Breakout | Materials | 2016 |
|
Breakout DOE Case Studies in Aerospace Research and Development (Abstract)
This presentation will provide a high level view of recent DOE applications to aerospace research. Two broad categories are defined, aerodynamic force measurement system calibrations and aircraft model wind tunnel aerodynamic characterization. Each case study will outline the application of DOE principles including design choices, accommodations for deviations from classical DOE approaches, discoveries, and practical lessons learned. Case Studies Aerodynamic Force Measurement System Calibrations Large External Wind Tunnel Balance Calibration – Fractional factorial – Working with non-ideal factor settings – Customer driven uncertainty assessment Internal Balance Calibration Including Temperature – Restrictions to randomization – split plot design requirements – Constraints to basic force model – Crossed design approach Aircraft Model Wind Tunnel Aerodynamic Characterization The NASA/Boeing X-48B Blended Wing Body Low-Speed Wind Tunnel Test – Overcoming a culture of OFAT – General approach to investigating a new aircraft configuration – Use of automated wind tunnel models and randomization – Power of residual analysis in detecting problems NASA GL–10 UAV Aerodynamic Characterization – Use of the Nested-Face Centered Design for aerodynamic characterization – Issues working with over 20 factors – Discoveries |
Drew Landman Old Dominion University |
Breakout | Materials | 2017 |
|
Short Course A Practitioner’s Guide to Advanced Topics in DOE (Abstract)
Having completed a first course in DOE and begun to apply these concepts, engineers and scientists quickly learn that test and evaluation often demands knowledge beyond the use of classical designs. This one-day short course, taught by an engineer from a practitioner’s perspective, targets this problem. Three broad areas are covered:
The course format is to introduce relevant background material, discuss case studies, and provide software demonstrations. Case studies and demonstrations are derived from a variety of sources, including aerospace testing and DOD T&E. Learn design approaches, design comparison metrics, best practices, and lessons learned from the instructor’s experience. A first course in Design of Experiments is a prerequisite. |
Drew Landman Professor Old Dominion University ![]() (bio)
Drew Landman has 34 years of experience in engineering education as a professor at Old Dominion University. Dr. Landman’s career highlights include13 years (1996-2009) as chief engineer at the NASA Langley Full-Scale Wind Tunnel in Hampton, VA. Landman was responsible for program management, test design, instrument design and calibration and served as the lead project engineer for many automotive, heavy truck, aircraft, and unmanned aerial vehicle wind tunnel tests including the Centennial Wright Flyer and the Boeing X-48B and C. His research interests and sponsored programs are focused on wind tunnel force measurement systems and statistically defensible experiment design primarily to support wind tunnel testing. Dr. Landman has served as a consultant and trainer in the area of statistical engineering to test and evaluation engineers and scientists at AIAA, NASA, Aerovironment, Airbus, Aerion, ATI, USAF, US Navy, US Marines and the Institute for Defense Analysis. Landman founded a graduate course sequence in statistical engineering within the ODU Department of Mechanical and Aerospace Engineering. He is currently co-authoring a text on wind tunnel test techniques. |
Short Course | Materials | 2022 |
|
Breakout Software Reliability and Security Assessment: Automation and Frameworks (Abstract)
Software reliability models enable several quantitative predictions such as the number of faults remaining, failure rate, and reliability (probability of failure free operation for a specified period of time in a specified environment). This talk will describe recent efforts in collaboration with NASA, including (1) the development of an automated script for the SFRAT (Software Failure and Reliability Assessment Tool) to streamline application of software reliability methods to ongoing programs, (2) application to a NASA program, (3) lessons learned, (4) and future directions for model and tool development to support the practical needs of the software reliability and security assessment frameworks. |
Lance Fiondella | Breakout |
![]() | 2019 |
|
Quantifying the Impact of Staged Rollout Policies on Software Process and Product Metrics (Abstract)
Software processes define specific sequences of activities performed to effectively produce software, whereas tools provide concrete computational artifacts by which these processes are carried out. Tool independent modeling of processes and related practices enable quantitative assessment of software and competing approaches. This talk presents a framework to assess an approach employed in modern software development known as staged rollout, which releases new or updated software features to a fraction of the user base in order to accelerate defect discovery without imposing the possibility of failure on all users. The framework quantifies process metrics such as delivery time and product metrics, including reliability, availability, security, and safety, enabling tradeoff analysis to objectively assess the quality of software produced by vendors, establish baselines, and guide process and product improvement. Failure data collected during software testing is employed to emulate the approach as if the project were ongoing. The underlying problem is to identify a policy that decides when to perform various stages of rollout based on the software’s failure intensity. The illustrations examine how alternative policies impose tradeoffs between two or more of the process and product metrics. |
Lance Fiondella Associate Professor University of Massachusetts Dartmouth ![]() (bio)
Lance Fiondella is an associate professor of Electrical and Computer Engineering at the University of Massachusetts Dartmouth and the Director of the UMassD Cybersecurity Center, a NSA/DHS designated Center of Academic Excellence in Cyber Research. |
Session Recording |
![]() Recording | 2022 |
|
Breakout Combinational Testing (Abstract)
Combinatorial methods have attracted attention as a means of providing strong assurance at reduced cost. Combinatorial testing takes advantage of the interaction rule, which is based on analysis of thousands of software failures. The rule states that most failures are induced by single factor faults or by the joint combinatorial effect (interaction) of two factors, with progressively fewer failures induced by interactions between three or more factors. Therefore if all faults in a system can be induced by a combination of t or fewer parameters, then testing all t-way combinations of parameter values is pseudo-exhaustive and provides a high rate of fault detection. The talk explains background, method, and tools available for combinatorial testing. New results on using combinatorial methods for oracle-free testing of certain types of applications will also be introduced |
Rick Kuhn NIST |
Breakout | Materials | 2017 |
|
Tutorial Introduction to Qualitative Methods – Part 1 (Abstract)
Qualitative data, captured through freeform comment boxes, interviews, focus groups, and activity observation is heavily employed in testing and evaluation (T&E). The qualitative research approach can offer many benefits, but knowledge of how to implement methods, collect data, and analyze data according to rigorous qualitative research standards is not broadly understood within the T&E community. This tutorial offers insight into the foundational concepts of method and practice that embody defensible approaches to qualitative research. We discuss where qualitative data comes from, how it can be captured, what kind of value it offers, and how to capitalize on that value through methods and best practices. |
Kristina Carter Research Staff Member Institute for Defense Analyses ![]() (bio)
Dr. Kristina Carter is a Research Staff Member at the Institute for Defense Analyses in the Operational Evaluation Division where she supports the Director, Operational Test and Evaluation (DOT&E) in the use of statistics and behavioral science in test and evaluation. She joined IDA full time in 2019 and her work focuses on the measurement and evaluation of human-system interaction. Her areas of expertise include design of experiments, statistical analysis, and psychometrics. She has a Ph.D. in Cognitive Psychology from Ohio University, where she specialized in quantitative approaches to judgment and decision making. |
Tutorial | Session Recording |
![]() Recording | 2021 |
Tutorial Evolving Statistical Tools (Abstract)
In this session, researchers from the Institute for Defense Analyses (IDA) present a collection of statistical tools designed to meet ongoing and emerging needs for planning, designing, and evaluating operational tests. We first present a suite of interactive applications hosted on test.testscience.testscience.org that are designed to address common analytic needs in the operational test community. These freely available resources include tools for constructing confidence intervals, computing statistical power, comparing distributions, and computing Bayesian reliability. Next, we discuss four dedicated software tools: JEDIS – a JMP Add-In for automating power calculations for designed experiments skpr – an R package for generating optimal experimental designs and easily evaluating power for normal and non-normal response variables ciTools – an R package for quickly and simply generating confidence intervals and quantifying uncertainty for simple and complex linear models nautilus – an R package for visualizing and analyzing aspects of sensor performance, such as detection range and track completeness |
Kevin Kirshenbaum Research Staff Member IDA |
Tutorial | Materials | 2018 |
|
Contributed Model credibility in statistical reliability analysis with limited data (Abstract)
Due to financial and production constraints, it has become increasingly common for analysts and test planners in defense applications to find themselves working with smaller amounts of data than seen in industry. These same analysts are also being asked to make strong statistical statements based on this limited data. For example, a common goal is ‘demonstrating’ a high reliability requirement with sparse data. In such situations, strong modeling assumptions are often used to achieve the desired precision. Such model-driven actions contain levels of risk that customers may not be aware of and may be too high to be considered acceptable. There is a need to articulate and mitigate risk associated with model form error in statistical reliability analysis. In this work, we review different views on model credibility from the statistical literature and discuss how these notions of credibility apply in data-limited settings. Specifically, we consider two different perspectives on model credibility: (1) data-driven credibility metrics for model fit, (2) credibility assessments based on consistency of analysis results with prior beliefs. We explain how these notions of credibility can be used to drive test planning and recommend an approach to presenting analysis results in data-limited settings. We apply this approach to two case studies from reliability analysis: Weibull analysis and Neyer-D optimal test plans. |
Caleb King Sandia National Laboratories |
Contributed | Materials | 2018 |
|
Webinar I have the Power! Power Calculation in Complex (and Not So Complex) Modeling Situations Part 1 (Abstract)
Invariably, any analyst who has been in the field long enough has heard the dreaded questions: “Is X number of samples enough? How much data do I need for my experiment?” Ulterior motives aside, any investigation involving data must ultimately answer the question of “How many?” to avoid risking either insufficient data to detect a scientifically significant effect or having too much data leading to a waste of valuable resources. This can become particularly difficult when the underlying model is complex (e.g. longitudinal designs with hard-to-change factors, time-to-event response with censoring, binary responses with non-uniform test levels, etc.). Even in the supposedly simpler case of categorical factors, where run size is often chosen using a lower bound power calculation, a simple approach can mask more “powerful” techniques. In this tutorial, we will spend the first half exploring how to use simulation to perform power calculations in complex modeling situations drawn from relevant defense applications. Techniques will be illustrated using both R and JMP Pro. In the second half, we will investigate the case of categorical factors and illustrate how treating the unknown effects as random variables induces a distribution on statistical power, which can then be used as a new way to assess experimental designs. Instructor Bio: Caleb King is a Research Statistician Tester for the DOE platform in the JMP software. He received his MS and PhD in Statistics from Virginia Tech and worked for three years as a statistical scientist at Sandia National Laboratories prior to arriving at JMP. His areas of expertise include optimal design of experiments, accelerated testing, reliability analysis, and small-sample theory |
Caleb King JMP Division, SAS Institute Inc. ![]() |
Webinar | Session Recording |
Materials
Recording | 2020 |
Breakout Estimating Pure-Error from Near Replicates in Design of Experiments (Abstract)
In design of experiments, setting exact replicates of factor settings enables estimation of pure-error; a model-independent estimate of experimental error useful in communicating inherent system noise and testing of model lack-of-fit. Often in practice, the factor levels for replicates are precisely measured rather than precisely set, resulting in near-replicates. This can result in inflated estimates of pure-error due to uncompensated set-point variation. In this article, we review previous strategies for estimating pure-error from near-replicates and propose a simple alternative. We derive key analytical properties and investigate them via simulation. Finally, we illustrate the new approach with an application. |
Caleb King Research Statistician Developer SAS Institute ![]() |
Breakout |
![]() | 2021 |
|
Keynote Closing Remarks (Abstract)
Mr. William (Allen) Kilgore serves as Director, Research Directorate at NASA Langley Research Center. He previously served as Deputy Director of Aerosciences providing executive leadership and oversight for the Center’s Aerosciences fundamental and applied research and technology capabilities with the responsibility over Aeroscience experimental and computational research. After being appointed to the Senior Executive Service (SES) in 2013, Mr. Kilgore served as the Deputy Director, Facilities and Laboratory Operations in the Research Directorate. Prior to this position, Mr. Kilgore spent over twenty years in the operations of NASA Langley’s major aerospace research facilities including budget formulation and execution, maintenance, strategic investments, workforce planning and development, facility advocacy, and integration of facilities’ schedules. During his time at Langley, he has worked in nearly all of the major wind tunnels with a primary focus on process controls, operations and testing techniques supporting aerosciences research. For several years, Mr. Kilgore led the National Transonic Facility, the world’s largest cryogenic wind tunnel. Mr. Kilgore has been at NASA Langley Research Center since 1989, starting as a graduate student. Mr. Kilgore earned a B.S. and M.S. in Mechanical Engineering with concentration in dynamics and controls from Old Dominion University in 1984 and 1989, respectively. He is the recipient of NASA’s Exceptional Engineering Achievement Medal in 2008 and Exceptional Service Medal in 2012. |
William “Allen” Kilgore Director, Research Directorate NASA Langley Research Center ![]() |
Keynote | Session Recording |
Recording | 2021 |
Breakout The System Usability Scale: A measurement Instrument Should Suit the Measurement Needs (Abstract)
The System Usability Scale (SUS) was developed by John Brooke in 1986 “to take a quick measurement of how people perceived the usability of (office) computer systems on which they were working.” The SUS is a 10-item, generic usability scale that is assumed to be system agnostic, and it results in a numerical score that ranges from 0-100. It has been widely employed and researched with non-military systems. More recently, it has been strongly recommended for use with military systems in operational test and evaluation, in part because of its widespread commercial use, but largely because it produces a numerical score that makes it amendable to statistical operations. Recent lessons learned with SUS in operational test and evaluation strongly question its use with military systems, most of which differ radically from non-military systems. More specifically, (1) usability measurement attributes need to be tailored to the specific system under test and meet the information needs of system users, and (2) a SUS numerical cutoff score of 70—a common benchmark with non-military systems—does not accurately reflect “system usability” from an operator or test team perspective. These findings will be discussed in a psychological and human factors measurement context, and an example of system-specific usability attributes will be provided as a viable way forward. In the event that the SUS is used in operational test and evaluation, some recommendations for interpreting the outcomes will be provided. |
Keith Kidder AFOTEC |
Breakout | Materials | 2017 |
|
Tutorial Case Study on Applying Sequential Methods in Operational Testing (Abstract)
Sequential methods concerns statistical evaluation in which the number, pattern, or composition of the data is not determined at the start of the investigation but instead depends on the information acquired during the investigation. Although sequential methods originated in ballistics testing for the Department of Defense (DoD), it is underutilized in the DoD. Expanding the use of sequential methods may save money and reduce test time. In this presentation, we introduce sequential methods, describe its potential uses in operational test and evaluation (OT&E), and present a method for applying it to the test and evaluation of defense systems. We evaluate the proposed method by performing simulation studies and applying the method to a case study. Additionally, we discuss some of the challenges we might encounter when using sequential analysis in OT&E. |
Keyla Pagán-Rivera Research Staff Member IDA ![]() (bio)
Dr. Keyla Pagán-Rivera has a Ph.D. in Biostatistics from The University of Iowa and serves as a Research Staff Member in the Operational Evaluation Division at the Institute for Defense Analyses. She supports the Director, Operational Test and Evaluation (DOT&E) on training, research and applications of statistical methods. |
Tutorial | Session Recording |
![]() Recording | 2022 |
Breakout STAT and UQ Implementation Lessons Learned (Abstract)
David Harrison and Kelsey Cannon from Lockheed Martin Space will present on STAT and UQ implementation lessons learned within Lockheed Martin. Faced with training 60,000 engineers in statistics, David and Kelsey formed a plan to make STAT and UQ processes the standard at Lockheed Martin. The presentation includes a range of information from initial communications plan, to obtaining leader adoption, to training engineers across the corporation. Not all programs initially accepted this process, but implementation lessons have been learned over time as many compounding successes and savings have been recorded. ©2022 Lockheed Martin, all rights reserved |
Kelsey Cannon Materials Engineer Lockheed Martin ![]() (bio)
Kelsey Cannon is a Senior Research Scientist at Lockheed Martin Space, previously completing a Specialty Engineering rotation program where she worked in a variety of environments and roles. Kelsey currently works with David Harrison, the statistical engineering SME at LM, to implement technical principles and a communications plan throughout the corporation. Kelsey holds a BS in Metallurgical and Materials Engineering from the Colorado School of Mines and is nearing completion of a MS in Computer Science and Data Science. |
Breakout | Session Recording |
![]() Recording | 2022 |
Tutorial Tutorial: Statistics Boot Camp (Abstract)
In the test community, we frequently use statistics to extract meaning from data. These inferences may be drawn with respect to topics ranging from system performance to human factors. In this mini-tutorial, we will begin by discussing the use of descriptive and inferential statistics, before exploring the basics of interval estimation and hypothesis testing. We will introduce common statistical techniques and when to apply them, and conclude with a brief discussion of how to present your statistical findings graphically for maximum impact. |
Kelly Avery IDA |
Tutorial |
![]() | 2019 |
|
Roundtable Test Design and Analysis for Modeling & Simulation Validation (Abstract)
System evaluations increasingly rely on modeling and simulation (M&S) to supplement live testing. It is thus crucial to thoroughly validate these M&S tools using rigorous data collection and analysis strategies. At this roundtable, we will identify and discuss some of the core challenges currently associated with implementing M&S validation for T&E. First, appropriate design of experiments (DOE) for M&S is not universally adopted across the T&E community. This arises in part due to limited knowledge of gold standard techniques from academic research (e.g., space filing designs; Gaussian Process emulators) as well as lack of expertise with the requisite software tools. Second, T&E poses unique demands in testing, such as extreme constraints in live testing conditions and reliance on binary outcomes. There is no consensus on how to incorporate these needs into the existing academic framework for M&S. Finally, some practical considerations lack clear solutions yet have direct consequences on design choice. In particular, we may discuss the following: (1) sample size determination when calculating power and confidence is not applicable, and (2) non-deterministic M&S output with high levels of noise, which may benefit from replication samples as in classical DOE. |
Kelly Avery Research Staff Member Institute for Defense Analyses ![]() (bio)
Kelly M. Avery is a Research Staff Member at the Institute for Defense Analyses. She supports the Director, Operational Test and Evaluation (DOT&E) on the use of statistics in test & evaluation and modeling & simulation, and has designed tests and conducted statistical analyses for several major defense programs including tactical aircraft, missile systems, radars, satellite systems, and computer-based intelligence systems. Her areas of expertise include statistical modeling, design of experiments, modeling & simulation validation, and statistical process control. Dr. Avery has a B.S. in Statistics, a M.S. in Applied Statistics, and a Ph.D. in Statistics, all from Florida State University. |
Roundtable | 2021 |
Session Title | Speaker | Type | Recording | Materials | Year |
---|---|---|---|---|---|
Breakout Panelist 2 |
Laura Freeman Director, Intelligent Systems Division Virginia Tech ![]() |
Breakout | Session Recording |
Recording | 2022 |
Breakout Waste Not, Want Not: A Methodological Illustration of Quantitative Text Analysis |
Laura Castro-Schilo | Breakout |
![]() | 2019 |
|
Tutorial Introduction to Structural Equation Modeling: Implications for Human-System Interactions |
Laura Castro-Schilo Sr. Research Statistician Developer SAS Institute ![]() |
Tutorial | Session Recording |
![]() Recording | 2021 |
Breakout Constructing Designs for Fault Location |
Erin Lanus | Breakout |
![]() | 2019 |
|
Short Course Combinatorial Interaction Testing |
Erin Lanus Research Assistant Professor Virginia Tech ![]() |
Short Course | Session Recording |
Materials
Recording | 2021 |
Tutorial Statistics Boot Camp |
Stephanie Lane Research Staff Member IDA |
Tutorial |
![]() | 2018 |
|
Tutorial Evolving Statistical Tools |
Stephanie Lane Research Staff Member IDA |
Tutorial | Materials | 2018 |
|
Breakout A Survey of Statistical Methods in Aeronautical Ground Testing |
Drew Landman | Breakout |
![]() | 2019 |
|
Breakout Statistically Defensible Experiment Design for Wind Tunnel Characterization of Subscale Parachutes for Mission to Mars |
Drew Landman Old Dominion Univerity |
Breakout | Materials | 2016 |
|
Breakout DOE Case Studies in Aerospace Research and Development |
Drew Landman Old Dominion University |
Breakout | Materials | 2017 |
|
Short Course A Practitioner’s Guide to Advanced Topics in DOE |
Drew Landman Professor Old Dominion University ![]() |
Short Course | Materials | 2022 |
|
Breakout Software Reliability and Security Assessment: Automation and Frameworks |
Lance Fiondella | Breakout |
![]() | 2019 |
|
Quantifying the Impact of Staged Rollout Policies on Software Process and Product Metrics |
Lance Fiondella Associate Professor University of Massachusetts Dartmouth ![]() |
Session Recording |
![]() Recording | 2022 |
|
Breakout Combinational Testing |
Rick Kuhn NIST |
Breakout | Materials | 2017 |
|
Tutorial Introduction to Qualitative Methods – Part 1 |
Kristina Carter Research Staff Member Institute for Defense Analyses ![]() |
Tutorial | Session Recording |
![]() Recording | 2021 |
Tutorial Evolving Statistical Tools |
Kevin Kirshenbaum Research Staff Member IDA |
Tutorial | Materials | 2018 |
|
Contributed Model credibility in statistical reliability analysis with limited data |
Caleb King Sandia National Laboratories |
Contributed | Materials | 2018 |
|
Webinar I have the Power! Power Calculation in Complex (and Not So Complex) Modeling Situations Part 1 |
Caleb King JMP Division, SAS Institute Inc. ![]() |
Webinar | Session Recording |
Materials
Recording | 2020 |
Breakout Estimating Pure-Error from Near Replicates in Design of Experiments |
Caleb King Research Statistician Developer SAS Institute ![]() |
Breakout |
![]() | 2021 |
|
Keynote Closing Remarks |
William “Allen” Kilgore Director, Research Directorate NASA Langley Research Center ![]() |
Keynote | Session Recording |
Recording | 2021 |
Breakout The System Usability Scale: A measurement Instrument Should Suit the Measurement Needs |
Keith Kidder AFOTEC |
Breakout | Materials | 2017 |
|
Tutorial Case Study on Applying Sequential Methods in Operational Testing |
Keyla Pagán-Rivera Research Staff Member IDA ![]() |
Tutorial | Session Recording |
![]() Recording | 2022 |
Breakout STAT and UQ Implementation Lessons Learned |
Kelsey Cannon Materials Engineer Lockheed Martin ![]() |
Breakout | Session Recording |
![]() Recording | 2022 |
Tutorial Tutorial: Statistics Boot Camp |
Kelly Avery IDA |
Tutorial |
![]() | 2019 |
|
Roundtable Test Design and Analysis for Modeling & Simulation Validation |
Kelly Avery Research Staff Member Institute for Defense Analyses ![]() |
Roundtable | 2021 |