Session Title | Speaker | Type | Recording | Materials | Year | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Webinar A Practical Introduction To Gaussian Process Regression (Abstract)
Abstract: Gaussian process regression is ubiquitous in spatial statistics, machine learning, and the surrogate modeling of computer simulation experiments. Fortunately their prowess as accurate predictors, along with an appropriate quantification of uncertainty, does not derive from difficult-to-understand methodology and cumbersome implementation. We will cover the basics, and provide a practical tool-set ready to be put to work in diverse applications. The presentation will involve accessible slides authored in Rmarkdown, with reproducible examples spanning bespoke implementation to add-on packages. Instructor Bio: Robert Gramacy is a Professor of Statistics in the College of Science at Virginia Polytechnic and State University (Virginia Tech). Previously he was an Associate Professor of Econometrics and Statistics at the Booth School of Business, and a fellow of the Computation Institute at The University of Chicago. His research interests include Bayesian modeling methodology, statistical computing, Monte Carlo inference, nonparametric regression, sequential design, and optimization under uncertainty. Professor Gramacy is a computational statistician. He specializes in areas of real-data analysis where the ideal modeling apparatus is impractical, or where the current solutions are inefficient and thus skimp on fidelity. Such endeavors often require new models, new methods, and new algorithms. His goal is to be impactful in all three areas while remaining grounded in the needs of a motivating application. His aim is to release general purpose software for consumption by the scientific community at large, not only other statisticians. Professor Gramacy is the primary author on six R packages available on CRAN, two of which (tgp, and monomvn) have won awards from statistical and practitioner communities. |
Robert “Bobby” Gramacy Virginia Tech ![]() (bio)
Robert Gramacy is a Professor of Statistics in the College of Science at Virginia Polytechnic and State University (Virginia Tech). Previously he was an Associate Professor of Econometrics and Statistics at the Booth School of Business, and a fellow of the Computation Institute at The University of Chicago. His research interests include Bayesian modeling methodology, statistical computing, Monte Carlo inference, nonparametric regression, sequential design, and optimization under uncertainty. Professor Gramacy is a computational statistician. He specializes in areas of real-data analysis where the ideal modeling apparatus is impractical, or where the current solutions are inefficient and thus skimp on fidelity. Such endeavors often require new models, new methods, and new algorithms. His goal is to be impactful in all three areas while remaining grounded in the needs of a motivating application. His aim is to release general purpose software for consumption by the scientific community at large, not only other statisticians. Professor Gramacy is the primary author on six R packages available on CRAN, two of which (tgp, and monomvn) have won awards from statistical and practitioner communities. |
Webinar |
![]() | 2020 |
|||||||||
Webinar Can AI Predict Human Behavior? (Abstract)
Given the rapid increase of novel machine learning applications in cybersecurity and people analytics, there is significant evidence that these tools can give meaningful and actionable insights. Even so, great care must be taken to ensure that automated decision making tools are deployed in such a way as to mitigate bias in predictions and promote security of user data. In this talk, Dr. Burns will take a deep dive into an open source data set in the area of people analytics, demonstrating the application of basic machine learning techniques, while discussing limitations and potential pitfalls in using an algorithm to predict human behavior. In the end, Dustin will draw a comparison between the potential to predict human behavioral propensity to things such as becoming an insider threat to how assisted diagnosis tools are used in medicine to predict development or reoccurrence of illnesses. |
Dustin Burns Senior Scientist Exponent ![]() (bio)
Dr. Dustin Burns is a Senior Scientist in the Statistical and Data Sciences practice at Exponent, a multidisciplinary scientific and engineering consulting firm dedicated to responding to the world’s most impactful business problems. Combining his background in laboratory experiments with his expertise in data analytics and machine learning, Dr. Burns works across many industries, including security, consumer electronics, utilities, and health sciences. He supports clients’ goals to modernize data collection and analytics strategies, extract information from unused data such as images and text, and test and validate existing systems. |
Webinar | Session Recording |
![]() Recording | 2020 |
||||||||
Webinar KC-46A Adaptive Relevant Testing Strategies to Enable Incremental Evaluation (Abstract)
The DoD’s challenge to provide capability at the “Speed of Relevance” has generated many new strategies to adapt to rapid development and acquisition. As a result, Operational Test Agencies (OTA) have had to adjust their test processes to accommodate rapid, but incremental delivery of capability to the warfighter. The Air Force Operational Test and Evaluation Center (AFOTEC) developed the Adaptive Relevant Testing (ART) concept to answer the challenge. In this session, AFOTEC Test Analysts will brief examples and lessons learned from implementing the ART principles on the KC-46A acquisition program to identify problems early and promote the delivery of individual capabilities as they are available to test. The AFOTEC goal is to accomplish these incremental tests while maintaining a rigorous statistical evaluation in a relevant and timely manner. This discussion will explain in detail how the KC-46A Initial Operational Test and Evaluation (IOT&E) was accomplished in a unique way that allowed the test team to discover, report on, and correct major system deficiencies much earlier than traditional methods. |
J. Quinn Stank Lead KC-46 Analyst AFOTEC ![]() (bio)
First Lieutenant J. Quinn Stank is the Lead Analyst for the Air Force Operational Test and Evaluation Center Detachment 5 at Outside Location Everett, Washington. The lieutenant serves as the advisor to the Operational Test and Evaluation team for the KC-46A. Lieutenant Stank, originally from Knoxville, Tn., received his commission as a second lieutenant upon graduation from the United States Air Force Academy in 2016. EDUCATION:
|
Webinar | Session Recording |
![]() Recording | 2020 |
||||||||
Webinar Development and Analytic Process Used to Develop a 3-Dimensional Graphical User Interface System for Baggage Screening (Abstract)
The Transportation Security Administration (TSA) uses several types of screening technologies for the purposes of threat detection at airports and federal facilities across the country. Computed Tomography (CT) systems afford TSA personnel in the Checked Baggage setting a quick and effective method to screen property with less need to physically inspect property due to their advanced imaging capabilities. Recent reductions in size, cost, and processing speed for CT systems spurred an interest in incorporating these advanced imaging systems at the Checkpoint to increase the speed and effectiveness of scanning personal property as well as passenger satisfaction during travel. The increase in speed and effectiveness of scanning personal property with fewer physical property inspections stems from several qualities native to CT imaging that current 2D X-Ray based Advanced Technology 2 (AT2) systems typically found at Checkpoints lack. Specifically, the CT offers rotatable 3D images and advanced identification algorithms that allow TSA personnel to more readily identify items requiring review on-screen without requesting that passengers remove them from their bag. The introduction of CT systems at domestic airports led to the identification of a few key Human Factors issues, however. Several vendors used divergent strategies to produce the CT systems introduced at domestic airport Checkpoints. Each system offered users different 3D visualizations, informational displays, and identification algorithms, offering a range of views, tools, layouts, and material colorization for users to sort through. The disparity in system similarity and potential for multiple systems to operate at a single airport resulted in unnecessarily complex training, testing, certification, and operating procedures. In response, a group of human factors engineers (HFEs) was tasked with creating requirements for a single common Graphical User Interface (GUI) for all CT systems that would provide a standard look, feel, and interaction across systems. We will discuss the development and analytic process used to 1.) gain an understanding of the tasks that CT systems must accomplish at the Checkpoint (i.e. focus groups), 2.) identify what tools Transportation Security Officers (TSOs) tend to use and why (i.e. focus groups and rank-ordered surveys), and 3.) determine how changes during iterative testing effects performance (i.e. A/B testing while collecting response time, accuracy, and tool usage). The data collection effort described here resulted in a set of requirements that produced a highly usable CT interface as measured by several valid and reliable objective and subjective measures. Perceptions of the CGUI’s usability (e.g., the System Usability Scale; SUS) were aligned with TSO performance (i.e., Pd, PFA, and Throughput) during use of the CGUI prototype. Iterative testing demonstrated an increase in the SUS score and performance measures for each revision of the requirements used to produce the common CT interface. User perspectives, feedback, and performance data also offered insight toward the determination of necessary future efforts that will increase user acceptance of the redesigned CT interface. Increasing user acceptance offers TSA the opportunity to improve user engagement, reduces errors, and the likelihood that the system will stay in service without a mandate. |
Charles McKee President and CEO Taverene Analytics LLC ![]() (bio)
Mr. McKee provides Test and Evaluation, Systems Engineering, Human Factors Engineering, Strategic Planning, Capture Planning, and Proposal Development support to companies supporting the Department of Defense and Department of Homeland Security. Recently served as President of the Board of Directors, International Test and Evaluation Association (ITEA), 2013 – 2015. Security Clearance: Secret, previously cleared for TS SCI. Homeland Security Vetted. TSA Professional Engineering Logistics Support Services (PELSS2) (May 2016 – present) for Global System Technologies (GST) and TSA Operational Test & Evaluation Support Services (OTSS) and Test & Evaluation Support Services (TESS) (Aug 2014 – 2016) for Engility. Provides Acquisition Management, Systems Engineering, Operational Test & Evaluation (OT&E), Human Factors Engineering (HFE), test planning, design of experiments, data collection, data analysis, statistics, and evaluation reporting on Transportation Security Equipment (TSE) systems deployed to Airports and Intermodal facilities. Led the design and development of a Common Graphical User Interface (CGUI) for new Checkpoint Computed Tomography systems. The CGUI design maximized the Probability of Detection, minimized probability of false alarms, while improving throughput time for screening accessible property by Transportation Security Officers (TSO’s) at airports. Division Manager, Alion Science and Technology, 2009-2014. Oversaw program management and technical support for the Test and Evaluation Division. Provided analysis support to multiple clients such as: Army Program Executive Office (PEO) Simulation Training Instrumentation (STRI) STARSHIP program and DISA Test and Evaluation (T&E) Mission Support Services contract. Provided Subject Matter Expertise to all client on program management, test and evaluation, statistical analysis, modeling and simulation, training, human factors engineering / human systems integration, and design of experiments. Operations Manager, SAIC, 2006-2009. Oversaw the program management and technical support for the Test, Evaluation, and Analysis Operation (TEAO). Provided analysis support to multiple clients such as the Director, Operational Test and Evaluation (DOT&E), Joint Test & Evaluation (JT&E), Test Resource Management Center (TRMC), OSD AT&L Systems Engineering, Defense Modeling and Simulation Office (DMSO), Air Force Test and Evaluation (AF/TE), US Joint Forces Command (USJFCOM) Joint Forces Integration and Interoperability Test (JFIIT), and the Air Combat Command (ACC) Red Flag Exercise Support. Provided Subject Matter Expertise to all clients on program management, test and evaluation, statistical analysis, modeling and simulation, training, human factors engineering / human systems integration, and design of experiments. Senior Program Manager, Human Factors Engineer. BDM / TRW / NGC (1989-2000 and 2003-2006). Provided Human Factors Engineering / Manpower Personnel Integration support to the Army Test and Evaluation Command (ATEC) / Army Evaluation Center (AEC), FAA Systems Engineering Integrated Product Team (SEIPT), and TSA Data Collection, Reduction, Analysis, Reporting, and Archiving (DCRARA) Support. Developed Evaluation Plans, Design of Experiments (DOE), requirements analysis, test planning, test execution, data collection, reduction, analysis, statistical analysis, and military assessments of Army programs. Supported HFE / MANPRINT working groups and System MANPRINT Management Plans. Conducted developmental assessments of System Safety, Manpower, Personnel, Training, and Human Systems Interfaces. MS, Industrial Engineering, NCSU, 1989. Major: Human Factors Engineering. Minor: Occupational Safety and Health. Scholarship from the National Institute of Occupational Safety and Health (NIOSH). Master’s Thesis on Cumulative Trauma Disorders in occupations with repetitive motion. |
Webinar | Session Recording |
![]() Recording | 2020 |
||||||||
Webinar Adoption Challenges in Artificial Intelligence and Machine Learning for Analytic Work Environments |
Laura McNamara Distinguished Member of Technical Staff Sandia National Laboratories ![]() (bio)
Dr. Laura A. McNamara is Distinguished Member of Technical Staff at Sandia National Laboratories. She’s spent her career partnering with computer scientists, software engineers, physicists, human factors experts, organizational psychologists, remote sensing and imagery scientists, and national security analysts in a wide range of settings. She has expertise in user-centered technology design and evaluation, information visualization/visual analytics, and mixed qualitative/quantitative social science research. Most of her projects involve challenges in sensor management, technology usability, and innovation feasibility and adoption. She enjoys working in Agile and Agile-like environments and is a skilled leader of interdisciplinary engineering, scientific, and software teams. She is passionate about ensuring usability, utility, and adaptability of visualization, operational, and analytic software. Dr. McNamara’s current work focuses on operational and analytic workflows in remote sensing environments. She is also an expert on visual cognitive workflows in team environments, focused on the role of user interfaces and analytic technologies to support exploratory data analysis and information creation with large, disparate, unwieldy datasets, from text to remote sensing. Dr. McNamara has longstanding interest in the epistemology and practices of computational modeling and simulation, verification and validation, and uncertainty quantification. She has worked with the National Geospatial-Intelligence Agency, the Missile Defense Agency, the Defense Intelligence Agency, and the nuclear weapons programs at Sandia and Los Alamos National Laboratories to enhance the effective use of modeling and simulation in interdisciplinary R&D projects. |
Webinar |
![]() | 2020 |
|||||||||
Webinar Taking Down a Turret: Introduction to Cyber Operational Test and Evaluation (Abstract)
Cyberattacks are in the news every day, from data breaches of banks and stores to ransomware attacks shutting down city governments and delaying school years. In this mini-tutorial, we introduce key cybersecurity concepts and methods to conducting cybersecurity test and evaluation. We walk you through a live demonstration of a cyberattack and provide real-world examples of each major step we take. The demonstration shows an attacker gaining command and control of a Nerf turret. We leverage tools commonly used by red teams to explore an attack scenario involving phishing, network scanning, password cracking, pivoting, and finally creating a mission effect. We also provide a defensive view and analytics that shows artifacts left by the attack path. |
OED Cyber Lab IDA |
Webinar | Session Recording |
Recording | 2020 |
||||||||
Webinar I have the Power! Power Calculation in Complex (and Not So Complex) Modeling Situations Part 1 (Abstract)
Invariably, any analyst who has been in the field long enough has heard the dreaded questions: “Is X number of samples enough? How much data do I need for my experiment?” Ulterior motives aside, any investigation involving data must ultimately answer the question of “How many?” to avoid risking either insufficient data to detect a scientifically significant effect or having too much data leading to a waste of valuable resources. This can become particularly difficult when the underlying model is complex (e.g. longitudinal designs with hard-to-change factors, time-to-event response with censoring, binary responses with non-uniform test levels, etc.). Even in the supposedly simpler case of categorical factors, where run size is often chosen using a lower bound power calculation, a simple approach can mask more “powerful” techniques. In this tutorial, we will spend the first half exploring how to use simulation to perform power calculations in complex modeling situations drawn from relevant defense applications. Techniques will be illustrated using both R and JMP Pro. In the second half, we will investigate the case of categorical factors and illustrate how treating the unknown effects as random variables induces a distribution on statistical power, which can then be used as a new way to assess experimental designs. Instructor Bio: Caleb King is a Research Statistician Tester for the DOE platform in the JMP software. He received his MS and PhD in Statistics from Virginia Tech and worked for three years as a statistical scientist at Sandia National Laboratories prior to arriving at JMP. His areas of expertise include optimal design of experiments, accelerated testing, reliability analysis, and small-sample theory |
Caleb King JMP Division, SAS Institute Inc. ![]() |
Webinar | Session Recording |
Materials
Recording | 2020 |
||||||||
Webinar D-Optimally Based Sequential Test Method for Ballistic Limit Testing (Abstract)
Ballistic limit testing of armor is testing in which a kinetic energy threat is shot at armor at varying velocities. The striking velocity and whether the threat completely penetrated or partially penetrated the armor is recorded. The probability of penetration is modeled as a function of velocity using a generalized linear model. The parameters of the model serve as inputs to MUVES which is a DoD software tool used to analyze weapon system vulnerability and munition lethality. Generally, the probability of penetration is assumed to be monotonically increasing with velocity. However, in cases in which there is a change in penetration mechanism, such as the shatter gap phenomena, the probability of penetration can no longer be assumed to be monotonically increasing and a more complex model is necessary. One such model was developed by Chang and Bodt to model the probability of penetration as a function of velocity over a velocity range in which there are two penetration mechanisms. This paper proposes a D-optimally based sequential shot selection method to efficiently select threat velocities during testing. Two cases are presented: the case in which the penetration mechanism for each shot is known (via high-speed or post shot x-ray) and the case in which the penetration mechanism is not known. This method may be used to support an improved evaluation of armor performance for cases in which there is a change in penetration mechanism. |
Leonard Lombardo Mathematician U.S. Army Aberdeen Test Center ![]() (bio)
Leonard currently serves is an analyst for the RAM/ILS Engineering and Analysis Division at the U.S. Army Aberdeen Test Center (ATC). At ATC, he is the lead analyst for both ballistic testing of helmets and fragmentation analysis. Previously, while on a developmental assignment at the U.S. Army Evaluation Center, he worked towards increasing the use of generalized linear models in ballistic limit testing. Since then, he has contributed towards the implementation of generalized linear models within the test center through test design and analysis. |
Webinar | Session Recording |
![]() Recording | 2020 |
||||||||
Webinar Introduction to Uncertainty Quantification for Practitioners and Engineers (Abstract)
Uncertainty is an inescapable reality that can be found in nearly all types of engineering analyses. It arises from sources like measurement inaccuracies, material properties, boundary and initial conditions, and modeling approximations. Uncertainty Quantification (UQ) is a systematic process that puts error bands on results by incorporating real world variability and probabilistic behavior into engineering and systems analysis. UQ answers the question: what is likely to happen when the system is subjected to uncertain and variable inputs. Answering this question facilitates significant risk reduction, robust design, and greater confidence in engineering decisions. Modern UQ techniques use powerful statistical models to map the input-output relationships of the system, significantly reducing the number of simulations or tests required to get accurate answers. This tutorial will present common UQ processes that operate within a probabilistic framework. These include statistical Design of Experiments, statistical emulation methods used to create the simulation inputs to response relationship, and statistical calibration for model validation and tuning to better represent test results. Examples from different industries will be presented to illustrate how the covered processes can be applied to engineering scenarios. This is purely an educational tutorial and will focus on the concepts, methods, and applications of probabilistic analysis and uncertainty quantification. SmartUQ software will only be used for illustration of the methods and examples presented. This is an introductory tutorial designed for practitioners and engineers with little to no formal statistical training. However, statisticians and data scientists may also benefit from seeing the material presented from a more practical use than a purely technical perspective. There are no prerequisites other than an interest in UQ. Attendees will gain an introductory understanding of Probabilistic Methods and Uncertainty Quantification, basic UQ processes used to quantify uncertainties, and the value UQ can provide in maximizing insight, improving design, and reducing time and resources. Instructor Bio: Gavin Jones, Sr. SmartUQ Application Engineer, is responsible for performing simulation and statistical work for clients in aerospace, defense, automotive, gas turbine, and other industries. He is also a key contributor in SmartUQ’s Digital Twin/Digital Thread initiative. Mr. Jones received a B.S. in Engineering Mechanics and Astronautics and a B.S. in Mathematics from the University of Wisconsin-Madison. |
Gavin Jones Sr. Application Engineer SmartUQ ![]() |
Webinar | 2020 |
||||||||||
Webinar A Validation Case Study: The Environment Centric Weapons Analysis Facility (Abstract)
Reliable modeling and simulation (M&S) allows the undersea warfare community to understand torpedo performance in scenarios that could never be created in live testing, and do so for a fraction of the cost of an in-water test. The Navy hopes to use the Environment Centric Weapons Analysis Facility (ECWAF), a hardware-in-the-loop simulation, to predict torpedo effectiveness and supplement live operational testing. In order to trust the model’s results, the T&E community has applied rigorous statistical design of experiments techniques to both live and simulation testing. As part of ECWAF’s two-phased validation approach, we ran the M&S experiment with the legacy torpedo and developed an empirical emulator of the ECWAF using logistic regression. Comparing the emulator’s predictions to actual outcomes from live test events supported the test design for the upgraded torpedo. This talk overviews the ECWAF’s validation strategy, decisions that have put the ECWAF on a promising path, and the metrics used to quantify uncertainty. |
Elliot Bartis Research Staff Member IDA ![]() (bio)
Elliot Bartis is a research staff member at the Institute for Defense Analyses where he works on test and evaluation of undersea warfare systems such as torpedoes and torpedo countermeasures. Prior to coming to IDA, Elliot received his B.A. in physics from Carleton College and his Ph.D. in materials science and engineering from the University of Maryland in College Park. For his doctorate dissertation, he studied how cold plasma interacts with biomolecules and polymers. Elliot was introduced to model validation through his work on a torpedo simulation called the Environment Centric Weapons Analysis Facility. In 2019, Elliot and others involved in the MK 48 torpedo program received a Special Achievement Award from the International Test and Evaluation Association in part for their work on this simulation. Elliot lives in Falls Church, VA with his wife Jacqueline and their cat Lily. |
Webinar | Session Recording |
![]() Recording | 2020 |
||||||||
Webinar The Role of Uncertainty Quantification in Machine Learning (Abstract)
Uncertainty is an inherent, yet often under-appreciated, component of machine learning and statistical modeling. Data-driven modeling often begins with noisy data from error-prone sensors collected under conditions for which no ground-truth can be ascertained. Analysis then continues with modeling techniques that rely on a myriad of design decisions and tunable parameters. The resulting models often provide demonstrably good performance, yet they illustrate just one of many plausible representations of the data – each of which may make somewhat different predictions on new data. This talk provides an overview of recent, application-driven research at Sandia Labs that considers methods for (1) estimating the uncertainty in the predictions made by machine learning and statistical models, and (2) using the uncertainty information to improve both the model and downstream decision making. We begin by clarifying the data-driven uncertainty estimation task and identifying sources of uncertainty in machine learning. We then present results from applications in both supervised and unsupervised settings. Finally, we conclude with a summary of lessons learned and critical directions for future work. |
David Stracuzzi Research Scientist Sandia National Laboratories ![]() |
Webinar |
![]() | 2020 |
|||||||||
Webinar Statistical Engineering for Service Life Prediction of Polymers (Abstract)
Economically efficient selection of materials depends on knowledge of not just the immediate properties, but the durability of those properties. For example, when selecting building joint sealant, the initial properties are critical to successful design. These properties change over time and can result in failure in the application (buildings leak, glass falls). A NIST led industry consortium has a research focus on developing new measurement science to determine how the properties of the sealant change with environmental exposure. In this talk, the two-decade history of the NIST led effort will be examined through the lens of Statistical Engineering, specifically its 6 phases: (1) Identify the problem. (2) Provide structure. (3)Understand the context. (4) Develop a strategy. (5) Develop and execute tactics. (6) Identify and deploy a solution. Phases 5 and 6 will be the primary focus of this talk, but all of the phases will be discussed. The tactics of phase 5 were often themselves multi-month or year research problems. Our approach to predicting outdoor degradation based only on accelerated weathering in the laboratory has been revised and improved many times over several years. In phase 6, because of NIST’s unique mission of promoting U.S. innovation and industrial competitiveness, the focus has been outward on technology transfer and the advancement of test standards. This may differ from industry and other government agencies where the focus may be improvement of processes inside of the organization. |
Adam Pintar Mathematical Statistician National Institute of Standards and Technology ![]() (bio)
Adam Pintar is a Mathematical Statistician at the National Institute of Standards and Technology. He applies statistical methods and thinking to diverse application areas including Physics, Chemistry, Biology, Engineering, and more recently Social Science. He received a PhD in Statistics from Iowa State University. |
Webinar | Session Recording |
![]() Recording | 2020 |
||||||||
Webinar The Science of Trust of Autonomous Unmanned Systems (Abstract)
The world today is witnessing a significant investment in autonomy and artificial intelligence that most certainly will result in ever-increasing capabilities of unmanned systems. Driverless vehicles are a great example of systems that can make decisions and perform very complex actions. The reality though is that while it is well understood what these systems are doing, but not well at all ‘how’ the intelligence engines are generating decisions to accomplish those actions. Therein lies the underlying challenge of accomplishing formal test and evaluation of these systems and related, how to engender trust in their performance. This presentation will outline and define the problem space, discuss those challenges, and offer solution constructs. |
Reed Young Program Manager for Robotics and Autonomy Johns Hopkins University Applied Physics Laboratory ![]() |
Webinar | Session Recording |
![]() Recording | 2020 |
||||||||
Webinar Sequential Testing and Simulation Validation for Autonomous Systems (Abstract)
Autonomous systems expect to play a significant role in the next generation of DoD acquisition programs. New methods need to be developed and vetted, particularly for two groups we know well that will be facing the complexities of autonomy: a) test and evaluation, and b) modeling and simulation. For test and evaluation, statistical methods that are routinely and successfully applied throughout DoD need to be adapted to be most effective in autonomy, and some of our practices need to be stressed. One is sequential testing and analysis, which we illustrate to allow testers to learn and improve incrementally. The other group needing to rethink practices best for autonomy is the modeling and simulation. Proposed are some statistical methods appropriate for modeling and simulation validation for autonomous systems. We look forward to your comments and suggestions. |
Jim Simpson Principal JK Analytics ![]() |
Webinar | Session Recording |
![]() Recording | 2020 |
||||||||
Webinar The Role of Statistical Engineering in Creating Solutions for Complex Opportunities (Abstract)
Statistical engineering is the art and science for addressing complex organizational opportunities with data. The span of statistical engineering ranges from the “problems that keep CEOs awake at night” to the analysts dealing with the results of the experimentation necessary for the success of their most current project. This talk introduces statistical engineering and its full spectrum of approaches to complex opportunities with data. The purpose of this talk is to set the stage for the two specific case studies that follow it. Too often, people lose sight of the big picture of statistical engineering by a too narrow focus on the specific case studies. Too many people walk away thinking “This is what I have been doing for years. It is simply good applied statistics.” These people fail to see what we can learn from each other through the sharing of our experiences to teach other people how to create solutions more efficiently and effectively. It is this big picture that is the focus of this talk. |
Geoff Vining Professor Virginia Tech ![]() (bio)
Geoff Vining is a Professor of Statistics at Virginia Tech, where from 1999 – 2006, he also was the department head. He holds an Honorary Doctor of Technology from Luleå University of Technology. He is an Honorary Member of the ASQ (the highest lifetime achievement award in the field of Quality), an Academician of the International Academy for Quality, a Fellow of the American Statistical Association (ASA), and an Elected Member of the International Statistical Institute. He is the Founding and Current Past-Chair of the International Statistical Engineering Association (ISEA). He is a founding member of the US DoD Science of Test Research Consortium. Dr. Vining won the 2010 Shewhart Medal, the ASQ career award given to the person who has demonstrated the most outstanding technical leadership in the field of modern quality control. He also received the 2015 Box Medal from the European Network for Business and Industrial Statistics (ENBIS). This medal recognizes a statistician who has remarkably contributed to the development and the application of statistical methods in European business and industry. In 2013, he received an Engineering Excellence Award from the NASA Engineering and Safety Center. He received the 2011 William G. Hunter Award from the ASQ Statistics Division for excellence in statistics as a communicator, consultant, educator, innovator, and integrator of statistics with other disciplines and an implementer who obtains meaningful results. Dr. Vining is the author of three textbooks. He is an internationally recognized expert in the use of experimental design for quality, productivity, and reliability improvement and in the application of statistical process control. He has extensive consulting experience, most recently with the U.S. Department of Defense through the Science of Test Research Consortium and with NASA. |
Webinar | Session Recording |
![]() Recording | 2020 |
||||||||
Webinar Connecting Software Reliability Growth Models to Software Defect Tracking (Abstract)
Co-Author: Melanie Luperon. Most software reliability growth models only track defect discovery. However, a practical concern is removal of high severity defects, yet defect removal is often assumed to occur instantaneously. More recently, several defect removal models have been formulated as differential equations in terms of the number of defects discovered but not yet resolved and the rate of resolution. The limitation of this approach is that it does not take into consideration data contained in a defect tracking database. This talk describes our recent efforts to analyze data from a NASA program. Two methods to model defect resolution are developed, namely (i) distributional and (ii) Markovian approaches. The distributional approach employs times between defect discovery and resolution to characterize the mean resolution time and derives a software defect resolution model from the corresponding software reliability growth model to track defect discovery. The Markovian approach develops a state model from the stages of the software defect lifecycle as well as a transition probability matrix and the distributions for each transition, providing a semi-Markov model. Both the distribution and Markovian approaches employ a censored estimation technique to identify the maximum likelihood estimates, in order to handle the case where some but not all of the defects discovered have been resolved. Furthermore, we apply a hypothesis test to determine if a first or second order Markov chain best characterizes the defect lifecycle. Our results indicate that a first order Markov chain was sufficient to describe the data considered and that the Markovian approach achieves modest improvements in predictive accuracy, suggesting that the simpler distributional approach may be sufficient to characterize the software defect resolution process during test. The practical inferences of such models include an estimate of the time required to discover and remove all defects. |
Lance Fiondella Associate Professor University of Massachusetts ![]() (bio)
Lance Fiondella is an associate professor of Electrical and Computer Engineering at the University of Massachusetts Dartmouth. He received his PhD (2012) in Computer Science and Engineering from the University of Connecticut. Dr. Fiondella’s papers have received eleven conference paper awards, including six with his students. His software and system reliability and security research has been funded by the DHS, NASA, Army Research Laboratory, Naval Air Warfare Center, and National Science Foundation, including a CAREER Award. |
Webinar | Session Recording |
![]() Recording | 2020 |
||||||||
Webinar A HellerVVA Problem: The Catch-22 for Simulated Testing of Fully Autonomous Systems (Abstract)
In order to verify, validate, and accredit (VV&A) a simulation environment for testing the performance of an autonomous system, testers must examine more than just sensor physics—they must also provide evidence that the environmental features which drives system decision making are represented at all. When systems are black boxes though, these features are fundamentally unknown, necessitating that we first test to discover these features. An umbrella known as “model induction” provides approaches for demystifying black boxes and obtaining models of their decision making, but the current state of the art assumes testers can input large quantities of operationally relevant data. When systems only make passive perceptual decisions or operate in purely virtual environments, these assumptions are typically met. However, this will not be the case for black-box, fully autonomous systems. These systems can make decisions about the information they acquire—which cannot be changed in pre-recorded passive inputs—and a major reason to obtain a decision model is to VV&A the simulation environment—preventing the valid use of a virtual environment to obtain a model. Furthermore, the current consensus is that simulation will be used to get limited safety releases for live testing. This creates a catch-22 of needing data to obtain the decision-model, but needing the decision-model to validly obtain the data. In this talk, we provide a brief overview of this challenge and possible solutions. |
Daniel Porter Research Staff Member IDA ![]() |
Webinar | Session Recording |
![]() Recording | 2020 |
||||||||
Webinar I have the Power! Power Calculation in Complex (and Not So Complex) Modeling Situations Part 2 (Abstract)
Instructor Bio: Ryan Lekivetz is a Senior Research Statistician Developer for the JMP Division of SAS where he implements features for the Design of Experiments platforms in JMP software. |
Ryan Lekivetz JMP Division, SAS Institute Inc. ![]() |
Webinar | Session Recording |
![]() Recording | 2020 |
||||||||
Tutorial Tutorial: Combinatorial Methods for Testing and Analysis of Critical Software and Security Systems (Abstract)
Combinatorial methods have attracted attention as a means of providing strong assurance at reduced cost, but when are these methods practical and cost-effective? This tutorial includes two sections on the basis and application of combinatorial test methods: The first section explains the background, process, and tools available for combinatorial testing, with illustrations from industry experience with the method. The focus is on practical applications, including an industrial example of testing to meet FAA-required standards for life-critical software for commercial aviation. Other example applications include modeling and simulation, mobile devices, network configuration, and testing for a NASA spacecraft. The discussion will also include examples of measured resource and cost reduction in case studies from a variety of application domains. The second part explains combinatorial testing-based techniques for effective security testing of software components and large-scale software systems. It will develop quality assurance and effective re-verification for security testing of web applications and testing of operating systems. It will further address how combinatorial testing can be applied to ensure proper error-handling of network security protocols and provide the theoretical guarantees for detecting Trojans injected in cryptographic hardware. Procedures and techniques, as well as workarounds will be presented and captured as guidelines for a broader audience. |
Rick Kuhn, Dimitris Simos, and Raghu Kacker National Institute of Standards & Technology |
Tutorial |
![]() | 2019 |
|||||||||
Tutorial Tutorial: Cyber Attack Resilient Weapon Systems (Abstract)
This tutorial is an abbreviated version of a 36-hour short course recently provided by UVA to a class composed of engineers working at the Defense Intelligence Agency. The tutorial provides a definition for cyber attack resilience that is an extension of earlier definitions of system resilience that were not focused on cyber attacks. Based upon research results derived by the University of Virginia over an eight year period through DoD/Army/AF/Industry funding , the tutorial will illuminate the following topics: 1) A Resilence Design Requirements methodology and the need for supporting analysis tools, 2) a System Architecture approach for achieving resilience, 3) Example resilience design patterns and example prototype implementations, 4) Experimental results regarding resilience-related roles and readiness of system operators, and 5) Test and Evaluation Issues. The tutorial will be presented by UVA Munster Professor Barry Horowitz. |
Barry Horowitz Professor, Systems Engineering University of Virginia |
Tutorial |
![]() | 2019 |
|||||||||
Tutorial Tutorial: Learning Python and Julia (Abstract)
In recent years, the programming language Python with its supporting ecosystem has established itself as a significant capability to support the activities of the typical data scientist. Recently, version 1.0 of the programming language Julia has been released; from a software engineering perspective, it can be viewed as a modern alternative. This tutorial presents both Python and Julia from both a user and developer point of view. From a user’s point of view, the basic syntax of each, along with fundamental prerequisite knowledge presented. From a developers point of view the underlying infrastructure of the programming language / interpreter / compiler is discussed. |
Douglas Hodson Associate Professor Air Force Institute of Technology |
Tutorial | 2019 |
||||||||||
Tutorial Tutorial: Statistics Boot Camp (Abstract)
In the test community, we frequently use statistics to extract meaning from data. These inferences may be drawn with respect to topics ranging from system performance to human factors. In this mini-tutorial, we will begin by discussing the use of descriptive and inferential statistics, before exploring the basics of interval estimation and hypothesis testing. We will introduce common statistical techniques and when to apply them, and conclude with a brief discussion of how to present your statistical findings graphically for maximum impact. |
Kelly Avery IDA |
Tutorial |
![]() | 2019 |
|||||||||
Tutorial Tutorial: Reproducible Research (Abstract)
Analyses are “reproducible” if the same methods applied to the same data produce identical results when run again by another researcher (or you in the future). Reproducible analyses are transparent and easy for reviewers to verify, as results and figures can be traced directly to the data and methods that produced them. There are also direct benefits to the researcher. Real-world analysis workflows inevitably require changes to incorporate new or additional data, or to address feedback from collaborators, reviewers, or sponsors. These changes are easier to make when reproducible research best practices have been considered from the start. Poor reproducibility habits result in analyses that are difficult or impossible to review, are prone to compounded mistakes, and are inefficient to re-run in the future. They can lead to duplication of effort or even loss of accumulated knowledge when a researcher leaves your organization. With larger and more complex datasets, along with more complex analysis techniques, reproducibility is more important than ever. Although reproducibility is critical, it is often not prioritized either due to a lack of time or an incomplete understanding of end-to-end opportunities to improve reproducibility. This tutorial will discuss the benefits of reproducible research and will demonstrate ways that analysts can introduce reproducible research practices during each phase of the analysis workflow: preparing for an analysis, performing the analysis, and presenting results. A motivating example will be carried throughout to demonstrate specific techniques, useful tools, and other tips and tricks where appropriate. The discussion of specific techniques and tools is non-exhaustive; we focus on things that are accessible and immediately useful for someone new to reproducible research. The methods will focus mainly on work performed using R, but the general concepts underlying reproducible research techniques can be implemented in other analysis environments, such as JMP and Excel, and are briefly discussed. By implementing the approaches and concepts discussed during this tutorial, analysts in defense and aerospace will be equipped to produce more credible and defensible analyses of T&E data. |
Andrew Flack, Kevin Kirshenbaum, and John Haman IDA |
Tutorial |
![]() | 2019 |
|||||||||
Tutorial Tutorial: Developing Valid and Reliable Scales (Abstract)
The DoD uses psychological measurement to aid in decision-making about a variety of issues including the mental health of military personnel before and after combat, and the quality of human-systems interactions. To develop quality survey instruments (scales) and interpret the data obtained from these instruments appropriately, analysts and decision-makers must understand the factors that affect the reliability and validity of psychological measurement. This tutorial covers the basics of scale development and validation and discusses current efforts by IDA, DOT&E, ATEC, and JITC to develop validated scales for use in operational test and evaluation. |
Heather Wojton & Shane Hall IDA / USARMY ATEC |
Tutorial |
![]() | 2019 |
|||||||||
Tutorial The Bootstrap World (Abstract)
Bootstrapping is a powerful tool for statistical estimation and inference. In this tutorial, we will use operational test scenarios to provide context when exploring examples ranging from the simple (estimating a sample mean) to the complex (estimating a confidence interval for system availability). Areas of focus will include point estimates, confidence intervals, parametric bootstrapping and hypothesis testing with the bootstrap. The strengths and weaknesses of bootstrapping will also be discussed. |
Matt Avery Research Staff Member IDA |
Tutorial | Materials | 2016 |
Session Title | Speaker | Type | Recording | Materials | Year |
---|---|---|---|---|---|
Webinar A Practical Introduction To Gaussian Process Regression |
Robert “Bobby” Gramacy Virginia Tech ![]() |
Webinar |
![]() | 2020 |
|
Webinar Can AI Predict Human Behavior? |
Dustin Burns Senior Scientist Exponent ![]() |
Webinar | Session Recording |
![]() Recording | 2020 |
Webinar KC-46A Adaptive Relevant Testing Strategies to Enable Incremental Evaluation |
J. Quinn Stank Lead KC-46 Analyst AFOTEC ![]() |
Webinar | Session Recording |
![]() Recording | 2020 |
Webinar Development and Analytic Process Used to Develop a 3-Dimensional Graphical User Interface System for Baggage Screening |
Charles McKee President and CEO Taverene Analytics LLC ![]() |
Webinar | Session Recording |
![]() Recording | 2020 |
Webinar Adoption Challenges in Artificial Intelligence and Machine Learning for Analytic Work Environments |
Laura McNamara Distinguished Member of Technical Staff Sandia National Laboratories ![]() |
Webinar |
![]() | 2020 |
|
Webinar Taking Down a Turret: Introduction to Cyber Operational Test and Evaluation |
OED Cyber Lab IDA |
Webinar | Session Recording |
Recording | 2020 |
Webinar I have the Power! Power Calculation in Complex (and Not So Complex) Modeling Situations Part 1 |
Caleb King JMP Division, SAS Institute Inc. ![]() |
Webinar | Session Recording |
Materials
Recording | 2020 |
Webinar D-Optimally Based Sequential Test Method for Ballistic Limit Testing |
Leonard Lombardo Mathematician U.S. Army Aberdeen Test Center ![]() |
Webinar | Session Recording |
![]() Recording | 2020 |
Webinar Introduction to Uncertainty Quantification for Practitioners and Engineers |
Gavin Jones Sr. Application Engineer SmartUQ ![]() |
Webinar | 2020 |
||
Webinar A Validation Case Study: The Environment Centric Weapons Analysis Facility |
Elliot Bartis Research Staff Member IDA ![]() |
Webinar | Session Recording |
![]() Recording | 2020 |
Webinar The Role of Uncertainty Quantification in Machine Learning |
David Stracuzzi Research Scientist Sandia National Laboratories ![]() |
Webinar |
![]() | 2020 |
|
Webinar Statistical Engineering for Service Life Prediction of Polymers |
Adam Pintar Mathematical Statistician National Institute of Standards and Technology ![]() |
Webinar | Session Recording |
![]() Recording | 2020 |
Webinar The Science of Trust of Autonomous Unmanned Systems |
Reed Young Program Manager for Robotics and Autonomy Johns Hopkins University Applied Physics Laboratory ![]() |
Webinar | Session Recording |
![]() Recording | 2020 |
Webinar Sequential Testing and Simulation Validation for Autonomous Systems |
Jim Simpson Principal JK Analytics ![]() |
Webinar | Session Recording |
![]() Recording | 2020 |
Webinar The Role of Statistical Engineering in Creating Solutions for Complex Opportunities |
Geoff Vining Professor Virginia Tech ![]() |
Webinar | Session Recording |
![]() Recording | 2020 |
Webinar Connecting Software Reliability Growth Models to Software Defect Tracking |
Lance Fiondella Associate Professor University of Massachusetts ![]() |
Webinar | Session Recording |
![]() Recording | 2020 |
Webinar A HellerVVA Problem: The Catch-22 for Simulated Testing of Fully Autonomous Systems |
Daniel Porter Research Staff Member IDA ![]() |
Webinar | Session Recording |
![]() Recording | 2020 |
Webinar I have the Power! Power Calculation in Complex (and Not So Complex) Modeling Situations Part 2 |
Ryan Lekivetz JMP Division, SAS Institute Inc. ![]() |
Webinar | Session Recording |
![]() Recording | 2020 |
Tutorial Tutorial: Combinatorial Methods for Testing and Analysis of Critical Software and Security Systems |
Rick Kuhn, Dimitris Simos, and Raghu Kacker National Institute of Standards & Technology |
Tutorial |
![]() | 2019 |
|
Tutorial Tutorial: Cyber Attack Resilient Weapon Systems |
Barry Horowitz Professor, Systems Engineering University of Virginia |
Tutorial |
![]() | 2019 |
|
Tutorial Tutorial: Learning Python and Julia |
Douglas Hodson Associate Professor Air Force Institute of Technology |
Tutorial | 2019 |
||
Tutorial Tutorial: Statistics Boot Camp |
Kelly Avery IDA |
Tutorial |
![]() | 2019 |
|
Tutorial Tutorial: Reproducible Research |
Andrew Flack, Kevin Kirshenbaum, and John Haman IDA |
Tutorial |
![]() | 2019 |
|
Tutorial Tutorial: Developing Valid and Reliable Scales |
Heather Wojton & Shane Hall IDA / USARMY ATEC |
Tutorial |
![]() | 2019 |
|
Tutorial The Bootstrap World |
Matt Avery Research Staff Member IDA |
Tutorial | Materials | 2016 |