Detection Time as a System Response Variable
Many military systems involve software and hardware that enable the detection of an object or event. Common examples include sensors (e.g., radar, sonar, infrared, electro-optical), which enable detection of objects in a complex battlespace. Some systems are meant to fully automate detection, while others facilitate detection by human users. The time it takes to detect an event or object is a critical characteristic of performance. Variables such as time to detect or detection accuracy provide a great deal more information than a binary outcome. Thus, detection time and accuracy are common response variables that must be analyzed for many system tests. In this case study, evaluators examined whether a new software version improved detection timeliness in sonar testing. Whereas this example focuses on timeliness, accuracy analyses often use the same statistical methods because the nature of the data is similar.
A-RCI Analysis Case Study
Acoustic Rapid Commercial-Off-The-Shelf Insertion (A-RCI) is a sonar system
upgrade installed on the entirety of the USA’s submarine fleet. A-RCI is a submarine sonar processing system that functions as a filter between operators and the constant barrage of incoming sonar data. The system’s purpose is to display processed sonar data and perform automations that help operators quickly detect submarines.
Real-time at-sea testing of the A-RCI involves two submarines simultaneously searching for one-another within a given area. The response variable of interest is detection time. More specifically, it is the time it takes an operator to detect a submarine once it has become visible on the A-RCI system screens. In addition to this limited at-sea testing, the A-RCI system tests have been augmented with lab testing in which hydrophone recordings from real at-sea tests are played back through a sonar system replica equipped with A-RCI on land. The laboratory based testing allows for tests to examine more than one operational environment, submarine type, and operating profile. Additionally, the more controlled test environment allows for a direct comparison between multiple versions of software so evaluators can compare the old software to the new software under the same scenarios.
Experimental Design Approach
The laboratory setting allowed for the investigation of several factors’ impact on detection time.
|Submarine Target Type||SSN|
|SSNs and SSKs exhibit different acoustic signatures. SSNs typically have more discrete tonal information because of the machinery associated with the nuclear reactor.|
|Array Type||Type A|
|Array type A typically detects targets at longer ranges, which would be expected to generate longer detection times.|
|Loud targets are detected at longer ranges, which could lead to longer detection times. Conversely, loud targets typically have more discrete tonal information and are easier to identify, which could result in shorter detection times.|
|The primary goal of the test was to compare the latest version of the sonar system - Advanced Processor Build-11 (APB-11) - with the previous version - APB-09.|
|Operator Proficiency||Experience Rating: 1-20||More proficient operators will detect a submarine more quickly. The numeric scale was developed by the Naval Undersea Warfare Center and is based on an operator's experience with the A-RCI system.|
Detection Time Data Analysis
The goals of A-RCI analyses were to determine whether an updated system outperformed the legacy system as well as to screen for other important factors that influenced detection time. Detection time data require extra attention as they do not form the shape of the normal distribution that many standard analyses require. In fact, most responses involving time are not normally distributed because there is a starting point of absolute zero and a potentially infinite end point. This demonstrates the importance of examining data before jumping into any modeling or inferential analyses. The figure below shows that the time data collected here is well fit by a lognormal distribution.
When presented with non-normally distributed data, analysts have choices as to how to proceed. One option is to analyze the data using a method that does not assume a normal distribution and is able to fit a number of non-normal distributions. For example, A-RCI data could be modeled by way of a lognormal regression analysis. Another option is to transform the data. In the case of the A-RCI data, the analysts chose to transform the response variable data. After taking the log of each data point, the resulting data met the assumptions of more standard analyses.
The analysis of A-RCI data also illustrates another advanced technique involving censored data. In each test run, not every system display of a signal resulted in an operator’s detection of that signal. That is, it is not uncommon for the system to detect and display a submarine, but for the operator to miss it. This creates a problem for the analysis of time because there is no value recorded for missed targets. If these misses are removed from the analysis, the test loses power and the analyst also throws away useful information. Right censoring is a technique that incorporates the range of potential detection times after the end of the tape, which can help the analyst preserve information and test power. The analysis assumes the operator would have eventually detected the submarine if given a longer time period to detect.
After transforming and censoring the data, the analysts fit a linear model to determine the impact of several factors on detection time. Several factors emerged as significant (summarized below).
|Term (Factor)||Value and CI†||Effect Description|
|†: Confidence Interval is an 80% Wald interval.|
|β1 (Software Build)||0.307 ± 0.129||Detection time is shorter for APB-11, by 46%.|
|β2 (Operator Experience Level)||-0.074 ± 0.041||Increased operator proficiency results in shorter detection times. An increase in proficiency of one unit reduces median detection time by 7 percent.|
|β3 (Submarine Type)||0.359 ± 0.126||Detection time is shorter for SSN targets.|
|β4 (Target Noise)||-0.324 ± 0.125||Detection time is shorter for loud targets.|
|β5 (Hydrophone Array Type)||0.347 ± 0.125||Detection time is shorter for the Type B array.|
The figure below shows the predicted median detection times based on a regression model fit to the log transformed detection times (blue dots, with 80% confidence intervals shown as vertical lines), along with the median detection times in each group, which are simple summary statistics not based on a regression model (black), and the raw detection times (light blue and red). The model predictions generally agree with the median in each bin, indicating that the relatively simple model provides a good fit to the data. There is, however, notable disagreement between the data median and the model prediction for one bin: quiet SSK targets with array type B in APB-09. The difference is due to sparse data, rather than a poorly fitting model. The data median in this case is based on only three data points and is therefore highly variable, making it a poor estimator of the true performance in that bin. The analysts believe the model estimate predicts the performance that would be observed if additional runs were conducted with APB-09.
This case study of A-RCI testing and analysis illustrates several important points that should be considered by evaluators. The first is that the precise and detailed information provided by these data would not have been possible had a simple binary response of detect vs. no-detect been recorded. Continuous responses (e.g., detection time) are substantially more informative.
Similarly, information can be retained by censoring data where appropriate. It is helpful to consider whether information is lost by removing non-detects, or events that one can safely assume would eventually occur. Censoring can add power to the analysis and contribute to insightful analyses in other ways as well.
Another key point is that a test’s data may have particular characteristics by default, rather than as a function of the test design. Time is a good example of a measurement that is commonly non-normally distributed. Thus, it is important to check that data meet the assumptions of the planned analysis.
Finally, there are often multiple reasonable approaches to data analysis. The A-RCI data could be analyzed as described above as well as by fit to a Generalized Linear Model. These tests lead to similar results, which allow strong conclusions to be drawn from the data.