Expert Probability Elicitation Tools

Principal Investigator: Stephen C. Hora

Abstract:

This project has focused on the development of elicitation methods for subject matter experts (SMEs) that support quantification of risk models for terrorist activities. The work for year six of CREATE entails the creation of elicitation methods for split fractions, the analysis of aggregation methods for the judgments of multiple experts, and the development of optimal weighting for linear combinations of expert judgments. SMEs have traditionally played a key role in the quantification of risk models.. They are used for this purpose when data are sparse, when results from analogs must be interpreted, when there is conflicting evidence, and when conditions are changing. All of these circumstances apply to the analysis of terrorist threats. Developing tools to support probability elicitation can lead to more efficient use of experts, to more accurate estimates, and to a better understanding of the rationales under pinning judgments. Three major studies were completed this year. The first study was inspired by a project for NBACC and has gained considerable interest at RMA. In this study, we examine how we might elicit uncertainty distributions over the fraction of times one of m mutually exclusive categories appears. For example, the categories may be biological agents and the fractions may the relative frequency of terrorist biological attacks that employ each of the m agents. Of course the relative frequencies must add to one which injects a complicating factor into the probability elicitation. Four distinct elicitation strategies were developed and demonstrated. A second major study examined the aggregation of multiple expert judgments given as density functions. The expert judgments were modeled as being well calibrated and were combined using both arithmetic averaging and geometric averaging as well as an analogue of a likelihood function. The effects of changes in expertise, number of experts, and dependence of experts were evaluated by examining their impact on calibration, sharpness, and the expected Brier score. It was found that under ideal circumstances, the geometric likelihood model outperforms arithmetic aggregation. However, small departures from these ideal conditions such dependence among experts quickly denigrate the quality of the aggregation. The conclusion is that arithmetic averaging may be a safer method of aggregation. The third study extends to results of the second study and examines the possibility of optimally weighting experts based on their expertise and calibration. A significant finding is that optimal weights for linear combinations of experts can be found through the solution of a quadratic program. Implementation of these findings was shown to have significant impact on the quality of aggregated judgments. However, the information requirements to implement such a scheme are also significant and will require a rethinking of the elicitation process.