The Art of Misdirection: Abbott's Shaky Take on the VRAG-R
Vernon L. Quinsey
Psychology Department, Queen's University
[Sexual Offender Treatment, Volume 12 (2017), Issue 2]
Abbott (2017) makes a variety of criticisms of the use of the Violence Risk Appraisal Guide-Revised (VRAG-R; Harris, Rice, Quinsey, & Cormier, 2015) in sexual violent predator (SVP) hearings. His article uses misdirection to obfuscate and conceal two issues involved in the application of the VRAG-R in SVP cases: The nature of the decision that is to be made and the characteristics of the offenders who are involved.
Keywords: actuarial prediction, violence, VRAG-R, sexual predators
At a time when many scientists worry about the replicability of their results (Open Science Collaboration, 2015), the many individual-level correlates of crime, such as sex and age at first arrest, are routinely found to predict reoffending across time, countries, and various populations, such as psychiatric patients and offenders incarcerated in the criminal justice system (e.g., Skeem, Winter, Kennealy, Louden, & Tatar II, 2014; Witt, van Dom, & Fazel, 2013). So frequent and ubiquitous are these findings, that should an investigator fail to obtain them, the most parsimonious explanation is that an error has been made.
The robustness of these correlates of reoffending has encouraged their combined use in actuarial risk assessment instruments (e.g., Bonta et al., 1996). These instruments estimate the probability that an assessee will commit at least one re-offense (or re-offense of a particular kind) within a specific time period. All assessments of risk necessarily involve the estimation of a probability (BusinessDictionary, 2017) offers six definitions of risk in various contexts and all involve probabilities of adverse outcomes). To illustrate: If it is asserted that an offender has a high risk of reoffending, there is no sensible answer to the question "How high?" or "What is 'high?'" without the specification of a probability. Thus, any risk assessment instrument involves a probability estimate, explicit in actuarial instruments and implicit (relying on consumers' intuitive and untutored understanding of the implied probability) in non-actuarial ones.
Risk communication in the VRAG-R is accomplished by dividing the total range of scores into nine equal-sized bins or ranges. Each range is associated with an empirically derived probability of violent (including hands-on sexual) reoffending. This information can be used in titrating the amount of treatment or supervision an offender receives, although percentile ranks or risk ratios may also be used for these purposes in some situations (see Harris, Lowenkamp, & Hilton, 2015).
Examinations of how faithfully the probabilities associated with each bin are replicated in new samples are termed "calibration" studies (Helmus & Babchisin, 2017). Abbott's article (2017) focuses in large part on issues involved in calibration studies of a variety of actuarial instruments. Although calibration studies are relevant to issues of accuracy in risk communication, they are not closely linked to binary decision situations such as SVP adjudications. Because of this, Abbott's calibration discussion is an elaborate misdirection from the task at hand in SVP trials. When a binary decision is to be made, such as when an offender is classified as a sexually violent predator or not, a cut-off score or probability is established that reduces, in the case of the VRAG-R, from nine bins to two. The resulting type of decision matrix (in this case, two classes of offender by two classes of outcome) has been studied extensively in psychology for decades (e.g., Tanner & Swets, 1954).
Although not directly relevant to SVP hearings, it should be noted for the sake of completeness that Abbott's (2017) suggestion that VRAG-R bin probabilities are unstable has not been empirically supported by initial investigations - see Gregório Hertz, P., Rettenberger, M., and Eher, R. (2016). A cross-validation of the VRAG-R using a sexual offender sample from Austria. Presented at the International Association for the Treatment of Sex Offenders Conference, Copenhagen. Gregório Hertz, Rettenberger, and Eher (2016) and Olver (2017) for calibration among Austrian and Canadian sex offenders, respectively.
3. Base Rates
Abbott's (2017) abstract states that "Clinicians find it increasingly difficult to affirm the likelihood threshold in the face of decreasing base rates…" but does not cite the evidence upon which this statement is based or in which population this decline has occurred. Although population crime rates have decreased substantially throughout North America in recent decades, remaining much higher in the U.S. than in Canada (Lalumière, Harris, Quinsey, & Rice, 2005; Mishra & Lalumière, 2009; Zimring, 2006), this decrease may or may not be related to the base rate of reoffending among identified offenders - base rates of recidivism among released offenders can rise, fall, or remain the same despite a falling base rate of offending in the general population. Analyses of follow-up data from temporal cohorts in the VRAG-R data set found no decrease in the rate of sexual or violent reoffending across time (Harris, Rice et al., 2015).
Abbott (2017) spends considerable space arguing that the base rate of sexual recidivism among SVP assesses is likely to be lower than the base rate of sexual or violent reoffending in the VRAG-R construction sample. Of course, the VRAG-R was not designed for SVP hearings and the base rates of SVP populations may indeed differ from those of the VRAG-R construction sample, the question is whether they are the best available estimate (because an estimate must be made) and whether they are likely to be higher or lower. The outcome of concern in SVP legislation in Washington State, for example (statute 71.09.020), is narrower than the operational definition of violent re-offenses in the VRAG-R research in that the former excludes non-sexually motivated crimes of violence (with several exceptions, including murder and kidnapping). It is broader in including such crimes as sexually motivated residential burglary. More importantly, the relevant time period in SVP legislation is the lifetime of the offender, not a specific time interval as given in the VRAG-R actuarial tables. These differences serve to make the VRAG-R 5- and 12-year probabilities underestimates of the probability of sexual reoffending among those considered for SVP status.
4. The VRAG-R Development Sample
The goal of the VRAG-R sampling strategy was to obtain a heterogeneous sample of the kind of offenders who are often assessed for risk of violent (including sexual) reoffending and to show that the VRAG-R worked similarly for the various subsamples. Because of this, very low-risk offenders (despite inclusion of a sample of sex offenders referred from the community) are likely underrepresented. In contrast to the assertion of Abbott's (2017, p.5), not all the VRAG-R subjects came from Oak Ridge - sex offenders from a community sample and two samples from federal corrections were also included. Similarly, even the Oak Ridge sample was not composed exclusively of mentally ill offenders, a substantial number were only assessed briefly at Oak Ridge before going directly to federal or provincial corrections.
In contradiction to the statements in Abbott's Section 2.4 (2017), the controversial methods of treatment, such as "the capsule" described in this section were applied to a small number of volunteer patients who were present in the late sixties and early seventies, these offenders make up a minuscule part of the VRAG-R sample. Many of the sex offenders who were treated at Oak Ridge received a cognitive behavioural type treatment. Most of the prison sample of sex offenders received similar treatment. It would make no difference, however, whether the sex offenders received cognitive behavioural treatment or not because the evaluative literature, including a well-designed random assignment study of cognitive-behavioral treatment of sex offenders (Marques, Wiederanders, Day, Nelson, & Van Ommeren, 2005) does not show that treatment reduces sexual recidivism (Rice & Harris, 2013). Moreover, retrospective evaluations of cognitive behavioral programs for the sex offenders included in the VRAG-R sample (Quinsey, Khanna, & Malcolm, 1998; Rice, Harris, & Quinsey, 1991) found no treatment effect. Moreover, treatment variables were included as candidate predictors in the development of the VRAG but none of them made the cut. The assertions concerning treatment in Abbott's article are therefore essentially irrelevant.
Abbott's (2017) arguments purporting to show that the VRAG-R development sample is "special" would only work to support his thesis if they demonstrate that probability estimates based on it are lower than what would be expected in the SVP-eligible assessees. The focus on the alleged weirdness of the VRAG-R sample is another instance of misdirection but a far more spectacular one than that involved in the discussion of calibration. It is not the VRAG-R development sample that is special, it is the SVP-eligible assessees who are very, very special. For example, each year, about 1000 sex offenders are released in the state of Washington but only a few are selected for application of the SVP legislation: For example, in fiscal year 2014, only 14 cases (1.4%) were tried in court (Washington State Office of the Attorney General, 2017). These "forensic superstars" are selected according to risk-related criteria such as previous sex crime and other criminal convictions, predatory characteristics (such as seeking out victims unknown to the offender), having used violence in their crime, and having refused or failed to complete sex offender treatment (King County Sexual Assault Resource Center, 2017).
Thus, when the VRAG-R is employed in Washington State SVP hearings, norms developed on a heterogeneous sample of offenders selected to be roughly representative of offenders who are ordinarily assessed for risk of sexual or violent reoffending are applied to an elite group of sex offenders who have been highly pre-selected for risk. One doesn't have to be a statistician to understand the consequences: The anticipated base rate of sexual reoffending in the Sexually Violent Predator assessees is likely much higher than that indicated by the VRAG-R. Even with much less rigorous pre-selection on risk, the score-wise recidivism rates on actuarial scales are much higher than the norms for an unselected sample (Hanson, Thornton, Helmus, & Babchishin, 2016). In a more directly relevant investigation, Milloy (2007) followed 135 Washington State sex offenders who were referred for civil commitment, but not committed, for a six-year period. Milloy concluded that, in comparison to released Washington State sex offenders in general, "this population of released sex offenders who were referred for civil commitment is a unique subgroup with much higher recidivism rates" (p. 7).
5. Allegorical Summary
To make the issues explicated in the foregoing more vivid, I have summarized them using an allegory.
Imagine that you have been captured by an evil overlord of the new corporate order and forced to participate in his hobby reality TV show - "Nouveau Serfs". There is a carnival strength test apparatus on the set. Contestants strike a plate at the bottom with a sledgehammer, attempting to send a metal ball to ring the bell at the top. The strength test apparatus has been imported from Canada and the numbers running from bottom to top are divided into nine equal sized ranges or bins. The bottom range of numbers is labelled "girly-man", the penultimate, "John Henry", and the top "Thor".
The gamble you have been forced to accept is to bet on which bin a randomly selected local yokel will score in. If you lose, you and your family will labour in the dungeons of a Trumpian tower for less than minimum wage in perpetuity. If you win, you and your family will receive free health care and coupons redeemable for plastic trinkets from popular on-line retailers.
A search of the dark web reveals an actuarial scale that has been secretly developed by Canadian carnival gamblers; it contains variables such as height, weight, and number of arms (scored zero, one, or two). Given the high stakes, you consider the scale carefully. What if the calibration of the apparatus is different for Canadian men than men in Washington State? Your mind conjures up visions of burly Canadian lumberjacks driven near mad by the incessant biting of the Northern Ontario blackfly and effete man-bunned hippies sipping kale lattes in upscale Seattle coffee houses.
Just before you place your bet, you learn that the customary crowd of local yokels has been replaced by the weight-lifting team from the University of Washington and you are now to bet whether the contestant will score in the top third or bottom two-thirds of the bins. You start wondering how many plastic baubles you can fit into your trailer.
- Abbott, B.R. (2017). Sexually violent predator risk assessments with the Violence Risk Appraisal Guide-Revised: A shaky practice. International Journal of Law and Psychiatry, 52, 62-73. doi:10.1016/j.ijlp.2017.03.003
- Bonta, J., Harman, W.G., Hann, R.G., & Cormier, R.B. (1996). The prediction of recidivism among federally sentenced offenders: A re-validation of the SIR Scale. Canadian Journal of Criminology, 38, 61-79.
- BusinessDictionary. (2017). "Risk" definitions. Retrieved from www.businessdictionary.com/definition/risk.html.
- Gregório Hertz, P., Rettenberger, M., & Eher, R. (2016). A cross-validation of the VRAG-R using a sexual offender sample from Austria. Presented at the International Association for the Treatment of Sex Offenders Conference, Copenhagen.
- Hanson, R. K., Thornton, D., Helmus, L. M., & Babchishin, K. M. (2016). What sexual recidivism rates are associated with Static-99R and Static-2002R scores? Sexual Abuse, 28, 218-252. doi:10.1177/1079063215574710
- Harris, G.T., Lowenkamp, C.T., & Hilton, N.Z. (2015). Evidence for risk estimate precision: Implications for individual risk communication. Behavioral Sciences and the Law, 33, 111-127. doi: 10.1002/bsl.2158
- Harris, G.T., Rice, M.E., Quinsey, V.L., & Cormier, C. (2015). Violent offenders: Appraising and managing risk (3rd ed.). Washington, DC: American Psychological Association.
- Helmus, L. M., & Babchishin, K. (2017). Primer on risk assessment and the statistics used to evaluate its accuracy. Criminal Justice and Behavior, 44, 8-25. doi:10.1177/0093854816678898
- King County Sexual Assault Resource Center. (2017). Classification of sex offenders: Frequently asked questions. Retrieved from www.k12.wa.us/safetycenter/Offenders/pubdocs/FAQonClassification.pdf.
- Lalumière, M. L., Harris, G. T., Quinsey, V. L., & Rice, M. E. (2005). The causes of rape: Understanding individual differences in the male propensity for sexual aggression. Washington, DC: American Psychological Association.
- Marques, J. K., Wiederanders, M., Day, D. M., Nelson, C., & Van Ommeren, A. (2005). Effects of a relapse prevention program on sexual recidivism: Final results from California's Sex Offender Treatment and Evaluation Project (SOTEP). Sexual Abuse: A Journal of Research and Treatment, 17, 79-107. doi:10.1177/107906320501700108
- Mishra, S., & Lalumière, M. L. (2009). Is the crime drop of the 1990s in Canada and the USA associated with a general decline in risky and health-related behavior? Social Science and Medicine, 68, 39-48. doi:10.1016/j.socscimed.2008.09.060
- Milloy, (2007). Six-year follow-up of 135 released sex offenders recommended for commitment under Washington's Sexually Violent Predator Law, where no petition was filed. Olympia, WA: Washington State Institute for Public Policy.
- Olver, M. (2017). Cross-validation, calibration, and risk communication using the VRAG-R. To be presented at the meeting of the Association for the Treatment of Sex Offenders Conference. Kansas City.
- Open Science Collaboration (2015). Estimating the reproducibility of psychological science. Science, 349, 1-8. doi:10.1126/science.aac4716
- Quinsey, V.L., Khanna, A., & Malcolm, B. (1998). A retrospective evaluation of the Regional Treatment Centre Sex Offender Treatment Program. Journal of Interpersonal Violence, 13, 621-644. doi:10.1177/088626098013005005
- Rice, M.E. & Harris, G.T. (2013). Treatment for adult sex offenders: May we reject the null hypothesis? In K. Harrison & B. Rainey (Eds.), The Wiley-Blackwell handbook of legal and ethical aspects of sex offender treatment and management (pp. 217-235). New York, NY: Wiley Blackwell.
- Rice, M.E., Harris, G.T., & Quinsey, V.L. (1991). Evaluation of an institution-based treatment program for child molesters. Canadian Journal of Program Evaluation, 6, 111-129.
- Skeem, J. L., Winter, E., Kennealy, P. J., Louden, J. E., & Tatar II, J. R. (2014). Offenders with mental illness have criminogenic needs, too: Toward recidivism reduction. Law and Human Behavior, 38, 212-224. doi:10.1037/lhb0000054
- Tanner, W.P. & Swets, J.A. (1954). A decision-making theory of visual detection. Psychological Review, 61, 401-409. doi:10.1037/h0058700
- Washington State Office of the Attorney General. (2017). Sexually violent predators. Retrieved from www.atg.wa.gov/sexually-violent-predators.
- Witt, K., van Dom, R., & Fazel, S. (2013). Risk factors for violence in psychosis: Systematic review and meta-regression analysis of 110 studies. PLoS ONE, 8, e55942. doi:10.1371/journal pone.0055942.
- Zimring, F. E. (2006). The great American crime decline. Oxford: Oxford University Press.
Maaike Helmus, Brian Judd, and Martin Lalumière provided helpful comments and encouragement in the preparation of this article.
I examine and provide examples of the persuasion techniques employed in Abbott's (2017) opinion piece. Misdirection is the most common of these, although misinformation also plays a role.
Vernon L. Quinsey
Kingston, Ontario, Canada, K7L 3N6