1 F. Supp. 3d 1124 (D. Haw. 2014)

From Casetext: Smarter Legal Research

United States v. Williams

United States District Court, D. Hawai‘i.

Mar 6, 2014

1 F. Supp. 3d 1124 (D. Haw. 2014)

Summary

finding that, “in the end, the exact wording of the various standards makes little substantive difference,” and that “with the recent release of the [DSM-V] ... the Court need not decide which definition of prong two is preferable or correct, because the differences between them are mostly theoretical”

Summary of this case from United States v. Wilson

See 6 Summaries

Opinion

Criminal No. 06–00079 JMS–KSC.

2014-03-6

UNITED STATES of America, Plaintiff, v. Naeem J. WILLIAMS, Defendant.

Steven D. Mellin, U.S. Department of Justice, Washington, DC, Darren W.K. Ching, Office of the United States Attorney, Honolulu, HI, for Plaintiff. Barry D. Edwards, Kaneohe, HI, Michael N. Burt, Law Office of Michael Burt, John Timothy Philipsborn, San Francisco, CA, for Defendant.

J. MICHAEL SEABRIGHT

ORDER DENYING DEFENDANT NAEEM WILLIAMS' MOTION FOR PRETRIAL DETERMINATION THAT THE DEATH PENALTY CANNOT BE CARRIED OUT AGAINST NAEEM WILLIAMS BECAUSE OF A DISQUALIFYING MENTAL CAPACITY WITHIN THE MEANING OF 18 U.S.C. § 3596(c) AND ATKINS v. VIRGINIA, 536 U.S. 304 (2002)

J. MICHAEL SEABRIGHT, District Judge.

I. INTRODUCTION

The United States has charged Defendant Naeem Williams (“Defendant” or “Williams”) with crimes that qualify him for possible imposition of the death penalty under 18 U.S.C. §§ 3591 & 3592. Defendant has moved pursuant to 18 U.S.C. § 3596(c), and Atkins v. Virginia, 536 U.S. 304, 122 S.Ct. 2242, 153 L.Ed.2d 335 (2002), for a pretrial determination that the death penalty cannot be carried out against him because of a disqualifying mental capacity— i.e., that he is “intellectually disabled.” Doc. No. 2064. The court has analyzed the extensive evidence taken during nine days of testimony in September and December 2013 (as well as other evidence in the record specifically proffered by the parties ), and has carefully considered the written arguments filed by both sides. Based on the following, the court concludes that Defendant has failed to prove by a preponderance of the evidence that he has such a disqualifying condition. Accordingly, Defendant's Motion is DENIED.

Both 18 U.S.C. § 3596(c) and Atkins preclude imposing the death penalty on the “mentally retarded.” The clinical term for “mental retardation” is now “intellectual disability.” See, e.g., Pizzuto v. Blades, 729 F.3d 1211, 1214 n. 1 (9th Cir.2013) (citing Robert L. Shalock et al., The Renaming of Mental Retardation: Understanding the Change to the Term Intellectual Disability, 45 Intell. & Dev. Disabilities 116, 116–17 (2007)); see also Rosa's Law, Pub. L. No. 111–256, 124 Stat. 2643 (2010) (“An Act to change references in Federal law to mental retardation to references to an intellectual disability, and change references to a mentally retarded individual to references to an individual with an intellectual disability.”). Given the language of § 3596(c) and Atkins, however, many authorities continue to use the term “mental retardation” for consistency in relation to Atkins issues. Nevertheless, to conform to current clinical practice, the court here primarily refers to the relevant condition as “intellectual disability.” For present purposes, the terms are synonymous, referring to precisely the same condition.

As described below, some of the evidence relevant to this Atkins proceeding was developed in other proceedings in this case before Judge David Alan Ezra (who previously presided over this case, but who now serves by designation in the Western District of Texas upon taking Senior status). In particular, Judge Ezra held hearings under Daubert v. Merrell Dow Pharmaceuticals, 509 U.S. 579, 113 S.Ct. 2786, 125 L.Ed.2d 469 (1993), in 2008 and 2012 regarding expert testimony for use at a guilt phase of trial and in assessing Defendant's competency to stand trial. (This action was reassigned to this court, Judge J. Michael Seabright, on March 2, 2013. Doc. No. 1966.). And as part of the current Atkins proceeding, the parties have designated transcripts of certain testimony from those 2008 and 2012 proceedings. At a January 17, 2014 status conference, the parties confirmed that this court should consider only the portions of those transcripts that they specifically refer to in their written argument. That is, the evidentiary record for this Atkins proceeding is confined to testimony and exhibits admitted at the September and December 2013 hearings, as well as prior testimony specifically cited in the parties' memoranda. Of course, this does not preclude the court from reviewing other portions of the docket as necessary to put matters into the proper context.

The court recognizes that other proceedings in this case concern “borderline intellectual functioning” (“BIF”), and that some expert witnesses have opined in various degrees on both BIF and on intellectually disability. This Order concerns only the Atkins question—any issues regarding BIF are controlled by other Orders.

The court first explains the relevant procedural background leading to the Atkins hearings, and summarizes the witnesses who testified in September and December 2013 (and in prior related proceedings in this case). The substance of the evidence, however, is best understood in light of the applicable legal and clinical standards. The court thus analyzes the specific testimony and evidence in the Analysis section of this Order, after examining the relevant standards in the Discussion section.

II. PROCEDURAL BACKGROUND

A. Charges Against Defendant

The Second Superseding Indictment (“Indictment”) charges Defendant with two capital-eligible Counts arising out of his role in allegedly beating and killing his five-year-old daughter. Specifically, Count One charges Defendant with first degree felony murder, in violation of 18 U.S.C. §§ 7 & 1111. Doc. No. 1004, Indictment at 2. It alleges that on July 16, 2005, Defendant, with malice aforethought, unlawfully killed a child, in the perpetration of child abuse, at Wheeler Army Airfield. Id. Count Two charges Defendant with first degree felony murder, and aiding and abetting first degree felony murder, in violation of 18 U.S.C. §§ 7 & 1111. It alleges that sometime after December 13, 2004, and culminating on July 16, 2005, Defendant and his wife, Delilah Williams, with malice aforethought, unlawfully killed, and aided and abetted each other in the killing of, a child in the perpetration of a pattern and practice of assault and torture against a child. Id. at 3.

The Indictment contains a Notice of Special Findings section, alleging mental state eligibility factors and statutory aggravating factors under 18 U.S.C. §§ 3591(a) & 3592(c). In particular, it alleges that Defendant:

a. intentionally inflicted serious bodily injury that resulted in the death of Talia Williams (18 U.S.C. § 3591(a)(2)(B));

b. intentionally and specifically engaged in an act of violence, knowing that the act created a grave risk of death to a person, other than one of the participants in the offense, such that participation in the act constituted a reckless disregard for human life and the victim, Talia Williams, died as a direct result of the act (18 U.S.C. § 3591(a)(2)(D));

c. committed the offense charged in the indictment in an especially heinous, cruel, and depraved manner in that it involved torture and serious physical abuse to the victim, Talia Williams (18 U.S.C. § 3592(c)(6)); and

d. committed the offense charged in the indictment against a victim, Talia Williams, who was particularly vulnerable due to her youth (18 U.S.C. § 3592(c)(11)).
Id. at 4.

At the time of the alleged crimes, Defendant was a Specialist (enlisted rank of E–4) on active duty in the United States Army, stationed at Schofield Barracks in Wahiawa, Hawaii. See, e.g., Gov't's Ex. 3, Denney Rpt. at 6. Federal jurisdiction arises because the alleged crimes occurred “within the special maritime and territorial jurisdiction of the United States, to wit, Wheeler Army Airfield[.]” Doc. No. 1004, Indictment at 2.

B. Prior Expert Witness Testimony and Evidence

On November 9, 2007 and April 12, 2008, Defendant filed Notices of Expert Evidence of a Mental Condition pursuant to Federal Rule of Criminal Procedure 12.2(b). Doc. Nos. 416 & 554. By these Notices, Defendant indicated that he “intends [to] introduce expert evidence relating to mental condition bearing on (1) the issue of guilt during the guilt trial and (2) on the issue of punishment during any penalty hearing in this capital case [.]” Doc. No. 554, Def.'s Notice at 1. In this regard, several expert witnesses—clinical and social psychologists, neuropsychologists, and psychiatrists—had previously been or were later retained and proffered opinions as to (among other matters) Defendant's mental condition as related to his capacity to form the requisite intent charged in the Indictment. Specifically, Defendant has claimed he is or was suffering from “borderline intellectual functioning” (“BIF”) (a distinct, although perhaps related, issue from the Atkins question presently before the court). As described in a prior Order, BIF is a condition (or description of a condition) the existence of which might be relevant in understanding whether Defendant had the necessary “mens rea” as charged in the Indictment. See, e.g., Doc. No. 780, Order Denying Government's Amended Motion To Exclude the Defendant's Mental Health Expert Witnesses at the Guilt–Phase (“Guilt Phase Order”) at 18–19 (Feb. 20, 2009) (Ezra J.).

.Rule 12.2(b) provides in pertinent part:
If a defendant intends to introduce expert evidence relating to a mental disease or defect or any other mental condition of the defendant bearing on either (1) the issue of guilt or (2) the issue of punishment in a capital case, the defendant must—within the time provided for filing a pretrial motion or at any later time the court sets—notify an attorney for the government in writing of this intention and file a copy of the notice with the clerk.

In July and October 2008, the government filed Motions seeking to exclude Defendant's mental health expert witnesses at the guilt phase, and requested hearings under Daubert v. Merrell Dow Pharmaceuticals, 509 U.S. 579, 113 S.Ct. 2786, 125 L.Ed.2d 469 (1993). See Doc. Nos. 639 & 715. Accordingly, Judge Ezra conducted a Daubert evidentiary proceeding from November 3 through 6, 2008, and ultimately denied the government's request to exclude Defendant's witnesses at the guilt phase. See Doc. No. 780, Guilt Phase Order at 47. As part of these 2008 Daubert proceedings, Defendant proffered testimony and opinions from Dr. Myla Young (a neuropsychologist) and Dr. Pablo Stewart (a clinical psychiatrist). They opined, among other beliefs, that Defendant suffers from BIF and brain damage which impairs his ability to understand and adapt to stressful situations. Id. at 4. The government responded with Dr. Philip Resnick (a forensic psychiatrist) and Dr. Harold Hall (a psychologist and forensic neuropsychologist), who critiqued Drs. Young and Stewart's diagnoses and methodology. Id. Some of the evidence from this 2008 proceeding is relevant towards Atkins issues, as explained further below.

As a result of testimony during the November 2008 hearings, a question arose regarding Defendant's competency to stand trial. On March 9, 2009, Judge Ezra issued an Amended Order Granting the Government's Motion for: (1) a Hearing to Determine the Mental Competency of the Defendant to Stand Trial; and (2) a Psychiatric and Psychological Examination of the Defendant. Doc. No. 796. That Order led to examinations in 2009 of Defendant by United States Bureau of Prisons Drs. Elizabeth Tyner and Dr. Lea Ann Preston Baecht, and a June 2009 Forensic Report by Dr. Preston Baecht. See, e.g., Doc. No. 826, Order Re. Competency Rpt.; Doc No. 2065–5, Tr. June 25, 2012 (Dr. Tyner) at 27–28. On August 31, 2009, Defendant was found competent to stand trial. See Doc. Nos. 859 (oral ruling), 865 (written order).

For various reasons, the competency proceedings eventually led to Defendant's May 29, 2012 Motion to Exclude or Limit Testimony of Dr. Preston Baecht or Other Competence–Related Examiners. Doc. No. 1853. As a result, Judge Ezra held Daubert evidentiary hearings in June and August 2012, during which the court heard testimony from Drs. Tyner, Preston Baecht, and defense witness Dr. Kyle Boone. These 2012 Daubert hearings, like the 2008 Daubert hearings, resulted in some evidence also relevant to Atkins issues—and the parties have also proffered specific testimony from these 2012 proceedings for the court's consideration here.

Meanwhile, the government retained a neuropyschologist, Dr. Diana Goldstein, as a rebuttal witness as to Defendant's BIF theory. As part of her duties, Dr. Goldstein evaluated Defendant and conducted various neuropsychological tests in 2010. Dr. Goldstein's opinions were not proffered in this Atkins proceeding, but the parties have agreed that “raw test scores reported by Dr. Diana Goldstein, including Dr. Goldstein's WAIS–IV scores, and her neuropsychological assessment scores, can be used by the parties' mental health experts as bases for opinions on Mr. Williams's intellectual functioning and mental condition at the time of testing.” Doc. No. 2176, Stipulation Concerning Dr. Diana Goldstein's Data and Scoring Opinions at 2. The parties also stipulated that Dr. Goldstein's actual opinions as reflected in her reports shall not be the basis for any opinion by another expert—that is, the parties were permitted to consider Dr. Goldstein's test results but not rely on her opinions, interpretations, or analysis of those results. Id. at 2–3. Accordingly, some of Dr. Goldstein's 2010 test results are in this Atkins record, and were considered by witnesses from both sides.

C. Atkins Evidentiary Hearings

On July 3, 2013, Defendant filed his “Motion for a Pretrial Determination That the Death Penalty Cannot Be Carried out Against Naeem Williams Because of a Disqualifying Mental Capacity Within the Meaning of 18 U.S.C. § 3596(c), and Atkins v. Virginia.” Doc. No. 2064. The government filed its initial Opposition on October 4, 2013. Doc. No. 2188.

Meanwhile, on September 9, 2013, this court heard testimony from Dr. Thomas Oakland (a proposed government rebuttal witness) during further Daubert proceedings as to the propriety of his (and other witnesses') possible testimony during the guilt phase of the trial. The parties agreed that Dr. Oakland's testimony is also relevant towards Atkins issues, and subsequently stipulated to submit Dr. Oakland's September 9, 2013 testimony as it was given in that Daubert hearing as part of the government's case in the Atkins proceeding. See Doc. No. 2246, Tr. Dec. 13, 2013 (Dr. Oakland) at 172. Dr. Oakland's September 9, 2013 testimony was thus designated as part of the Atkins record. See Def.'s Ex. F; Gov't's Ex. 47.

The court held Atkins hearings on December 3, 2013; on December 11–13, 2013; and on December 16–19, 2013. On those days, the court admitted evidence and heard testimony for Defendant from Drs. Joette James, Kyle Boone, and George Woods. And for the government, the court heard testimony from Drs. Robert Denney and Linda Gottfredson. Post-hearing Briefs were filed on January 13, 2014, Doc. Nos. 2279, 2281, and Replies were filed on January 17, 2014. Doc. Nos. 2285, 2286.

As set forth above, Dr. Boone also testified in 2012 in Daubert proceedings before Judge Ezra.

To summarize, in all, the court considered evidence and testimony from the following (including the witnesses who testified in 2008 and 2012):

1. Dr. Myla Young

Dr. Young did not testify during the Atkins proceedings, and the court understands that she passed away in late 2013.

Dr. Young was retained as an expert witness for Defendant. As described in the 2009 Guilt Phase Order:

Dr. Young earned her Ph.D. from what was formerly known as the California School of Professional Psychology in 1988. She was licensed in the State of California as a psychologist in 1990, and is certified by the American Board of Professional Neuropsychology. Dr. Young focuses on neuropsychological assessment and has spent a significant period of her career on the staff of the California Department of Mental Health Program at the Correctional Medical Facility in Vacaville, California. As a result, Dr. Young has experience in assessment of individuals within the correctional setting and has participated as a principal investigator in several studies conducted by the California Department of Health.
Doc. No. 780, Guilt Phase Order at 10. She was accepted by the court as an expert in neuropsychology. Doc. No. 2065–2, Tr. Nov. 4, 2008 (Dr. Young) at 30.

Over a period of approximately five days beginning in January 2006, Dr. Young administered a variety of neuropsychological and psychological tests and assessment tools to assess Defendant's neural functioning.... These procedures tested, among other things, Defendant's intellectual functioning, his motor, attention, memory and learning skills, and his executive functioning. Among these tests were the [WAIS–III] and the Test of Non–Verbal Intelligence (“TONI–3”), which have been generally accepted as reliable and valid measures of intelligence. [Atkins, 536 U.S. at 309 n. 5, ––– S.Ct. at ––––] (describing WAIS–III as “the standard instrument in the United States for assessing intellectual functioning”); [Doc. No.2065–2,] Nov. 4[, 2008] Tr. at 64:8–13 (Dr. Young stating “[t]he most frequently used test of nonverbal IQ is the TONI–3”).
Doc. No. 780, Guilt Phase Order at 11.

2. Dr. Pablo Stewart

Defendant presented Dr. Stewart in 2008, also to opine regarding BIF and its effect, if any, on Defendant's capacity to form the mens rea charged in the Indictment. See, e.g., id. at 33. As set forth in the Guilt Phase Order, Dr. Stewart is:

a physician licensed to practice medicine in California and Hawaii who is board certified in psychiatry by the American Board of Psychiatry and Neurology. Dr. Stewart completed his medical and psychiatric training at the University of California in 1986 and has practiced in a number of settings, including jails, jail psychiatric hospitals, and Veterans' Administration hospitals. Dr. Stewart has qualified as an expert in seven federal courts and several state courts.
Id. at 31–32. In November 2008, the court qualified Dr. Stewart as an expert in psychiatry. Doc. No. 2065–1, Tr. Nov. 3, 2008 (Dr. Stewart) at 36.

Dr. Stewart interviewed Defendant for a total of 13 to 14 hours over a period of more than two years and reviewed anecdotal records, including interviews of family members, social history, and an interview with Delilah [Williams]. Dr. Stewart also reviewed some of the testing conducted by Dr. Young.
Doc. No. 780, Guilt Phase Order at 32–33. Dr. Stewart did not testify at the Atkins proceeding, although he was present in court for many of the hearings—Defendant notified the court on December 11, 2013 of Dr. Stewart's presence so as to observe testimony of various witnesses (and the court recognized his attendance on that day and on other days), and the government did not object to his presence. Doc. No. 2259, Tr. Dec. 11, 2013 at 5.

3. Dr. Phillip Resnick

Dr. Resnick is a board-certified psychiatrist and professor of psychiatry at Case Western Reserve University. Doc. No.2065–3, Tr. Nov. 5, 2008 (Dr. Resnick) at 104–05. He was retained by the government, and qualified by the court in November 2008 as an expert in forensic psychiatry. Id. at 107. He offered an opinion during the 2008 Daubert hearings on Drs. Young and Stewart's diagnosis and methodology. Doc. No. 780, Guilt Phase Order at 4.

4. Dr. Howard Hall

Dr. Hall was retained by the government as a rebuttal witness, and opined in 2008 on testimony or procedures of Drs. Young and Stewart as discussed above. He has a doctorate in clinical psychology, and is board-certified in three areas: clinical psychology, forensic psychology, and neuropsychology. Doc. No.2065–4, Tr. Nov. 6, 2008 (Dr. Hall) at 5–6. He has been qualified as an expert witness “several hundred times since the late 1970's” in state, military, and federal courts. Id. at 7. The court qualified him in November 2008 as an expert in forensic neuropsychology. Id.

5. Dr. Lee Ann Preston Baecht

Dr. Preston Baecht performed a comprehensive competency examination (assisted by Dr. Tyner), and prepared a forensic report for the court for competency purposes in 2009. She has a doctorate degree in clinical psychology, and is a licensed psychologist in the State of Indiana. See Doc. No. 1928, Order Denying Def.'s Mot. to Exclude or Limit Testimony at 10–11. She is board certified in forensic psychology, and has been employed full-time as a clinical psychologist in the forensic evaluation unit at the United States Medical Center for Federal Prisoners (“MCFP”) of the Bureau of Prisons (“BOP”) in Springfield, Missouri since 2000. Id. Among other duties, she conducts court-ordered forensic evaluations to address competency, responsibility, the need for mental health treatment, and dangerousness of defendants or prisoners. Id. at 11. The court qualified her in August 2012 as an expert in forensic psychology. Doc. No.2065–6, Tr. Aug. 7, 2012 (Dr. Peston Baecht) at 14–15.

6. Dr. Elizabeth Tyner

Dr. Tyner obtained her doctorate in clinical psychology, with a focus in forensic psychology in 2008. See Doc. No.1928, Order Denying Def.'s Mot. to Exclude or Limit Testimony at 10. She was licensed in the State of Washington as a psychologist in October 2009. Id. She completed a clinical internship and postdoctoral residency at the MCFP in 2009, and has worked there since 2009 as a clinical psychologist. During her postdoctoral residency, Dr. Tyner conducted (with assistance of, or supervision by, Dr. Preston Baecht) a WAIS–III intelligence test of Defendant. See Doc. No.2065–5, Tr. June 25, 2012 (Dr. Tyner) at 27–28. The court qualified Dr. Tyner in June 2012 as an expert in clinical psychology. Id. at 26.

7. Dr. Thomas Oakland

As described above, the government submitted Dr. Oakland's September 9, 2013 Daubert testimony as rebuttal in this Atkins proceeding. Dr. Oakland has a doctorate in educational psychology, and has over forty years of experience as, among other positions, a professor of educational psychology at the Universities of Texas and Florida. Doc. No. 2172, Tr. Sept. 9, 2013 (Dr. Oakland) at 19. Among other areas, Dr. Oakland has an academic and clinical background working with the intellectually disabled. Id. at 21. He, along with Dr. Patti Harrison, developed a standardized test for assessing adaptive behavior-the Adaptive Behavior Assessment System (“ABAS”). Id. The ABAS–II is now one of the generally accepted methods of measuring adaptive functioning. Id. at 21–22. On September 9, 2013, this court qualified Dr. Oakland as an expert in the assessment of adaptive functioning, id. at 18, and admitted portions his October 2011 report (as amended on June 22, 2012) that relate to adaptive functioning. Id. at 20–21; Gov't's Ex. 59.

8. Dr. Joette James

Dr. James is a board-certified (April 2013) clinical neuropsychologist with a doctorate degree in clinical psychology. Doc. No. 2259, Tr. Dec. 11, 2013 (Dr. James) at 6, 8, 11, 112. She is employed by Children's National Medical Center in Washington D.C., and is an assistant professor in the Departments of Pediatrics and Psychiatry and Behavioral Sciences, at the George Washington University Medical Center. Def.'s Ex. 1003. She has testified in several capital cases, including opining on Atkins matters. Doc. No. 2259, Tr. Dec. 11, 2013 (Dr. James) at 113. On December 11, 2013, this court permitted Dr. James to testify as an expert neuropsychologist, and in the field of assessment of neuropsychology and intelligence. Id. at 17.

Among other matters, Dr. James administered a formal neuropsychological evaluation of Defendant in February and April 2013, which included a Stanford–Binet Intelligence Scales (Fifth Edition) (“SB–V”) intelligence test. See Def.'s Exs. 1005, 1006. During the Atkins hearings, Defendant submitted Dr. James' “Summary Report of Neuropsychological Evaluation—Revised,” Def.'s Ex. 1004; a “Neuropsychological Evaluation Test Summary,” Def.'s Ex. 1005; and a November 2013 “Supplement Report of Evaluations,” Def.'s Ex. 1006.

9. Dr. George Woods

Dr. Woods is a board-certified psychiatrist who also teaches at the University of California Berkeley School of Law and at Morehouse School of Medicine. Doc. No. 2262, Tr. Dec. 16, 2013 (Dr. Woods) at 6–7. He has over thirty years of experience as a clinical psychiatrist, neuropsychiatrist, and forensic psychiatrist. Id. at 6, 8–10. He has performed over forty Atkins examinations, and testified as an expert witness in approximately seven cases. Id. at 12. On December 16, 2013, Dr. Woods was permitted to testify as an expert in forensic neuropsychiatry and intellectual disabilities, id. at 19, and Defendant submitted Dr. Woods' July 29, 2013 Report opining on Defendant's mental capacity. Def.'s Ex. 1002.

10. Dr. Kyle Boone

Dr. Boone is a board-certified clinical neuropsychologist and a professor. Doc. No. 2260, Tr. Dec. 12, 2013 (Dr. Boone) at 130, 133. She was formerly affiliated with Harbor–UCLA Medical Center in California, and is now a clinical professor in the Department of Psychiatry at UCLA and a professor at the California School of Forensic Studies at Alliant International University. Id. at 134; Def.'s Ex. 1007. She has published, in the area (among others) of symptom or “performance validity” of neuropsychological assessment, Doc. No. 2260, Tr. Dec. 12, 2013 (Dr. Boone) at 142, which she describes as “measur[ing] whether or not someone is in fact performing to true ability.” Id. at 144.

On December 12, 2013, Dr. Boone was accepted as an expert on “neuropsychological assessment, including the standards of practice applicable to intelligence testing, neuropsychological assessment, the assessment of effort, performance validity and malingering.” Id. at 163. During the Atkins hearings, Defendant submitted her July 20, 2012 Report, and a November 21, 2013 Supplemental Report. Def.'s Exs. 1008, 1009.

11. Dr. Robert Denney

Dr. Denney is a clinical neuropsychologist, forensic psychologist, and professor. Doc. No. 2263, Tr. Dec. 17, 2013 (Dr. Denney) at 187. He obtained his doctorate in psychology in 1991, and worked as a forensic psychologist at the BOP's Medical Center from 1990 to 2000. Id. at 189. He “performed pretrial criminal forensic evaluations for the U.S. District Courts from ... January of 1992 through February of 2000.” Id. These evaluations included “competency to stand trial and sanity-related evaluations,” and determinations of intellectual disability. Id. at 190. From 2000 until his retirement from the BOP in 2011, he continued to work at “the medical and surgical sides” of the MCFP, “provid[ing] mental health services, [and] neuropsychological diagnostic services.” Id. at 191. Since his retirement, he practices as a forensic neuropsycholgist, id. at 202, and is an associate professor and coordinator of clinical neuropsychology at the Forest Institute of Professional Psychology. Id. at 192; Gov't's Ex. 4.

On December 17, 2013, the court qualified Dr. Denney as an expert in neuropsychology and forensic psychology. Doc. No. 2263, Tr. Dec. 17, 2013 (Dr. Denney) at 204. During the Atkins hearings, the government submitted Dr. Denney's September 23, 2013 Neuropsychological Report of Defendant. Gov't's Ex. 3.

12. Dr. Linda Gottfredson

Dr. Gottfredson is a professor in the School of Education at the University of Delaware. Doc. No. 2246, Tr. Dec. 3, 2013 (Dr. Gottfredson) at 18. She has a doctorate degree in sociology from the Johns Hopkins University. Id. She teaches and has published in the general area of intelligence and its assessment, see, e.g., Gov't's Ex. 1–8, and she is cited in clinical sources (as discussed below) as having at least some responsibility for an accepted definition of “intelligence”— i.e., that there is a single “general factor of intelligence,” known as “g”. See, e.g., Doc. No. 2246, Tr. Dec. 3, 2013 (Dr. Gottfredson) at 32; Def.'s Ex. 1015. She is, however, not a psychologist or psychiatrist, and has never previously testified or been qualified as an expert witness in any court. See, e.g., Doc. No. 2246, Tr. Dec. 3, 2003 at 19 & 34.

Dr. Gottfredson testified on December 3, 2013, as a government rebuttal witness (both as to Atkins issues and as a potential government witness at the guilt phase regarding BIF). The government also proffered her September 19, 2013 Report, along with six appendices to that Report. See Gov't's Exs. 1, 1–1 to 1–10, and 2. At the hearing, the court permitted her to testify for Atkins purposes as an expert in the field of human intelligence, and subject to a renewed Daubert motion as to the use of her testimony. Doc. No. 2246, Tr. Dec. 3, 2013 (Dr. Gottfredson) at 38. On December 27, 2013, Defendant renewed his challenge to the use of her testimony by filing a “Motion to Exclude or Widely Limit Use of Dr. Linda Gottfredson's September 2013 Report and December 3, 2013 Testimony on Atkins and Cognitive FunctioningAssessment,” Doc. No. 2255, which the court rules upon in a separate Order.

Ultimately, in this Atkins context, the court considers Dr. Gottfredson's testimony and report in a very limited manner—solely as background information on the general concept of human intelligence. See Def.'s Ex. 1015, 2010 AAIDD Manual at 32, 34 (citing Dr. Gottfredson for the idea that intelligence is best conceptualized by a single general factor known as “g”). Because (as set forth below) the court is guided by clinical standards, her opinions as coming from a sociologist—not a psychologist or psychiatrist—are of no clinical value in this proceeding, and the court does not rely at all on her opinions regarding Defendant's intelligence testing (including her opinions regarding obtaining equivalent IQ scores from tests such as an SAT or ASVAB, as explained below). That is, the court agrees with Defendant to “widely limit” the use of her testimony to background information only.

III. DISCUSSION

Two provisions (statutory and constitutional) forbid federal courts from imposing the death penalty on the intellectually disabled—the Federal Death Penalty Act of 1994 (“FDPA”) and the Eighth Amendment. In particular, the FDPA specifically provides that a “sentence of death shall not be carried out upon a person who is mentally retarded.” 18 U.S.C. § 3596(c). Atkins later held, in addressing state law, that the execution of the mentally retarded is excessive and violates the Eighth Amendment (“Excessive bail shall not be required, nor excessive fines imposed, nor cruel and unusual punishments inflicted.”). Thus, under Atkins, “the federal policy embodied in the [FDPA] became a constitutional imperative[.]” United States v. Davis, 611 F.Supp.2d 472, 473 (D.Md.2009). Applying these provisions gives rise to several procedural and substantive issues, which the court addresses next.

A. Procedural Standards

Whether an individual is intellectually disabled “is a question of fact[.]” Clark v. Quarterman, 457 F.3d 441, 444 (5th Cir.2006); see also, e.g., Walker v. Kelly, 593 F.3d 319, 323 (4th Cir.2010) (reviewing finding that defendant was not intellectually disabled, stating that “the determination of mental retardation involves a question of fact”). “[I]t is a condition, the existence of which disqualifies a person from capital punishment[.]” Davis, 611 F.Supp.2d at 474 (citing Walker v. True, 399 F.3d 315, 326 (4th Cir.2005)). But “[t]he standard for whether someone is [intellectually disabled] and ineligible for the death penalty ... is a legal matter[.]” United States v. Wilson, 922 F.Supp.2d 334, 342 (E.D.N.Y.2013).

In this regard, Williams (in his original filing on this Motion) asks the court to require the government to prove that he is eligible for the death penalty—that is, to prove that he is not intellectually disabled. For this proposition, he cites Ring v. Arizona, 536 U.S. 584, 122 S.Ct. 2428, 153 L.Ed.2d 556 (2002) (holding in a capital case that the government has the burden to prove aggravating factors before a jury that are necessary for imposition of the death penalty, consistent with Apprendi v. New Jersey, 530 U.S. 466, 120 S.Ct. 2348, 147 L.Ed.2d 435 (2000)).

The court, however, agrees with and follows the ample case law holding precisely the opposite—in a federal case, the defendant bears the burden of proving an Atkins claim by a preponderance of the evidence. See, e.g., In re Johnson, 334 F.3d 403, 405 (5th Cir.2003) (“[N]either Ring and Apprendi nor Atkins render the absence of mental retardation the functional equivalent of an element of capital murder which the [prosecution] must prove beyond a reasonable doubt.”); United States v. Candelario–Santana, 916 F.Supp.2d 191, 193 (D.P.R.2013)(“Every district court that has addressed the issue that we are aware of has held [that the defendant bears the burden of proof on this issue by a preponderance of the evidence.]”); Davis, 611 F.Supp.2d at 474 (“[B]ecause [intellectual disability] is a disqualifying condition, the Court ... assigned to [defendant] the burden of establishing, by a preponderance of the evidence, that he is [intellectually disabled.]”); United States v. Sablan, 461 F.Supp.2d 1239, 1242–43 (D.Colo.2006) (same).

The court applies the “more probably true than not true” formulation of the preponderance of the evidence burden. See, e.g., Ninth Cir.Crim. Jury Instr. 6.6 (2013) (defining “preponderance of the evidence” as “you must be persuaded that the things the defendant seeks to prove are more probably true than not true”).

Similarly, a defendant has no constitutional right to a jury trial on an Atkins claim, a point that the parties have not disputed. See, e.g., Walker, 399 F.3d at 324–27 (rejecting argument that defendant was entitled to a jury on an Atkins claim); Smith v. Ryan, 2012 WL 6019055, at *10–11 (D.Ariz. Dec. 3, 2012) (refusing to find a right to a jury determination on intellectual disability, in part because “state and federal courts have rejected the argument that [intellectual disability] is an element of the offense which must be proven to a jury pursuant to [ Apprendi] and its progeny”); Maldonado v. Thaler, 662 F.Supp.2d 684, 706 (S.D.Tex.2009) ( “[T]he factfinder with respect to a determination of [intellectual disability] need not be a jury[.]”) (quoting In re Woods, 155 Fed.Appx. 132, 135–36 (5th Cir.2005)); Walker, 399 F.3d at 326 (“[A] finding of mental retardation ... is analogous to the question of competency to be executed in death penalty cases, which need not be decided by a jury.”) (quoting Walton v. Johnson, 269 F.Supp.2d 692, 698 n. 3 (W.D.Va.2003)).

In contrast, some states provide for a jury trial on an Atkins claim. See, e.g., Hooks v. Workman, 689 F.3d 1148 (10th Cir.2012) (reviewing, under 28 U.S.C. § 2254, an Oklahoma jury verdict in an “ Atkins trial”); see also Pennsylvania v. Sanchez, 614 Pa. 1, 36 A.3d 24, 61 (2011) (“[I]n nine states, the primary adjudicator of an Atkins claim is the trier of fact, i.e., the jury, or the trial judge if the right to a jury is waived.”) (citing cases).

Although an intellectual disability “is not a defense,” Davis, 611 F.Supp.2d at 474, the requirement for a defendant to prove an intellectual disability fits completely within the framework explained in Dixon v. United States, 548 U.S. 1, 7–8, 126 S.Ct. 2437, 165 L.Ed.2d 299 (2006) (finding it constitutional to place the burden on the defendant to establish duress by a preponderance of the evidence). Among other reasoning, Dixon reaffirmed that “at common law, the burden of proving ‘affirmative defenses—indeed, “all ... circumstances of justification, excuse or alleviation”—rested on the defendant.’ ” Id. at 8, 126 S.Ct. 2437 (quoting Patterson v. New York, 432 U.S. 197, 202, 97 S.Ct. 2319, 53 L.Ed.2d 281 (1977)).

In short, the “ultimate issue” of whether a defendant is, in fact, intellectually disabled is for the court to decide (and for Williams to prove by a preponderance of the evidence), “based upon all of the evidence and determinations of credibility.” Wilson, 922 F.Supp.2d at 343 (citation omitted).

B. Substantive Standards—A Definition of “Intellectual Disability” Informed by Established Clinical Standards

Neither the FDPA nor Atkins adopted a precise standard for determining whether a person has an intellectual disability. The FDPA provides no parameters for the term “mentally retarded,” and—as the Ninth Circuit recently reiterated—“ Atkins did not define mental retardation as a matter of federal law.... [Rather,] the Supreme Court left to the states ‘the task of developing appropriate ways to enforce the constitutional restriction upon [their] execution of sentences.’ ” Pizzuto, 729 F.3d at 1216 (quoting Moormann v. Schriro, 672 F.3d 644, 648 (9th Cir.2012) (in turn, quoting Atkins)).

The court and parties are aware that the Supreme Court has granted certiorari in Hall v. Florida, ––– U.S. ––––, 134 S.Ct. 471, 187 L.Ed.2d 316 (2013) (No. 12–10882), which presents a question regarding the Atkins standard. Accordingly, the parties have stipulated that if Hall provides a basis for reconsideration of this Order, then Defendant may raise such a challenge “within 30 days after a ruling in the Hall case, or prior to the entry of any judgment and commitment order in this case, whichever occurs first.” Doc. No. 2230, Stipulation at 2.

Nevertheless, Atkins referred to clinical definitions as set forth by the American Association on Intellectual and Developmental Disabilities (“AAIDD”) (formerly the American Association on Mental Retardation (“AAMR”)) and the American Psychiatric Association (“APA”). 536 U.S. at 308 n. 3, 122 S.Ct. 2242; see also id. at 318, 122 S.Ct. 2242. And thus, in the Atkins context, federal courts have been guided primarily by such clinical standards in determining whether a defendant is intellectually disabled when facing federal charges. See, e.g., Wilson, 922 F.Supp.2d at 339 (citing numerous district court cases); see also, e.g., Ortiz v. United States, 664 F.3d 1151, 1157 (8th Cir.2011) (discussing the APA's and AAIDD's definitions of intellectual disability in reviewing a federal district court's 28 U.S.C. § 2255Atkins decision). In that light, both parties here have structured their arguments and proffered evidence based mostly on clinical standards.

Of course, federal courts reviewing under 28 U.S.C. § 2254 state-court determinations of Atkins issues must analyze state-law definitions—many of which also incorporate clinical standards. See, e.g., Pizzuto, 729 F.3d at 1217 (reviewing Idaho law).

Wilson observed that “[f]ederal courts that have decided cases involving both Atkins and FDPA claims have taken inconsistent approaches” in whether to apply a forum state's definition of intellectual disability. 922 F.Supp.2d at 338 (citing cases taking different approaches). Although Hawaii does not have the death penalty, Hawaii law defines “intellectual disability” in other contexts as “significantly subaverage general intellectual functioning resulting in or associated with concurrent moderate, severe, or profound impairments in adaptive behavior and manifested during the developmental period.” Haw.Rev.Stat. (“HRS”) § 333F–1. As discussed below, Hawaii's definition conforms to established clinical standards such that the court need not specifically decide whether to apply Hawaii law or a federal common law definition of intellectual disability.

Accordingly, as a framework, this court relies chiefly on the professional clinical standards established by the APA and AAIDD in assessing whether Defendant is intellectually disabled within the meaning of Atkins. The court, however, also emphasizes that it is making a legal (not a medical or psychological) determination. See, e.g., Ortiz, 664 F.3d at 1168 (rejecting an argument that “established science ... dictates a mental retardation diagnosis” because it “incorrectly assumes the Atkins decision delegates to the scientific community the finding of whether an individual is mentally retarded”); Hooks, 689 F.3d at 1172 (emphasizing that “a clinical standard is not a constitutional command”). “ Atkins ‘did not delegate to psychologists the determination of whether an inmate should not face execution.’ ” Wilson, 922 F.Supp.2d at 339 (quoting United States v. Bourgeois, 2011 WL 1930684, at *24 (S.D.Tex. May 19, 2011)). Rather, “ ‘psychology informs, but does not determinatively decide, whether an inmate is exempt from execution,’ leaving the ‘contours of the constitutional protection to the courts.’ ” Ortiz, 664 F.3d at 1168 (quoting Bourgeois, 2011 WL 1930684, at *24). And so—informed by clinical definitions—the court ultimately “will apply its own judgment as to the ‘appropriate ways' to enforce the ultimately legal prohibition on executing [intellectually disabled] offenders,” Wilson, 922 F.Supp.2d at 339 (quoting Atkins, 536 U.S. at 317, 122 S.Ct. 2242), particularly when facing ambiguous or conflicting definitions and testimony. Id.

Indeed, clinical authorities also recognize this distinction. See Def.'s Ex. 1018, Cautionary Statement for Forensic Use of [the APA's 2013 Diagnostic and Statistical Manual–5 (“DSM–5”) ] at 25 (recognizing “the imperfect fit between the questions of ultimate concern to the law and the information contained in a clinical diagnosis,” explaining that “[i]n most situations, the clinical diagnosis of a DSM–5 mental disorder such as intellectual disability ... does not imply that an individual with such a condition meets legal criteria for the presence of a mental disorder or a specified legal standard”).

The court looks to all the recent clinical standards of the APA and AAIDD, focusing on the latest versions. But the court does not limit its analysis strictly to the 2013 and 2010 versions of the APA and AAIDD clinical definitions, described in detail below. Defendant was evaluated as early as 2006 on guilt-phase and competency-related issues, and other Atkins specific evidence also pre-dates the latest standards. This is the same approach taken by various witnesses. For example, Dr. Woods, testifying for the defense, was specifically asked “[d]id you use both the DSM–IV–TR to make that diagnosis [of intellectual disability] as well as the AAID[D] green book and DSM–V?,” and he answered, “I used all three.” Tr. Dec. 16, 2013 (Dr. Woods) at 39.

C. Clinical Standards

1. A Three–Prong Test

The APA and AAIDD have promulgated evolving definitions of intellectual disability. All these clinical definitions, however, require three basic elements or criteria: (1) significant deficits in intellectual functioning; (2) deficits or impairments in adaptive functioning or behavioral skills; and (3) onset of the condition before age eighteen (or the “developmental period”).

In particular, until 2013, the APA gave the following three “diagnostic criteria for mental retardation:”

A. Significantly subaverage intellectual functioning: an [intelligence quotient (“IQ”) ] of approximately 70 or below on an individually administered IQ test....

B. Concurrent deficits or impairments in present adaptive functioning ( i.e., a person's effectiveness in meeting the standards expected for his or her age by his or her cultural group) in at least two of the following areas: communication, self-care, home living, social/interpersonal skills, use of community resources, self-direction, functional academic skills, work, leisure, health and safety.

C. The onset is before age 18 years.
APA, Diagnostic and Statistical Manual of Mental Disorders (4th ed. 2000) (Text Revision) (“DSM–IV–TR”) at 49. The APA instructed users of the DSM–IV–TR to:

Code based on degree of severity reflecting level of intellectual impairment:

317	Mild Mental Retardation:	IQ level 50–55 to approximately 70
318.0	Moderate Mental Retardation:	IQ level 35–40 to 50–55
318.1	Severe Mental Retardation:	IQ level 20–25 to 35–40
318.2	Profound Mental Retardation:	IQ level below 20 or 25
319	Mental Retardation, Severity Unspecified: when there is strong presumption of Mental Retardation but the person's intelligence is untestable by standard tests.

Id.

The APA revised the DSM–IV–TR with the 2013 publication of the Diagnostic and Statistical Manual of Mental Disorders (5th ed. 2013) (“DSM–5”). The DSM–5 provides the following “diagnostic criteria” for “Intellectual Disability (Intellectual Developmental Disorder):”

Intellectual disability (intellectual development disorder) is a disorder with onset during the developmental period that includes both intellectual and adaptive functioning deficits in conceptual, social, and practical domains. The following three criteria must be met:

A. Deficits in intellectual functions, such as reasoning, problem solving, planning, abstract thinking, judgment, academic learning, and learning from experience, confirmed by both clinical assessment and individualized, standardized intelligence testing.

B. Deficits in adaptive functioning that result in failure to meet developmental and sociocultural standards for personal independence and social responsibility. Without ongoing support, the adaptive deficits limit functioning in one or more activities of daily life such as communication, social participation, and independent living, across multiple environments, such as home, school, work, and community.

C. Onset of intellectual and adaptive deficits during the developmental period.
Def.'s Ex. 1019, DSM–5 at 33. Under the DSM–5, the level of severity of intellectual disability is no longer classified specifically in terms of IQ. Rather, “[t]he various levels of severity [in the DSM–5] are defined on the basis of adaptive functioning, and not IQ scores, because it is adaptive functioning that determines the level of supports required. Moreover, IQ measures are less valid in the lower end of the IQ range.” Id. That is, although it certainly still refers to consideration of IQ scores, the DSM–5 “de-emphasizes IQ scores as determinants of [intellectual disability].” Hernandez v. Stephens, 537 Fed.Appx. 531, 533 n. 1 (5th Cir.2013).

The AAIDD's standards are similar to the APA's. Indeed, prior to the 2013 release of the DSM–5, courts characterized the standards as “essentially identical.” Wilson, 922 F.Supp.2d at 341 (citing cases). In its 2010 Manual Intellectual Disability: Definition, Classification, and Systems of Supports (11th ed. 2010) (“2010 AAIDD Manual” or “the Green Book” as referred to during the Atkins hearings), the AAIDD defines intellectual disability as “characterized by significant limitations both in intellectual functioning and in adaptive behavior as expressed in conceptual, social, and practical adaptive skills. This disability originates before 18.” Wilson, 922 F.Supp.2d at 341 (quoting 2010 AAIDD Manual at 1). This is the same general definition given in the prior (2002) version of the AAMR/AAIDD Manual. See Def.'s Ex. 1014, AAMR, Mental Retardation: Definition, Classification, and Systems of Supports (10th ed. 2002) at 93 (“2002 AAMR Manual”) (“Mental retardation is a disability characterized by significant limitations both in intellectual functioning and in adaptive behavior as expressed in conceptual, social, and practical skills. This disability originates before age 18.”).

Atkins noted a similar definition from a 1992 version of the AAMR Manual:
Mental retardation refers to substantial limitations in present functioning. It is characterized by significantly subaverage intellectual functioning, existing concurrently with related limitations in two or more of the following applicable adaptive skill areas: communication, self-care, home living, social skills, community use, self-direction, health and safety, functional academics, leisure, and work. Mental retardation manifests before age 18.

536 U.S. at 308 n. 3, 122 S.Ct. 2242 (quoting Mental Retardation: Definition, Classification, and Systems of Supports at 5 (9th ed. 1992)).

2. Clinical Judgment and a Comprehensive Analysis

The APA and AAIDD clinical manuals—both (1) the APA's DSMIV–TR, and DSM–5; and (2) the 2002 AAMR Manual, and 2010 AAIDD Manual—all have significant diagnostic features, explanations, and qualifiers for forensic use (many of which are discussed below when detailing each prong). The standards stress the importance of (1) clinical judgment, and (2) a comprehensive view that considers multiple sources of information. See, e.g., Def.'s Ex. 1016, AAIDD User's Guide at 9 (“Clinical judgment is a special type of judgment rooted in a high level of clinical expertise and experience; it emerges directly from extensive data.”); Def.'s Ex. 1019, DSM–5 at 37 (“Clinical training and judgment are required to interpret test results and assess intellectual performance.”); Def.'s Ex. 1017, Excerpt from DSM–IV–TR re. Use of Clinical Judgment (“In addition to the need for clinical training and judgment, the method of data collection is also important. The valid application of the diagnostic criteria ... necessitates an evaluation that directly accesses the information contained in the criteria sets[.]”); Def.'s Ex. 1014, 2002 AAMR Manual at 66 (“The assessment of intellectual functioning must rely on sound procedures and may, at times, require information from multiple sources.”); Def.'s Ex. 1015, 2010 AAIDD Manual at 41 (same); Def.'s Ex. 1016, AAIDD User's Guide at 7–8 (setting forth “best practices,” including “[r]ecognizing the multifactorial nature of the etiology of [intellectual disability],” and “[u]sing a multidimensional approach to classification that is based on the specific purpose for classification and incorporates the factors that impact human functioning”); Thomas v. Allen, 614 F.Supp.2d 1257, 1283 (N.D.Ala.2009) (“[I]t is crucial that clinicians conduct a thorough social history and align data and data collection to the critical question(s) at hand.”), aff'd, 607 F.3d 749 (11th Cir.2010) (quoting a previous version of the AAIDD User's Guide).

See, e.g., Def.'s Exs. 1016 (“User's Guide To Accompany the 11th Edition of Intellectual Disability: Definition, Classification, and Systems of Supports”) (“AAIDD User's Guide”); 1017 (section titled “Use of DSM–IV in Forensic Settings”); and 1018 (“Cautionary Statement for Forensic Use of DSM–5”).

See also Sasser v. Hobbs, 735 F.3d 833, 847 (8th Cir.2013) ( “[T]he district court should have considered all evidence of [defendant's] intellectual functioning rather than relying solely on his IQ test scores.”) (emphasis added); United States v. Hardy, 762 F.Supp.2d 849, 876 (E.D.La.2010) (describing use of “[o]ther sources of information” such as school records and family data in assessing subaverage intelligence as “both instructive and necessary” because such information may be “consistent with” another expert's testimony and “testimony on these other sources informs the Court's assessment of [an expert's] overall credibility”).

In this regard, various witnesses confirmed the importance of clinical judgment, and the use of multiple sources of information,in assessing a person's intelligence and whether someone is intellectually disabled. See, e.g., Tr. Dec. 11, 2013 (Dr. James) at 45 (“[I]t's [about] having additional information about the integrity of the neural system ... for whatever purpose it might be, whether it's making a decision in an Atkins case or ... an intervention or treatment.”); id. at 69 (“[C]linical judgment is a place where you can consider other kinds of factors that might lead to a person's cognitive impairment.... As a clinician, you would be looking at all of these different sources of data, not just the IQ score but other data about adaptive functioning to understand that, and weigh ... different pieces of information.”); Tr. Dec. 16, 2013 (Dr. Woods) at 52 (emphasizing the relevance of clinical judgment with “all of the instruments with which we determine intellectual disability”); Tr. Dec. 12, 2013 (Dr. Boone) at 194 (“[I]t comes down to clinical judgment. If you have evidence that someone performed poorly in school, tested out poorly on standardized testing, then you would make the case that they were low functioning at that point in time.... It really comes down to the clinical judgment going through those records.”); id. at 208 (testifying that it is generally accepted practice for a neuropsychologist to “take the information that you get from testing instruments other than IQ tests and look at that in conjunction with what you've gotten in the IQ tests to see how an individual is performing cognitively”).

With these clinical standards firmly in mind, the court next details the relevant parameters of each of the three prongs.

3. Prong One: “Significantly Subaverage Intellectual Functioning”

a. The relative importance of IQ scores

Psychologists and others in the clinical community consistently discuss human “intelligence” in terms of “g,” a general factor of intelligence. See, e.g., Def.'s Ex. 1015, 2010 AAIDD Manual at 34 (“[I]ntellectual functioning ... is best conceptualized and captured by a general factor of intelligence (g).”). “Intelligence is a general mental ability. It includes reasoning, planning, solving problems, thinking abstractly, comprehending complex ideas, learning quickly, and learning from experience.” Id. at 31. And “[m]ost of the more commonly used individual tests of intelligence ... provide metrics of this g factor.” Id. at 32.

Although the DSM–5 “de-emphasizes IQ scores as determinants” of intellectual disability, Hernandez, 537 Fed.Appx. at 533 n. 1, it nevertheless remains accepted that “IQ tests are the best available tools for measuring intellectual functioning” such that “both the AAIDD and the APA frame prong one criteria in terms of IQ scores.” United States v. Salad, 959 F.Supp.2d 865, 870 (E.D.Va.2013). In this regard, the DSM–5 describes prong one in part as follows:

Intellectual functioning is typically measured with individually administered and psychometrically valid, comprehensive, culturally appropriate, psychometrically sound tests of intelligence. Individuals with intellectual disability have scores of approximately two standard deviations or more below the population mean, including a margin for measurement error (generally +5 points). On tests with a standard deviation of 15 and a mean of 100, this involves a score of 65–75 (70 ± 5). Clinical training and judgment are required to interpret test results and assess intellectual performance.
Def.'s Ex. 1019, DSM–5 at 37.

Likewise, the 2010 AAIDD Manual recognizes that “[a]lthough far from perfect, intellectual functioning is currently best represented by IQ scores when they are obtained from appropriate, standardized and individually administered assessment instruments.” Def.'s Ex. 1015, 2010 AAIDD Manual at 31. Under this standard, “[t]he ‘significant limitations in intellectual functioning’ criterion for a diagnosis of intellectual disability is an IQ score that is approximately two standard deviations below the mean, considering the standard error of measurement for the specific instruments and the instruments' strengths and limitations.” Id. The AAIDD emphasizes that “[t]he intent of this definition is not to specify a hard and fast cutoff point/score for meeting the significant limitations in intellectual functioning criteria of [intellectual disability].” Id. at 35. “The use of ‘approximately’ reflects the role of clinical judgment in weighing the factors that contribute to the validity and precision of a decision. The term also addresses statistical error and uncertainty inherent in any assessment of human behavior.” Id.

In addition to the role of clinical judgment, “the court must examine the reliability and validity of IQ scores, and consider the credibility of witnesses that proffer expert opinions on those scores.” Salad, 959 F.Supp.2d at 871 (citing Thomas, 614 F.Supp.2d at 1264 n. 16, and United States v. Nelson, 419 F.Supp.2d 891, 903 (E.D.La.2006)).

Thus, courts properly recognize that “[t]he psychiatric and psychological communities, including those specializing in the treatment of [intellectual disability], agree [that] ‘[a] fixed point cutoff score for [intellectual disability] is not psychometrically justifiable.’ ” Sasser, 735 F.3d at 843 (quoting 2010 AAIDD Manual at 40). “It is possible to diagnose [intellectual disability] in individuals with IQs between 70 and 75 who exhibit significant deficits in adaptive behavior because there is ‘a measurement error of approximately 5 points [in assessing IQ], depending on the testing instrument.” Id. (quoting Jackson v. Norris, 615 F.3d 959, 965 n. 7 (8th Cir.2010) (in turn quoting DSM–IV–TR at 41–42)). “Conversely, [intellectual disability] would not be diagnosed in an individual with an IQ lower than 70 if there are no significant deficits or impairments in adaptive functioning.” Def.'s Ex. 1020, DSM–IV–TR at 42. “Simply put, an IQ test score alone is inconclusive of ‘significantly subaverage general intellectual functioning.’ ” Sasser, 735 F.3d at 843 (emphasis added).

“The most widely-accepted IQ tests in the United States are the Wechsler Intelligence Scales, which include ... the Wechsler Adult Intelligence Scale (‘WAIS').” Wilson, 922 F.Supp.2d at 344. Another “widely recognized and utilized” IQ instrument is the Stanford–Binet Intelligence Scales (“SB”). Thomas, 607 F.3d at 753.

b. Measurement errors and confidence intervals

Clinical authorities also agree that “[a]ll IQ tests ... contain at least some possibility of error, making it impossible to state a test subject's ‘true’ IQ score with certainty.” Wilson, 922 F.Supp.2d at 345 (citing Thomas, 614 F.Supp.2d at 1269). “An IQ score is subject to variability as a function of a number of potential sources of error, including variations in test performance, examiner's behavior, cooperation of the test taker, and other personal and environmental factors.” Def.'s Ex. 1015, 2010 AAIDD Manual at 36. And so, accepted IQ tests take into account a “standard error of measurement” (“SEM”), which “varies by test, subgroup, and age group,” and “is used to quantify this variability and provide a stated statistical confidence interval within which the person's true score falls.” Id. “The confidence interval refers to a percentage corresponding to [a] degree of confidence that an interval around the obtained IQ score contains the true IQ score.” Wilson, 922 F.Supp.2d at 345 (citing Wiley v. Epps, 668 F.Supp.2d 848, 893–94 (N.D.Miss.2009)).

According to the AAIDD, “[f]or well-standardized measures of general intellectual functioning, the [SEM] is approximately 3 to 5 points.” Def.'s Ex. 1015, 2010 AAIDD Manual at 36. In terms of confidence intervals (with a normal curve), the AAIDD describes a 66% confidence interval as the range from one SEM below to one SEM above a given score (“scores of about 66 to 74”) and a 95% confidence interval as the range from two SEMs below to two SEMs above a given score (“scores of about 62 to 78”). Id. For example, “the 95% confidence interval for a given IQ score would show the range of scores within which we can be 95% confident that a person's true IQ score falls.” Wilson, 922 F.Supp.2d at 345.

“Understanding and addressing [an IQ] test's [SEM] is a critical consideration that must be part of any decision concerning a diagnosis of [intellectual disability.]” Def.'s Ex. 1015, 2010 AAIDD Manual at 36. And so “[b]oth [the] AAIDD and the [APA] support the best practice of reporting an IQ score with an associated confidence interval.” Id. “Reporting an IQ score with an associated confidence interval is a critical consideration underlying the appropriate use of intelligence tests and best practices; such reporting must be a part of any decision concerning the diagnosis of [intellectual disability].” Id.

The testimony at the hearings confirmed these clinical standards. See, e.g., Tr. Dec. 11, 2013 (Dr. James) at 56 (“[W]e always have to be thinking of [a] person's IQ in terms of a range of scores, not as a single IQ score.... And that's what's reflected in the standard of error of measurement.”); Tr. Dec. 16, 2013 (Dr. Woods) at 41 (testifying that “the idea of being fixed by an IQ score is not appropriate” and opining based upon a range of scores “particularly when taking the confidence interval into consideration”).

Accordingly, the court will not consider Defendant's various reported IQ scores in isolation—rather, the court will consider them as part of a range (above and below) in relation to the reported (if given) confidence interval and SEM. And the court does not apply a “hard and fast cutoff point/score for meeting the significant limitations in intellectual functioning” prong. Def.'s Ex. 1015, 2010 AAIDD Manual at 35. Rather, the court will look to a reported range of IQ scores, as part of a comprehensive analysis of all the relevant evidence for this prong (and the other prongs) of the clinical definition.

c. The “Flynn Effect”

The court will also consider the “Flynn Effect,” which is “a theory that IQ scores increase over time, so that a person who takes an IQ test that has not recently been ‘normed’ may have an artificially inflated IQ score.” Pizzuto, 729 F.3d at 1223 (citing James R. Flynn, Tethering the Elephant: Capital Cases, IQ, and the Flynn Effect, 12 Psychol. Pub. Pol'y & L. 170, 173 (2006)). “The standard practice is to deduct 0.3 IQ points per year (3 points per decade) to cover the period between the year the test was normed and the year in which the subject took the test.” Id.

Pizzuto stated that “the Flynn Effect is not uniformly accepted as scientifically valid.” Pizzuto, 729 F.3d at 1223 (citing Maldonado v. Thaler, 625 F.3d 229, 238 (5th Cir.2010)). But it is important to view this statement in context—the Ninth Circuit was reviewing a state court proceeding under a § 2254 standard of review ( i.e., whether the state court's application of clearly established law was reasonable). That is, Pizzuto was not deciding whether a federal court can or should consider a Flynn Effect, but rather it was deciding whether the state court's application of the theory—as matter of state law on the particular evidentiary record before it—was “unreasonable” and violated “clearly established” law. Pizzuto reasoned that “[w]ithout more evidence in the record on the need to include an adjustment such as the Flynn Effect in considering the relationship between past IQ tests and a person's true IQ, the Idaho Supreme Court's refusal to apply it is not grounds for reversal here.” Id. at 1223–24. In apparent contrast to Pizzuto, the record before this court after nine days of testimony—with much evidence on factors such as a Flynn effect and potential practice effects—is much different.

The Flynn effect acknowledges that as an intelligence test ages, or moves farther from the date on which it was standardized, or normed, the mean score of the population as a whole on that assessment instrument increases, thereby artificially inflating the IQ scores of individual test subjects. Therefore, the IQ test scores must be recalibrated to keep all test subjects on a level playing field.
Thomas, 607 F.3d at 753. See Def.'s Ex. 1015, 2010 AAIDD Manual at 37 (“In cases where a test with aging norms is used, a correction for the age of the norms is warranted.”). By doing so, however, the court will consider both “Flynn-adjusted” and non-adjusted scores—the court will not automatically discard or ignore non-adjusted scores, but will keep all relevant data in mind in making its assessment.

The 2010 AAIDD Manual refers to a 0.33 adjustment per year from the date a test is normed. See Def.'s Ex. 1015, 2010 AAIDD Manual at 37 (referring to a 1984 article by Flynn). Other sources indicate a 0.3 adjustment per year, based on more recent articles. See Def.'s Ex. 1027 (Flynn, Tethering the Elephant (2006) at 173; see also Doc. No. 2259, Tr. Dec. 11, 2013 (Dr. James) at 159 (“Flynn describes the .33 in his 1984 paper. He describes the .30 in more recent publications.”).

d. The “practice effect”

Much testimony focused on a phenomenon called the “practice effect,” which clinical sources recommend taking into account. See id. at 35, 38, 102. Specifically, the AAIDD describes the practice effect as follows:

The practice effect refers to gains in IQ scores on tests of intelligence that result from a person being retested on the same instrument. Kaufman (1994) noted that practice effect can occur when the same individual is retested on a similar instrument. For example, the WAIS–III Manual presents data showing the artificial increase in IQ scores when the same instrument is readministered within a short time interval. The WAIS–III Manual also reports average increases between administrations with intervals of 2 to 12 weeks. For this reason, established clinical practice is to avoid administering the same intelligence test within the same year to the same individual because it will often lead to an overestimate of the examinee's true intelligence.
Id. at 38.

“The theory behind the practice effect ‘is that because IQ assessments rely upon novel tasks and instructions to assess ability and performance, an instruction given on a test will be more familiar to the examinee and more quickly implemented on subsequent presentations.’ ” Wilson, 922 F.Supp.2d at 352 (quoting Wiley, 668 F.Supp.2d at 896). In general, a practice effect is larger for a “performance IQ” as compared to a “verbal IQ” (both of which are components of a “full scale IQ” on some tests). See, e.g., Def.'s Ex. 1033, Alan S. Kaufman, excerpt from R.J. Sternberg, 2 Encyclopedia of Human Intelligence 828 (1994) (“Kaufman (1994)”) at 2; Tr. Dec. 18, 2013 (Dr. Denney) at 168, 176; Tr. Dec. 11, 2013 (Dr. James) at 101 (acknowledging a greater practice effect for a “performance scale” than for “knowledge tasks” on an IQ test).

But, “[u]nlike with the Flynn Effect, there does not appear to be an accepted method in the psychological community for adjusting IQ scores to account for the practice effect.” Wilson, 922 F.Supp.2d at 352. For example, in rejecting a suggestion that an individual's IQ scores should be adjusted downward by five to eight points for all retests because of a practice effect, Wilson emphasized that authorities state “only that ‘[c]linicians should understand the average practice effect gains,” ’ and do “not recommend adjusting an individual's IQ scores” for all retests. Id. at 352–53; see Def.'s Ex. 1033, Kaufman (1994) at 4 (“Clinicians should understand the average practice effect gains in intelligence scores[.]”). Indeed, the AAIDD recommends that clinicians “avoid administering the same intelligence test within the same year to the same individual[.]” Def.'s Ex. 1–15, 2010 AAIDD Manual at 38 (emphasis added).

In general, a practice effect depends upon the length of time between the original test and the retest. See, e.g., Def.'s Ex. 1033, Kaufman (1994) at 4 (indicating that a practice effect overestimates a person's intellectual functioning “especially if the retest is given within about six months of the original test, or ... several times in the course of a few years”); Tr. Dec. 18, 2013 (Dr. Denney) at 47 (“Available research shows that the effect of practice enhancement of the score drops with time.... As you get further away, the increase drops each time.”); id. at 185 (“[I]t's clear there's more retest gain early on, but as time goes by we see the scores decreasing and decreasing.”). Several courts have recognized this proposition based on expert testimony. See, e.g., Wilson, 922 F.Supp.2d at 352; Blue v. Thaler, 2010 WL 8742423, at *13 (S.D.Tex. Aug. 19, 2010) (“[T]he practice effect only applies when there is a short interval between tests. The nine-month period here should have dispelled any lingering effect from the first test.”); Green v. Johnson, 2006 WL 3746138, at *44 (E.D.Va. Dec. 15, 2006) (“The practice effect refers to an increase in a person's score on an IQ test when it is administered within a short time after taking the same or [a] similar test.... [T]he effect is more pronounced the closer in time the tests are given.”), Report and Recommendation Adopted as Modified,2007 WL 951686 (E.D.Va. Mar. 26, 2007), aff'd, 515 F.3d 290 (4th Cir.2008).

And clinical sources ( i.e., the 2010 AAIDD Manual and DSM–5) do not quantify such a practice effect, particularly for intervals over a year—although some research or literature suggests that some type of a practice effect is possible over longer periods, and can result in “cumulative effects” or “progressive error” with repeated administration of intelligence tests. See, e.g., Def.'s Ex. 1033, Kaufman (1994) at 4; Tr. Dec. 11, 2013 (Dr. James) at 66–67. Moreover, the court accepts the testimony of Dr. Denney that, generally, a practice effect is more likely to occur with a more intelligent person with a higher IQ, than with a less intelligent person:

Q. (by the court): Is there any literature on the practice effect and intelligence, meaning the smarter your are, the more likely the practice effect would accelerate that difference in the scores versus someone [at] a lower end is less likely to see the same degree of practice effect?

A. Yes, there is. And that's exactly what you see. .... [T]he band of IQs do not have the same complete practice effect, and what you're going to see is a greater practice effect for the higher scores than the lower scores.
See Doc. No. 2264, Tr. Dec. 18, 2013 (Dr. Denney) at 50–51.

The court, having reviewed the evidence and considered caselaw and clinical sources, follows an approach similar to Wilson. “The court will—as clinicians recommend—take into account the practice effect in interpreting [Defendant's] IQ scores.” Wilson, 922 F.Supp.2d at 353 (citing 2010 AAIDD Manual at 35, 38, 102). But the court will not apply any specific downward point adjustment to Defendant's IQ scores because of a practice effect. And when considering a practice effect, “the court will be mindful that the practice effect diminishes significantly (although perhaps without disappearing entirely) as the length of time between test administrations increases.” Id. at 354 (citing Blue, 2010 WL 8742423, at *13). That is, the court will consider a possible practice effect as a factor, among others, in assessing the reliability or uncertainty of particular scores on particular tests. See Def.'s Ex. 1015, 2010 AAIDD Manual at 40 (“[I]n evaluating the role that an IQ score plays in making a diagnosis of ID, clinicians should ... (b) interpret the obtained score in reference to the test's [SEM], the assessment instruments' strengths and limitations, and other factors (such as practice effect, fatigue effects, and age or norms used) that determine the size of the error involved in estimating the person's true score[.]”).

4. Prong Two: “Adaptive Functioning”

“Adaptive functioning refers to how effectively individuals cope with common life demands and how well they meet the standards of personal independence expected of someone in their particular age group, sociocultural background, and community setting.” Salad, 959 F.Supp.2d at 877 (quoting DSM–IV–TR at 42). The DSM–5 articulates deficits in adaptive function as follows:

Deficits in adaptive functioning (Criterion B) refer to how well a person meets community standards of personal independence and social responsibility, in comparison to others of similar age and sociocultural background. Adaptive functioning involves adaptive reasoning in three domains: conceptual, social, and practical. The conceptual (academic) domain involves competence in memory, language, reading, writing, math reasoning, acquisition of practical knowledge, problem solving, and judgment in novel situations, among others. The social domain involves awareness of others' thoughts, feelings, and experiences; empathy; interpersonal communication skills; friendship abilities; and social judgment, among others. The practical domain involves learning and self-management across life settings, including personal care, job responsibilities, money management, recreation, self-management of behavior, and school and work task organization, among others. Intellectual capacity, education, motivation, socialization, personality features, vocational opportunity, cultural experience, and coexisting general medical conditions or mental disorders influence adaptive functioning.

....

Criterion B is met when at least one domain of adaptive functioning—conceptual, social or practical—is sufficiently impaired that ongoing support is needed in order for the person to work adequately in one or more life settings at school, at work, at home, or in the community.
Def.'s Ex. 1019, DSM–5 at 37–38.

Caselaw describes this adaptive functioning prong as consisting of general factors, regardless of the wording of the recent manuals, i.e., the AAIDD/AAMR's 2002 AAMR Manual, and 2010 AAIDD Manual; and the APA's DSM–IV–TR and DSM–5.

Prong two generally requires a more expansive investigation of a defendant's life history and skill levels than could be fully evaluated through use of a normed instrument. See United States v. Davis, 611 F.Supp.2d 472, 491 (D.Md.2009) (describing prong two analysis as “amorphous”).Thus, an evaluation of adaptive functioning “involves significantly more subjective clinical judgment.” United States v. Hardy, 762 F.Supp.2d 849, 883 (E.D.La.2010); see also [2010] AAIDD Manual at 29 (defining “clinical judgment” as “a special type of judgment rooted in a high level of clinical expertise and experience ... to enhance the quality, validity, and precision of the clinician's decision”).
Salad, 959 F.Supp.2d at 878. Caselaw summarizes these guidelines:

The [2010] AAIDD Manual provides several important guidelines for analyzing adaptive behavior. First, the analysis is often retrospective, in that it examines past behavior for evidence of conformity or non-conformity to the baseline standards for the subject's age and background. [2010] AAIDD Manual at 46; see also Hardy, 762 F.Supp.2d at 881 (noting that, in the context of an Atkins claims, the analysis is always retrospective). Second, in the absence of standardized measurements, analysts should examine multiple sources of information for “convergence”; exercise “reasonable caution” in resolving conflicting reports; and avoid drawing conclusions from isolated performances. [2010] AAIDD Manual at 48. That is, an evaluation should not rely primarily on an individual's self-report of his skill level, but rather should rely on information gathered from third parties who are “very familiar with the person and have known him/her for some time and have had the opportunity to observe the person function across community settings and times.” Id. at 47. Third, the analysis should focus on average ability, not peak functioning. Id. (describing this broader focus as a “critical distinction” between prongs one and two). And finally, clinicians should be mindful that subjects with mild intellectual disability present a complex picture of strengths and weaknesses, and analysts should not evaluate a subject's performance based on inaccurate stereotypes of disabled individuals. See id. at 7 (“[L]imitations often coexist with strengths.”).
Id. And Hardy describes the second prong similarly as follows:

For adaptive behavior, the current version of the APA's guidance requires concurrent deficits in at least two of eleven relatively specific areas of adaptive functioning. See DSM–IV–TR at 49. The AAMR/AAIDD takes a more holistic approach and treats adaptive behavior as a global characteristic that finds expression in three relatively abstract areas of functioning—conceptual, social, and practical—and requires deficits in just one of these three general domains. See [2002 AAMR Manual] at 13; [2010 AAIDD Manual] at 6. That is, “the three broad domains of adaptive behavior in [the AAMR's] definition represent a shift from the requirement ... that a person have limitations in at least 2 of the 10 specific skill areas listed in [the AAMR's] 1992 definition,” which was the model for the approach still used by the APA.... The AAIDD moved away from that model because “[t]he three broader domains of conceptual, social, and practical skills ... are more consistent with the structure of existing measures and with the body of research on adaptive behavior.” [2002 AAMR Manual] at 73, 78.
762 F.Supp.2d at 879. Hardy observed that, in the end, the exact wording of the various standards makes little substantive difference in this Atkins context:

While these differences in definition deserve note, they are ultimately of no consequence to the Court's task. Just as in Wiley [v. Epps, 625 F.3d 199 (5th Cir.2010) ], where the Fifth Circuit noted that the AAMR's definition has diverged from that of the APA, 625 F.3d at 216 n. 3, the Court need not decide which is preferable or correct, because the differences between them are mostly theoretical. Both the APA and the AAMR direct clinicians to the same standardized measures of adaptive behavior, such as the Vineland Adaptive Behavior Scales–II (VABS–II) and the AAMR's Adaptive Behavior Scale (ABS). See DSM–IV–TR at 42; [2002 AAMR Manual] at 76–78.
Id. at 879–80.

And with the recent release of the DSM–5, the same general statement— i.e., “the Court need not decide which [definition of prong two] is preferable or correct, because the differences between them are mostly theoretical,” id. at 880—certainly still applies here. See, e.g., Doc. No. 2264, Tr. Dec. 18, 2013 (Dr. Denney) at 15 (testifying regarding the DSM–5 and 2010 AAIDD Manual that “they're right on point with each other. They're really describing the same thing.... [T]hey both point to the exact same type of criteria.”). Even after release of the DSM–5, prong two still “generally requires a more expansive investigation of a defendant's life history and skill levels than could be fully evaluated through use of a normed instrument.” Salad, 959 F.Supp.2d at 878. The adaptive functioning prong remains “amorphous,” Davis, 611 F.Supp.2d at 491, involving “significantly more subjective clinical judgment.” Hardy, 762 F.Supp.2d at 883.

Under the DSM–5, in contrast to the DSM–IV–TR, adaptive functioning deficits must be the direct result of intellectual disability. Doc. No. 2265, Tr. Dec. 19, 2013 (Dr. Denney) at 59; see Def.'s Ex. 1019, DSM–5 at 38 (“To meet diagnostic criteria for intellectual disability, the deficits in adaptive functioning must be directly related to the intellectual impairments described in [prong one]”).

Nevertheless, it also remains true that “[b]oth the APA and the [AAIDD] direct clinicians to the same standardized measures of adaptive behavior, such as the Vineland Adaptive Behavior Scales–II (VABS–II) and the [AAIDD's] Adaptive Behavior Scale.” Id. at 880. In this regard, the court heard evidence confirming that the ABAS–II is such a clinically-accepted metric of prong two, developed in part by Dr. Oakland. Doc. No. 2171, Tr. Sept. 9, 2013 (Dr. Oakland) at 21. “The purpose of the ABAS–II is to provide a reliable, valid, comprehensive, norm-based measure of adaptive behavior skills for children and adults from birth to age 89 years.” Def.'s Ex. 24, Test Review, 22 J. Psychoeducational Assessment 367 (2004). Here, the ABAS–II was administered by different experts to Defendant, Delilah Williams, and a former co-worker, and the court will place some weight in reviewing this standardized assessment of adaptive functioning.

5. Prong Three: Onset Before Eighteen (or “the Age of Development”)

“The third prong—onset before the age of eighteen—bears clarification because it is essentially a prerequisite to satisfying the first two prongs.” Wilson, 922 F.Supp.2d at 342. Although Wilson (and other courts) explain that “in deciding an Atkins claim, the court must determine whether the defendant ‘was mentally retarded at the time of the crime,’ ” id. (quoting Hardy, 762 F.Supp.2d at 881), the analysis is not necessarily so focused. Because, by definition, an intellectual disability manifests itself before the age of development (and does not change appreciably over time), the court is assessing a defendant's intellectual condition per se—and considers available evidence both at the time of the crime, as well as before and after.

A court's assessment, however, is “retrospective” in that the court must decide—often many years later—whether evidence proves that a condition has manifested itself by the age of development. See Holladay v. Allen, 555 F.3d 1346, 1353 (11th Cir.2009) (“Though the factors state that the problems had to have manifested themselves before the defendant reached the age of eighteen, it is implicit that the problems also existed at the time of the crime.”). “Thus, mental retardation must ‘be diagnosed, if it is to be diagnosed at all, retrospectively in every sense of the word.’ ” Wilson, 922 F.Supp.2d at 342 (quoting Hardy, 762 F.Supp.2d at 881).

“This does not mean that a defendant must be diagnosed with mental retardation before the age of eighteen, only that the disability's defining symptoms must have manifested themselves before the age of eighteen.” Id. at 342 n. 7. That is, “[d]isability does not necessarily have to have been formally identified, but it must have originated during the developmental period.” Id. (quoting 2010 AAIDD Manual at 27).

Thus, if Defendant fails to meet either prong one or two, he necessarily fails to meet prong three. See Candelario–Santana, 916 F.Supp.2d at 221 (concluding that “because [defendant] does not show signs of being mentally retarded presently, this third prong becomes moot”); Salad, 959 F.Supp.2d at 887 (“Having concluded that the Defendant does not have significant limitations in adaptive skills or intellectual functioning, the court does not address prong three.”).

IV. ANALYSIS

These Atkins proceedings present the court with a wide and comprehensive spectrum of relevant information about Defendant—multiple IQ tests over a seven-year period taken with different instruments; numerous full-scale neuropsychological examinations with complete test batteries; several standardized tests of adaptive functioning; interviews of Defendant's family members and former co-workers; reviews of Defendant's personal and employment history; reports examining Defendant's educational background, grades, and related test results ( e.g., college and military entrance examinations); and clinical opinions and testimony from psychologists, neuropsychologists, and psychiatrists.

This breadth of evidence enables the court to take a multifactorial approach in assessing whether Defendant has proven by a preponderance of the evidence that he has a “disqualifying mental capacity” under § 3596(c) and Atkins. After extensive and careful review of the record, the court concludes that Defendant has not met his burden of proof. The court explains it reasoning, analyzing each prong of the clinical test, below.

The court uses the three-prong clinical framework to structure its reasoning, and at the first two prongs, cites to particular exhibits or testimony to explain how that evidence factored into its decision. The parties, however, are familiar enough with the extensive factual record such that the court need not further reiterate in this already-lengthy Order all the evidence that was presented to, and considered by, the court at each stage. Suffice it to say, the court recognizes the stakes and seriousness of the Atkins issues, and has attempted to address each major point raised by the parties—even if some of the evidence is not discussed at length.

A. Prong One—Intellectual Functioning

From 2006 to 2013, Defendant was administered five full-scale WAIS or SB IQ tests, which various experts testified are “gold standard.” See, e.g., Doc. No. 2259, Tr. Dec. 11, 2013 (Dr. James) at 27. That is, the WAIS and SB instruments are generally accepted in the psychological community as “appropriate, standardized and individually administered assessment instruments” of intellectual functioning. Def.'s Ex. 1015, 2010 AAIDD Manual at 31. These examinations were administered on behalf of Defendant (Drs. Young and James), the government (Drs. Goldstein and Denney), or as part of a court-ordered competency examination (Dr. Tyner, under supervision of Dr. Preston Baecht).

Some testimony indicates that because an SB–V examination does not measure “processing speed,” as does a WAIS instrument, in that sense it is not a “gold standard IQ test.” Doc. No. 2264, Tr. Dec. 18, 2013 (Dr. Denney) at 44. Even if that were so, the test is still an appropriate instrument for assessing intellectual functioning, which the court accepts as similar to a WAIS instrument.

Additionally, Dr. Denney administered a Reynolds Intellectual Assessment Scales (“RIAS”) intelligence test, Gov't's Ex. 3, Sept. 23, 2013 Report of Denney at 2, 36, which is not prominently mentioned (if at all) as a generally-accepted instrument as are the WAIS and SB.

Defendant's full-scale IQ examination scores are summarized in the following table, which is derived from a similar table in Defendant's post-hearing brief (modified by the court to reflect additional information). See Doc. No. 2279, Def.'s Post–Hearing Br. at 24; Def.'s Ex. 1006, James Suppl. Rpt. at 7.

					Range (95%
Year Test			Full	Flynn–	Confidence
Was Given	Test Type	Evaluator	Scale IQ	Adjusted IQ	Interval) ²⁵

2006	WAIS–III	Dr. Young	73 (corrected to 74 per testimony of Drs. Boone and Denney)	70 (corrected to 71 per Drs. Boone and Denney)	67–75 (69–78 as reported by Dr. Young. Gov't's Ex. 36)
2009	WAIS–III	Dr. Tyner (Dr. Preston Baecht)	88 (corrected to 85 o r 86 per Drs. Boone and Denney)	84 (corrected to 81 or 82 per Drs. Boone and Denney)	80–88
2010	WAIS–IV	Dr. Goldstein	79 (corrected to 78 per Dr. Boone)	78 (corrected to 77 per Dr. Boone)	74–83 (75–83 as reported by Dr. Goldstein. Gov't's Ex. 24)
2013	SB–V	Dr. James	78	74	71–79
2013	WAIS–IV	Dr. Denney	84	82	78–86

Much testimony during these Atkins proceedings focused on purported uncertainties or unreliability of particular IQ test scores, or ranges of scores—and the court discusses some of that evidence below. But, when all is said and done, the range of Defendant's full scale IQ scores generally falls above the range that clinical sources consider as satisfying prong one ( i.e., a full scale IQ range of approximately 65 to 75). Even Defendant concludes in his post-hearing brief that his Flynn-adjusted full-scale IQ scores “cluster” between “77.6 and 77.3,” when considering all the data. See Doc. No. 2279, Def.'s Post–Evidentiary Hearing Br. at 23 (“With the Denney and Boone corrections, the scores actually cluster, assuming that they are all considered, between 77.6 and 77.3—these are both Flynn adjusted ‘cluster’ calculations.”).

See DSM–IV–TR at 49 (referring to “significantly subaverage intellectual functioning” as an IQ “of approximately 70 or below on an individually administered IQ test”); see also DSM–5 at 37 (“On tests with a standard deviation of 15 and a mean of 100, [intellectual disability] involves a score of 65–75 (70 +/- 5)”).

At best for Defendant, considering a 95 percent confidence interval, the scores cluster at the extreme upper part of an approximately two SEM range recognized by some authorities from an IQ score of 70. See Def.'s Ex. 1015, 2010 AAIDD Manual at 36 (“From the properties of the normal curve, a range of confidence can be established with ... parameters of two standard error of measurement ( i.e., scores of about 62 to 78, 95% probability).”). Indeed, Defendant admits that he “is not claiming to be in the heartland of what used to be called moderate mental retardation, or to possess a severe intellectual disability.” Doc. No. 2064–1, Def.'s Mot. at 46. Rather, “he contends that his intellectual disability, as defined by the DSM–5 and by the current AAIDD classification criteria, places him in a zone of impairment which disqualifies him from the death penalty.” Id.

Some caselaw indicates that a 95 percent confidence interval is too high for Atkins purposes, and that, instead, a narrower 66 percent interval—a range of one SEM below to one SEM above a score—is more appropriate. See Wilson, 922 F.Supp.2d at 347. Ultimately, however (because the court concludes that Defendant is not intellectually disabled with the higher confidence interval), the court need not decide which interval is more appropriate in this Atkins context.

Defendant thus appears to recognize that the overall data indicates that, at best, he straddles clinical lines between BIF and intellectual disability. And for good reason—Defendant is hard-pressed to distinguish 2008 testimony from his own experts (Drs. Stewart and Young) explicitly opining that he is not intellectually disabled, given his IQ scores and their review of other information. Instead, both of these experts emphasized on his behalf that he suffers from BIF, and that BIF is distinct from “intellectual disability.”

“A diagnosis of borderline intellectual functioning will not qualify for exemption from the death penalty.” Holladay v. Allen, 555 F.3d 1346, 1353 (11th Cir.2009) (citation omitted).

1. Prior Testimony of Drs. Stewart and Young

Specifically, Dr. Stewart testified unequivocally in November 2008 that, to a reasonable degree of medical certainty, “Mr. Williams has a condition referred to as [BIF].” Doc. No.2065–1, Tr. Nov. 3, 2008 (Dr. Stewart) at 42. Dr. Stewart described BIF as follows:

[BIF] is a diagnostic category that's listed in the DSM–IV–TR. It's referred to as a V code or other condition of clinical interest that would be listed on Axis II of the five-axis diagnosis. The DSM–IV as well as the supporting literature talks about people that have IQs in the range of 71 to 85. This is a category right above the next category higher, if you will, for mental retardation.
Id. at 43. In the following exchange with the court, Dr. Stewart then testified that Defendant was not intellectually disabled:

The DSM–5 also lists BIF as a “V code” (under “Other Conditions That May Be a Focus of Clinical Attention”), although it no longer contains a specific IQ spread:
V62.89 (R41.83) Borderline Intellectual Functioning. This category can be used when an individual's borderline intellectual function is the focus of clinical attention or has an impact on the individual's treatment or prognosis. Differentiating borderline intellectual functioning and mild intellectual disability (intellectual development disorder) requires careful assessment of intellectual and adaptive functions and their discrepancies[.]

Def.'s Ex. 1022, DSM–5 at 727.

THE COURT: So that I can understand better, it is—it is not your diagnosis that the defendant is mentally retarded; it is your diagnosis that he is borderline—he has borderline intellectual functioning. In other words, those are two different things.

THE WITNESS: They are two different things, Your Honor.

THE COURT: So you are not suggesting that in your opinion he is mentally retarded.

THE WITNESS: No, Your Honor.
Id. at 43. Dr. Stewart explained:

[M]ild mental retardation is the—the term to use when people have IQs in the mid–70 range to significantly lower, it's mild, moderate and severe. And most people that you encounter with mental retardation fall within the mild—mildly mentally retarded range.... But what I think it's important to just note in Mr. Williams' case is that although his IQ, taken in and of itself, doesn't rule out the presence of mental retardation, his overall functioning does.
Id. at 45–46.

On November 4, 2008, Dr. Young testified similarly, and perhaps more adamantly:

[T]he DSM–IV would then categorize Naeem Williams based on his IQ or his intellectual testing in the borderline range.... [T]his [ (referring to a diagram) ] is the average range, and then you have the low average, and then you have the borderline, and then you have the mental retardation, or the intellectually deficient or exceptionally low.
Doc. No. 2065–2, Tr. Nov. 4, 2008 (Dr. Young) at 66. She specifically rejected mental retardation:

Q. (Defense counsel) ... [B]ased on the information that you acquired in the course of your entire assessment, did you consider but reject the notion of Mr. Williams being mentally retarded?

A. I definitely considered it. And I did reject it.
Id. She concluded that Defendant has a “mental condition” of BIF, and again stated that he is not mentally retarded:

Q. ... [I]n connection with your—with your work with Naeem Williams, did you—based on your entire assessment, including your assessment of IQ as you've described it to us, did you arrive at an opinion of whether borderline intellectual functioning or borderline intellectual disability was a mental condition of Naeem Williams?

A. Yes, I did.

Q. And what was your opinion or what is your opinion?

A. My opinion is two prong. First of all, my opinion is that he's not mentally retarded. But my opinion also is that he is intellectually disabled, and that ... needs to be considered as a part of who he is and how he—what his needs are going to be in the world.
Id. at 74–75. She later repeated her opinion that he “is certainly not mentally retarded,” but is “in the borderline range”:

Q. Do you have an opinion concerning his cognitive functioning?

A. Yes, I do.

Q. What is that opinion?

A. That opinion is, is that his general cognitive ability or general intellectual ability is in the borderline range. So it is limited, but it is certainly not mentally retarded, but he has disabilities and that's who he is.
Id. at 114.

Defendant points out that Drs. Stewart and Young testified in 2008— before the current 2010 AAIDD Manual and DSM–5 were promulgated—but Defendant has presented no convincing evidence (if any) that these opinions would be different ( i.e., that Defendant does not have BIF, but is instead intellectually disabled) under newer clinical standards. Indeed, current clinical standards put less emphasis on IQ scores and more emphasis on prong two (adaptive functioning) than do previous versions ( e.g., the 2002 AAIDD Manual and the DSM–IV–TR). And Dr. Stewart testified that Defendant was not intellectually disabled precisely because of prong two. See Doc. No. 2065–1, Tr. Nov. 3, 2008 (Dr. Stewart) at 46 (“But what I think it's important to just note in Mr. Williams' case is that although his IQ, taken in and of itself, doesn't rule out the presence of mental retardation, his overall functioning does.”). This suggests that, if anything, the wording of current standards would be a basis for confirming, not changing, these prior opinions. In any event, the court will consider Dr. Young's and Dr. Stewart's testimony in this regard at face value, and—in the absence of any evidence to the contrary—not assume that their opinions would change under newly-articulated standards.

As indicated above, Dr. Stewart observed several of the December 2013 hearings in person but did not testify to disclaim or clarify his prior opinions, and the government had indicated since at least October 2013 that his prior opinions and testimony would be at issue. See Doc. No. 2188, Gov't's Initial Opp'n at 12 (“Most importantly, Dr. Stewart, who interviewed Defendant years ago, never claimed in his report or his testimony that Defendant was [intellectually disabled].”). The court, of course, does not speculate as to what was not presented, and instead focuses on the evidence before it in assessing whether to accept Defendant's argument attempting to explain his experts' prior testimony—testimony that the court (Judge Ezra) relied upon, at least in part, so as to allow Defendant to introduce evidence of BIF at a guilt phase of trial.

2. Testimony of Drs. James and Boone

Even beyond Dr. Stewart's and Dr. Young's conclusions quoted above, much of Defendant's evidence presented in December 2013 was similar— i.e., indicating that his unadjusted full-scale IQ was above the 65 to 75 range. Defense expert Dr. James administered an SB–V examination to Defendant as part of a formal neuropsychological evaluation in February and April 2013. She reported Defendant's full scale IQ on the SB–V as 78, which she “Flynn adjusted” to 74. Def.'s Ex. 1004, James Rpt. at 2. (Dr. Denney believes a Flynn adjustment, if it applies, should be three points—not four—assuming the SB–V was normed in 2002, and a 0.3 point correction per year. Gov't's Ex. 3, Denney Rpt. at 24. In the larger picture, the court does not ascribe much significance to this professional disagreement.)

Dr. James wrote an earlier report indicating a full scale IQ of 73 (not 78), which she then “Flynn adjusted” to 69. See Gov't's Ex. 23, James Rpt. at 2. But she later issued a revised report that corrected the score to 78 due to “clerical” or scoring errors. Doc. No. 2259, Tr. Dec. 11, 2013 (Dr. James) at 128–142.

Dr. Woods relied on Dr. James' earlier report of a Flynn-adjusted score of 69 in initially opining that Defendant is intellectually disabled. See Def.'s Ex. 1002, Woods July 29, 2013 Rpt. at 26. Both Dr. James and Dr. Woods testified that a five-point increase made no difference in their conclusions that Defendant is, nevertheless, intellectually disabled. See Doc. No. 2259, Tr. Dec. 11, 2013 (Dr. James) at 143–46, 149–50; Doc. No. 2263, Tr. Dec. 17, 2013 (Dr. Woods) at 116–17.

The court views that testimony (that such a five-point difference is not significant) with some skepticism—even considering the SEM and confidence intervals of the scores, the court is not convinced that a five-point increase in a Flynn-adjusted score of 69 to 74 would not be important information. See Doc. No. 2264, Tr. Dec. 18, 2013 (Dr. Denney) at 54 (testifying that a five point change from 73 to 78 is “meaningful,” would move out of a confidence interval, and change from a possible range of “mild mental retardation” to “borderline”). Although not dispositive, a Flynn-adjusted score below 70 would be worthy of at least some consideration as compared to a score of 74— i.e., it would be significant.

In any event, the court accepts Dr. James' score of 78, and will consider that score, along with a Flynn adjustment and 95 percent confidence interval, in conjunction with all other data.

Although Dr. James ultimately opined that Defendant “fulfills the criteria for a diagnosis of [intellectual disability],” Def.'s Ex. 1006, James Suppl. Rpt. at 10, her written opinion was based heavily on giving the most weight to Dr. Young's 2006 IQ score. See id. at 7. And, in assessing her credibility during the hearing, the court views Dr. James' testimony in this regard as equivocal, otherwise dependent on a host of qualifiers:

Q. [D]oes [Defendant] meet the AAIDD green book definition [of prong one]?

A. Yes, he does. Yes.

Q. And can you explain the basis on which you're expressing that opinion?

A. Yes. He demonstrates—again, when you look at the scores and you look at the scores in terms of the caveats that were mentioned in terms of standard error of measurement and practice effects and other things that need to be considered in the accuracy of the scores, with the AAIDD talking about an approximately two standard deviations below the means where the approximately is there deliberately to allow for the role of clinical judgment in identifying the intellectual deficits, yes, I do.
Doc. No. 2259, Tr. Dec. 11, 2013 (Dr. James) at 105.

Dr. Boone—who has particular expertise in “performance validity,” e.g., assessing effort of or malingering by test subjects, but did not conduct her own examination—reviewed all other IQ testing, and was even more equivocal when asked at the hearing about her view of Defendant's intellectual functioning:

Q. ... [W]hat is your opinion with regard to [whether or not Mr. Williams is suffering from intellectual disability]?

A. That the IQ scores would be consistent with that possibility.

Q. With that possibility?

A. Correct.

....

Q. How sure of that possibility are you?

A. Well, I'm sure this has been testified to, but when we obtained an IQ score for various reasons, it might not be completely accurate, so there's a confidence interval. I believe that when we look at the confidence interval for his IQ scores, that interval would overlap with the mentally retarded range, which raises the very real possibility that he is functioning in the mentally retarded range.

....

... And all we know is that the score falls within a particular range. And his range overlaps with mental retardation. So we certainly can't say he's not within the mental retardation range.... The test results are consistent with that possibility. The confidence interval indicates that his score may fall below 70. So we cannot rule out mental retardation.
Doc. No. 2261, Tr. Dec. 13, 2013 (Dr. Boone) at 62–63.

The court accepts (without having to make an actual finding) Dr. Boone's conclusions that Defendant “passed symptom validity indicators” from his IQ testing in 2006, 2009, and 2010, see Def.'s Ex. 1008, Boone Rpt. at 3–4, and at later times. See Def.'s Ex. 1009, Boone Supp. Rpt. at 3. Her reports as to “symptom validity,” however, were limited to prong one (and not, for example, on the ABAS–II, a measure used to assess prong two). Indeed, she testified that she is not an expert on adaptive functioning and was called to comment on IQ scores. See Doc. No. 2261, Tr. Dec. 13, 2013 (Dr. Boone) at 70–71.

3. IQ Testing by Drs. Young, Tyner, Goldstein, and Denney

As summarized in the table above, the government presented other evidence that Defendant's full-scale IQ was even further above the range traditionally accepted as fulfilling prong one. Dr. Tyner administered a WAIS–III examination in 2009, resulting in a score of 88. See Doc. No.2065–6, Tr. Aug. 7, 2012 (Dr. Preston Baecht) at 46. Dr. Goldstein administered a WAIS–IV in 2010, resulting in a score of 79. See, e.g., Gov't's Ex. 24; Gov't's Ex. 3, Denney Rpt. at 22. And in 2013, Dr. Denney administered a WAIS–IV, resulting in a score of 84. See Gov't's Ex. 25; Gov't's Ex. 3, Denney Rpt. at 29.

The table also lists Flynn–Effect adjustments, ranges at a 95% confidence interval, and adjustments or corrections by different witnesses.

In response to these scores, Defendant—despite Dr. Young's 2008 testimony that Defendant is not mentally retarded—emphasizes Dr. Young's administration of a WAIS–III in 2006, which resulted in a score of 73, with a 95 percent confidence interval of 69 to 78. See, e.g., Gov't's Ex. 36. Both Dr. James and Dr. Boone, as well, relied primarily on Dr. Young's scores (and discounted other higher scores for various reasons) in opining that Defendant is intellectually disabled. See, e.g., Def.'s Ex. 1006, James Suppl. Rpt. at 7 (“[T]he Full Scale score that should be given the most weight is Dr. Young's, as her assessment was the closest in time to the offense[.]”); Doc. No. 2260, Tr. Dec. 12, 2013 (Dr. Boone) at 183–85 (similar testimony).

To fulfill this first prong, then, Defendant's argument rests on the court accepting Dr. Young's IQ score unconditionally, and simultaneously discounting or disregarding Drs. Tyner's (88), Goldstein's (79), James' (78), and Denney's (84) test results. The government, on the other hand, challenges the reliability of Dr. Young's scores, characterizing them—and not Dr. Tyner's—as the “outlier.” Doc. No. 2281, Gov't's Opp'n at 5. The court addresses some of these concerns below, but ultimately concludes that each score has some uncertainties (which is the purpose of a SEM), but none are so inherently unreliable, such that the court will not ignore or eliminate any given score or range of scores—especially those at either extreme. Rather, the court considers all the data, recognizing that uncertainties exist.

a. Dr. Young's January 2006 testing

There is indeed some basis to question the reliability of Dr. Young's full scale IQ score of 73 for Defendant. See Gov't's Ex. 36. In particular, others had difficulty confirming her scoring because of a lack of “raw data”—when administering the examination, she did not simultaneously record certain information (such as Defendant's exact verbal answers to some questions) in the test materials. See Gov't's Ex. 37 (portions of Dr. Young's WAIS–III test of Defendant). Such a practice might be contrary to established protocol. See Gov't's Ex. 3, Denney Rpt. at 15 (“Her raw test data for the WAIS–III was completed in an atypical manner ... She also did not place her scorings on either the print out or the protocol, thus making it impossible to verify her scoring.”); Doc. No. 2264, Tr. Dec. 18, 2013 (Dr. Denney) at 21 (“Q. [I]s that standard protocol to not indicate on the form what the score was? A. No.”). Rather, her notes indicate she recorded answers in a computer the next day. Id. at 22.

More curiously, neither side could explain convincingly how or why Dr. Young's raw scoring data indicates some answers by Defendant to nonexistent questions. In particular, a section of the WAIS–III titled “similarities” has 19 questions, and Dr. Young's scoring of that section lists 23 answers (with questions 20 to 23 answered “idk,” presumably for “I don't know”). See Gov't's Exs. 37 and 39; Doc. No. 2264, Tr. Dec. 18, 2013 (Dr. Denney) at 23 (“This is what is really purely astounding ... she continued typing his responses past the end of the subtest.”); Doc. No. 2261, Tr. Dec. 12, 2013 (Dr. Boone) at 137–38 (“I don't understand what happened, but somehow she made a mistake there.”).

On the other hand, Defendant points to testimony from 2008, where government expert Dr. Hall—although disagreeing with some of Dr. Young's conclusions regarding neuropsychological tests—testified that he had “no problems” with Dr. Young's test's “reliability and validity” and he “actually rescored some of [Dr. Young's] measures and she was accurate in the scoring.” Doc. No.2065–4, Tr. Nov. 6, 2008 (Dr. Hall) at 17.

Ultimately, however, experts for both sides (Dr. Boone for Defendant, and Dr. Denney for the government) were able to review the available data in an attempt to verify the accuracy of Dr. Young's scoring, and both agreed that Dr. Young's full scale IQ score should be raised one point. See Doc. No. 2261, Tr. Dec. 13, 2013 (Dr. Boone) at 9 (opining that Dr. Young's full-scale IQ score of 73 for Defendant should be a 74, given a change in the scoring for a vocabulary subscore); Doc. No. 2265, Tr. Dec. 19, 2013 (Dr. Denney) at 40–41 (indicating that Dr. Denney rescored Dr. Young's examination and also obtained a higher vocabulary subscore, resulting in a full scale score of 74). The WAIS–III was apparently published in 1997, with data normed in 1996 or 1997. See Def.'s Ex. 1029. Thus, assuming a “Flynn adjustment” of approximately 0.3 (or 0.33) per year, Dr. Young's full-scale IQ would “adjust” to 70 or 71.

Despite some questions about Dr. Young's administration of the 2006 WAIS–III test, however, the court is not convinced that (as the government argues) “Dr. Young did not perform the WAIS–III testing with a generally accepted protocol” or that her testing is “replete with errors and is not reliable.” Doc. No. 2281, Gov't's Opp'n at 6. In short, the court will consider Dr. Young's original test results, as well as the opinions of Drs. Boone and Denney that her full-scale IQ score for Defendant should be a 74 (rather than a 73).

b. Dr. Tyner's April 2009 testing

At the other extreme, Defendant challenges the full scale IQ score of 88 obtained in April 2009 on a test administered by Dr. Tyner. In this regard, it is apparently undisputed that Dr. Tyner administered the WAIS–III test while serving in a post-doctorate capacity and had relatively limited experience in IQ test administration. See Gov't's Ex. 40; Doc. No.2065–5, Tr. June 12, 2012 (Dr. Tyner) at 47. Dr. Tyner, however, was supervised by Dr. Preston Baecht—and Dr. Preston Baecht prepared and signed the Forensic Report regarding competency, as a complete battery of testing (not just an IQ test by Dr. Tyner) had been administered. Doc. No. 825, Forensic Rpt. at 13.

Defendant's scores were a full scale IQ of 88, a performance IQ of 90, and a verbal IQ of 88. Gov't's Ex. 40.

Defendant previously objected to the release or use of these 2009 test results, which were obtained for assessing his competency to stand trial. But the court ultimately allowed mental health experts from both sides to consider the April 2009 results and clinical findings. See Doc. No. 1266, Aug. 11, 2010 Order Granting the Government's Motion to Allow the Mental Health Experts, for Both Sides, to Consider the Test Results and Clinical Findings Contained in the Competency Report (Ezra, J.).

In 2012, Defendant also challenged the reliability and admissibility of this information for trial purposes. See Doc. No. 1853, Def.'s Mot. to Exclude or Limit Testimony of Dr. Preston–Baecht or Other BOP Competence–Related Examiners. But the court (Judge Ezra)—after three days of Daubert hearings—found the test results and proffered testimony to be sufficiently reliable under Federal Rule of Evidence 702. Doc. No.1928, Order Denying Def.'s Mot. to Exclude or Limit Testimony at 17. The court rejected an argument that the evidence was unreliable as having been developed only for competency purposes. Id. at 15 (“The fact that Drs. Tyner and Preston Baecht examined Defendant to assess his competency does not undermine the reliability of their testimony as long as the ‘principles and methodology’ they employed ‘are grounded in the methods of science.” ’). The court accepted the use of a WAIS–III, even though the WAIS–IV had been recently released in 2009, whereas the WAIS–III was several years old. Id. at 21–22. The court considered testimony from Dr. Boone that Dr. Tyner made subjective scoring errors, had insufficient experience, and improperly administered the test on different days. The court concluded that any “errors in administration and scoring identified by Defendant are not fatal to the reliability of the [Bureau of Prison's] experts' testimony.” Id. at 21.

Neither Dr. Tyner nor Dr. Preston Baecht testified again during the December 2013 Atkins proceedings. Similar evidence, however, was presented by other witnesses regarding the April 2009 testing, including that Defendant suffered a major diabetic attack during the battery of tests (only a few days after the WAIS–III testing). See, e.g., Gov't's Ex. 3, Denney Rpt. at 18; Def.'s Ex. 1008, Boone Rpt. at 7. During the Atkins proceedings, Dr. Boone (for Defendant) and Dr. Denney (for the government) both presented evidence that they reviewed Dr. Tyner's raw test data from April 2009, and opined that the full scale IQ score should be lower. In particular, Dr. Boone re-scored the test as an 85. Def.'s Ex. 1008, Boone Rpt. at 7. Dr. Denney rescored it an 86. Doc. No. 2264, Tr. Dec. 18, 2013 (Dr. Denney) at 24 (“I came up with an 86.... I was able to score this protocol rather easily, and I can see which items we disagreed on.”).

Evidence was also presented about the practice effect, particularly because Defendant had been administered the same instrument (a WAIS–III) three years earlier by Dr. Young. As set forth above, the court will consider the existence of a possible practice effect (both in administering the same instruments, and in administering multiple IQ tests over a period of time). The court, however, will not lower this score any particular amount (whether from the original score of 88, or from either of Dr. Boone's or Dr. Denney's lower scores of 85 or 86), especially given the relatively long three-year passage of time between test administrations.

In summary, having considered the record as a whole, and having heard the testimony regarding the April 2009 testing, the court concludes that Dr. Tyner's test data is sufficiently reliable to at least consider both the original full scale score of 88, and related subscores. But the court will also accept the opinions of Drs. Boone and Denney that Dr. Tyner's 2009 full-scale IQ score should be lower by two points (to an 86) or three points (to an 85). Applying a Flynn adjustment to this WAIS–III (normed in 1997) examination would lower scores by approximately four points.

c. Dr. Goldstein's May 2010 testing

Dr. Goldstein administered a WAIS–IV examination on Defendant in May 2010, and obtained a full scale IQ score of 79 (with a range of 75 to 83, at a 95 percent confidence interval). See Gov't's Ex. 24. Dr. Boone reviewed Dr. Goldstein's testing and testified that the scoring was accurate. Doc. No. 2260, Tr. Dec. 12, 2013 (Dr. Boone) at 202. Although this was Defendant's first time taking the WAIS–IV, Dr. Boone testified that there still would be a practice effect from Defendant's having taken the WAIS–III a year earlier. Id. She later testified, however, that she independently reviewed Dr. Goldstein's testing materials and—acknowledging some degree of subjectivity in scoring—rescored the test by lowering the full scale IQ by one point. Doc. No. 2261, Tr. Dec. 13, 2013 (Dr. Boone) at 13. That is, according to Dr. Boone, this full scale score should be a 78 (not a 79). Thus, assuming a “Flynn adjustment” of approximately 0.3 (or 0.33) per year, Dr. Goldstein's full-scale IQ would adjust to 77 or 78 (the WAIS–IV was normed in 2006, Doc. No. 2260, Tr. Dec. 12, 2013 (Dr. James) at 153).

Having considered all the testimony, the court accepts Dr. Goldstein's original full-scale score (79), understanding also that Dr. Boone believes it should be a 78, and that an unspecified practice effect creates some uncertainty in these scores.

d. Dr. Denney's August 2013 testing

Finally, Dr. Denney administered another WAIS–IV in August 2013, as part of his neuropsychological evaluation of Defendant. Gov't's Ex. 3. He obtained a full scale IQ score of 84 (with a range of 80 to 88, at a 95 percent confidence interval). Gov't's Ex. 3, Denney Rpt. at 29.

Dr. Boone, on behalf of Defendant, testified that she reviewed Dr. Denney's data (including his test booklet for the WAIS–IV) and did not find any errors. Doc. No. 2260, Tr. Dec. 12, 2013 (Dr. Boone) at 210. Defendant, however, challenges Dr. Denney's score, again based upon an unspecified practice effect, but emphasizing “progressive error”—pointing out that he had been administered the same test three years earlier, had been administered a WAIS instrument four times over seven years, and took the SB–V given by Dr. James only a year earlier. For example, Dr. Boone was asked “is there literature that actually indicates that there would be a ... practice effect-related problem if you have tests that are spaced as they were in this case: a WAIS–III in 2006; a WAIS–III in 2009; a WAIS–IV in 2010; Stanford–Binet 2013; WAIS–IV 2013?” Id. at 210. Dr. Boone testified that the literature in the field “specifically talk[s] about the practice effect issue, ... that the person learns from the test administration procedures separate from any item content so that they will get inflated scores just because they've been through the similar procedures.” Id. at 211.

Nevertheless, the court accepts Dr. Denney's test results. Again, however, the court will consider that a practice effect may exist with repeated IQ testing, but will not make a specific point deduction in Dr. Denney's score. Rather, the court is convinced by Dr. Denney's general testimony that although “[t]here is no doubt that there is practice effect upon retesting ... [i]t varies upon the length and it depends on what sort of test you're using,” such that “progressive error” is a “possibility.” Id. at 188. Such a possibility, however, is likely within given confidence intervals, and ultimately can become moot depending upon an entire record of data. Id. at 186–87. Dr. Denney emphasized—consistent with the court's review of the clinical standards—that “[w]e cannot get overly focused with blinders on to some little number.” Id. at 13. “[A] clinician has to arrive at some level of confidence regarding these test scores. And to ... inform that level of confidence you look at how it's scored, how it's done ... but you don't get overly focused on, again, small numbers because, frankly, the precision in our assessment is not that precise.” Id. at 14. It is in that context that the court considers a practice effect.

Aside from Defendant's challenges to Dr. Denney's IQ scoring, the court finds credible Dr. Denney's testimony and opinion that if Defendant had a mathematics disorder, but is not intellectually disabled, such a disorder could also contribute to his relatively low IQ scores (whether characterized as being consistent with intellectual disability, or borderline, or subaverage). See Gov't's Ex. 3, Denney Rpt. at 36 (“[Achievement tests] indicate the presence of a specific learning disorder relating to mathematic abilities not [intellectual disability.]”); id. at 37 (“Such a mathematics related disability can affect IQ test results.”); id. (“Extreme deficits in mathematics abilities would clearly affect the Quantitative Reasoning factor [of the SB–V]”) (quoting the SB–V Interpretive Manual); Doc. No. 2264, Tr. Dec. 18, 2013 (Dr. Denney) at 66–67 (acknowledging that the SB–V Interpretive Manual states that “if ... observations and achievement test data show specific patterns of delayed learning, reading or mathematics deficiencies, and low SB–5 working memory scores have deflated the FSIQ, further testing should be conducted,” meaning that “we should look at other achievement-related tests ... to see [if] there is a specific area of deficit or weakness that this person has that would then be accounting for the low [IQ] score overall”). The court assumes that “comorbid” conditions are possible, if not common. See, e.g., Wilson, 922 F.Supp.2d at 363 n. 29 (assuming that “mental retardation can coexist with a learning disability”). But, given the extensive evidence of Defendant's problems with mathematics, the court gives credence to the possibility that such a mathematics disorder—and not intellectual disability—is reflected in Defendant's relatively low IQ scores. See also Doc. No. 2263, Tr. Dec. 17, 2013 (Dr. Woods) at 164 (“[I]f they have a math disorder, that impacts their IQ, that's the way it should be.”).

Across the board, experts for both sides concluded that Defendant has always had significant problems with mathematics ( e.g., a Scholastic Aptitude Test score of 200—the lowest possible score—on the mathematics section), and some diagnosed him with having a mathematics disorder. See, e.g., Gov't's Ex. 59, Oakland Rpt. at 6 (“chronic math difficulties”); id. at 15 (“Mr. Williams' academic history, including past and current achievement data, support a diagnosis of a Mathematics Disorder in light of standards from the [DSM–IV–TR].”); Tr. Sept. 9, 2013 (Dr. Oakland) at 90 (testifying that Defendant has a mathematics disorder); Tr. Dec. 12, 2013 (Dr. James) at 82 (“[H]e definitely has significant math impairment, but impairment in math can be due to multiple factors.”); Def.'s Ex. 1004, James Rpt. at 4 (“basic math calculation was extremely weak, roughly equivalent to the mid 3rd grade level”); Gov't's Ex. 3, Denney Rpt. at 36. See also Def.'s Ex. 1023 (providing the diagnostic features of Mathematics Disorder in the DSM–IV–TR).

4. Other Indicia of Intellectual Functioning

Aside from the full scale IQ testing discussed above, the record contains other indicia of Defendant's intellectual functioning. The court discusses some of the testimony here, but, overall, such evidence tends to confirm—rather than contradict—that Defendant does not have sufficiently “significant subaverage intellectual functioning” under prong one.

For example, Dr. Denney administered a RAIS examination in 2013, and obtained scores consistent with Dr. Denney's WAIS–IV scores—that is, above a range of intellectual disability. See Gov't's Ex. 3, Denney Rpt. at 29 (RAIS Composite Index score of 91). Dr. Denney acknowledged that a RAIS has no “processing speed” aspect and that sources indicate a person generally scores about five points higher on a RAIS than other clinically-accepted intelligence tests. Doc. No. 2264, Tr. Dec. 18, 2013 (Dr. Denney) at 10. Likewise, Dr. Young administered a Test of Nonverbal Intelligence (“TONI”) in 2006 in conjunction with a WAIS–III, and Defendant scored a 73 on the TONI. Doc. No.2065–2, Tr. Nov. 4, 2008 (Dr. Young) at 64. Ultimately, then, these other tests of intellectual functioning are consistent with each expert's other full scale IQ test scores—considered in detail above—and thus confirm their respective opinions but neither prove nor disprove the overriding Atkins question before the court.

Similarly, various experts as part of their neuropsychological examinations administered many other instruments as part of comprehensive test batteries. For example, the record contains numerous results for tests such as the Test of Variables of Attention, Wechsler Memory Scale, Wide Range Achievement Test, Wisconsin Card Sorting Test, Delis–Kaplan Executive Function System, California Verbal Learning Test, Story Memory Test, Figure Memory Test, Digit Vigilance Test, Green's Word Memory Test, Test of Memory Malingering, the Woodcock–Johnson Achievement Test, and the Minnesota Multiphasic Personality Inventory. See, e.g., Def.'s Ex. 1004, James Rpt. at 1; Gov't's Ex. 3, Denney Rpt. at 2. Further, as noted above with various IQ tests, each administration of an IQ test contains various subscores for areas such as working memory, verbal comprehension, and processing speed.

Some of these tests, such as the Test of Variables of Attention, were useful as standalone “performance validity” tests to assess factors such as effort and possible malingering.

Overall, however, because the clinical sources focus on the full-scale IQ scores, a detailed discussion of these other neuropsychological test results is not necessary. These other test results certainly factor into respective clinical opinions considered by the court, but are less important in specifically assessing prong one. But some of these subtest results (whether considered at prong one, or as part of Defendant's adaptive functioning at prong two), indicate that Defendant is “low average” or “average” in many relevant areas, consistent with him not being intellectually disabled. See, e.g., Def.'s Ex. 1004, James Rpt. at 2; Gov't's Ex. 3, Denney Rpt. at 28.

Several witnesses discussed Defendant's Scholastic Aptitude Test (“SAT”) and Armed Services Vocational Aptitude Battery (“ASVAB”) scores. It is undisputed that when Defendant took the SAT in high school in 1997, he scored a 350 verbal and 200 math (both on a scale of 200 to 800). See, e.g., Gov't's Ex. 3, Denney Rpt. at 10. It is also undisputed that Defendant took the ASVAB in 2000, but failed it on that first attempt. On Defendant's second attempt—taken after he studied for it—he obtained an Armed Forces Qualifying Test (“AFQT”) (which is a composite score made up of four ASVAB subtests) score of 31, which was sufficient for entry as an infantry soldier. See Gov't's Ex. 45, Wall Rpt. at 4, 8. This AFQT score means he scored “as well or better than 31 percent of individuals in the norm group ... which is based on a nationally representative sample of 18–23 year-olds taking the ASVAB in 1980.” Id. at 8.

The government proffered testimony from Dr. Gottfredson who utilized a “Frey–Detterman” formula to “convert” or estimate Defendant's IQ (or “g”) based on Defendant's SAT scores, concluding that his scores “yielded an IQ equivalent of 89.7.” Gov't's Ex. 1–2, Gottfredson App. 2 at 2–6. Similarly, Dr. Gottfredson opined that Defendant's 1999 and 2000 AFQT scores were equivalent to an IQ of 88 and 92.5. Id. at 2–4. If such a methodology were appropriate, such IQ scores would be well above a range considered to be intellectually disabled.

On the other hand, Dr. Boone testified that such formulas are “wildly inaccurate” and that it would “absolutely not” be generally accepted practice to use an IQ score obtained through Frey–Detterman to determine if someone is presently intellectually disabled. Doc. No. 2260, Tr. Dec. 12, 2013 (Dr. Boone) at 196–97. She testified that, although SAT scores can give a “ball park idea” as to how a person was functioning at a given time, using an equation to obtain an IQ score would not be appropriate. Id. at 198. And, in any event, Dr. Boone opined that Dr. Gottfredson used the wrong equation and, if anything, Defendant's “equivalent” IQ score based on his SAT scores would be a 68. Doc. No. 2261, Tr. Dec. 13, 2013 (Dr. Boone) at 73; Def.'s Ex. 1008, Boone Rpt. at 7 (opining that Dr. Goldstein—who also used Frey–Detterman to obtain IQ equivalents from SAT scores—erred, and that Defendant's estimated IQ scores with the proper equation would be 68).

The court, however, ultimately finds it completely unnecessary to consider these “IQ equivalents.” Indeed, both parties point to Defendant's low SAT scores as evidence to support their positions—the government proffers Dr. Gottfredson's opinions, and Defendant has pointed to the scores as evidence of BIF and intellectual disability. See Doc. No.2065–2, Tr. Nov. 4, 2008 (Dr. Young) at 79; Doc. No. 2260, Tr. Dec. 12, 2013 (Dr. James) at 104; Doc. No. 2263, Tr. Dec. 17, 2013 (Dr. Woods) at 36. But even if there is some basis to obtain an “equivalent” IQ score from an SAT and ASVAB score for some purposes, such tests are not “appropriate, standardized and individually administered assessment instruments” of intellectual functioning under AAIDD or APA clinical standards—and so, considering “equivalent” IQ scores would create further uncertainty in assessing intellectual functioning under prong one in this Atkins context. So, to be clear, the court is not considering Dr. Gottfredson's, or any other expert's, “conversion” of SAT or ASVAB/AFQT scores to “equivalent” IQ scores in making its determination here.

To summarize, the court has considered the totality of the evidence regarding Defendant's intellectual functioning, and ultimately agrees with the credible testimony of Dr. Denney, who testified in this regard that “when I look at the Flynn-adjusted IQ scores, even in taking in the confidence interval of each one of them ... it still does not go down into the range that is required for the diagnosis of intellectual disability.” Doc. No. 2264, Tr. Dec. 18, 2013 (Dr. Denney) at 91. That is—after carefully considering the uncertainties inherent in all of Defendant's IQ test scores (as set forth in the table above), and examining other evidence of Defendant's intellectual functioning (including assessing the credibility of different expert witnesses)—the court concludes that Defendant has failed to prove by a preponderance of the evidence that he has significantly subaverage intellectual functioning (prong one) of the general clinical standard for intellectual disability.

B. Prong Two—Adaptive Functioning

Where a defendant fails to prove prong one, some courts have found it unnecessary to reach prong two—a successful Atkins claim, when applying clinical standards, requires proving all three prongs. See, e.g., Wilson, 922 F.Supp.2d at 368 (“[T]he court concludes that Wilson has not satisfied his burden of proving that he more likely than not suffers from significantly subaverage intellectual functioning. He has thus failed to satisfy an indispensable prerequisite of the definition of [intellectual disability], and there is no need for the court to address the other requirements of the definition.”) (internal citations omitted).

Following such reasoning, this court could also stop at prong one and conclude that Defendant's Atkins claim necessarily fails. Even so, however, some sources indicate that Defendant might still be classified as intellectually disabled with an IQ score as high as 75, depending upon the level of deficits at prong two. See Def.'s Ex. 1020, DSM–IV–TR at 41–42 (“[I]t is possible to diagnose [intellectual disability] in individuals with IQs between 70 and 75 who exhibit significant deficits in adaptive behavior.”). And, despite the extensive discussion above, the court acknowledges that some of Defendant's scores are close to, or fall within, this range. Thus, given the emphasis on clinical judgment and a comprehensive approach, as well as the DSM–5's de-emphasis of IQ scores (and corresponding emphasis on adaptive functioning), the court will proceed to examine testimony and evidence of Defendant's adaptive functioning.

The court focuses on the three witnesses who testified before the court in September and December 2013 regarding adaptive functioning, and who personally performed clinical examinations of Defendant—Drs. Oakland, Denney, and Woods. (Although Drs. James and Boone opined that Defendant is intellectually disabled, they did so based largely on prong one, relying on assessments of others for the adaptive functioning prong. See Def.'s Ex. 1005, James Suppl. Rpt. at 2 (“I did not do a formal evaluation of adaptive functioning[.]”). And Dr. Boone admits that she is not an expert on adaptive functioning. See Doc. No. 2261, Tr. Dec. 13, 2013 (Dr. Boone) at 68–69, 71.)

Drs. Oakland and Denney both reported and testified, among other matters, that Defendant does not fulfill prong two's requirements. See Gov't's Ex. 59, Oakland Rpt. at 24; Gov't's Ex. 3, Denney Rpt. at 38. On the other hand, Dr. Woods opined that Defendant satisfies prong two, and “meets the AAIDD, DSM–IV [-]TR, and DSM–5 criteria for mild intellectual disability[.]” Def.'s Ex. 1002, Woods Rpt. at 28. The court assesses each opinion below, and is persuaded by Drs. Oakland and Denney, finding Dr. Woods' opinion to be less credible.

1. Dr. Oakland

As explained above, Dr. Oakland and a colleague developed the ABAS–II, which is a peer-reviewed and generally-accepted method of assessing adaptive behavior. See Doc. No. 2172, Tr. Sept. 9, 2013 (Dr. Oakland) at 21–22. Clinical and other sources mention the ABAS–II prominently as such a method. See, e.g., Def.'s Ex. 1015, 2010 AAIDD Manual at 5; Def.'s Ex. 1014, 2002 AAMR Manual at 81, 90; United States v. Northington, 2012 WL 4024944, at *5 (E.D.Pa. Sept. 12, 2012) (“The ABAS–II assesses certain adaptive functioning skills that directly correlate with the definition of intellectual disability.... The ABAS–II is often administered to defendants raising Atkins claims.”). Given such a credential, Dr. Oakland was clearly qualified to conduct an assessment of Defendant that focused, in relevant part, on Defendant's adaptive behavior. See Gov't's Ex. 59, Oakland Rpt. at 1.

Dr. Oakland examined Defendant while in custody on May 24, 25, and 26, 2010. Id. at 1–2. In addition to interviewing Defendant about his childhood and academic background, Dr. Oakland administered the ABAS–II, and the Woodcock–Johnson Test of Achievement–III (“WJ–III”). Id. at 3–6. He also interviewed Delilah Williams about Defendant, and she too was administered the ABAS–II as to Defendant's adaptive functioning. Id. at 6. Dr. Oakland concluded that “[Defendant's] adaptive behavior is estimated to be at least in the borderline range and may be in the below average range.” Id. at 24. That is, according to Dr. Oakland, Defendant fails to satisfy prong two's standard for intellectual disability.

Although Dr. Oakland's ABAS–II numerical scores of Defendant's functioning were in the “extremely low range” (Defendant's self-rating scores) or “extremely low to borderline range” (Delilah Williams' ratings of Defendant), Dr. Oakland concluded that those ratings lack reliability and are of questionable validity. Id. at 8–9; see also id. at 19 (“It is my opinion that scores from the ABAS underestimate and under-represent Mr. Williams' optimal capacity to possess and employ adaptive behavior and skills.”). He explains:

Defendant scored a “Standard Score” of 63 on the ABAS–II's “General Adaptive Composite,” with subscores of 67 for the conceptual domain, 64 for the social domain, and 65 for the practical domain—all in the lowest category of “extremely low range.” Gov't's Ex. 59, Oakland Rpt. at 7.

Delilah Williams' ABAS–II for Defendant resulted in a “Standard Score” of 70, with subscores of 75 for the conceptual, social, and practical Domains—with a range of 71 to 79 constituting the “borderline range.” Gov't's Ex. 59, Oakland Rpt. at 7. That is, the standard score was “extremely low” and the subscores were borderline.

The measurement of adaptive behavior by the ABAS distinguishes between whether a person possesses a skill versus the extent to which the skill is actually used by that person when needed. For example, a person may have the ability to wash dishes either by hand or by placing them in a dishwasher yet not perform this task when needed. When completing the ABAS, a score of 0 is assigned to a test item when a person cannot perform the behavior. A score of 1 is assigned to a test item when a person has this ability to employ the skill yet never or almost never does when needed. A score of 2 is assigned to a test item when the person has the ability to employ the skill and does it sometimes when needed. A score of 3 is assigned to a test item when the person has the ability to employ the skill and does it always or almost always when needed.

....

.... [B]oth [Defendant] and his wife indicate [Defendant] has the capacity to perform between 93% and 98% of all adaptive skills assessed by the ABAS. The discrepancy between this finding and Mr. Williams' low scores on the General Adaptive Composite, the three Domains, and the 10 skill areas is attributable to the fact that he received numerous scores of 1 on the 239 items [in the test]. That is, both Mr. Williams and his wife describe him as possessing numerous adaptive skills yet not reliably employing them when indicated....

....

Mr. Williams' self-report reflects 36 items that are rated as a 1 (possessing a skill but not employing it or rarely employing it when indicated). Mrs. Williams' report similarly has 32 items that are rated as a 1. The amount of “1” ratings in Mr. and Mrs. Williams' test protocols is significantly more than I ever have seen in my many years of evaluating adaptive skills. Importantly, responses from both Mr. and Mrs. Williams reflect their shared perception that Mr. Williams has the ability to use numerous adaptive skills but generally decides not to do so when indicated.
Id. at 9. Dr. Oakland thus disregarded the numbers, reasoning:

.... Although both Mr. and Mrs. Williams indicated the skills/behaviors could have been performed but were not, independent evidence suggests Mr. Williams displayed a number of these behaviors at least sometimes when needed, and thus should receive a score of 2, not 1 on them. Revision of these scores to a 2 would significantly increase his scores.
Id. at 13. He also analyzed and compared the results from Defendant and Delilah Williams, which “reflect considerable inconsistency.” Id. at 19. He noted, for example that “they agreed on only 38% of the items rated a 1.” Id.

That is, after examining the test data, Dr. Oakland used his clinical judgment and looked to other evidence as well to fully assess Defendant's adaptive behavior. At the September 2013 hearing, Dr. Oakland explained that

[W]e need to consider information beyond the ABAS if we're going to make a decision as to the level of adaptive behavior displayed by Mr. Williams. And that, of course, is consistent with [the DSM–5] to consider both the clinical judgment as well as ... an assessment based upon a standardized normed reference measure of adaptive behavior.
Doc. No. 2172, Tr. Sept. 9, 2013 (Dr. Oakland) at 36; see also id. at 103 (“So it comes down to clinical judgment.”). His report summarized that “scores from the ABAS underestimate and under-represent Mr. Williams' optimal capacity to possess and employ adaptive behavior and skills.” Id. at 19.

At the September 2013 hearing, Dr. Oakland further explained that his examination of Defendant was not ideal—the ABAS is typically used to assess someone's current adaptive behavior, id. at 22, and not retrospectively while a testing subject is incarcerated. Id. at 23; see also Def.'s Ex. 1019, DSM–5 at 38 (“Adaptive functioning may be difficult to assess in a controlled setting ( e.g., prisons, detention centers); if possible, corroborative information reflecting functioning outside those settings should be obtained.”). Clinicians also prefer to take adaptive behavior information from multiple sources such as parents, siblings, and teachers (whereas Dr. Oakland, although he read reports of interviews prepared by others, only personally interviewed Defendant and Delilah Williams). Doc. No. 2172, Tr. Sept. 9, 2013 (Dr. Oakland) at 23. He also explained that it's inconsistent with the ABAS to take information from people “who may have some personal gain to make from their reports.” Id. at 24. In this regard, the AAIDD also recommends against relying “primarily on an individual's self-report of his skill level.” Salad, 959 F.Supp.2d at 878. Rather, examiners “should rely on information gathered from third parties who are ‘very familiar with the person and have known him/her for some time and have had the opportunity to observe the person function across community settings and times.’ ” Id. (quoting 2010 AAIDD Manual at 47); see also Def.'s Ex. 1014, 2010 AAIDD Manual at 52 (cautioning against “relying heavily only on the information obtained from the individual himself or herself when assessing adaptive behavior for purposes of establishing a diagnosis of [intellectual disability]”).

In discounting Defendant's ABAS–II scores, Dr. Oakland relied upon facts from Defendant's background that are not seriously disputed by Defendant: He began drinking alcohol at home by at least age 13. Gov't's Ex. 59, Oakland Rpt. at 4. He graduated with a regular diploma from “an excellent high school in Heidelberg, Germany [as a military dependent].” Doc. No. 2172, Tr. Sept. 9, 2013 (Dr. Oakland) at 38. There is no evidence that he was ever suspected of having an intellectual disability when he entered school or at any level of his education. He had not been retained in grade. Id. He also “passed the ASVAB,” id., and progressed to the rank of E–4 in the U.S. Army. Id. at 38–39, 104. Although Dr. Oakland diagnosed Defendant with a mathematics disorder, his verbal scores on the WJ–III were inconsistent—in Dr. Oakland's view—with intellectual disability. Id. at 48; see also Gov't's Ex. 58 (“[Defendant] demonstrated a significant strength in written expression. He demonstrated a significant weakness in math calculation skills.”) (Dr. Oakland's summary of WJ–III scores). Dr. Oakland nevertheless recognizes that the WJ–III is not a measure of intellectual functioning (prong one). Doc. No. 2172, Tr. Sept. 9, 2013 (Dr. Oakland) at 77. He used that test, however, as part of his overall assessment and to “provide some insights” in reference to adaptive behavior. Id. at 108.

In sum, the court finds credible Dr. Oakland's explanations regarding the reasons he doubts that the ABAS–II scores are accurate. And Dr. Oakland testified credibly that, in his clinical judgment, other information is inconsistent with the ABAS–II test results, and that Defendant's adaptive functioning is “at least in the borderline range and may be in the below average range.” Gov't's Ex. 59, Oakland Rpt. at 24. Moreover, this finding is fully supported by the opinions and credible testimony of Dr. Denney, which the court addresses next.

2. Dr. Denney

Dr. Denney, on behalf of the government, conducted a neuropsychological examination of Defendant in August 2013 that included an extensive adaptive behavior assessment, interviews with Delilah Williams, and administration of the ABAS–II regarding Defendant on Sgt. Eugene Grace (who met Defendant in 2004 and worked with him for about a year). See Gov't's Ex. 3, Denney Rpt. at 31–34. That is, in contrast to Dr. Oakland, Dr. Denney had additional access to third-party information regarding Defendant's “real world functioning.” Id. at 39. Among other information, Dr. Denney reviewed previous intellectual functioning testing; Dr. Oakland's report and raw data; defense and government investigator reports; Defendant's military and educational records; and reports by government and Defendant expert witnesses. Id. at 2–3, 24–27. Dr. Denney analyzed Defendant's adaptive behavior in all clinical domains—determining that “Mr. Williams' functioning is higher than the range of Intellectual Disability and [BIF],” id. at 39, and concluding that Defendant is not intellectually disabled. Doc. No. 2264, Tr. Dec. 18, 2013 (Dr. Denney) at 12.

The court is convinced by Dr. Denney's credible testimony. At this prong in particular, Dr. Denney based his opinion on the entire record. He based his opinion on:

The record as it stands regarding his academic success, including the academic weaknesses he's had, but also ... looking at the adaptive function as evidenced by his achievement testing that he's completed, by his success in the workplace, his success in communication skills with friends and people, his ability to manage himself safely ... those things all come together to lead me to conclude that he does not have a pervasive deficit in the area of adaptive function[ing]. .... [Y]ou can't separate that out from the rest of somebody's life.... [T]hey all create a global picture that is not consistent with mental retardation[.]
Id. at 12–13.

More specifically, similar to Dr. Oakland, Dr. Denney relied upon largely undisputed facts from Defendant's background: Defendant graduated from a Department of Defense high school in Germany without any indication of an intellectual disability (although with below average grades), having never been held back since he started school. Defendant indicated he “did not put much effort into his school work ... said he cheated [and] drank and smoked marijuana, drank heavily ... and he believed that that probably impacted his school work.” Doc. No. 2264, Tr. Dec. 18, 2013 (Dr. Denney) at 208.

Dr. Denney relied on information regarding Defendant's employment history. Defendant obtained his driver's license and worked at several fast food type restaurants, and later enlisted in the Army. Id. at 210. After basic training, Defendant attended advanced infantry training, where he learned about weapon systems, and focused on learning to drive a tank (a Bradley Fighting Vehicle). Id. at 213. He was then stationed at Fort Hood, Texas where he was a “Bradley driver” for over three years. Gov't's Ex. 3, Denney Rpt. at 6. He was deployed to Kuwait, and later returned to Fort Hood. Id. After re-enlisting in the Army, and being promoted to the rank of Specialist (E–4), he was stationed at Schofield Barracks, Hawaii. Doc. No. Tr. Dec. 18, 2013 (Dr. Denney) at 216; Gov't's Ex. 3, Denney Rpt. at 6.

Dr. Denney knew that, although Defendant “attempted to avoid land navigation” because he was “not too fond of maps,” Defendant indicated this was because it “required using a math formula and he could not understand it because of the math required.” Gov't's Ex. 3, Denney Rpt. at 7. Delilah Williams confirmed to Dr. Denney that Defendant had trouble with land navigation, but also said Defendant “knew how to drive long distances from state to state,” and that he drove himself from Tennessee to Texas. Id. at 31. (There is other evidence that Defendant also drove from South Carolina to Texas. See Doc. No. 2263, Tr. Dec. 17, 2013 (Dr. Woods) at 93–95). Dr. Denney concluded that all these facts about Defendant's adaptive functioning are inconsistent with intellectual disability. Gov't's Ex. 3, Denney Rpt. at 39.

Dr. Denney also relied on an ABAS–II completed by Sgt. Grace regarding Defendant's behavior in 2004 and 2005—resulting in an ABAS—II General Adaptive Composite score of 101 (a score in the average range). Id. at 33–34. Sgt. Grace, who was the supply sergeant for Defendant's unit at Schofield Barracks, told Dr. Denney that he chose Defendant to be his assistant supply sergeant. Id. at 21. He reported that Defendant was “on point” regarding his job. Id. at 32. “We pushed him fast. I taught him everything about being a supply sergeant.... I had no concerns about leaving him in charge ... I knew he could handle the job.” Id. Sgt. Grace explained that Defendant “was awesome with paperwork. Compared to others I trained, you just had to show him once and he had it.” Id. Sgt. Grace continued: “[Defendant] was above average compared to others [he has] trained in regard to filling out paperwork, doing everything in that job—he knew it all without hesitation.” Id. He commented: “[Defendant] was ... a natural for the position, and he knew the information the next day. I don't know what he did to memorize it, or if he had a photographic memory, but he had it down instantly.” Id. Sgt. Grace told Dr. Denney that he recommended Defendant three times for promotion from E–4 to E–5, but Defendant's “home life” prevented such a promotion. Id. at 32–33; see also Gov't's Ex. 55 at 2 (“Compared to others from an intellect perspective; he was a smart guy. I think he would have made an excellent [Non-commissioned Officer]”) (quoting Dr. Denney's notes from his September 2013 interview with Sgt. Grace). At the December 2013 hearings, Dr. Denney also testified that, when Defendant was arrested for the incidents in this case, he called Sgt. Grace for assistance—indicative of a close relationship between Sgt. Grace and Defendant. See Doc. No. 2265, Tr. Dec. 19, 2013 (Dr. Denney) at 70.

In short, Dr. Denney had ample basis for concluding that Defendant's adaptive behavior was inconsistent with Defendant having an intellectual disability: Defendant was able to study for the ASVAB and gain entrance to the U.S. Army. By many accounts, Defendant functioned “normally” as an adult and as a soldier in the Army. He drove a tank. He was an assistant supply sergeant. He was trusted by a supervisor to handle military duties. An ABAS–II examination by Sgt. Grace indicated Defendant had “average” adaptive behavior. True, the record contains much evidence supporting Delilah Williams' view (as told to Dr. Denney) that “I never thought [Defendant] was the brightest intellectually.” Gov't's Ex. 3, Denney Rpt. at 31. But the weight of the evidence credibly supports Dr. Denney's view that, even if Defendant is not bright, he does not meet prong two's standard for intellectual disability.

3. Dr. Woods

In contrast, Dr. Woods testified on Defendant's behalf and concluded that Defendant's adaptive behavior met the criteria for intellectual disability under prong two. The court accepts Dr. Woods' testimony that many mildly intellectually disabled individuals can function in society, obtain a driver's license, graduate from high school, and—given a proper support system—operate under a “cloak of confidence” whereby they mask their symptoms and obtain employment. Even so, however, the court ultimately finds Dr. Woods' testimony regarding Defendant's specific adaptive functioning to be less compelling and credible than the opinions of Drs. Denney and Oakland.

First, Dr. Woods relied in large part on Dr. Oakland's ABAS–II test results in opining that Defendant had deficits in adaptive functioning, Doc. No. 2263, Tr. Dec. 17, 2013 (Dr. Woods) at 5 to 35—but these are test results that Dr. Oakland has (credibly) discredited as being unreliable and understatements of Defendant's adaptive behavior. Dr. Woods appears not to have seriously considered the ABAS–II administered by Dr. Denney on Sgt. Grace, performed after Dr. Woods issued his June 2013 written report. Id. at 81. Indeed, as a psychiatrist (not a psychologist), Dr. Woods admitted that he does not administer the ABAS, id. at 82, as compared to Dr. Oakland who developed the ABAS–II.

At prong one, Dr. Woods also relied on Dr. James' original IQ test score of 73, Flynn-adjusted to 69, for his initial opinion—and indicated unconvincingly that an increase in five points would make no difference due to deficits in adaptive functioning.

Second—while there is evidence supporting Dr. Woods' views that Defendant could not (or did not) write checks, and relied extensively on others (such as his wife or his mother) for financial matters—Dr. Woods selectively chose to credit such information about Defendant, but somewhat implausibly discounted many other aspects of Defendant's “real world” functioning: his commendation medals received in the Army, ability to drive a Bradley Fighting Vehicle, and ability to drive on his own from Tennessee and/or South Carolina to Fort Hood, Texas. Id. at 93–95. Dr. Woods also apparently disregarded or discounted 2011 information in Dr. Woods' file from Sgt. Grace (well before Dr. Denney's interviews in 2013), wherein Sgt. Grace also described Defendant as “an awesome employee,” stated that Defendant could “physically and mentally handle the job and was able to lay out supplies by the book,” and that he “never had to tell [Defendant] to do something twice.” Id. at 86; see also Gov't's Ex. 54, FBI 302 (Sgt. Grace Interview) at 2.

In short, Dr. Woods' examination was not as thorough as Dr. Denney's—and thus the court credits Dr. Denney's opinion above Dr. Woods' conclusion (both as to adaptive functioning, and his overall belief regarding Defendant's condition). And thus Defendant has failed to prove prong two by a preponderance of the evidence—he has not proven he has significant deficits or impairments in adaptive functioning or behavioral skills to meet the applicable definition of intellectual disability.

C. Prong Three—Onset Before the Age of Development

Defendant has failed to prove both prongs one and two—he has not demonstrated that he has significantly subaverage intellectual functioning or significant deficits or impairments in adaptive functioning. As a result, this third prong is moot. See Candelario–Santana, 916 F.Supp.2d at 221 (concluding that “because [defendant] does not show signs of being [intellectually disabled], this third prong becomes moot”); Salad, 959 F.Supp.2d at 887 (“Having concluded that the Defendant does not have significant limitations in adaptive skills or intellectual functioning, the court does not address prong three.”).

V. CONCLUSION

For the foregoing reasons, Defendant has failed to prove that he is intellectually disabled for purposes of the Federal Death Penalty Act and Atkins v. Virginia, 536 U.S. 304, 122 S.Ct. 2242, 153 L.Ed.2d 335 (2002). He remains eligible to face the death penalty. Defendant's Motion for such a pretrial determination is DENIED.

This denial, however, does not preclude Defendant from raising these issues at a punishment phase of the trial as part of an attempt to demonstrate mitigating factors, should he be found guilty of one of the capital counts of the Indictment. See, e.g., Hardy, 644 F.Supp.2d at 752 (rejecting the proposition that the existence of intellectual disability is not a mitigation factor, emphasizing that “the defendant's character, prior criminal history, mental capacity, background, and age are just a few of the many factors ... a jury may consider in fixing punishment”) (quoting Simmons v. South Carolina, 512 U.S. 154, 163, 114 S.Ct. 2187, 129 L.Ed.2d 133 (1994)).

IT IS SO ORDERED.

United States v. Williams

Summary

Opinion

ORDER DENYING DEFENDANT NAEEM WILLIAMS' MOTION FOR PRETRIAL DETERMINATION THAT THE DEATH PENALTY CANNOT BE CARRIED OUT AGAINST NAEEM WILLIAMS BECAUSE OF A DISQUALIFYING MENTAL CAPACITY WITHIN THE MEANING OF 18 U.S.C. § 3596(c) AND ATKINS v. VIRGINIA, 536 U.S. 304 (2002)

I. INTRODUCTION

II. PROCEDURAL BACKGROUND

A. Charges Against Defendant

B. Prior Expert Witness Testimony and Evidence

C. Atkins Evidentiary Hearings

1. Dr. Myla Young

2. Dr. Pablo Stewart

3. Dr. Phillip Resnick

4. Dr. Howard Hall

5. Dr. Lee Ann Preston Baecht

6. Dr. Elizabeth Tyner

7. Dr. Thomas Oakland

8. Dr. Joette James

9. Dr. George Woods

10. Dr. Kyle Boone

11. Dr. Robert Denney

12. Dr. Linda Gottfredson

III. DISCUSSION

A. Procedural Standards

B. Substantive Standards—A Definition of “Intellectual Disability” Informed by Established Clinical Standards

C. Clinical Standards

1. A Three–Prong Test

2. Clinical Judgment and a Comprehensive Analysis

3. Prong One: “Significantly Subaverage Intellectual Functioning”

a. The relative importance of IQ scores

b. Measurement errors and confidence intervals

c. The “Flynn Effect”

d. The “practice effect”

4. Prong Two: “Adaptive Functioning”

5. Prong Three: Onset Before Eighteen (or “the Age of Development”)

IV. ANALYSIS

A. Prong One—Intellectual Functioning

1. Prior Testimony of Drs. Stewart and Young

2. Testimony of Drs. James and Boone

3. IQ Testing by Drs. Young, Tyner, Goldstein, and Denney

a. Dr. Young's January 2006 testing

b. Dr. Tyner's April 2009 testing

c. Dr. Goldstein's May 2010 testing

d. Dr. Denney's August 2013 testing

4. Other Indicia of Intellectual Functioning

B. Prong Two—Adaptive Functioning

1. Dr. Oakland

2. Dr. Denney

3. Dr. Woods

C. Prong Three—Onset Before the Age of Development

V. CONCLUSION

United States v. Williams

United States v. Williams

Case Details

CitationsCopy Citation

Citing Cases

Citations