The People, Respondent,v.Kaity Marshall, Appellant.BriefN.Y.November 17, 2015APL-2014-00196 Appellate Term, Second Department Docket No. 2011-999 K CR Court of Appeals STATE OF NEW YORK THE PEOPLE OF THE STATE OF NEW YORK, Respondent, —against— KAITY MARSHALL, Defendant-Appellant. MATERIALS CITED IN BRIEF OF AMICUS CURIAE THE INNOCENCE PROJECT, INC. IN SUPPORT OF DEFENDANT-APPELLANT KAITY MARSHALL’S APPEAL d DAVID S. FRANKEL SARAH C. WHITE SHAKED SIVAN (not yet admitted) KRAMER LEVIN NAFTALIS & FRANKEL LLP 1177 Avenue of the Americas New York, New York 10036 Telephone: (212) 715-9100 Facsimile: (212) 715-8000 Counsel for Amicus Curiae The Innocence Project, Inc. BARRY C. SCHECK KAREN A. NEWIRTH THE INNOCENCE PROJECT, INC. 40 Worth Street, Suite 701 New York, New York 10013 Telephone: (212) 364-5340 October 19, 2015 RECOVERED MEMORIES: TRUE AND FALSE The Formation of False Memories , ~ J ! by ELIZABETH F. LOFTUS, PhD; and JACQUELINE E. PICKRELL, BA F or most of this century, experimental psychologists have been interested in how and why memory fails. As Greene l has aptly noted, memories do not exist in a vacuum. Rather, they continually disrupt each other through a mechanism that we call "interference." Virtually thousands of studies have documented how our memories can be disrupted by things that we experienced earlier (proactive interference) or things that we experienced later (retroactive interference). Relatively modern research on interference theory has focused primarily on retroactive interference effects. After receipt of new infor- mation that is misleading in some ways, people make errors when they report what they saw.2.3 The new post-event information often beco~es incorporated into the recollection, supplement- ing or altering it, sometimes in dramatic ways. New information invades us, like a Trojan horse. precisely because we do not detect its influence. Understanding how we become tricked by revised data about a witnessed event is a central goal of this research. The paradigm for this research is simple. Participants first witness a complex event. such as a simulated violent crime or an automobile accident. Subsequently, half of the participants receive new misleading information about the event. The other half do not get any misinfor- mation. Finally, all participants attempt to Dr. Loftus IS from Stanford University. Ms. Pickrell is from the University of Washmgton. The authors thank numerous researchers who helped gath- er data. and Jim Coan for his early partiCipation m poor related research. Address reprint requests and correspondence to Elizabeth F Loftus. PhD. Psychology Dept.. University of Washington. Seattle. WA 98195 or via email: eloftus@u.washmgton.edu. 720 recall the original event. In a typical example of a study using this paradigm. participants saw a video depicting a killing in a crowded town square. They then received written information about the killing, but some people were misled about what they saw. A critical blue vehicle. for instance. was referred to as being white. \Vhen later asked about their memory for the color of the vehicle, those given the phony information tended to adopt it as their memory; they said the vehicle was white.~ In these and many other experiments, people who had not received the phony information had much more accurate memories. In some experiments. the deficits in memory performance following receipt of misin- formation have been dramatic. with perfor- mance differences as large as 309c or 40'1. 0 This degree of distorted reporting has been found in scores of studies involving a wide vari- ety of materials. People have recalled nonexis- tent broken glass and tape recorders, a clean- shaven man as having a mustache. straight hair as curly, stop signs as yield signs, hammers as screwdrivers, and even something as large and conspicuous as a barn in a bucolic scene that contained no buildings at all. In short. mis- leading post-event information can alter a per- son's recollection in powerful ways. even lead- ing to the creation of false memories of objects that never in fact existed. LOST IN ASHOPPING MAll Most of the experimental research on mem- ory distortion has involved deliberate attempts to change memory for an event that actually was experienced. An important issue is whether it is possible to implant an entire false memory for something that never happened. Could it be done in an ethically permissible way? Several years ago, a method was conceived for exploring this issue-Why not see whether people could be led to believe that they had been lost in a shopping mall as a child even if they had not been. (See Loftus and Ketcham6 for a descrip- Psychiatric Annals 25:12/December 1995 •! I I ) tion of the evolution of the idea for the study. I In one of the first cases of successful false memory implantation.7 a 14-year-old boy named Chris was supplied with descriptions of three true events that supposedly happened in Chris' child- hood involving Chris' mother and older brother Jim. Jim also helped construct one false event. Chris was instructed to write about all four events every day for 5 days. offering any facts or descrip- tions he could remember about each event. If he could not recall any additional details, he was instructed to write "1 don't remember." The false memory was introduced in a short paragraph. It reminded Chris that he was 5 at the time. that he was lost at the University City shopping mall in Spokane, Washington, where the family often went shopping, and that he was crying heavily when he was rescued by an elder- ly man and reunited with his family. Over the first 5 days, Chris remembered more and more about getting lost. He remem- bered that the man who rescued him was "real- ly cool." He remembered being scared that he would never see his family ag~in. He remem- bered his mother scolding him. A few weeks later, Chris was reinterviewed. He rated his memories on a scale from 1 (not clear at all) to 11 (very, very clear). For the three true memories, Chris gave ratings of 1, 10, and 5. For the false shopping mall memory, he assigned his second-highest rating: 8. When asked to describe his memory of getting lost, Chris provided rich details about the toy store where he got lost and his thoughts at the time ("Ub-oh. I'm in trouble now."l. He remembered the man who rescued him as wearing a blue flannel shirt, kind of old. kind of bald on top ... ·'and. he had glasses." Chris was soon told that one of the memo- ries was false. Could he guess? He selected one of the real memories. When told that the memo- ry of being lost was the false one, he had trouble believing it. More recently, we have completed a study that uses a procedure similar to that used with Chris. We asked 24 individuals to recall events that were supplied by a close relative. Three of the events were true, and one was a research- crafted false event about getting lost in a shop- ping mall or other public place. We now describe this study in detail. lOST AGAIN .Overview The subjects in this study thought they were participating in a study of "the kinds of things you may be able to remember from your childhood." The subjects were given a brief description of four events that supposedly occurred while the subject and a close family member were together. Three were true events and one was the false "lost" event. Subjects tried to write about these events in detail. Later they were interviewed about the events on two sepa- rate occasions. Psychiatric Annals 25:12/December 1995 Method Subjects. Three males and 21 females, ranging in age from 18 to 53, completed all phas- es of the study. They were recruited by University of Washington students; each stu- dent provided a pair of individuals, which included both a subject and the subject's rela- tive. The pairs consisted primarily of parent- child pairs or sibling pairs. and the youngest member of the pair was at least 18 years of age. The "relative" member of the pair had to be knowledgeable about the childhood experiences of the "subject," the younger member of the pair. Materials. Subjects were mailed a five- page booklet containing a cover letter with instructions for completing the booklet and the scheduled interviews. The booklet contained four short stories about events from the subject's childhood provided by the older relative. In actu- ality, three of the stories were true, and one was the false event about getting lost. The order of events in the booklet and in the subsequent interviews was alwavs the same, with the false event about getting l~st always presented in the third position. Each event was described in a single paragraph at the top of the page, with the rest of the page left blank for the subject to record the details of his or her memory. To exemplify a false memory paragraph. here is one created for a 20-year-old Vietnamese- American woman who grew up in the state of Washington: "You, your mom, Tien, and Tuan all went to the Bremerton K-Mart. You must have been 5 years old at the time. Your mom gave each of you some money to get a blueberry Icee. You ran ahead to get into the line first. and somehow lost your way in the store. Tien found you crying to an elderly Chinese woman. You three then went together to get an Ieee." Procedure. Interviews with the relative for each subject were conducted to obtain three events that happened to the subject between the ages of 4 and 6. The stories were not to be family "folklore" or traumatic events that the subject will either remember easily or find painful to remember. In addition, the relative provided information about a plausible shop- ping trip to a mall or large department store in order to construct a false event where the sub- ject could conceivably have gotten lost. The rel- ative was asked to provide the following kinds of information: (1) where the family would have shopped when the subject was about 5 years old; (2) which members of the family usually went along on shopping trips; (3) what kinds of stores might have attracted the subject's inter- est; and (4) verification that the subject had not been lost in a mall around the age of 5. The false event was then crafted from this information. The false events always included the following elements about the subject: (1) lost for an extended period, (2) crying, (3) lost in a mall or large department store at about the age of 5, (4) found and aided by an elderly woman, and 151 reunited with the family. 721 Subjects were told that they were partici- pating in a study on childhood memories, and that we were interested in how and why people remembered some things and not others. They were asked to complete the booklets by reading what their relative had told us about each event, and then writing what they remembered about each event. If they did not remember the event, they were told to write "I do not remember this." After completing the booklet, they mailed it back to us in a stamped envelope that we had provided to them. Upon receipt of the completed booklet. sub- jects were called and scheduled for two inter- views. If it was convenient, the interviews took place at the University; otherwise, they were conducted over the telephone. Initially we had planned to manipulate, as an independent vari- able, the time intervals between the receipt of the booklet and the two subsequent interviews. However, scheduling difficulties created by sub- ject unavailability prevented us from doing this. Thus, in the end, all subjects were first inter- viewed approximately 1 to 2 weeks after receipt of the booklet, and received a second interview approximately 1 to 2 weeks after that. Two interviewers, both female, conducted and record- ed the interview sessions. At the beginning of the first interview, sub- jects were reminded about each of the four events, one at a time, and asked to recall as much as they could about them. They were instructed to tell us everything they remem- bered about the event, whether or not they had already written the information in their book- lets. We told the subjects we were interested in examining how much detail they could remem- ber, and how their memories compared with those of their relative. The event paragraphs were not read to them verbatim, but rather bits of them were provided as retrieval cues. When the subjects had recalled as much as possible, they were asked to rate the clarity of their memory for the event on a scale of 1 to 10, with 1 being not clear at all and 10 being extremely clear. Next, subjects rated their confidence on a scale of 1 to 5 that if given more time to think about the event, they would be able to remem- ber more details (1=not confident and 5=extremely confident that they would be able to remember more). The interviewers maintained a pleasant and friendly manner, while pressing for details. After the first interview, the subjects were thanked for their time, and were encouraged to think about the events and try to remember more details for the next interview. They were told not to discuss the events at all with their relative or anyone else. The second interview session, conducted 1 to 2 weeks after the first, was essentially the same: subjects tried to remember the four events, they rated their clarity and confidence, but at the end of this session, they were debriefed. The debriefing phase explained our 722 attempt to create a memory for something that had not happened, and asked subjects to guess which event may have been the false one. We apologized for the deception and explained why it was necessary for the research. Results The 24 subjects were asked to remember a total of 72 true events, and succeeded in remem- bering something about 49 (68%) of these 72 true events. In other words, 68% of the true events were remembered. Figure 1 shows that this per- centage held constant from the initial booklet stage through the two subsequent interviews. The Figure also shows the rate of remembering the false event. In the booklet, 7 (29%) of the 24 subjects "remembered" the false event, either fully or partially. The partial memories included remembering parts of the event and speculations about how and when it might have happened. During the first interview, one subject decided she did not remember, leaving 6 (25%) of the 24 claiming to remember, fully or partially. This same percentage held for the second interview. Subjects used more words when describing their true memories, whether these memories were fully or only partially recalled. For purpos- es of analysis, we calculated the mean number of words using only the 29% who produced a full or partial false memory in their initial booklets. The mean word length of descriptions of true memo- ries was 138.0, whereas for descriptions of false memories, it was 49.9. Six of the seven subjects used more words to describe their true than false memories, and the seventh used very few words to describe any memories (a mean of 20 for the true memories and 21 for the false one J. During the first interview session, 17 sub- jects continued to maintain that they had no memory whatsoever of the false event happening to them. One additional subject, who had earlier accepted the event partially, now claimed that she did not remember being lost. Thus. 75% resisted the suggestion about being lost, and they continued to resist during the second interview. We analyzed the clarity ratings for the sub- jects who embraced the false event during the first interview, and compared these clarity rat- ings to the ones given by these particular sub- jects for their true events. In general, the clarity ratings for the false events tended to be lower than for the true events. For purposes of analy- sis, we took five individuals who falsely remem- bered being lost and analyzed their mean and median clarity ratings. (Unfortunately, one sub- ject could not be included in this analysis because clarity ratings were inadvertently not collected during the first interview.) The mean clarity rating for the true memories of these five individuals was 6.3 during the first interview and also 6.3 during the second interview. The mean clarity rating for the false memory was 2.8 during the first interview and 3.6 during the second interview (Figure 2l.AlI five subjects had mean clarity ratings for their true events that Psychiatric Annals 25:12/December 1995 Mean Clarity Rating 10r-------------------, Percentage Remembered 100 90 80 49172 49172 49/72 70 60 60 40 30 20 10 Event Type • BooKlet I ~ Interview 1 Interview 2 63 True 63 Event Type False Figure 1. Twenty-four subjects were asked to remember true and false events over three stages-booklet and two mtefVIews. The percentage remembenng IS shown. exceeded the clarity rating for the false event. Three of the five subjects increased their clarity ratings for the false event, while two gave the same rating. Medians showed a similar pattern: higher ratings for the true events, and a modest rise in clarity from the first to the second inter- view for the false event only. The subject with missing data gave a median rating of 7.0 to her true memories and a rating of 4.0 to her false memory. One subject's performance illustrates this pattern. She was a 20-year-old woman who was convinced that she had been lost at K-Mart when she was about 5 years old. In her booklet, she used 90 words to describe her false memory and a mean of 349 words to describe her true memories. During the two interview sessions, her clarity ratings were mostly higher for the true memories than for the false one, and only the clarity rating for the false memory rose from the first to the second interview. More specifical- ly. her false memory was initially rated 3, then rose to 4. By contrast, her true memories were rated 7 then 2. 9 then 9, and 6 then 6. Subjects also rated how confident they were that they would be able to recall additional details at a later time, using a scale from 1 to 5. We examined the confidence ratings for the sub- set of subjects embracing the false event during the first interview and who provided two sets of confidence ratings. In general, the confidence ratings were low, but lower for the false event than for the true ones. The mean confidence rat- ing for the true memories for this set of people was 2.7 during the first interview and 2.2 during the second interview. The mean confidence rat- ing for the false memory wa,s 1.8, then 1.4 I Figure 3l. All five subjects had mean confidence ratings for their true events that exceeded the confidence rating for the false one. Most of the subjects gave the same low confidence rating during the two interviews. At the end of the second session, subjects were debriefed and asked to choose which event may have been the false one. Of the 24 total, 19 subjects correctly chose the getting-lost memo- Psychiatric Annals 25:12/December 1995 Figure 2. Clanty ratmgs of subjects who believed the false event durmg the first mtervlew. compared to the clanty ratmgs they gave to the true events. ry as the false one, while the remaining five incorrectly thought that one of the true events was the false one. Although subjects sometimes correctly chose the getting-lost memory as the false one. this does not mean that they were not previous- ly misled into genuinely embracing the false event. Sometimes they chose correctly simply by a process of elimination. Here is an example from one subject who was led to believe that she had been lost at the Hillsdale Shopping Mall. She described her getting-lost experience using 66 words I as opposed to a mean of 128 words for her true memories), During the second inter- view, she said, I vaguely. vague, I mean this is very vague. remember the lady helping me and Tim and my mom doing something else. but I don't remem- ber crying. I mean I can remember a hundred times crying ... I just remember bits and pieces of it. I remember being with the lady. I remem- ber going shopping. I don't think 1. I don't remember the sunglasses part. She went on to remember that the elderly lady who helped her was "heavy-set and older. Like my brother said, nice." She gave her false mem- ory a clarity rating of 4. When the subject was debriefed and asked to tell which was the false memorv, she said: "Well, it can't be Slasher, 'cause I ~ow that he ran up in the chimney and I know that that car got smashed and I know that we got robbed so it had to be that mall one." Despite the debriefing. she continued to struggle mildly with her per- sisting memory: "... I totally remember walking around in those dressing rooms and my mom not being in the section she said she'd be in. You know what I mean?" Discussion These findings reveal that people can be led to believe that entire events happened to them after suggestions to that effect. We make no claims about the percentage of people who might be able to be misled in this way, only that we are 723 , , 10.-------------------, Mean Confidence Rating not to say that the actual experience of being lost briefly or of hearing about someone else being lost (even Hansel and Gretell is not important. The development of the false memory of being lost may evolve first as the mere sugges- tion of being lost leaves a memory trace in the brain.. Even if the information is originally tagged as a suggestion rather than a historic fact. that suggestion can become linked to other knowledge about being lost (stories of others). as time passes and the tag that indicates that being lost in the mall was merely a suggestion slowly deteriorates. The memory' of a real event. visiting a mall. becomes confounded with the suggestion that you were once lost in a mall. Finally, when asked whether you were ever lost in a mall, your brain activates images of malls and those of being lost. The resulting memory can even be embellished with snippets from actual events. such as people once seen in a mall. Now you "remember" being lost in a mall as a child. By this mechanism. the memory errors occur because grains of experienced events or imagined events are integrated with inferences and other elaborations that go beyond direct experience. FALSE MEMORIES OF HOSPITALIZATIONS AND OTHER EVENTS It could be argued that getting lost. howev- er briefly, is a common experience. and that fact enabled subjects to construct a false memory about a particular occasion of getting lost. Could false memories be constructed about events that were not so common in childhood experiences? Hyman et al8 used a similar procedure to explore this issue. In their first experiment, col- lege students were asked to recall actual events that had been reported by their parents and one experimenter-crafted false event-an overnight hospitalization for a high fever with a possible ear infection. They were informed that they would be asked to recall childhood experiences based on information obtained from their par- ents. They thought the goal of the research was to compare their recall to the information sup- plied by the parents. All events, including the false one, were first cued with an event title (family vacation. overnight hospitalization) and an age. If subjects couldn't recall the event, they received brief addi- tional cues, such as location or other people involved. After the first interview, subjects were encouraged to continue thinking about the events, but not to discuss them, and to return for a second interview 1 to 7 days after the first. In the first interview, subjects recalled and described 62 of the 74 true events (84o/cl, and in the second interview, they provided memories for 3 of the 12 events that had not been remem- bered during the first interview. As for the false events, no subject recalled these during the first interview, but 4 of 20 subjects (20%) incorporat- ed false information in an event description dur- ing the second interview. One subject "remem- 1.4 Falsa 18 Evant Typa True providing an "existence proof" for the phenome- non of false memory formation. In addition to the current subjects, and those of Loftus and Coan,6 we have successfully implanted the get- ting-lost memory in a number of other individu- als. some of whom have taught us how fervently subjects will cling to their false memories even after debriefing. In two demonstration cases. supplied by The MacNeillLehrer News Hour, individuals were successfully led to create a false memory of being lost. The process of memory implantation was filmed. with the subjects' full permission and cooperation. for purposes of demonstrating this scientific methodology to the public. One of the demonstration cases, Becca. was led to believe that she had been lost in the Tacoma Mall while she had been shopping with her mother and father. By her last interview, she thought she may have been looking at puppies at the pet store about the time she got lost. She remem- bered "the initial panic when you realize that your mom and dad aren't there any more." She remembered the elderly lady who rescued her, and thought she may have been wearing a long skirt. "1 do remember her asking me if I was lost, and ... asking my name and then saying some- thing about taking me to security." She remem- bered that she didn't cry while she was lost, but did cry when she saw her parents again. When we debriefed her at the end of the study. Becca found it so hard to believe that her getting-lost memory was false that she telephoned both of her parents to check. The parents, now divorced, independently confirmed that the episode in the Tacoma Mall had never happened. A predictable comment about the false mem- ories of getting lost is that people may have actu- ally been lost in their lives, however briefly, and they may be confusing this actual experience with the false memory description. But our subjects were not asked about any experience of being lost. They were asked to remember being lost around the age of 5-in a particular location with partic- ular people present, being frightened, and ulti- mately being rescued by an elderly person. This is Figure 3. The subjects who believed the false event at first mter- view were asked to rate their confidence that they would be able to recall additional details of this event at a later time. They also rated the/{ true memones. 724 Psychiatric Annals 25:12/December 1995 bered" that the doctor was a male. but the nurse was female and also a friend from church. In a second study, Hyman et al. tried to implant three new false events that were rather unusual. The first was attending a wedding reception and accidentally spilling a punch bowl on the parents of the bride. The second was hav- ing to evacuate a grocery store when the over- head ,;prinkler systems erroneously activated. The third was being left in the car in a parking lot and releasing the parking brake. causing the car to roll into something. While the methodolo- gy was basically the same as in the first study, there were some minor variations. Instead of beginning by simply cueing subjects with an event title and an age. they were now given more cues at the start (age. event, location, actions. and others involvedl. In subsequent interviews. the researchers provided only the event title and age, and only when subjects failed to recall the event were additional cues provided. Moreover. the experimental demands were intensified somewhat by, for example. pres- sures for more complete recall. There were three interviews spaced 1 day apart. In the first interview. subjects recalled and described 182 of the 205 true events (890'c J. In the second interview, they provided a bit more information. and by the third interview, they had provided some recall for 95C7c of the events. During the third interview, subjects provided memories for 13 of the 23 true events that had not been remembered during the first interview. As for the false events, again no subject recalled these during the first interview, but 13. (25o/c) did so by the third interview. For example, one sub- ject had no recall of the wedding "accident," stat- ing"I have no clue. I have never heard that one before." By the second interview, the subject said ..... It was an outdoor wedding and I think we were running around and knocked something over like the punch bowl or something and made a big mess and. of course, got yelled at for it." These results show that people will create false recalls of childhood experiences in response to misleading information and the social demands inherent in repeated interviews.9.10 The process of false recall appeared to depend, in part, on accessing some relevant background informa- tion. The authors hypothesized that some form of schematic reconstruction may account for the cre- ation of false memories. What people appear to do, at the time they encounter the false details, is to call up schematic knowledge that is closely related to the false event. Next, they think about the new information in conjunction with the schema, possibly storing the new information with that schema. Now, when they later try to remember the false event, they recall the false information and the underlying schema. The underlying schema is helpful for supporting the false event-it adds actual background informa- tion and provi.des the skeletal or generic scenes. Creation of false memories in this manner Psychiatric Annals 25:12/December 1995 can be viewed as a form of source confusion as described bv Schacter and Curran. ll The false event is a~sumed to be a personal memory rather than an event presented by the researchers as ostensibly coming from the par- ent. Schacter and Curran's patient, E.G., came to "remember" words that were never studied. probably because these words were represented in his long-term memory prior to the experiment and this pre-experimental familiarity was wrongly used as evidence that the word had recently appeared. Similarly, some elements of the false memories created bv us, and bv Hvman and colleagues, are repres~nted in l~ng:term memory prior to the experiment. This pre-exper- imental familiarity can be wrongly used as evi- dence that the false event actually happened. FINAL COMMENT Nearly two decades of research on memory distortion leaves no doubt that memory can be altered via suggestion. People can be led to remember their past in different ways. and they can even be led to remember entire events that never actually happened to them. When these sorts of distortions occur, people are sometimes confident in their distorted or false memories, and often go on to describe the pseudomemories in substantial detail. These findings shed light on cases in which false memories are ferventlv held-as in when people remember things tha't are biological or geographically impossible. The findings do not, however. give us the ability to reliably distinguish between real and false mem- ories, for without independent corroboration. such distinctions are generally not possible. REFERENCES 1. Greene RL. Human .\1emorv. Hillsdale. ~.J: Lawrence Erlbaum Associates. Inc: 1992. 2. Loftus EF. EyelL'itness Testlmonv, Cambridge. ~[A: Harvard University Press: [979 3. McCloskey M. Zaragoza :\1. :\lisleading poste"ent infor- mation and memory for events: arguments and evidence against memory impairment hypotheses. J Exp Psvcho! Gen. 1985. 114: 1-16. 4. Loft,us EF. Levidow B. Duensing S. \\'ho remembers best'" individual differences in memory for events that occurred in a science museum. ApplLed Cognttlce Psycholog\'. 1992: 6:93-107. 5. Loftus EF. \\i'hen a lie becomes memory's truth, Current Directions in Psychological Science. 1992: 1: 121-123. 6. Loftus EF. Ketcham K. The .\1yth of Repressed Jfemor\'. New York, NY: St. :\lartin's Press: 1994, 7. Loftus EF. Coan JA. Pickrell JE, :\Ianufacturing false memories using bits of reality. In: Reder L. ed. ImplICIt Memory and Metacngnitinn. Hillsdale. :--;,J: Erlbaum. [n press .. 8. Hyman [E. Husband TH. Billings F.J. False memories of childhood experiences, ApplIed Cognttice Psychology. 1995: 9:181-195. 9. Ceci SJ. Loftus EF. Leichtman MD. Bruck M, The possible role of source misattributions in the creation of false beliefs among preschoolers. Int .J ClLn Exp Hypn. 1994: XLII:304-320, 10. Ceci SJ. Huffman MLC. Smith E. Loftus EF Repeatedly thinking about non-events. ConscIOusness and Cognztion. 1994: 3:388·407. 11. Schacter DL. Curran T. The cognitive neuroscience of false memories. PsychiatriC Annals. 1995: 25:727-731. 725 Bulletin of the Psychonomic Society 1975, Vol. 5- (I), 86-88 Eyewitness testimony: The influence of the wording of a question ELIZABETH F. LOFTUS University of lCashington, Seattle, ~¢ashington 98195 and GUIDO ZANNI New School for Social Research, New York, New York 10011 Two experiments are reported in which subjects viewed a film of an automobile accident and then answered questions about events occurring in the film. Relative to questions containing an indefinite article (e.g., Did you see a broken headlight?), questions which contained a definite article (e.g., Did you see the broken headlight?) produced (1)fewer uncertain or "I don’t know" responses, and (2)more "recognition" of events that never, in fact, occurred. The results, which are consistent with the view that questions asked subsequent to an event can cause a reconstruction in one’s memory of that event, have important implications for courtroom practices and eyewitness investigations. An automobile accident is a kighly cgmplex and sudden event often lasting only a few seconds. Is our perception, recollection, and verbalization of such an incident an identical copy of the original event? Most researchers in the field of human memory would agree that the answer to this question is "no." There are numerous ways to influence (and often distort drastically) the recollections of a witness. One relatively easy way is to vary the method by which the recollection is elicited or to vary the form in which questions are asked about the recollection. Much of the research in this area has indicated that when people are forced to answer specific questions, rather than to report freely, their reports are more complete but less accurate (Cady, 1924; Gardner, 1933; Marquis, Marshall & Oskamp, 1972; Marston, 1924; Whipple, 1909). Furthermore, the accuracy of an answer to a specific question can be noticeably influenced by the wording of the question itself. By changing one or two words in a question, clear-cut variations have been shown to appear in as diverse areas as a subject’s report of hypnotic experiences (Barber, 1969) and in his estimates of the speed of a moving vehicle (Loftus & This research was supported by the Urban Mass Transportation Administration, Del)artment of Transportation, Grant No. WA-11-0004. Requesta for reprints should be sent to Elizabeth F. Loflus, Psychology Department. University of Washington, Seattle, Washington 98195. Palmer, in press). The wording of a question is such an important matter, that a recent book intended to help potential questionnaire designers (Oppenheim, 1966) devotes an entire chapter to the topic of question wording. The present research demonstrates the influence of very small changes in the wording of a question in a situation in which subjects viewed a film of an automobile accident and then answered questions about events that did and did not occur in the film. For some of the questions, the English article the (the definite article) was used, as in "Did you see the broken headlight?" For other questions the article a (the nondefinite or indefinite article) was used resulting in questions like "Did you see a broken headlight?" Previous research on the definite and indefinite article has been equivocal as to whether there is a difference in influence between the two. Muscio (1915)concluded that the more reliable form of question was one that did not use the definite article, whereas Burtt (1931) reported that a and the are about equally suggestive. What is the difference between the and a, and why should use of these articles produce differential behavior on the part of eyewitnesses? On this topic, a number of psychologists have recently had something to say (e.g., Anderson & Bower, 1973; Brown, 1973; Chafe, 1972; Maratsos, 1971; Osgood, 197l); most have made the 86 EYEWITNESS TESTIMONY 87 point that if a speaker has already seen a particular item, and assumes his listener is also familiar with it, he will use the article the. For example, when a young man wants to borrow the family car, he says "Can I have the family car tonight?" These notions have already been successfully embodied into an elaborate model of human memory called HAM (Anderson & Bower, 1973); when HAM sees a noun preceded by the it looks for the referent of that term in its memory. In contrast, when HAM sees a noun preceded by a it assumes that a new member of the noun class is being introduced into its memory. To return to the example of accidents and the or a broken headlight, consider first the question, "Did you see a broken headlight?" Two questions are implicitly being asked here: (1) Was there a broken headlight? and (2) if there was, did you see it? If a subject decides that the answer to Question 1 is "yes," he can then ask himself Question 2, and he should be fairly certain of his response. The problem that arises for a subject is that filmed accidents occur in the space of seconds making it nearly impossible to be certain of Question 1, and making it likely that the subject will respond "don’t know" much of the time. In contrast, the second question, "Did you see the broken headlight?" can be translated into the nearly equivalent, "There was a broken headlight. Did you happen to see it?" Thus, a subject who is interrogated with the definite article does not need to answer Question 1. Effectively, the answer is "yes." He need only answer Question 2, and, as was the case with the indefinite article, at this point he can be fairly certain about his response. According to this analysis, fewer "don’t know" responses would be expected. Furthermore, if a subject’s recollections tend to conform, for some reason, to what he believes actually did occur, then the definite article may lead to a greater "recognition" of events, even when they never in fact occurred. EXPERhMENT I Method One hundred graduate students participated in this experiment, in groups of various sizes. All subjects were told that they were participating in an experiment on memory and that they would be shown a short film followed by a questionnaire. The content of the film was not mentioned. The film itself depicted a multiple car accident. Specifically, a car makes a right hand turn to enter the main stream of traffic; this turn causes the cars in the oncoming traffic to stop suddenly, causing a five car bumper to bumper collision. The total time of the film is less than 1 min, and the accident itself occurs within a 4-sec period. At the end of the film, the subjects received a questionnaire asking them to first "give an account of the accident you have just seen." When they had completed their accounts, a series of specific questions was asked. Six critical questions were embedded in a list totaling 22 questions. Half the subjects received critical questions in the form, "Did you see a ...?" and the other half of the subjects received them in the form, "Did you see the ...?" Three of the critical questions pertained to items present in the film and three to items not present. Subjects were urged to report only what they saw, and did so by checking "yes," "no," or "I don’t know." Each subject received a different permutation of the questions. Results Table 1 presents the percentage of "yes," "no," and "I don’t know" responses for both the "the" and "a" subjects. Whether an item was actually present or not, subjects interrogated with a were over twice as likely to respond "I don’t know." Subjects interrogated with the tended to commit themselves to a "yes" or "no" response. Another aspect of these data are worthy of mention. First, when a subject is queried about an item that was not present in the film, "yes" responses are particularly interesting. A "yes" response indicated a subject reported that he saw something that was not, in fact, present. Using the indefinite article resulted in false "yes" responses 7% of the time. With the definite article, however, false "yes" responses occurred 15% of the time-over twice as often. To test statistically for the difference between interrogation with a and with the, a single score for each subject was generated. "Yes" responses were assigned a value of +1, "I don’t know" responses were assigned a value of 0, and "No’ responses were assigned a value of -1. A subject’s mean score, then, reflected his confidence that the items were present. The difference between the "confidence scores" for the a and the subjects was significant by a Mann-Whitney U test, z = 2.98, p < .01. To test for the difference between items which were and were not present, two mean scores for each subject were generated, one for items which were present and the other for items which were not. Again, "yes" responses received a value of +1, "I don’t know" responses a value of 0, and "No" responses a value of -1. As before, a subject’s mean score for a particular type of item expressed his confidence that those items were present. Two Wilcoxon matched-pairs signed-ranks tests revealed that both the subjects who had been interrogated with a and the subjects who had been interrogated with the were more confident about items that had been present, z = 3.97, p < .001, and z = 2.52, p < .01, respectively. EXPERIMENT II Method In order to be sure that the results obtained were not peculiar to the film or subject population used in Experiment I, a second experiment was conducted. Experiment II was identical o Experiment I in all respects except two. First, a different subject population was used; 60 people between the ages of 14 and 20 were recruited from a public library. Second, a different film was used. The film shown in a small room at the library, depicted a minor collision between a man who was backing out of a narrow space in a supermarket parking lot and a woman pedestrian who 88 LOFTUS AND ZANNI was carrying a large bag of groceries. The total time of the film was less than 4 rain, and the accident atself occurred with a 2-see period. Subjects viewed the film and then answered a questionnaire. Six critical questions were embedded in a list of 22; 3 pertained to items present in the film and 3 to items not present. The definite a~ticle was used for 30 subjects; the indefinite article for the other 30. Results Table 1 presents the percentage of "yes," "flo," and "I don’t know" responses for both the "the" and "a" subjects. The effects of Experiment l are replicated. When an indefinite article was contained in a question about an item that was not present in the film, "yes" responses occurred 6% of the time. When the definite article was used, "yes" responses occurred 20% of the time. "I don’t know" responses occurred, overall, more often when the indefinite article was used (47.5% vs. 15.5% for the definite article). DISCUSSION The research reported here required subjects to view a film of a traffic accident and then to answer questions about the film. A major finding was that questions containing an indefinite article led to many more "I don’t know" responses. The phrase "a broken headlight" could refer to any of a number of headlights which a subject might have seen. Since it is impossible to inspect all of the headlights carefully in the time allowed for viewing the accident, at is impossible to be sure that no headlight was broken. In this case, then, the subject must deal with uncertainty about whether a broken headlight actually existed at all, and a larger number of "uncertain" or "don’t know" responses result. On the other hand, "Did you see the broken headlight?" more strongly ~mplies the existence of a specific broken headlight, and the subject need not deal with the uncertainty about whether tlie broken headlight existed. Tiffs finding may also indicate that it is easier to be confident that you have not seen some specific item than to be confident that you have not seen any instance of a general class of items. A second result is that questions containing a definite article resulted in a greater number of false recognitions ("recogmtion" of events that had never occurred). At least two explanations for this finding are possible. One is that the definite article produces a baas favoring a "yes" or "no" response; in other words, the changes a subject’s criteria for how much objective evidence he needs to say "yes" or "no." The other is that the definite article leads a subject to infer that the object was in fact present, causing for stone a reconstruction in their original memory for the event. While the present data cannot differentiate between response bias and reconstructive memory explanations, a recent study (Loftus & Palmer, in press) does not indicate that questions asked subsequent to an event can cause a reconstruction in one’s memory of that event. In that study subjects viewed films of automobile accidents and then answered questions about events occurring in the films. The question "About how fast were the cars going when they smashed into each other?" elicited a higher estimate of speed than "About how fast were the cars going when they hit each other?" Furthermore, on a retest 1 week later, those subjects who received the verb smashed were more likely to say "yes" to the question "Did you see any broken glass?" even though broken glass did not exist in the accident. The implication of these results for courtroom examinations, police interrogations, and accident investigations is fairly clear Table 1 Percentage of "Yes, .... No," and "1 don’t know" Responses to Items that Were Present and Not Present in the Film Present Not P~esent Response "the .... a .... the .... a" Experiment I Yes 17 20 15 7 No 60 29 72 55 I don’t know 23 51 13 38 Experiment I I Yes 18 15 20 6 No 62 28 69 56 I don’t know 20 57 11 38 cut. The main aim of ~nterrogations conducted by attorneys before the court, for exanlple, is to provide information about events which have actually taken place. Different forms of queshons can be consciously used to elicat desired answers from a watness, and also to create a desired influence upon the jury. In the present research, the andefinite article elicited more false responses. Quesuons which either by form or content suggest to the witness the answer desired or "lead" him to that desired answer are called "leading questions’ in the courtroom, and the existence of rules for excluding them (e.g., Supreme Court Reporter, 1973) is a definite recogmtlon of thear power of suggestion. While an attorney can seemingly easily "sense" when to object to a leading question asked by another attorney, the definition of leading is a long way from being precise. Any complete definition must eventually consider the subtle suggestlbihty that individual words can carry with them. REFERENCES Anderson, J. R., & Bower, G. H. Human associative memory. Washington, D. C: V. H. Winston & Sons, 1973. Barber, T, X. Hypnosts: a scienttfic approach. New York: Van Nostrand Reinhold, 1969. Brown, R. A first language, the early stages. Cambridge, Mass: Harvard University Press, 1973. Burtt, H. Legal Psychology, 1931. Cady, H. M. On the psychology of testimony. American Journal of Psychology, 1924, 35, 110-112. Chafe, W. L. Discourse structure and human knowledge. In J. B. Carroll & R. R. Freedle (Eds.), Language comprehension and the acquisition of knowledge. Washington, D. C: V. H. Winston & Sons, 1972. Gardner, D. S. The perception and memory of witnesses. Cornell Law Quarterly, 1933, 8,391-409. Loftus, E. F. & Palmer, J. C. Reconstruction of automobile destruction: an example of the interaction between language and memory. Journal of Verbal Learning and Verbal Behavior, i~a press. Maratsos, M. P. The use of definite and ~ndefimte reference ~n young children. PhD dissertation, Harvard University, 1971. Marquis, K. H., Marshall, J.~ Oskamp, S. Testimony validity as a function of question form, atmosphere, and item difficulty. Journal of Applied Social Psychology, 1972, 2,167-186. Mamton, W. M. Studies in testimony. Journal of Criminal Law and Criminology, 1924, 15, 5-31. Muscio, B. The influence of the form of question. British Journal of Psychology, 1915, 8, 351-389. Oppenheim, A. N, Questionnaire design and attitude measurement New York: Basic Books, Inc., 1966. Osgood, C. E. Where do sentences come from? In D. D. Steinberg & L. A. Jakobovits (Eds.), Semantics. An interdisciplinary reader in philosophy, linguistics, and psychology Cambridge, England: Cambridge University Press, 1971. Supreme Court Reporter, 1973, 3: Rules of Evidence for United States Courts and Magistrates. Whipple, G. M. The observer as reporter: A survey of the psychology of testimony. Psychological Bulletin, 1909, 6, 153-170. (Received for publication October 71, 1974.) Journal ofApp~ Psychology 1998. Vul. 83, No.3. 360--376 Copyright 1998 by tho American Ps)'C""logical Association, IDe. 0021-9010/981S3.00 "Good, You Identified the Suspect": Feedback to Eyewitnesses Distorts Their Reports of the Witnessing Experience Gary L. Wells and Amy L. Bradfield Iowa State University People viewed a security video and tried to identify the gunman from a photospread. The actual gunman was not in the photospread and all eyewitnesses made false identifica- tions (n = 352). FoJlowing the identification, witnesses were given confirming feedback ("Good, you identified the actual suspect"), disconfirming feedback ("Actually, the suspect is number _"), or no feedback. The mMipulations produced strong effects on the witnesses' retrospective reports of (a) their certainty, (b) the quality of view they had, (c) the clarity of their memory, (d) the speed with which they identified the person, and (e) several other measures. Eyewitnesses who were asked about their certainty prior to the feedback manipulation (Experiment 2) were less influenced, but large effects still emerged on some measures. The magnitude of the effect was as strong for those who denied that the feedback influenced them as it was for those who admitted to the influence. Eyewitness to a crime on viewing a lineup: •'Oh, my God ... I don't know ... It's one of those two ... but I don't know ... Oh, man ... the guy a little bit taller than number two ... It's one of those two, but I don't know." Eyewitness 30 min later, still viewing the lineup and having difficulty making a decision: "I don't know ... number two?" Officer administering lineup: "Okay." Months later. . . at trial: "You were positive it was number two? It wasn't a maybe'!" Answer from eyewitness: "There was no maybe about it ... I was absolutely positive." (Mi.~souri v. Buehring, 1996,p.202) The' eyewitness in the above case spent 30 min trying to identify her attacker from a lineup of four people. Her behavior at the time indicated a great deal of uncertainty about which, if any, of the people in the lineup was the attacker: Later, at trial, howeve~ she recalls having been absolutely positive about her identification from the lineup. How could this 'happen? The current article tests Gary L. Wells and Amy L. Bradfield, Department of Psychol- ogy, Iowa State University. This research was supported by a grant to Gary L. Wells from the National Science Foundation (SBR 9308275). We thank Rana Alexander, Joy Carr, and Marla Jenkins for their assistance in conducting experimental sessions. We thank Paul Windschitl for comments on an earlier version. Correspondence concerning this article should be addressed to Gary L. Wells, Department of Psychology, Iowa State Univer- sity, Ames, Iowa 50011. Electronic mail may be sent to g)welIs@iastate.edu. 360 one possibility, namely that giving feedback to eyewit- nesses can result in their recalling that they were more confident in the identification than they really were at the time.1 Over 75,000 people become criminal suspects each year in the United States based on their being identified by eyewitnesses from lineups and photospreads (Goldstein, Chance, & Schneller, 1989). Judges and juries are then presented with the task of trying to determine whether the identification is of the actual perpetrator or of an innocent person. The task is a daunting one. The identification of innocent persons from lineups and photospreads is the primary cause of wrongful conviction, accounting for more convictions of innocent persons than all other causes combined (see Borchard, 1932; Brandon & Davies, 1973; Frank & Frank, 1957; Huff, Rattner, & Sagarin, 1986). This concern about false identification as the primary cause of wrongful imprisonment has received a new round of support recently owing to the introduction of forensic DNA techniques for analyzing trace evidence. A recent report from the National Institute of Justice, for example, examined the cases of 28 people who were officially re- leased from long prison terms based on definitive exonera- tions using DNA tests (which were not available at the time of· their convictions). Of the 28 false convictions, 24 had been misidentified by eyewitnesses (in some cases multiple eyewitnesses) from lineups and photospreads. Misidentification by highly certain eyewitnesses was the I We make no distinction between the terms cenainty and confidence. Courts of law in the United States commonly use the term certainty, whereas tlx; psychological science literature on eyewitnesses tends to use the term confidence. FEEDBACK TO EYEWITNESSES 361 principal evidence leading to the conviction of these inno- cent persons (Connors, Lundregan, Miller, & McEwan, 1996). Many have suggested that the problem with identi- fication evidence is that jurors are too willing to believe eyewitnesses. We suggest, however, that the problem might rest with the eyewitnesses themselves. Specifically, we suggest that eyewitnesses are too persuasive in the sense that their confidenCe and other qualities of their identification testimony are exaggerated. Qualities of Identification Testimony There is good empirical evidence to indicate that the confidence .with which eyewitnesses give identification testimony is the most important single quality of testi- mony in terms of whether participant-jurors will believe that the eyewitness correctly identified the actual perpetra-: tor (e.g., Cutler, Penrod, & Stuve, 1988; Deffenbacher & Loftus, 1982; Fox & Walters, 1986; Lindsay, Wells, & O'Connor, 1989; Lindsay, Wells, & Rumpel, 1981; Luus & Wells, 1994; Wells, Ferguson, & Lindsay, 1981; Wells, Lindsay, & Ferguson, 1979). In fact, a confident eyewitness tends to malce participant-jurors ignore the witnessing conditions themselves and believe the eyewit- ness at a rate that exceeds the actual rate of accuracy (Lindsay et al. 1981). Eyewitness confidence tends to be only modestly re- lated to eyewitness identification accuracy (accounting for 15% or so of the variance; see meta-analysis by Sporer, Penrod, Read, & CutIer,1995). Further compounding the problem, however, is recent evidence that eyewitness iden- tificationconfidence appears to be malleable. After mak- ing false identifications from a photospread, eyewitnesses who were told that a cowitness identified the ~ame person thatthey identified became highly confident in their false identifications and increased their credibility with partici- pant-jurors (Luus and Wells, 1994). Confidence inflation effects have also been observed when eyewitnesses are asked the same question repeatedly (Shaw, 1996; Shaw & McClure, 1996), when eyewitnesses are encouraged to prepare themselves for cross examination (Wells et al., 1981), or when there is high perceptual familiarity for a lure stimulus (Clark, 1997). Results of this type are con- sistent with Leippe's (1980) argument that eyewitness confidence and eyewitness accuracy can be governed by different variables. Although confidence is a primary determinant of the perceived credibility of eyewitnesses, it is by no means the only quality of testimony that influences perceptions of the credibility of eyewitnesses (Leippe, Manion, & Romanczyk,· 1992; Wells & Lindsay, 1983). Eyewitness identification testimony at trial typically involves numer- ous questions about the witnessed event as well as ques- tions about the identification decision of the eyewitness. At trial, eyewitnesses are generally asked to report on how good of a view they had of the perpetrator, the extent to which they directed their attention to the perpetrator, how long the perpetrator's face was in view, how well they could make out details of the face, howquicldy they were able to identify the suspect from the lineup or pho- tospread, how certain they were at the time of the identifi- cation, and so on. The U.S. Supreme Court ruling in Neil v. Biggers (1972), for example, set forth five criteria for use by court., in determining the likely accuracy of an eyewitness's identification of a criminal suspect: (a) the eyewitness's opportunity to view, (b) the attention paid by the eyewitness, (c) the accuracy of the witness's prelineup description of the culprit, (d) the certainty of the eyewit- ness, and (e) the amount of time between the event and the attempt to identify. These five criteria have been criti- cized on theoretical and empirical grounds in psychology, including the fact that four of the five criteria rely on memory-based self-reports from the very eyewitnesses . whose memory is being called into question (Wells and Murray, 1983). Nevertheless, these are the dominant for- mal criteria used in American court., today, especially for purposes of assessing motions to suppress the identifica- tion evidence. Prior work on qualities of identification testimony has focused almost exclusively on the confidence of the eye- witness. Our work expands this to other qualities of eye- witness identification testimony, such as eyewitnesses' re- ports of how good their view was, how long it took them to make an identification, and so on. We are particularly concerned with the prospect that, like confidence, these other qualities of eyewitness identification testimony are malleable as a function of feedback. Feedback to Eyewitnesses There are no legal prohibitions against a police investi- gator telling eyewitnesses that they did or did not choose the actual suspect in the case (Wells, 1993), and such feedback occurs in Teal cases (Nettles, Nettles, & Wells, 1996). Courts have been concerned primarily with the idea that the person who administers the lineup should not influence the choice of the eyewitnesses, but they have shown no particular concern with the possibility that the investigators' postidentification comments might inflate the confidence of the eyewitnesses. Our concern flows from Wells' ( 1988, 1993; Wells & Sedau, 1995) argument that the person who administers the lineup or photospread should not indicate anything about the status ofthe person identified until a clear statement of confidence is taken from the eyewitness. Immediately following an identification, the eyewitness should be asked to indicate how certain he or she is that the identified person is in fact the perpetrator [and] no clues of any kind should be given to the witness as to whether or not the identified person is the suspect in the case. (Wells, 1988, p. 126) 362 WELLS AND BRADFIELD Up to this point. only one type of manipulation has been used in examining the influence offeedback on eyewitness identification testimony, namely feedback about the iden- tification decision ofa cowitness (Luus and Wells, 1994). Eyewitnesses who are given feedback that their cowitness identified the same person as they identified develop high confidence relative to those who are not told this. 1bere are many otber possible fonns of feedback that could be tested. For example, following their identifications from a lineup, eyewitnesses might be told that the person they identified has committed offenses of this type previously, they might be told that this isa person who was detained close to the scene of the crime, that this is a person who was found with some of the stolen goods. and so on. We chose to use a particularly simple type of feedback manipulation, namely telling eyewitnesses that the person they chose from the lineup was or was not the actual suspect. Our paradigm for studying feedback effects on eyewit- nesses is a variation on the one used by Luus and Wells ( 1994). In this paradigm, eyewitnesses are given some form of feedback about their identification after they have made their identification. It is important to note that this paradigm randomly assigns eyewitnesses to feedback con- ditions only after they have already witnessed the event and after they have already made an identification from a lineup. We call this the postidentification feedback para- digm. Using this paradigm, we are assured through ran- dom assignment that eyewitnesses in each condition had equally good or poor memories of the perpetrato~ paid equal amounts of attention at the time, had equally go09 or poor views of the event, were equally confident at the time of the identification, and so on. This paradigm, therefore. allows us to assume that any differences in eyewitnesses' reports of the witnessing experience as a function of the feedback manipulation are forms of false recall about the witnessing experience. Broad Effects of Feedback? In addition to testing the idea that such feedback inflates eyewitness icJentification confidence, we were interested in the possibility that the effects of feedback would affect other qualities of eyewitnesses' identification testimony. We decided to examine three categories of dependent measures relating to the quality of eyewitness identifica- tion testimony. First. there are the eyewitnesses' reports of the qualities of the witnessed event itself. Examples include eyewitnesses' recollections of how good of a view they had of the perpetrato~ the amount of attention that they paid at the time, and recollections of how well they could see details of the perpetrator's face during the crime. Second, there are qualities of the identification task, such as the eyewitnesses' recollections of the amount of time that they took to make their identifications from the lineup, recollections of the ease with which they were able to identify the suspect from the lineup, and recollections of how confident they were when they made the identifica- tion. Third, there are what we call swnmative qualities of the witnessing experience. Summative qualities refer to current' 'bottom line" considerations of the eyewitnesses, such as the eyewitnesses' stated willingness to give testi- mony at trial or the eyewitnesses' current feelings that they had a good basis for making a positive identification. We predicted that the effects of feedback would affect not only ~yewitnesses' current feelings of willingness to give testimony but also their memory for bow confident they were at the time of the identification, how good of a view they recall having had of the perpetrator, and so 00. In fa~t, we predicted that the feedback manipulation would affect witnesses' reports from all three categories of the witnessing experience. "On-Line" Versus Postcomputed Judgments Our prediction regarding the effects of feedback stems largely from an assumption that eyewitnesses' impres- sions of the witnessing experience are not recorded on- line (see Hastie & Park, 1986). That is, eyewitnesses do not fonn clear impressions at the time of the event about how good or poor their view is, how much attention they are paying, how confident they are in their identification, and so on. Instead, people's memories for cognitive pro- cesses operating during an event (in this case the wit- nessedevent as well as the event of making an identifica- tion) are, like other memories, reconstructions. Hence, answers to these questions are postcomputed (later) by eyewitnesses when the relevant question is asked of them. When later asked to judge how good their view was, for example, the eyewitness does not recall an impression or judgment but rather fonns one. We hypothesized that, during this formation of an an- swer, eyewitnesses would not be able to ignore the feed- back. information. Our reasoning is related to the well- documented "hindsight bias" or "knew it all along" effect in which a person believes that something was obvi- ous all along when in fact it was obvious only after the answer was revealed to them (e.g., Fischhoff, 1975). Hav- ing no clear on-line memory for how confident they were at the time. confirming feedback should lead eyewitnesses to recall having been confident all along. Similarly, in the absence of an on-line impression of how good their view was, confinning feedback should lead eyewitnesses to make the inference that they "must have" had a good view. Disconfirming feedback. should have the opposite effect. Securing False Identifications In order to avoid the complication of identification ac- curacy as a factor, we decided to use only eyewitnesses FEEDBACK TO EYEWITNESSES 363 who had made false-identifications. Securing a false iden- tification rate at or near 100% in an eyewitness identifica- tion experiment is relatively easy. The key to securing high false identification rates is to use a lineup in which the actual perpetrator is not present, include people who match the general description of the perpetrato~ and sug- gest to the eyewitness that the perpetrator is in the lineup and that their task is to select him. Luus and Wells (1994) obtained a 97% false identification rate using this proce- dure and, as described later, we obtained a 100% false identification rate using this procedure. This allowed us to randomly assign these eyewitnesses to feedback condi- tions without concern for whether their identifications were accurate or inaccurate because all identifications were inaccurate. The primary purposes of Experiment 1 were (a) to test the hypothesis that confirming postidentification feedback would lead eyewitnesses to recall having been more con- fident in their lineup identification than they really were at the time and (b) to test the hypothesis that the effects of such feedback are quite broad, influencing not only eyewitnesses' recollections of how confident they were at the time of identification but also leading them to falsely recall other qualities of the witnessing experience. Experiment 1 Research participants were shown a grainy security camera video from a Target Store in which a man is shown entering the store. They were told to notice the man as they would be asked questions about him later. After view- ing the brief video, they were informed of the fact that the man murdered a security guard moments later (Iowa v. Chidester, 1995). Participants did not see the murder itself on video. They were then asked to identify the gun- man from a photospread. The photospread was the same one used in the actual criminal case, except that we re- moved the gunman's photo. As shown in prior research, absence of the actual target from a lineup or photospread leads to a high rate of misidentification, especially when eyewitnesses are not specifically warned that the actual culprit might not be in the lineup (see Luus and Wells, 1994; Malpass & Devine, 1981). In faet, our procedure was successful in getting every participant to make a false identification. Following the false identification, the exper- imenter gave confirming feedback (' 'Good. You identified the actual suspect"), disconfirming feedback ( ,,Actually, the suspect was number _"), or no feedback. A short time later. the participant-eyewitnesses were asked a number of questions, including how certain they were at the time of their identification decision, bow good of a view they got of the gunman's face, how long it took them to identify. the gunman from the photospread, and so on. Method Participants and cover story. Participants were 172 students who received a small amount of credit in their introductory psychology course for participating. The experiment was de- scribed simply as "impressions ofothers" and the sign-up form stated that they would view a short video and be asked some questions about a person or persons in the video. A session allowed for up to 2 participants, but some sessions had only 1 participant. On arrival, participants were told that we were interested in people's abilities to make judgments about other people, such as judgments of occupation and personality, based on a brief examination of the person's physical appearance. It was explained that they would view a video from a store camera showing people walking in and out of the store and that they were to study some of these people closely as they would be asked questions about them later. Procedure and moJerials. Following the initial instructions, participants were placed in individual cubicles that contained a video monitor. It was explained that the video image was grainy, and they were shown a 1.5-min segment of people walking into the store to give them an idea of what the video image would look like. Then, they were cued by a blank screen to pay atten- tion to the next person walking by the camera as "this will be one of the people we will ask you about." At that point, partici- pants saw the portion of the security camera video in which the person who later shot and killed a security guard walked in front of the camera. The video was slowed down at that point so that the gunman was in view for 8 s. Participants did not see the actual shooting. Figure I is a still reproduction from the video of the gunman walking in front of the camera. This is a very close reproduction of the actual quality of the video. As the reader can see, the image is rather poor. Mter the gunman was out of view, the experimenter reentered the cubicle and informed the participant that the person they just viewed had shot and killed a store security guard shortly after the video was taken, and they were, therefore, eyewitnesses to the identity of the gunman. It was explained further that the actual purpose of the study was to see if they could identify the gunman from a photospread. The experimenter then gave the eyewitness a pbotospread composed of five people. This was identical to the six-person photospread that was used in the actual case except that, unbeknownst to the eyewitness, the gunman's photo had been removed (and not replaced). Figure 2 is a reproduction of the photospread, which all participants saw. Figure 3 is a photo of the actual gunman, which the partici- pants did not see. The experimenter waited until the eyewitness selected someone from the photospread, which was indicated by checking a box corresponding to the number on the photograph. Feedback manipulation. Experimenters were kept blind to feedback condition until after the eyewitness had made an identi- fication. At that point, each participant was randomly assigned to one of three conditions, namely confirming feedback, discon- firming feedback, or no feedback. Eyewitnesses in the confirm- ing-feedback condition were told "Good. You identified the ac- tual suspect in the case." Eyewitnesses in the disconfirming- feedback condition were told •'Db. \bu identified number _ The actual suspect is number _." (If the eyewitness identified Number I, which 91% did, then the disconfirming-feedback eyewitnesses were told that the actual suspect was Number 5. 364 WELLS AND BRADFIELD Figure 1. Still reproduction of image of gunman from video. If the eyewitness identified someone other than Number 1, then the disconfirming-feedback eyewitnesses were told that the ac- tual suspect was Number 1.) Eyewitnesses in the control condi- tion were given no feedback. Dependent measures. The experimenter then gave the eye- witness a form that asked a long series of questions, some of which were not central to the hypotheses of the study (e.g., "Do you wear corrective.eyeglasses?") and are not reported. The principal questions are given in Table 1. The boldfaced word in each question serves as a shorthand name of the item for purpose of this article. Using the boldfaced words as refer- ence, the order of the questions asked of the eyewitnesses was certain, view, seconds, face, distance, attention, basis, easy, long, willing, and trusted. 1bese were the primary measures. 11len, on a separate page, participants were asked to provide a description of the gunman by using a checklist of features that included sex, race, age, height, build, hair color, hair type, facial hair, eyes, glasses, and complexion. Beside each of these charac- teristics were options (e.g., hair color had options of white, gray, blond, dark brown, light brown, red, black, otheI; and "don't know"). 1be "don't know" option was available for each of the characteristics. R>llowing the checklist was an open-ended question asking about any other appearance characteristics. R>l1owing their completion of the dependent measures, par- ticipants were fully debriefed and dismissed. Results Clearly, the primary measures are correlated. Certainty, for example, correlated .51 with the average of the other measures, and the mean intermeasure correlation was .43. This makes sense. The greater the certainty, for example, the greater one's willingness to testify; the less time taken to make an identification, the easier the identification deci- sion is perceived to have been; the better the view, the greater ability to make out details of the face, and so on. Nevertheless, correlations among the measures do not inform us about whether these mea.sures are each influ- . enced by the manipulation. The feedback manipulation, for example, might affect certainty without affecting the eyewitnesses' statements about how good their view was of the perpetrator. As outlined in our introduction, a prin- cipal purpose of this work was to see whether the manipu- lation produces broad effects on a variety of measures that contribute to the credibility of an eyewitness. Because of the large number of measures, we first con- ducted a multivariate analysis of variance (MANOVA) using the 11 primary measures. The MANOVA was highly significant, F(20, 1720) = 17.1, P < .001. Accordingly, the MANOVA was followed by univariate ANOVAs on FEEDBACK 10 EYEWITNESSES 365 1 2 3 4 5 Figure 2. Photospread shown to participant-witnesses. each of the dependent measures. Significant univariates were obtained for the measures of certainty, view of the culprit, ability to make out details of the face, attention to the event, basis for making an identification, ease of making the identification, time taken to make an identifi- Figure 3. Photo of actual gunman. cation, willingness to testify, trust in an identification un- der these conditions, and details provided in the descrip- tion, all Fs(2, 170) > 5.5, ps < .01. Only two of the primary· measures were not significantly affected by the feedb~ck manipulation, namely the number of seconds that the gunman's face was in view and the distance be- tween the camera and the gunman's face. For each of the significant univariate ANOYAs, we con- ducted three single-degree-of-freedom planned contrasts to locate the effects, namely contrasts between the no- feedback condition and the confirming-feedback condi- tion, between the disconfirming-feedback condition and the no-feedback condition, and between the confirming- feedback condition and the. disconfirming-feedback condition. Table 2 displays the means and standard deviations for all the significant measures. (Means for the ease of identification and time to make an identification mea- sures were subtracted from 7 to make the higher numbers indicate greater ease of identification and a faster identi- fication decision. The description measure was a com- posite score of the number of descriptor boxes checked, minus the number of "don't know" responses, plus the number of written descriptors.) The results of the single- degree-of-freedom planned contrasts are indicated by the 366 WELLS AND BRADFIELD Table 1 Meo.sures of the Witnessing Experieru:e: Experiment 1 Qualities Of the witnessed event "How good of a view did you get of the gunman?" "How mWlY seconds would you estimate diat the gunrnWl'S face was in view?" "How well were you able to make out specific features of the gunman's face from !he video?" "What would you estimate was the distance between the camera-eye view and !he gunman's face?" "How much attention were you paying to the gunman's face while viewing the video?" Of the identification "AI the time that you identified the person in the photospread, how certain were you that the person you identified from the photos was the gunman that you saw in the video?" "How easy or difficult was it for you to figure out which persun in die photos was the gunman?" "After you were first shown the photos, how long do you estimate it took you to make an identification"" Summative "On the basis of your memory of the gunman, how willing would you be to testify in court that the person you identified WIIS !he person in the video?" ••Assume that an eyewitness had about the same view of the gunman that you had from the video. Do you think that an identification by this eyewitness ought to be trusted?" "To what extent do you feel that you had a good basis (enough information) to make an identification?" Scale I (very poor) to 7 (very good) Open response 1 (not at all) to 7 (very well) 10 feet to 70 feet in Io-foot increments I (none) to 7 (my total attentinn) I (not at all certain) to 7 (totally certain) I (extremely easy) 10 7 (extremely difficult) I (/ needed almost no time to pick him out) to 7 (l had to look at the pJuJtos for a long time to pick him out) I (not at all willing) to 7 (totally willing) I (definitely sJuJuld not be trusted) to 7 (definitely shuuld be trus~ti) I (no basis 01 all) to 7 (a ver), good basis) Note. Words in bold are shorthand terms used in the text to refer to que,"-ion~ (e.g., the view question). subscripts in Table 2. It is clear from these data that the confirming-feedback manipulation, when compared with tbe no-feedback control condition, yielded responses from the eyewitnesses indicating greater certainty in the identification, a better view of the culprit. a greater abH- .ity to make out details of the face, greater attention to the event, a stronger basis for making an identification, greater ease of making the identification, less time taken to make the identification, greater willingness to testify, more trust in an identification made under these condi- tions, and more details provided in the description. Com- parisons between the disconfirming-feedback condition and the no-feedback condition yielded far fewer signifi- cant effects. Although the control condition yielded means that consistently fell between the confirrning- and disconfirming-feedback conditions, only four of the ten contrasts between the disconfirming- and no-feedback conditions were significant, namely the attention mea- sure, basis-for-the-identification measure, ease-of-identi- fication measure, and willingness-to-testify measure. Hence, the confirming feedback clearly had a stronger effect in elevating certainty and the other measures than the disconfirming feedback had in reducing the levels of these measures. To get a sense of the size of these effects, we calculated the effect-size statistic d (the difference between the means in standard deviation units) comparing the con- tinning- and disconfirming-feedback conditions for all the significant measures. The effect sizes were d = 1.56 for certainty in the identification, d = 0.77 for view of the 'culprit, d = 1.05 for ability to make out details of the face, d -- 1.11 for attention to the event, d -- 1.55 for basis for making an identification, d -- 1.42 for ease of making an identification, d -- 0.77 for time taken to make an identification, d = 1.38 for willingness to testify, d = 0.76 for trust in an identification made under these conditions, and d = 0.46 for details provided in the de- scription. The average effect size comparing the confirm- ing-feedback condition with the no-feedback condition across these 10 significant measures was d -- 0.75. The average effect size across these 10 measures for the dis- confirming-feedback condition to tbe no-feedback condi- tion was only d = 0.27. Discussion The results clearly show that the feedback manipulation not only affected the eyewitnesses' reports of how certain they were at the time of the identification but also affected other judgments about the witnessed event. Effects were observed in all three categories,. namely their recall of qualities of the witnessed event, qualities of the identifica- tion, and summative qualities. The effect.. were even stronger than we had expected. Using the effect-size s13- FEEDBACK TO EYEWITNESSES 367 ~ ....[ .- 0-;;. a ..... - f'i~ tistic d. Cohen (1988) stated that a small effect is 0.2, a medium effect is 0.5, and a large effect is 0.8. With these conventional values, the confirming- versus disconfirm- ing-feedback effect was large or very large on the eyewit- nesses' reports of certainty, view, ability to make out fea- tures of the face, attention, basis for making an identifica- tion, ease of making an identification, the amount of time taken to make an identification, willingness to testify, and trust of an identification made under these conditions. The disconfirming feedback appears to have been much less successful in lowering the eyewitnesses' confidence and their answers on the other nine measures than the confirming feedback was in raising their scores on these measures. We have no definitive explanation for this asymmetry. However, we caution against the assumption that it is generally easier to raise than to lower scores on these variables. We suspect that it depends a great deal on how the feedback is operationalized. Our disconfirming feedback, for instance, was operationalized by telling the eyewitnesses that the actual suspect was another person in the photospread. Some research suggests that this might not be as powerful as telling eyewitnesses that the suspect was not in the photospread at all (because this is perhaps the one possibility that the eyewitnesses had not consid- ered prior to feedback, see Luus and Wells. 1994). The findings from Experiment 1 are striking because they indicate an inability of the eyewitnesses to recall the witnessing experience and the identification experience in a manner that reflects how they actually experienced these events at the time. Random assignment to conditions allows us to assume that eyewitnesses in all three condi- tions had, on average, the same qualities of the witnessed event and qualities of the identification. Hence, the robust differences in means over the three conditions is strong evidence consistent with our hypothesis that feedback dis- torts answers to these questions. Experiment 1 is important inestablishing that feedback leads eyewitnesses to distort their reports of the wit- nessing experience across a broad array of questions. The practical implications are profound. These findings mean that extramemorial factors. having nothing to do with the actual quality of their view or the uncertainty that they actually felt at the time, can distort the eyewitnesses' judgments. This, in tum, means that criteria used to evalu- ate identification evidence (e.g., the Biggers criteria of certainty, opportunity to view) can actually be driven by the behavior of the agent administering the lineup, in particular the agent's decision regarding whether to give feedback to the eyewitness or not. We think that this is not at all what the U.S. Supreme Court had in mind when articulating the Biggers criteria. These results underscore the argument that lineups and photospreads should be administered by someone who is blind as to which person is the suspect so that the lineup administrator cannot give 368 WELLS AND BRADFIELD the eyewitness feedback about the identification (Wells, 1988). Although the applied import of Experiment 1 is deal; it tells us little about the psychological processes involved. To what extent, for example, are the eyewitnesses aware of the way the feedback is influencing their answers? Recall that our hypothesis was that eyewitnesses must construct answers to the questions because they did not form impressions on-line at the time that they experienced the events. In other words. they had no prior memories of what they thought about their view or what they thought about their confidence at the time. Without a prior impres- sion,the eyewitnesses have no ability to compare and contrast their current (postfeedback) impressions with impressions that they had at the time. Hence, our hypothe- sis implies that these effects are not subject to introspec- tive access and are occurring largely outside of awareness. One of the purposes of Experiment 2 was to examine this issue. Experiment 2 1b what extent can eyewimesses report accurately on the influence 'of the feedback manipulation? Can the ef- fects of feedback be prevented somehow? And. are there other variables that are affected by the feedback manipula- tion? Experiment 2 was designed to replicate Experiment 1 as well as addiess these three new questions. These questions are important to our understanding of the effect and to the practical question of how to prevent it in actual cases. Can Eyewitnesses Report Accurately on the. Influence of Feedback? One of the tenets of the legal system is that the eyewit- ness will be deposed at trial and can, therefore, testify as to whether the agent tried to influence him or her and whether the agent's actions in fact had any influence. If the eyewitness can articulate that influence, the jury can then take that into account as appropriate. We suspect that most eyewitnesses in our experiment can report accurately on the statements that were made to them by the agent. In Experiment 1, we had the clear impression from de- briefing interviews that they were keenly aware of whether they had been told that they identified the actual suspect or identified the wrong person. On the other hand, we are skeptical of the presumption that they would be able to report accurately on the influence that the feedback state- ment had on their judgments. Hence, Experiment 2 in- cluded a series of questions asking participants whether what they were told (i.e., the feedback) influenced how they responded to the key dependent measures. Th the extent that some eyewimesses reported that the feedback influenced their answers and others reported that the feed- back did not influence their answers, we can compare these two groups to see whether their reports of influence comport with actual evidence of influence. Can the Feedback Effect Be Prevented? Wells (1988) recommended that a clear statement be taken from the eyewitness at the time ofthe identification by a blind agent as to his or her confidence. Wells reasoned that this should "freeze" the eyewitnesses' confidence and produce resistance to confidence inflation from later feedback. We tested this idea in Experiment 2. We also speculated that asking eyewitnesses about their confidence prior to the feedback manipulation would help prevent the manipulation from affecting other 'responses taken later (such as their answers about bow good of a view they had of the perpetrator). We call this the confidence- prophylactic hypothesis in reference to the idea that asking eyewitnesses about their confidence before they receive feedback could somehow shield them from the various effects of feedback. This prediction stems from our gen- eral assumption that the eyewitnesses in Experiment 1 had no clear premanipulation thoughts regarding matters such as the quality of their view, the ease of the identifica- tion task, and so on, until they were asked, which was after the manipulation. We reasoned that asking eyewit- nesses about their confidence could lead them to think about various qualities of the witnessed event (e.g., how good was my view?) and the identification task (how easy was it for me to pick the gunman from the video?) in order to answer the confidence question. When the confidence question is asked before the manipulation, therefore, the eyewitness might form premanipulation impressions of these qualities, which could then be used to answer ques- tions about these qualities later: Although these impres- sions would not have been formed on-line, they would have been formed before the manipulation and, hence, could make the eyewiOless less influenced by the manipulation. We hypothesized that asking eyewitnesses about confi- dence prior to the manipulation would lead them to think broadly about other confidence-relevant judgments (such ac; how good their view was) more than asking eyewit- nessesabout these other judgments (such as their view) would lead them to think about their confidence. Of course, it might be the case that asking the eyewitnesses any question about the event prior to the feedback manipu- lation might moderate the effects of the feedback. Hence, for purposes of comparison, some eyewitnesses were. asked about their certainty prior to the manipulation (cer- tainty-question-first condition), others were asked about their view prior to the manipulation (view-question-first condition), and others were asked nothing prior to the manipulation (no-prior-question or control condition). Following the manipulation, all eyewitnesses were asked FEEDBACK TO EYEWITNESSES 369 about their certainty (for either the first or second time, depending on condition), their view (for either the first or second time, depending on condition), and the other questions (all for the first time). These three conditions were crossed with confinning versus disconfirming feedback. Are There Other Variables Affected by the Feedback Manipulation? Recent empirical evidence indicates that eyewitnesses are able to report on the processes or strategy by which they made their identification decisions and that these reports are statistically related to the chances that the identification was accurate. Mter their identifications, eye- witnesses who report that the suspect's photo just "popped out" are more likely to have made an accurate identification than are eyewitnesses who report that they used a "process of elimination" (see Dunning & Stern, 1994; Lindsay et aI., 1991). The former is the type of absolute judgment that is relatively automatic and is the common manner in which pure recognition memory is presumed to operate; the latter is a relative judgment that implies deliberation and inference processes rather than recognition memory per se (see Wells, 1984). We already noted in Experiment 1 that eyewitnesses who received confinning feedback recalled· making their identification decision faster than those given disconfirming feedback. In fact, faster identifications are generally more likely to be accurate (See Sporer, 1993). Furthennore, there is evidence that eyewitnesses seem to use the amount of time it took them to make an identification to infer their own accuracy (Kassin, 1985; Kassin, Rigby, & Castillo, 1991). Could feedback also distort their recollections of whether they used a deliberated process of elimination versus an automatic process of recognition? We added a dependent measure to test this hypothesis. We also speculated that feedback might influence the eyewitnesses' perceptions of the extent to which they are generally competent at identifying faces. Hence, we added a question about the extent to which participants thought they were good at recogniZing the faces of strangers that they had seen on only one prior occasion. This question is importantly distinct from the other dependent measures in the sense that it asks participants about a general 'ability (outside of the experiment) rather than something involv- ing the events in the experiment. Method Overview. Experiment 2 used the same materials and para- digm developed for Experiment 1 with some key changes. First, we dropped the control condition (no feedback). It is clear from. Experiment 1 that mean responses in the no-feedback condition fall in between the confirming and disconfirming conditions. Second. we crossed the feedback manipulation with whether the .eyewitness was asked about certainty after the identification but prior to tbe feedback (certainty first). was asked about view after the identification but prior to the feedback (view-question- first condition), or was asked nothing between the time of the identification and the feedback. Third. we added new questions and slightly modified a few questions from Experiment 1. Participants and cover story. Participants were 198 students who received a small amount of credit in their introductory psychology course for participating. The cover story and instruc- tions were the same as desCribed in Experiment 1. Procedure and materials. The procedure and stimulus mate- rials were tbe same as those used in Experiment 1. Feedback manipulmion. The feedback manipulation was the same as that used in Experiment 1 except that there was no no-feedback condition. Participants received either the positive or the negative feedback. Question-type manipulation. One third of the participants received the feedback ~iately after making their identifica- tion, as in Experiment 1. Another third of the participants were given a question about their certainty immediately after their identification (certainty-question-first condition) and then re- ceived the feedback. The other third of the participants were given a question about their view immediately after their identification (view-question-first condition) and then received feedback. (See Table 3). Experimenters were kept blind to the question-type manipulation until immediately prior to the manipulation. Primary dependent measures. Table 3 lists the questions that were a.<;ked of eyewitnesses in Experiment 2 in the order in which they appeared by condition. These are the same ques- tions used in Experiment 1 with the following exceptions: For reasons discussed earlier, we added II question about the eyewit- nesses' abilities to identify faces, which was worded as "Gener- ally, how good is your recognition memory for faces of strangers you have encountered on only one prior occasion?" from I (very poor) to 7 (excellent). In order to see if asking the cer- tainty question before the manipulation would guard against confidence inflation. we developed a second confidence question to ask after the manipulation, "At the time that you identified the person in the photos, how sure were you that the person you identified was the gunman .jn the video?" from I (totally un- sure) to 7 (totally sure). We also developed a second view question to ask after the manipulation, "How well could you see the gunman?" from 1 (very poorly) to 7 (very well).2 In addition, we added a question about the clarity of the eyewit- nesses' memorial image of the gunman, "How clear is the image you have in your head of the gunman you saw in the video?" from I (not at all clear) to 7 (very clear). Finally, for reasons 2 We thought that asking exactly the same question about certainty or view both before and after the manipulation would have led to II trivial interpretation had they given the same an- swer on both occasions, namely' 'wanting to appear consistent." This could happen in actual cases, of course. but it would still beg the question of what happens if the question is changed at a surface level and yet is still measuring the same concept. Hence. the second time that participants were asked about cer- tainty and view, we changed the question slightly to permit the ·participant to give a different answer. 370 WFLLS AND BRADFIELD Table 3 Order ofManipulation and Measures in Experiment 2 No-queslion-first condition" Feedback maulpulatlon Certainty question View question Time face was in view Details of face Distance question Attention question Basis question Ease of identification Time to identification Willingness question Trust question Memory for faces Second certainty question Second view question Clarity of image question Strategy question Confidence-question-first condilionb Certainty question Feedback manipulation View question Time face wa.~ in view Details of face Distance question Attention question Basis question Ease of identification Time to identification Willingness question Trust question Memory for faces Second certainty question Second view question Clarity of image question Strategy question View-question-first condition' View question Feedback manipulation Certainty question Time face was in view . Details of face Distance question Allention question Basis question Ease of identification Time to identification Willingness question Trust question Memol)' for faces Second certainty question Second view question Clarity of image question Strategy question ·No question before feedback manipulation. ·Confidence question before feedback manipulation. 'View question before feedback manipulation. discussed previously, we added a question about the strategy that the eyewitnesses used in identifying the suspect, "Which one of the following statements best describes how you went about trying to identify the gunman from the five photos?" with alternatives I (The gunman's photo just "popped out" at me and I recognized it immediately) or 2 (l used a process of elimination, deciding which photos were not of the gunman before deciding which photo must be that ofthe gunman). These were the primary measures. Measures of awareness of influence. Participants were then given a questionnaire asking whether the experimenter told them anything about the person identified (yes or no) and, if so, 10 indicate what was said. Then. participants were told "If the experimenter told you anything about the person you identified, did that have any influence on the way you answered the ques- tion about your certainty that you correctly identified the gun- man'!" and were requested to circle yes or no. This question was repeated for the items view, face, attention, basis, ea'le, time, willingness, and trust. After answering these questions. participants were debriefed and dismissed. Results Because of the large number of dependent measures, we first conducted a 2 (Confirming vs. Disconfirming Feedback) X 3 (Confidence Asked First. No Prior Ques- tion, View Asked First) MANOVA using the 16 primary measures. The three-way interaction between these mea- sures arid the manipulations was higWy significant, F( 30, 2550) = 21.5. p < .001. Given the significant interaction between the ~pulations and the measures, we used 2 (Confirming vs. Disconfinning Feedback) X 3 (Confi- dence Asked First, No Prior Question, View Asked First) ANOVAs on the individual measures along with single- degree-of-freedom tests when the interaction was signifi- cant. The general prediction was that there would be a main effect for feedback on nearly all items, thereby repli- cating the robust effects observed in Experiment 1. In addition, the confidence-prophylactic hypothesis pre- dicted an interaction. This interaction should be of the fonn in which the effect of feedback in the no-question- first condition is larger than the effect of feedback in the confidence-first condition. Significant main effects of feedback were observed for 13 of the 16 primary measures. Compared. with those in the disconfirming-feedback conditions, those in the con- firming-feedback conditions reported greater certainty, a better view on the first view question, a greater ability to make out details of face. a stronger basis for making an identification, greater ease of making the identification, less time to make the identification, a greater willingness to testify, a greater amount of trust the eyewitness would have in an identification like this, a better general ability to remember faces of strangers, greater certainly on the second certainty question, a better view on the second view question, greater clarity of image ofgunman in mem- ory, and a greater likelihood of reporting that the culprit's photo just "popped out," all Fs (I, 170) > 6.0, p s <: .05. Hence, the robust effects observed. in Experiment 1 were replicated and extended to include measures of the strategy that the witnesses reported using, the clarity of the gunman's image in memory, and the eyewitnesses' reports of their general ability to remember faces of strangers. the means and standard deviations for the 13 significant measures are shown in Table 4. Only three of the primary measures failed. to show a main effect for the T ab le 4 M ea n Ju dg m en ts as F un ct io ns o fF ee db ac k an d F ir st -Q ue st io n C on di ti on s M ea su re F ir st T ru st in an M em or y o f S ec on d C la ri ty o f F ee db ac k/ qu es ti on ce rt ai nt y F ir st vi ew D et ai ls B as is fo r E as e o f T im e to m ak e W il li ng ne ss ID li ke fa ce s fo r ce rt ai nt y S ec on d vi ew im ag e o f P ho to be fo re fe ed ba ck qu es ti on qu es ti on o f fa ce il l m ak in g il l il l to te st if y th is st ra ng er s qu es ti on qu es ti on gu nm an "p o p p ed o u t" C on fi rm in g/ no qu es ti on M 5. 0* 4. 6* 4. 2* 4. 8* 3. 4* 3. 6* 4. 7* 4. 5* 5. 4* 5. 2* 4. 5* 5. 4* 19 % * SD 1. 7 1. 4 1. 4 1. 3 1. 8 1. 7 1. 6 1. 5 1. 3 1. 4 1. 2 1. 1 0 .4 D is co nf ir m in g/ no qu es ti on M 3. 9* 3. 4* 3. 2* 3. 5* 2. 2* 2. 5* 4. 0* 3. 4* 4. 6* 3. 9* 3. 3* 3. 6* 1% * SD 1. 5 1. 2 1. 0 1. 3 1. 4 1. 4 1. 3 1. 5 1. 5 1. 4 1. 3 . 1. 0 0. 1 IN C on fi rm in g/ vi ew -. l .... qu es ti on M 4. 8* 3. 9 4. 2* 4. 6* 3. 5* 3. 8* 4. 8* 4. 5* 4. 9 5. 3* 4. 5* 4. 8 33 % * SD 1. 3 1. 2 1. 1 1. 3 1. 6 1. 1 1. 6 1. 7 1. 6 1. 2 1. 2 1. 4 0. 5 D is co nf ir m in g/ vi ew qu es ti on M 3. 4* 3. 7 3. 0* 3. 4* 1. 9* 2. 5* 3. 9* 3. 2* 4. 5 3. 8* 3. 0* 3. 9 10 % * SD 1. 4 1. 3 1. 1 1. 3 1. 0 1. 1 1. 0 1. 4 1. 5 1. 5 1. 1 1. 4 0. 3 C on fi rm in g/ ce rt ai nt y qu es ti on 5. 3* 4. 6 4. 0 4. 8 15 % * M 4. 1 4. 1 3. 9* 4. 2 3. 0* 3. 3* 4. 1 4. 1 SD 1 A 1. 3 1. 4 1. 4 1. 4 1. 2 1. 8 1. 4 1. 2 1. 5 1. 4 1. 4 0. 3 D is co nf ir m in g/ ce rt ai nt y qu es ti on 4. 2 M 4. 5 3. 6 3. 2* 3. 7 1. 7* 2. 3* 3. 7 3. 8 4. 4* 4. 5 3. 5 3% * SD 1. 4 1. 0 1. 0 1. 1 1. 0 1. 3 1. 3 1. 5 1. 4 1. 3 1. 3 1. 1 0. 2 N ot e. B ol d en tr ie s in di ca te m ea su re s ta ke n pr io r to th e m an ip ul at io n. H ig he r m ea ns in di ca te m or e o f th e qu al ity . *p ::s ; .0 1 (o ne -t ai le d; m ea ns co m pa ri ng co nf ir m in g to di sc on fi rm in g fe ed ba ck w it hi n fi rs t- qu es ti on co nd it io n) . 372 WELLS AND BRADFIELD feedback manipulation, namely attention, distance, and length of time the gunman's face was in view. Recall that the distance and length-of-tirne measures were also not significantly influenced by the feedback manipulation in Experiment I. Tests of the confidence-prophylactic hypothesis. Did asking the confidence or view question before the manipu- lation moderate the effect of feedback? We conducted 2 (Feedback) X 2 (No Question First vs. Confidence Ques- tion Fit:st) ANOVAs to test for the interaction effect on the 12 measures that produced significant feedback main effects.3 Among the 12 measures for which there was a significant main effect for feedback, the question-type manipulation moderated the effect of feedback on seven of these measur~s, namely the first view question, the basis for making an identification, willingness to testify, amount of trust the eyewitness would have in an identifi- cation like this, the second certainty question, the second view question, and perceived clarity of image of gunman in memory, all interaction Fs(l, 170) > 7.0, ps < .01. The confidence-question-first manipulation did not mod- erate the influence of feedback on measures of ability to make out details of face, perceived ease of making the identification, time to make the identification, the eyewit- nesses' general ability to remember faces of strangers, or the tendency for the gunman's face to "pop out" in the photospread. None of the interactions between the feed- back manipulation and the view-question-first versus no- question-first conditions were significant. Table.4 shows the results of pair-wise comparisons 00.' tween the confirming- and disconfirming-feedback manip- ulation within question-type conditions for the 13 mea- sures that yielded significant main effects for feedback. The pattern of results is very clear. Asking eyewitnesses the view question prior to the feedback manipulation did not eliminate the significant effect of feedback on any measure with the possible exception of the items regard- ing general memory for strangers and clarity of the image of the guruiJan (recall, however, that the interactions were not significant on these items). In fact, asking the view question before the feedback manipulation did not even serve to eliminate the effect of feedback on the later ques- tion about view. This is in sharp contrast to generally null effects of feedback when the certainty question is asked prior to the manipulation. Of the 12 measures that were affected by feedback in the control condition, 7 were rendered nonsignificant in the certainty-question-first con- dition. Hence, there is good support for the confidence- prophylactic hypothesis, but the effects of feedback are not entirely prevented by asking the eyewitnesses about their confidence prior to their being exposed to feedback. Awareness of the effects of feedback. When asked whether the experimenter had told them something about the identification that they had made, 90% of the eyewit- nesses answered yes and were able to state the basic feed- back that they were given. This rate did not vary by condi- tion. When asked if the feedback had influenced the way that they answered the certainty question, 58% said no. The percentages who denied that feedback affected how they answered each of the other questions were 78% for view, 88% for length of time the face was in view, 70% for face details, 90% for distance, 73% for attention, 57% for basis for making an identification, 70% for ease of identification, 88% for time to make an identification, 55% for willingness to testify, and 52% for trust In order to examine whether those who said lhat the manipulation did not influence them were any less influenced than were those who said that the manipulation did influence them, we conducted an internal analysis. We partitioned partici- pants on their yes and no answers to these questions and created a 2 (Yes-No response to the question of influ- ence) X 2 (Confirming vs. Disconfirrning Feedback) fac- torial to test for the interaction. If participants who said no were less influenced than were lhose who said yes, then there should be an intemction on the relevant measures in which the effect of feedback is smaller for those who said no than for those who said yes. Because such a small proportion of participants said yes on most of the ques- tions, we restricted our analyses to the five items that were closer to equal numbers of yes and no responses, namely certainty, face details, basis for identification, will- ingness to testify, and trust. There were no significant interactions on these items. Table 5 shows the means for these five items. Clearly, eyewitnesses who said that they were not influenced by the manipuJation were no less 'influenced than were those who said that they were influenced. Discussion The feedback effect replicated on the key measures of certainty, view, ease of identification, ability to make out details of the face, willingness to testify, and so on from Experiment 1. In addition the feedback effect was ex- tended to three other measures, namely clarity of the im- age of the face in memory, ability to recognize the faces of strangers, and strategy used to make the identification decision. Hence, the feedback effect on reconstructions of the witnessing experience is even broader than sug- gested in Experiment 1. The confirming feedback not only made the eyewitnesses more certain and made them be- lieve that their view was better but also actually changed their retrospective accounts of the identification experi- ence, such as leading them to be more likely to report that the culprit's face just "popped out" to them when 3 There were 13 significant main effects rather than 12. but for purposes ofthese interaction analyses the first certainty ques- tion was not included because it was a premanipulation question in the certainty-question-first condition. FEEDBACK TO EYEWITNESSES Table 5 Mean Responses by EyewitMsses Who Said That Feedback Did Versus Did Not Influence Their Answe~ 373 Feedback variable Certainty Face details Basis for identification Willingness to testify Trost Participants who said that feedback influenced their answer Disconfinning 3.88 3.09 3.57 3.73 3.51 Confirming 4.85 5.14 4.67 4.53 4.45 Participanls who said that feedback did not inDuence their answer Disconfinning 3.75 3.32 3.47 3.88 3.66 Confinning 5.12 4.87 4.75 4.47 4.40 they were shown the photospread. This is particularly provocative in light of evidence that eyewitnesses who make accurate identifications are more likely to report thaI the culprit's face just popped out to them than are inaccurate eyewitnesses, who generally report using a pro- cess of elimination (see Dunning & Stem, 1994; Lindsay et aI., 1991). We also think it important to note that the manipulation led eyewitnesses to change their perceptions of how good they generally are at recognizing strangers. The .significance of this finding owes to the fact tbat it is asking about a general ability rather than asking about some aspect of the specific episode that was witnessed. Hence, we think thaI these effects are surprisingly broad and largely unexpected from any existing theory in social or cognitive psychology. There was clear benefit to asking eyewitnesses about their confidence prior to the feedback manipulation. The confidence-question-first manipulation not only prevented the manipulation from affecting the second confidence measure, it also moderated the effect of feedback on eye- witnesses' answers to questions about their view, the basis for their identification, their willingness to testify, the trust they would place in such an identification by someone else, arid the second view question. The confidence-first manipulation did not significantly moderate the effect of the feedback on reports of their ability to make out details of the culprit' s fac~, the ease with which they thought they made the identification, the amount of time they thought it took them to make the identification, the stralegy they reported having used in making the identification, or their perceptions of how good they are at recognizing faces. The view-question-first manipulation, on the other hand, appears to have had no. prophylactic properties at all. In fact. asking about view prior to the manipulation did not even protect against the effects of the manipulation on their later answers about their view. Hence, we found support for the confidence-prophylactic hypothesis, but the prophylactic properties of asking eyewitnesses about their confidence prior to their getting feedback is clearly incomplete. Further. we are concerned that the prophylac- tic properties might be .short lived. Future research, for example, should examine what happens if the feedback occurs days or weeks later, when the eyewitnesses might no longer recall how uncertain they were at the time of the identification. We have no satisfactory explanation aI this point as to why tbe confidence-question-first manipulation success- fully moderated the effects of feedback on most of the dependent measures but did not moderate others. There seems to be no clear pattern. We note, for instance. that the confidence-question-first manipulation prevented feedback from influencing eyewitnessef later statements about how good of a view they had of the gunman, but it did not prevent the feedback from influencing the extent to which they reported being able to make out details of the gunman's face. These judgments seem very related, and this leaves us puzzled as to why a prophylactic effect would occur for one of these measures but not for the other. Future research will have to determine the explana- tion for the incompleteness of the confidence-prophylactic effect. The practical message. however, is relatively straightforward, namely that a complete prophylactic ef- fect will likely require that several questions be asked of the eyewitness before the eyewitness is exposed to feedback. Because of the very short period of time between the feedback manipulation and eyewitnesses' recall of the feedback, it is not surprising that almost all eyewitnesses could accurately report that they had received feedback. F\Jrthermore, we are not surprised that most eyewitnesses denied that the manipulation influenced how they an- swered the questions. More interesting is the fact that those who denied the influence nevertheless showed as much influence from the manipulation as did those who admitted that the feedback manipulation influenced their answers. There are several lines of evidence in social psy- chology, many first outlined by Nisbett and Wilson ( 1977), that people do not have introspective access to higher order cognitive processes. Often, as with the eye- wiblesses who indicated that they were not influenced by the feedback, people are unaware that a stimulus bas influenced their responses. In other cases, people indicate that they were influenced by a stimulus because it seems reasonable that they probably were influenced, but not 374 WELLS AND BRADFIELD because they had true introspective access to the cognitive processes. We suspect that the eyewitnesses who said that the manipulation influenced their answers were not relying on true introspective access, but instead simply thought it reasonable that there would be such an influence. This leads us to conclude that self-reports about the influence of the feedback are unreliable indicators of the actual influence of the feedback. These data suggest that the eyewitness who is asked at trial whether the comments of a lineup agent is influencing the way they are answering the questions at trial is in fact unable to accurately report on that influence. Hence, the legal tenet that the defense can depose the eyewitness and detennine such influ- ences through the eyewitness's testimony is, we think, untenable. confinning-feedback condition stated that they were a 6 or 7 on the 7-point clarity-of-image scale versus none reporting such a high level of clarity in the disconfirming- feedback condition. In effect, the confirming-feedback manipulation served to manufacture credible witnesses from a pool of inaccurate witnesses who were not particu- larly credible on their own. We are alarmed by these findings. Although We had good reason to suspect that there would be robust effects of feedback on retrospective confidence (owing to the earlier findings of Luus and Wells, 1994, involving cowit- ness effects), we did not expect such large and broad effects on other testimony-relevant judgments of the eyewitnesses. ImageEaseBasis ~~ 2:. Measure FaceView ,:.:1 c,: ,,~ ::.:. Certainty 0% 10% ~~ ::l 3 40% .. (J) c:_ ~ -g a. '0':' 30% .. c: 0>0 l!! .... c: ~ 2l 0£~ 20% 60% .,-------------------, 500/. Irony in Light of Biggers In light of the important Biggers criteria (Neil Yo Big- gers, 1972) used by U.S. courts to assess the accuracy of eyewitness identifications, these findings have dramatic forensic relevance. In effect, these findings indicate that four of the five Biggers criteria (confidence, view, atten- tion, and description of the perpetrator) can be influenced by a suggestive procedure in the fonn of a casual postiden- tification comment. This creates a situation that we find ironic. The U.S. Supreme Court developed the Biggers criteria to address the mounting claims that identification evidence should be suppressed when the procedures are suggestive. The Court ruled that suggestive procedures are not in and of themselves grounds for discounting eye- witnesses because the critical issue concerns the likeli- hood that the identification was accurate rather than whether the procedure was suggestive per se. This is where the Biggers criteria corne into play because, as long as Figure 4. Percentages of eyewitnesses who responded with high scores on key questions by feedback condition. General Discussion and Implications This work demonstrates that a casual comment from a lineup administrator following eyewitnesses' identifica- tions can have dramatic effects on their reconstructions of the witnessing and identification experience. A con- finning-feedback remark not only inflates eyewitnesses' recollections of how confident they were at the time, it also leads them to report that they had a better view of the culprit, that they could make out details of the face, that they were able to easily and quickly pick him out of a lineup, that his face just "popped out" to them, that their memorial image of the gunman is particularly cleat; and that they are adept at recognizing faces of strangers. These effects were very robust, with effect sizes that ex- ceed what are normally considered large effects in psy- chology (Cohen, I988). How might these effects play out in real cases? A rea- sonable assumption is that the likelihood that an eyewit- ness's identification will form a basis for prosecution is detennined by the eyewitness's ability to pass some threshold of credibility. Let's suppose, for example, that an eyewitness should be at the 6 or 7 level of a 7-pomt scale on confidence, goodness of view, and so on, in order for the prosecutor to consider the eyewitness's testimony to be strong evidence against the accused. In Figure 4, we have selected critical upper cutoff values (6s and 7s on the- 7-point scale) for the measures of certainty, view, face detail, basis, ease, and clarity of the image of the gunman's face in memory. Reported are the percentage of eyewitnesses in Experiment 2 (from the no-question- first conditions only) who answered these questions using the numbers 6 or 7 on these measures as a function of the feedback manipulation. Notice, for example, that 50% of the eyewitnesses in the confirming-feedback condition stated that they were a 6 or 7 on the 7-point certainty scale versus only 15% giving such extreme confidence ratings in the disconfirming-feedback condition. This is a huge effect. Similarly, 47% of the eyewitnesses in the FEEDBACK m EYEWITNESSES 375 the eyewimess comes across strongly on some or most of the five Biggers' criteria, it is assumed that the suggestive procedure was not problematic. But these data show that a suggestive procedure itself (the feedback) can cause eyewitnesses to come across strongly on the Biggers crite- ria. So, for example, an argument that the use of feedback is suggestive would not result in a successful motion to suppress the evidence because the eyewitness is certain. claims to have had a good view, and so on. Of course, the eyewitness is certain. claims to have had a good view, and so on because of the suggestive procedure, but the Biggers criteria do not allow for such an analysis. We are sure that the Court had no idea in 1972 that feedback would have the kinds of effects observed here and proba- bly gave no thought to the feedback problem at all. In light of ~e current findings, however. arguing that a suggestive procedure is not a problem because of the eyewitness's high standing on the Biggers criteria is a bit like arguing that a forensic DNA procedure that contaminated the sus- pect's blood with the sample at the crime scene is not a problem because the lab results show that the match is virtually perfect. Because the agent administering the lineup or pho- tospread routinely knows which person is the suspect in real cases, and because there are no legal prohibitions against the agent telling the eyewitness things about the accused after the identification decision; we believe that the legal policy implications of this work are immense. In particular. we endorse the recommendation that the person who administers the lineup or photospread should be someone who is blind as to which person is the suspect in the case (following Wells, 1988, Wells, 1993; Wells and Seelau, 1995). Without such blind testing, we see no sure way of keeping the agent from influencing. the eyewitness. In addition, this same (blind) lineup adminis- trator should (at a minimum4 ) secure a confidence state- ment from the eyewitness at the time of the identification. As shown in Experiment 2, obtaining such a statement can serve to help prevent later distortions of the eyewiblesses' confidence, reports of their view, and so on. Even if the confidence-prophylactic effect is short lived, at the very least the confidence statement taken at the time of the identification can then be a matter of record and subject to usual discovery procedures so that any later inflation in confidence can be noted for the trier of fact and perhaps discounted accordingly. We acknowledge certain limitations to this work. The witnessed event, for example, was not a live crime. Fur- thermore, we used only perpetrator-absent lineups. leaving us largely ignorant at this point about the malleability of these judgments for eyewitnesses who make accurate identifications. It could also be argued that the "payoff matrix" (Le.• consequences of a false identification vs. a correct identification) is quite different in an experiment than in a real case. Nevertheless, we see no reason why the robust effects observed here would somehow disappear in real cases, and feedback might even have stronger effects in.real cases than it bad here. Our feedback manipulation was a mere casual comment with no embellishment. There are many other possible comments that can occur in real cases. Consider, for example, such statements as •'You got em!"; "That's the guy!"; "Yes, number three is the one who has a prior record for this type of thing"; or "We knew that was the guy, but we couldn't use w~at we had as evidence. lbis will allow us to finally get this guy off the street." It strikes us that these statements niight have even stronger effects than what we observed with our more pallid "Good. You identified the actual suspect." In general, we think that there are good reasons to believe that the kinds of effects observed here will and do occur in real cases. We strongly advocate double-blind testing and asking eyewitnesses about their confidence at the time of the identification rather than after there is an opportunity for other events to influence the eyewitnesses' confidence and other judgments. 4 We say "at a minimum" because the data in Experiment 2 indicated that several measures were affected by the feedback manipulation even when the confidence question was asked prior to the feedback manipulation. A full prophylactic effect might require that eyewitnesses be asked several questions at the time of identification to prevent later distortions from feedback. References Borchard, E. (1932). Convicting the innocent: Errors ofcrimi- IIOl justice. New Haven, cr. Yale University Press. Brandon, R., & Davies, C. (1973). WrongfUL imprisonment. London: AIlen & Unwin. Clark, S. E. (1997). A familiarity-based account of confidence- accuracy inversions in recognition memory. JournalofExper- imental Psychology: Learning, Menwry, and Cognition, 23, 233-238. Cohen. J. (1988). StaJi.stical power analysis for the behavioral sciences. (2nd ed.). Hillsdale, NJ: Erlbaum. Connors, E., Luodregan, T., Miller, N., & McEwan, T. (1996). Convicted by juries. exonerated by sciem:e: Case studies in the use of DNA evidence to establish innocem:e tifzer trial. Alexandria, VA: National Institute of Justice. Cutltl; B. L.. Penrod, S. D., & Stuve, T. E. (1988).lury decision making in eyewitness identification cases. Law and Human Behavior, 12.41-56. Deffenbacher, K., & Loftus, E. F. (1982). Do jurors share a common understanding concerning eyewitness behavior? Law and Human Behavior, 6. 15-30. Dunning. D., & Stem, L. B. (1994). Distinguishing accurate from inaccurate identifications via inquiries about decision processes. Journal ofPersollOlity and Social Psychology, 67, 818-835. Fischhoff, B. (1975). Hindsight *- foresight: The effect of out- come knowledge on judgment under uncertainty. Journal of Experimental Psychclogy: Human Perception and Perfor- mance, 1. 288-299. 376 WELLS AND BRADFIELD fux, S. G., & Walters, H. A. (1986). The impact of general versus sPecific expert testimony and eyewitness confidence upon mock juror judgment. Low and Human Behavior. 10, 215-228. Frank, 1., & Frank, B. (1957). Not guilty. London: Gallanez. Goldstein, A. G., Chance, J. E.• & Schneller, G. R. (1989). Fre- quency of eyewitness identification in criminal cases: A"sur- vey of prosecutors. Bulletin of the Psychonomic Society, 27. 71-74. Hastie, R., & Park, B. (1986). The relationship between mem- ory and judgment depends on whether the judgment task is memory-based or on-line. Psychological Review, 93. 258- 268. Huff, R., Rattner, A., & Sagarin, E. (1986). Guilty until proven innocent. Crime and Delinquency. 32, 518-544. Iowa v. Chidester, 570 N.W. 2d 78 (1996). Kassin, S. M. (1985). Eyewitness identification: Retrospective self-awareness and the confidence-accuracy corre]aLion. Journal ofPersonality and Social Psychology. 49, 878-893. Kassin, S. M., Rigby, S., & Castillo, S. R. (991). The accu- racy-confidence correlation in eyewitness testimony: Limits and extensions of the retrospective self-awareness effect. Journal of Personality and SocUlI Psychology. 61. 698-707. Leippe, M. R. (1980). Effect of integrative memorial and cogni- tive processes on the correspondence of eyewitness accuracy and confidence. Low and Human Behavior. 4, 261-274. Leippe, M. R., Manion, A. P., & Romanczyk, A. (1992). Eye- witness persuasion: How and how well do fact finders judge the accuracy of adults' and children's memory reports? Jour- nal of Personality and Social Psychology, 63. 181-197. Lindsay, R. C. L., Lea, J. A., Nosworthy, G. J., Fulford, J. A., Hector, 1., LeVan, V., & Seabrook, C. (1991). Biased lineups: Sequential presentation reduces the problem. Journal ofAp- plied PS}'chology, 76, 796-802. Lindsay, R. C. L., Wells, G. L., & O'Connor, F. (1989). Mock juror belief of accurate and inaccurate eyewitnesses: A repli- cation. Law and Human Behavior, 13, 333-340. Lindsay, R. C. L., Wells, G. L., & Rumpel, C. (1981). Can people detect eyewitness identification accuracy within and between situations? Journal ofApplied Psychology, 66, 79- 89. Luus, C. A. E., & Wells, G. L. (1994). The malleability of eye- witness· confidence: Co-witness and perseverance effects. Journal ofApplied Psychology, 79. 714-724. Malpass, R. S.• & Devine, P. G. (]981). Eyewitness identifica- tion: Lineup instructions and the absence of the offender. Journal ofApplied Psychology, 66, 482-489. Missouri v. Huchting, 927 S.w. 2d 411 (1996). Neil v. Biggers, 409 U.S. 188 (1972). Nettles, W., Nettles, Z, & Wells, G. L. (1996, November). "I noticed you paused on numbec three": Biased testing in eye- witness identification. Champion. pp. 10-12.57-59. Nisbett, R. E., & Wilson, T. D. (1977). Telling more than we can know: Verbal reports on mental processes. Psychological Review, 84. 231-259. Shaw, J. S., III (1996). Increases in eyewitness confidence re- sulting from posteverit questioning. Journal ofExperimenJal Psychology; Applied, 2. ]26-146. Shaw, J. S., III, & McClure, K. A. (1996). Repeated postevent questioning can lead to elevated levels of eyewitness confi- dence. Low and Human Behavior, 20. 629-654. Sporer, S. L. (1993). Eyewimess identification accuracy, confi- dence, and decision times in simultaneous and sequential line· ups. Journal ofApplied Psyclwlogy. 78, 22-33. Sporer, S., Penrod, S., Read, D., & Cutler, B. L. ( 1995). Choos- ing, confidence, and accuracy: A meta-analysis of the confi- dence-accuracy relation in eyewitness identification studies. Psychological BulleTin, 118, 315-327. Wells, G. L. (1984). The psychology of lineup identifications. Journal ofApplied Social Psychology. 14.89-103. Wells, G. L. ( 1988). Eyewitness identificaTion: A system hand- book. Toronto: Carswell Legal Publications. Wells, G. L. ( 1993). What do we know about eyewitness identi- fication? American Psychologist, 48, 553-571. Wells, G. L., Ferguson, T. J., & Lindsay, R. C. L. (1981). The tractability of eyewitness confidence and its implication for triers of fact. Journal ofApplied Psychology, 66, 688-696. Wells, G. L., & Lind8ay, R. C. L. (1983). How do people infer the accuracy of memory? SLudies of performance and a meta- memory analysis. In S. Lloyd-Bostock & B. R. Clifford (Eds.), Wimess evidence: Critical and empirical papers. New York: Wiley. Wells, G. L., Lindsay, R. C. L., & Ferguson, T. J. ( 1979). Accu- racy, confidence, and juror perceptions in eyewitness identifi- cation. Journal ofApplied Psychology, 64. 440-448. Wells, G. L., & Murray, D. M. (1983). What can psychology say about the Neil v. Biggers criteria for judging eyewitness identification accuracy? Journal of Applied Psychology, 68. 347-362. Wells, G. L., & Seelau; E. P. (]995). Eyewitness identification: Psychological research and legal policy on lineups. Psychol- ogy. Public Policy, and Low, I. 765-791. Received July 3, 1997 Revision received November 3, 1997 Accepted November 3, 1997 • Please note that this submission is a chapter in The Handbook of Eyewitness Psychology: Memory For People, Volume II from which we had difficulty producing a high-quality scanned copy. Therefore, we relied upon this already-existing .pdf copy which, it turns out, is a proof of the final draft and not the final copy as it appeared in the book. However, aside from minor editorial notes, the content of the draft and final article are virtually identical. 19 Belief of Eyewitness Identification Evidence Melissa Boyce University of Victoria, British Columbia Jennifer L. Beaudry and R. C. L. Lindsay Queen’s University, Ontario Imagine you are a juror in a trial. An eyewitness testifies that she saw a man walk into a convenience store, point a gun at the cashier, demand all of the money from the register, and then shoot the cashier. She points to the defendant and identifies him as the stickup man. Are you inclined to believe her? Does it matter how certain she is of her decision? What if she only had a glimpse of the man’s face? What if she wasn’t wearing her glasses, and as a result had impaired vision? Does it matter if the defendant is of the same race as the witness? Would it matter if the witness had been a young child rather than an adult? Would the police procedures used to question the witness sway your decision in any way? Would the police procedures used to obtain the identification influence your decision? By now it is clear that eyewitness evidence often is fallible and that many variables influence the likelihood of accuracy. With respect to belief of eyewitness identification evidence, three issues are examined in this chapter: (1) Do jurors believe eyewitnesses? If eyewitness evidence has no impact, then mistaken identification as an issue is less se- rious than it would otherwise be. However, we show that eyewitness evidence is believed. This leads to the second issue: (2) Can people discriminate between accurate and inaccu- rate eyewitnesses? If people can make this distinction, then eyewitnesses will be believed only (or primarily) when they are accurate and disregarded when they are not. The in- nocent would still suffer the social and financial hardships associated with arrest and trial but would rarely be convicted as a result of mistaken identification. However, we show that people are not able to make this distinction, believing accurate and inaccurate eyewitnesses about equally. This brings us to the final issue: (3) Are cues available that can help people to calibrate their belief of the likelihood that an eyewitness is accurate? If so, then eyewitnesses will be believed in those situations in which they are most likely 501 ch19_8038_Lindsay_II_LEA 6/28/06 3:36 AM Page 501 502 BOYCE, BEAUDRY, LINDSAY to be correct and not believed when they are least likely to be correct, but only to the ex- tent that jurors can be taught these cues and how to look for them. The importance of this issue arises because of people’s inability to discriminate an accurate eyewitness from one who is inaccurate. If people cannot discriminate accuracy, can they effectively in- crease the probability that their decisions about eyewitness accuracy will be correct by basing their belief on factors that actually affect eyewitness accuracy? The issue is not whether there is evidence that could permit such calibration (see Caputo & Dunning, this volume), but rather whether people in general use the available information to in- form their decisions appropriately. Unfortunately, as we will show, the research indicates that this also is not the case, as people tend to base their belief on some factors that do not reflect the likelihood that a correct identification was made and frequently ignore other information that could be useful. We conclude with a discussion of the implica- tions of these findings for the criminal justice system and practitioners within it. METHODOLOGY USED TO STUDY EYEWITNESS BELIEF Before examining the issues of belief, discrimination, and calibration, we provide a brief overview of the methodology used to study eyewitness belief: questionnaires, prediction studies, and simulated testimony. For a more detailed discussion of these techniques, see Lindsay (1994). 1. Questionnaires The use of questionnaires provides an efficient way to study knowledge of eyewitness issues and the perceived importance of relevant variables. Participants are surveyed about their opinions of factors that can affect eyewitness memory and accuracy (e.g., Deffen- bacher & Loftus, 1982; Kassin, & Barndollar, 1992; Yarmey & Jones, 1983). This tech- nique addresses the issue of whether laypeople (often undergraduates, potential jurors, and sometimes eyewitness experts) have an understanding of these factors. A typical item in such surveys may present a scenario such as a woman being mugged by two men, one of her own race and one of another race. Respondents would be asked if later she will be more, equally, or less likely to be able to correctly identify the person of her own or the other race. Although this methodology cannot address the level of belief of eyewitnesses per se, it can shed some light on the issue indirectly. Lack of knowledge about issues that have been associated in past research with the likelihood that an eyewitness will be cor- rect indicates that people may not take these factors into consideration when deciding whether to believe an eyewitness. As well, people may believe that factors not strongly associated with accuracy predict whether an eyewitness will be correct, indicating that these irrelevant factors might be considered by people when they are judging the accuracy of an eyewitness. The worst possible outcome would indicate that people mistakenly be- lieve a factor is predictive of accuracy when it is predictive of inaccuracy. Differences in the views of experts versus other populations indicate lack of knowledge by the public and ch19_8038_Lindsay_II_LEA 6/28/06 3:36 AM Page 502 support arguments for the potential value of expert testimony (e.g., Benton, McDonnell, & Ross, this volume; Kassin & Barnsdollar, 1992). Questionnaire techniques fail to capture the dynamic aspects of testimony and the interplay of the content of testimony with witness demeanor. Respondents lack informa- tion that could put the “case” in context, such as details of the witnessing conditions. Usually, participants are restricted to selecting from a list of possible responses, none of which may perfectly reflect their opinions. Also, there is little or nothing in these stud- ies to indicate how strongly people hold the expressed views, so these views may or may not matter once deliberation begins. On the other hand, questionnaires provide one means of assessing the starting point from which people view eyewitness issues, their naïve expectations of the impact of many variables of interest on eyewitness accuracy. 2. Prediction Studies Two types of prediction studies could be conducted to examine eyewitness belief. In the first type of prediction research, witnesses view staged crimes and then attempt to iden- tify the perpetrator. Later, other participants view these witnesses to staged crimes testi- fying about their experiences and the identifications the witnesses made following those experiences. Participants then make judgments about (among other things) whether they believe the eyewitnesses’ identifications were accurate or not (e.g., Wells, Lindsay, & Ferguson, 1979). In this type of study, participants often have access to both the witnessing conditions (as reported in the testimony) and the behavior of the eyewitness (demeanor evidence). It is possible to ascertain whether people are able to accurately discriminate between correct and incorrect eyewitnesses based on the participants’ judg- ments, because the witnesses they are viewing actually attempted to identify a previously seen person. Furthermore, prediction research can test whether belief in eyewitnesses is influenced by any witnessing conditions varied in the identification phase of the study. In the second type of prediction research, participants would read detailed descrip- tions of past eyewitness identification studies and attempt to predict how accurate the eyewitnesses were (e.g., Brigham & Bothwell, 1983). The accuracy rates estimated by par- ticipants are compared with the actual accuracy rates obtained in the original studies to determine the ability of people to realistically estimate eyewitnesses’ abilities. However, because only a description of the study is provided, rather than actually viewing witness testimony, these studies look at the anticipated effects of witnessing conditions on iden- tification accuracy, rather than the belief of the eyewitness who made an identification decision after exposure to events under the conditions studied. Clearly it is a (question- able) inference that belief of eyewitnesses exposed to a criminal in specific conditions can be predicted by asking people to estimate the probability that a witness would be ac- curate under such conditions. This technique has not been used frequently. 3. Simulated Testimony In simulated testimony studies, participants are presented with a case. This case can be presented in a written format (e.g., Leippe, 1985) or as audio or videotape (e.g., Lindsay, 19. BELIEF OF EYEWITNESS EVIDENCE 503 ch19_8038_Lindsay_II_LEA 6/28/06 3:36 AM Page 503 Lim, Marando, & Cully, 1986). Participants then attempt to reach a verdict based on eyewitness testimony and/or other evidence. The case is created specifically for the study (though the scripts may be based on real cases), and the experimenters manipulate the behavior of the witness as well as other aspects of the evidence. This is in contrast to pre- diction studies where the witness’s behavior is genuine; that is, the testimony truly reflects what the witness recalls from the event witnessed. Using either paradigm, researchers can test for the effects of specific factors (e.g., confidence of the witness); however, whereas in simulated testimony studies the behavior in question is manipulated, in prediction studies the behavior would vary in an uncontrolled manner, and causal inferences would be more difficult to make. The simulated testimony paradigm has frequently been used to examine factors that may contribute to the belief of an eyewitness. Strengths and Weaknesses. Each of these methodologies has its strengths and its weaknesses. For example, both of the prediction techniques suffer from limitations of the nature of the crimes used to generate eyewitness decisions. Violent, sexual, complex, and lengthy crimes are all unlikely to be studied, for obvious ethical and practical reasons. One advantage of questionnaire and simulated testimony studies is that they do not suf- fer from such limitations. Another method of studying this phenomenon is the use of field studies (e.g., Devlin, 1976). Results from real cases can be examined to determine whether eyewitness identification evidence has an impact. A single finding from any of these par- adigms, although suggestive, will not be conclusive. By comparing the results across stud- ies with the use of multiple methodologies, convergent validity may be achieved regarding the importance of variables influencing belief of eyewitness testimony. ARE EYEWITNESSES BELIEVED? In 1974, the Devlin Commission was formed in England to examine eyewitness proce- dures after several cases of mistaken identification came to light. Devlin (1976) exam- ined all police lineups (or identity parades) conducted in England in 1973. In total, over 2000 parades were analyzed. A suspect was identified in 45% of the parades, and 82% of those identified were subsequently convicted. The identification comprised the only evidence in over 300 cases, and, of these, 75% of the suspects were found guilty. This report (field study) provides compelling real-world evidence that eyewitnesses are be- lieved. Then again, suspect identifications were made in less than half of the cases that the Devlin Report analyzed. It could be argued that the identified suspects were indeed guilty in these cases and that most (perhaps even all) of the eyewitnesses who identified them did so accurately and thus should have been believed. With the advent of DNA evidence, it is increasingly possible to prove that innocent people are convicted. To date, innocence projects have helped to exonerate at least 142 wrongfully convicted people (Scheck & Neufeld, 2004). Mistaken identification was a contributing factor and probably the primary reason for conviction in over 80% of these cases. Clearly, mistaken eyewitnesses have been believed and have led to the conviction 504 BOYCE, BEAUDRY, LINDSAY ch19_8038_Lindsay_II_LEA 6/28/06 3:36 AM Page 504 of innocent people. Once again, this evidence is insufficient on its own to lead to the conclusion of a widespread problem. The number of DNA exonerations is trivially small in comparison with the number of convictions based on eyewitness identification evi- dence. We must rely on the laboratory for evidence of the ease with which witnesses can be enticed into selecting innocent people from lineups (e.g., see Dupuis & Lindsay, this volume; Meissner & Brigham, this volume). The experimental literature demonstrates beyond doubt that eyewitnesses frequently select innocent people from lineups and thus that mistaken identification is likely often to be presented as evidence in court. Other laboratory research strongly supports the conclusion that eyewitnesses are fre- quently believed even in the absence of other evidence. Loftus (1974) used a case involv- ing robbery and murder (simulated testimony paradigm) and found that 72% of partici- pants returned guilty verdicts when eyewitness evidence was presented, but only 22% did so in the absence of identification evidence. Wells et al. (1979) found that approximately 80% of people believed the witness they had seen identify a person following a staged crime (prediction paradigm). Generally speaking, laboratory research supports the con- tention that identification by an eyewitness is a highly credible piece of evidence likely to lead jurors to vote guilty. Other research has demonstrated that eyewitness evidence is one of the most in- criminating types of evidence that can be presented in court. Identification evidence has been shown to be comparable to or more impactive than physical evidence (McAllister & Bregman, 1986; Skolnick & Shaw, 2001), character evidence (Kassin & Neumann, 1997), alibis (McAllister & Bregman, 1989), polygraph evidence (Myers & Arbuthnot, 1997), and even sometimes confession evidence (Kassin & Neumann, 1997). Interestingly, it seems that eyewitness evidence may actually affect the way in which other types of evidence are viewed. The existence of eyewitness identification evidence increases the perceived strength of the other evidence presented (e.g., McAllister & Bregman, 1986; Kassin & Neumann, 1997). Both legally and scientifically, this may seem to make sense. As multiple sources of information converge on the same conclusion, the evidence supporting the conclusion is stronger. However, what may happen need not be such a positive pattern. Consider the following example. Assume that the lack of an alibi is of relatively limited probative value in many cases (Turtle, Burke, & Olsen, volume 1). Assume that the probability that a person lacking an alibi is guilty is (arbitrarily) .1. All that is implied here is that most people could not ac- curately remember and prove their whereabouts at the time of the crime, say 3 weeks ago at 5:45 P.M. When a witness has identified the suspect, the lack of an alibi itself may be perceived as better evidence; thus, the probability that a person lacking an alibi is guilty increases (arbitrarily) to .6. If this altered probability (.6 vs. .1) is used in combination with the identification evidence to estimate the overall probability of guilt or innocence, erroneous conclusions are more likely. Of course the influence could be reciprocal as well. The lack of an alibi may reduce the perceived probability that an identification will be erroneous. The impact of various sources of evidence in combination on the credibil- ity of eyewitness identification evidence is an important issue that has received little at- tention from researchers to date (but for an example see Cutler, Penrod, & Dexter, 1990). 19. BELIEF OF EYEWITNESS EVIDENCE 505 ch19_8038_Lindsay_II_LEA 6/28/06 3:36 AM Page 505 CAN PEOPLE DISCRIMINATE BETWEEN ACCURATE AND INACCURATE EYEWITNESSES? The findings from the Devlin report indicate that eyewitnesses are believed; however, a quarter of suspects whose cases against them rested solely on identification evidence were not convicted. Similarly, laboratory studies indicate that not all eyewitnesses are believed. Is it possible that people are able to discriminate between accurate and inaccu- rate eyewitnesses, believing only (or primarily) those who are accurate? DNA exonera- tion cases prove that some inaccurate eyewitnesses are believed. However, exoneration cases are a very small proportion of all eyewitness cases. As a result, the existence of even 142 wrongful convictions is consistent with an argument that identification errors and wrongful convictions based on such errors are extremely rare. Although logically possible, few, if any, researchers believe identification errors to be rare. Many (including the authors) believe DNA cases are the tip of the iceberg, that most identification errors go undetected for a variety of reasons (e.g., once the case is “solved,” the investigation is terminated; relatively few cases provide suitable material for DNA testing; etc.). How- ever, this does not alter the fact that DNA exoneration cases are few in number. Wells developed a prediction paradigm to study this issue (Wells, Lindsay, & Fergu- son, 1979). Participants viewed the questioning of witnesses who had just observed a staged theft and identified someone from a lineup. Mock jurors were not able to differ- entiate between accurate and inaccurate witnesses; both were believed by approximately 80% of participants. Other studies have confirmed that people are not able to make this discrimination (e.g., Lindsay, Wells, & Rumpel, 1981; Wells, Ferguson, & Lindsay, 1981; Wells, Lindsay, & Tousignant, 1980). Even when attempts have been made to increase the ecological validity of the research by having practicing lawyers question eyewitnesses in a real courtroom, mock jurors could not distinguish between accurate and inaccurate eyewitnesses (Lindsay, Wells, & O’Connor, 1989). In fact, only one study using the par- adigm of a staged crime followed by a mock trial has ever shown a significant ability to discriminate accurate from inaccurate identification by adult mock jurors listening to the testimony of adult witnesses. In that study, Wells and Leippe (1981) demonstrated that when questioning focused on memory for peripheral details of the event, inaccurate eyewitnesses were believed significantly more often than accurate eyewitnesses. Interestingly, Leippe, Manion, and Romanczyk (1992) found that people were able to discriminate between correct and incorrect child eyewitnesses (ages 6 and 10). How- ever, the study selected only the most and least accurate children for use in the study, so it is possible that the accuracy of these children may have been obvious. In any case, at least with adults, people consistently demonstrate that they are unable to discriminate between eyewitnesses who are accurate and those who are not. The staged-event mock-jury paradigm generally presents participants with witnesses, all of whom have seen the same crime and criminal under the same circumstances. Vari- ation in the nature of the crime may significantly influence eyewitness accuracy. If jurors are able to estimate or intuit the impact of such variation, they may be able to calibrate their decisions such that eyewitnesses are most likely to be believed in exactly those sit- 506 BOYCE, BEAUDRY, LINDSAY ch19_8038_Lindsay_II_LEA 6/28/06 3:36 AM Page 506 uations in which they also are most likely to be accurate. Although this would not elim- inate errors, it would provide some reduction in the rate of wrongful conviction com- pared with random or indiscriminate belief. IS BELIEF OF EYEWITNESSES WELL CALIBRATED? People cannot discriminate between individual accurate and inaccurate eyewitnesses. If they could, calibration wouldn’t be a concern, but since they can’t, people’s ability to cal- ibrate their belief to the likelihood that an eyewitness is accurate becomes a useful skill. If people are unable to determine if a particular eyewitness is accurate within a situation, are there factors within situations that can help jurors to determine if the witness is more or less likely to be accurate? This can be thought of in terms of game theory (McCloskey & Egeth, 1984). In order to maximize correct decisions, people should focus on factors that are related to eyewit- ness accuracy; that way they will optimize their chances of being accurate when deciding whether to believe an eyewitness. If people use the wrong cues, that is, factors that do not relate to the accuracy of the eyewitness, then their degree of belief will not be optimally calibrated to the rate of correct identification. How well do people calibrate their belief of eyewitnesses? Lindsay, Wells, and Rumpel (1981) attempted to address the calibration issue with the use of the staged-crime mock-juror paradigm. Witnessing conditions were manipu- lated to generate poor, moderate, and good witnessing conditions, as indicated by the per- centage of eyewitnesses who were correct when they identified a lineup member (33%, 50%, and 75%). Accurate and inaccurate eyewitnesses from each condition were viewed by mock jurors. Within conditions, the usual result was obtained, indicating no ability to discriminate accurate from inaccurate eyewitnesses based on testimony. However, across conditions, witnesses were more likely to be believed the better the witnessing conditions were (61%, 70%, 78% in the poor, moderate, and good viewing conditions, respectively). This result provides some evidence that jurors may be able to estimate the likely accu- racy of eyewitnesses under varying conditions. Unfortunately, the results were strongly influenced by witness confidence. Confident witnesses were likely to be believed regard- less of witnessing conditions (76%, 76%, 78%, respectively). Thus, people took witnessing conditions into account only when the witness was not confident (46%, 63%, 78%). The pattern in the data from this study is a serious problem for a calibration approach. Wit- nesses who are not confident may be less likely to appear in court because they decline to testify or because the prosecutor believes they will not be convincing. If this reason- ing is correct, the witnesses who would permit some degree of calibration of belief are unlikely to testify, whereas those producing indiscriminate belief are most likely to tes- tify. Furthermore, witness confidence can be altered between the time of identification and testimony (Wells, Ferguson, & Lindsay, 1981) or may be distorted by interactions with police or other witnesses (Wells & Bradfield, 1998). To the extent that this happens, the evidence that could have been used to assist with calibration is distorted, possibly 19. BELIEF OF EYEWITNESS EVIDENCE 507 ch19_8038_Lindsay_II_LEA 6/28/06 3:36 AM Page 507 eliminating its usefulness. For a more extensive discussion of the usefulness of witness confidence as an indication of accuracy, see Leippe and Eisenstadt (this volume). EVIDENCE FOR OVERBELIEF Research indicates that people overestimate the abilities of eyewitnesses. The Lindsay et al. (1981) study reveals that the percentage of mock jurors who believed eyewitnesses was greater than the percentage of eyewitnesses who made accurate identifications in all conditions. The discrepancy was trivially small when viewing conditions were good (3%) but larger with moderate (20%) and poor (28%) viewing conditions. For confident wit- nesses drawn from poor viewing conditions, overbelief was very high, as only 33% of wit- nesses were correct but the witnesses were believed 76% of the time. If this reflects a general pattern, the worse the viewing conditions, the lower the probability that an iden- tification will be accurate, and the greater the discrepancy between eyewitness accuracy and eyewitness belief is likely to be. In other words, as identification accuracy declines, overbelief increases. Since real-world witnessing conditions can be much worse than those found in this laboratory study, very high rates of overbelief would be predicted. Wells et al. (1979) also exposed students to a staged crime, obtained identifications, and had witnesses testify about their experiences. Although only 54% of witnesses cor- rectly selected the culprit from the lineup, 80% of mock jurors indicated that they be- lieved the witness they saw testify (26% overbelief). Brigham and Bothwell (1983) conducted a prediction study in which they presented eligible jurors with the procedures from two studies that had previously been conducted to examine eyewitness accuracy, one involving a theft (from Leippe, Wells, & Ostrom, 1978) and the other involving a convenience store interaction (from Brigham, Maass, Snyder, & Spaulding, 1982). Participants were asked to estimate the likelihood of a cor- rect identification. When presented with information regarding the theft, participants estimated that 71% of witnesses would make a correct identification, when in reality only 12.5% had done so. This supports the earlier speculation that as witnessing conditions deteriorate (12.5% identification accuracy), overbelief escalates (58.5%). Overall, 91% of partici- pants overestimated the percentage of witnesses who would make a correct identifica- tion. For the convenience store study, when the target was black, participants thought 51% of witnesses would make an accurate identification when only 32% had (19% over- belief). Over 70% of participants overestimated the percentage of correct identification. It was estimated that 69% of witnesses would make a correct identification when the tar- get was white, although only 31% had (38% overbelief). Over 90% of participants over- estimated the percentage of correct identification. These results indicate that people believe that witnesses are considerably more likely to be accurate than they actually are. Thus, people do not effectively calibrate their be- lief to the likelihood that an eyewitness is correct; that is, they are not playing the game well. Why might this be? For people to successfully perform this calibration, they must have some understanding of the factors that affect eyewitness memory and accuracy. Past 508 BOYCE, BEAUDRY, LINDSAY ch19_8038_Lindsay_II_LEA 6/28/06 3:36 AM Page 508 research has shown that people’s knowledge of these factors is limited (e.g., Deffenbacher & Loftus, 1982; Benton et al., this volume; Shaw, Garcia, & McClure, 1999). People tend to base their decisions to believe eyewitnesses on at least some factors that are not related to eyewitness accuracy. System versus Estimator Variables Wells (1978) distinguished two categories of factors that can affect the likelihood that an eyewitness will make a correct identification. System variables are factors that are under the control of the system, and, as such, steps can be taken to ensure that these procedures are used in a way that will optimize eyewitness accuracy. Examples of system variables in- clude such things as identification procedures or questioning strategies. Estimator vari- ables, on the other hand, are factors that are not under the control of police officers but which can affect the likelihood that an eyewitness will make a correct identification. Es- timator variables can be further broken down into subcategories pertaining to character- istics of the witness, the culprit, and the situation. Witness characteristics include age, sex, race, and confidence. Culprit characteristics include factors such as race and disguise. Situational variables include such factors as lighting, distance, and exposure time. How much consideration do people give to system versus estimator variables? Shaw et al. (1999) asked people to list factors believed to have important effects on eyewitness accuracy. Of the total responses, 42% related to factors describing characteristics of the eyewitness (e.g., vision, age, reputation for honesty), 29% to conditions of the crime scene (e.g., lighting, distance between eyewitness and crime), 26% to characteristics of the eyewitness’s testimony (e.g., quality of description, composure in court), 3% to char- acteristics of the suspect (e.g., race, actions in the courtroom), and only 1% to police procedures (e.g., questioning tactics, handling of evidence). These results indicate that people consider characteristics of the eyewitness to be the strongest determinant of whether the identification is likely to be accurate. More telling, however, is that when the factors are separated into system and estimator variables, 99% of people’s responses fell under the category of estimator variables, whereas system variables accounted for only 1% of people’s responses. This particular study does not allow us to determine whether people will ignore variance in system variables when deciding whether to be- lieve an eyewitness; however, there is other evidence demonstrating that system vari- ables do not strongly influence juror belief of eyewitness identification evidence (Cutler, Penrod, & Dexter, 1990; Cutler, Penrod, & Stuve, 1988; Lampinen, Judges, Odegard, & Hamilton, 2005). ESTIMATOR VARIABLES Characteristics of the Eyewitness Much of the research on belief of eyewitnesses has focused on estimator variables per- taining to characteristics of the eyewitness, probably because the eyewitness is the most 19. BELIEF OF EYEWITNESS EVIDENCE 509 ch19_8038_Lindsay_II_LEA 6/28/06 3:36 AM Page 509 proximal cue, being a primary (and, in legal terms, direct) source of evidence, including identification. Most of the attention is focused on trying to determine witness accuracy by examining the cues provided by the witness. Not surprisingly, it is these variables that have been shown to have the greatest impact on whether an eyewitness will be believed. That is, people calibrate their belief based on characteristics of the eyewitness. Confidence of the Eyewitness The research on the effects of eyewitness confidence shows how powerfully a confident witness can sway juror belief. For example, Cutler, Penrod, and Stuve (1988) provided participants with information related to disguise, weapons, violence, mugshot searches, voice samples, length of retention interval, lineup size, similarity of lineup members, and witness confidence. Only witness confidence was related to ratings of the probability that a correct identification was made (60% versus 69%) and a greater proportion of guilty verdicts (39% versus 54%). These findings have been replicated with eligible and experienced jurors (Cutler, Penrod, & Dexter, 1990). Even in courtrooms using actual lawyers, confidence still determined the likelihood that a witness would be believed (Lindsay et al., 1989). In fact, eyewitness confidence is the only variable that consistently predicts belief in every study conducted examining the issue. This is troublesome unless the relationship between confidence and accuracy is strong. Wells et al. (1979) found a small (albeit significant) relationship between confidence and accuracy (r .29), such that confidence accounted for about 10% of the variance in accuracy. In spite of this, witness confidence accounted for 50% of the variance in jurors’ accuracy judgments in that study. Such results are typical in eyewitness research, though debate about the strength of the relationship between confidence and accuracy is far from over (see Leippe & Eisenstadt, this volume). Consistency of Eyewitness Testimony Consistency of eyewitness statements can have an impact on whether an eyewitness is believed (e.g., Berman, Narby, & Cutler, 1995; Berman & Cutler, 1996). Berman and Cutler found that any type of inconsistency decreased guilty verdicts, whether it was in- formation presented on the stand but originally omitted during pretrial investigations or information presented on the stand that contradicted original statements, or if the testi- mony on the stand contained contradictions. Lindsay, Lim, Marando, and Cully (1986) manipulated the consistency of a description and the appearance of the perpetrator in court and found it had no effect on belief of the eyewitness. Brewer and Burke (2002) suggest that the effects of witness inconsistency may be confounded with confidence; that is, inconsistency may reduce belief only when it leads to the perception of reduced confidence. This is consistent with the Lindsay et al. data because the witness was por- trayed as highly confident in her identification decision regardless of the level of consis- tency of her other evidence. Fisher and Cutler (1995) examined 612 identification at- tempts of eight different targets and found that consistency with description was not a powerful indicator of eyewitness accuracy. This finding has important implications be- 510 BOYCE, BEAUDRY, LINDSAY ch19_8038_Lindsay_II_LEA 6/28/06 3:36 AM Page 510 cause it indicates that consistency (at least with description) is not a factor that people should be using to determine whether they believe an eyewitness, and yet many studies suggest they do. Level of Detail in Eyewitness Testimony Very few studies have looked at the level of detail provided by the witness during testi- mony as a determinant of belief about the accuracy of the testimony. Bell and Loftus (1988) found that the level of detail used by the eyewitness did have an effect on guilty verdicts, indicating that eyewitnesses who provide more detail in their testimonies are more likely to be believed. In spite of this, some research has shown that better recall of peripheral details may not mean that the witness was more likely to have made a correct identification. Wells and Leippe (1981) found that people who were better able to re- member the peripheral details of a crime scene were actually less likely to make a correct identification. However, participants rated the credibility of the witnesses as though the opposite were true. Again, this is problematic, as the research suggests that detail may not be an indicator of eyewitness accuracy, and yet people seem to use or even misuse it when deciding whether to believe an eyewitness. Age of the Eyewitness Overall, the literature examining whether age affects witness belief has converged on two main findings. First, the research generally supports the notion that children’s credibility is a multidimensional construct comprising cognitive ability and honesty (e.g., Leippe and Romanczyk, 1989; Ross, Dunning, Toglia, & Ceci, 1990; Ross, Jurden, Lindsay, & Keeney, 2003; Ross, Miller, & Moran, 1987), though there have been some exceptions (e.g., McCauley & Parker, 2001). Although children are believed to be more honest, they are seen to be lacking in cognitive abilities. For this reason, in cases that involve a sexual element, such as child abuse, there is more of a focus on the honesty of the child, and children are typically believed. Alternatively, in cases such as car theft, which focus more on the memory of the child, children are seen as less credible witnesses. The same argument applies to the elderly, as it is believed that memory impairments may compro- mise their credibility, but, according to stereotypes of the elderly, they are honest. Second, people have a negative stereotype of children as eyewitnesses, most likely because of their credibility issues related to cognitive skill. Although some studies show that these stereotypes are readily disconfirmed and put aside if people are given the chance to actually view the testimony of the child (e.g., Leippe & Romanczyk, 1989; Ross, Dunning, Toglia, & Ceci, 1990), other studies report the opposite (e.g., Leippe et al., 1992). Research on the elderly has shown that their credibility as witnesses depends on what stereotypes are invoked. Nunez, McCoy, Clark, and Shaw (1999) found that when positive stereotypes of the elderly such as “statesman” were invoked, guilty ratings were significantly higher than when stereotypes such as “senior citizen” or “grandfather” were elicited. It would seem that, based on this evidence, the strongest statement that can be made is that sometimes eyewitness evidence is perceived as less credible if the witness is 19. BELIEF OF EYEWITNESS EVIDENCE 511 ch19_8038_Lindsay_II_LEA 6/28/06 3:36 AM Page 511 young or elderly, depending on the extent to which certain stereotypes are invoked and the type of case involved. Some have suggested that expectations of poor performance by children and the elderly may create demand characteristics in the research setting that may not generalize to the courtroom (e.g., Kwong-See, Hoffman, & Wood, 2001). This possibility has not been thoroughly evaluated, but the existence of such expecta- tions itself suggests that they may influence real-world trials. Age of the eyewitness can play a role when people are deciding whether to believe an eyewitness. In light of these findings, does age actually have an effect on eyewitness accuracy? This is one situation where a witness characteristic has been shown to have an effect on eyewitness accuracy. Lindsay, Pozzulo, Craig, Lee, and Corber (1997) found that chil- dren did not differ in their correct identification rates from adults, but that in target- absent lineups, because of a propensity to guess, they were less likely to correctly reject a lineup (see also Pozzulo, this volume). Likewise, Yarmey (1984), in a review of the liter- ature on the accuracy of elderly eyewitnesses, found that on average the elderly are 7% to 20% less accurate than young adults, although exceptions exist (see also Bartlett & Memon, this volume). Wright and Stroud (2002) determined that it may not be the age of eyewitnesses per se that makes them worse at identification, but the age of the eyewit- ness relative to that of the criminal. People are better at identifying those who are closer to them in age, and the fact that studies typically use young adults as their criminals may account for the findings that children and the elderly are less accurate. What does this mean in terms of accurate calibration? Perhaps people should only use age as a factor in deciding whether to believe an eyewitness if there is a large age difference between the witness and the suspect. However, research does not suggest that this is what people do, meaning that once again, people are using the wrong cues or ap- plying the correct cues incorrectly when deciding whether to believe an eyewitness. Credibility of the Eyewitness The credibility of the eyewitness is an interesting witness variable, as it can exist as its own category, but it can also subsume all other witness characteristics. All of the other witness variables discussed in this chapter (confidence, consistency of testimony, level of detail of testimony, and age of the eyewitness) affect whether an eyewitness is believed because they compromise the perceived credibility of the witness. When an eyewitness’s credibility is called into question, his/her impact as an eyewitness is diminished and he/she is less likely to be believed. Whether or not these factors are related to the accuracy of the eyewitness is a separate issue. Attempts have been made to directly test the effects of a witness’s credibility on his/her impact on mock jurors. For example, a number of studies have looked at whether vision impairment will affect a witness’s perceived credibility. These studies are more concerned with determining how pervasive findings of eyewitness overbelief are, rather than with the particular variable being used. Vision impairment is something that so ob- viously will affect whether a person can make an accurate identification that if research can show that people are still willing to believe a witness who has impaired vision, this would provide compelling evidence of people’s propensity to believe eyewitnesses. 512 BOYCE, BEAUDRY, LINDSAY ch19_8038_Lindsay_II_LEA 6/28/06 3:36 AM Page 512 As mentioned previously, Loftus (1974) used a (simulated testimony) case involving robbery and murder and found that 72% of participants returned guilty verdicts when eyewitness evidence was presented, but only 22% in the absence of identification evi- dence. In a third condition, 68% of participants found the defendant guilty when the eyewitness who testified was described as legally blind. This study sparked a wave of con- troversy. Some claimed that eyewitnesses were believed indiscriminately, as it seemed ju- rors’ reliance on eyewitness evidence was so strong that even blind eyewitnesses would be believed! However, since that initial study, very few studies have found any indica- tions that jurors ignore the fact that eyewitness evidence has been discredited (excep- tions are Cavoukian, 1981; Hatvany & Strack, 1980; Saunders, Vidmar, & Hewitt, 1983, study 3). The majority of studies report that mock jurors deal with discredited eyewit- ness evidence either by ignoring it or by slightly overcorrecting (actually returning fewer guilty verdicts than when no eyewitness evidence is presented at all; e.g., Weinberg & Baron, 1982; Saunders et al., 1983, studies 1 & 2; Elliot, Farrington, & Manheimer, 1988; Kennedy & Haygood, 1992). One clear implication of these results is that for years lawyers have been following a sensible practice when they attempt to win cases by discrediting witnesses for the other side. Characteristics of the Criminal or the Situation Many of these factors could logically compromise the accuracy of an eyewitness (to vary- ing degrees). Yet very little research has focused on how characteristics of the criminal or the crime scene affect the belief of eyewitnesses. Cutler, Penrod, and colleagues (Cutler et al., 1988, 1990) conducted a series of studies examining the role of multiple factors in eyewitness belief. They found that neither the presence of a disguise nor the criminal’s use of a weapon or violence were taken into consideration by mock jurors as potentially compromising eyewitness accuracy. Brigham and Bothwell (1983) found that people rec- ognized that the race of the criminal can play a role in correct identifications but under- estimated its effects and continued to overestimate the abilities of eyewitnesses in these situations. Lindsay et al. (1986) examined the effects of viewing conditions on eyewitness be- lief by manipulating the time of day (9:00 A.M. on a sunny day versus 1:00 A.M. and 60 feet from a street light) and exposure that the eyewitness had when viewing the crim- inal (5 seconds, half an hour, or half an hour with eyewitness interaction with the crim- inal). Lindsay et al. found that fewer people convicted the defendant when the criminal was seen at night (37% versus 57%), though this difference did not reach significance. There were no significant differences in the proportion of guilty votes between the 5-second, half-hour, or half-hour with interaction exposure times (45%, 40%, and 55%, respectively). Although the evidence is scant, what research has been done indicates that people do not readily consider characteristics of the criminal or the situation when determining whether to believe an eyewitness. Yet a disguise (Cutler, Penrod, & Martens, 1987), a vis- ible weapon (Steblay, 1992), violence (Clifford & Hollin, 1981), and viewing conditions (MacLin, MacLin, & Malpass, 2001) have all been shown to affect eyewitness accuracy. 19. BELIEF OF EYEWITNESS EVIDENCE 513 ch19_8038_Lindsay_II_LEA 6/28/06 3:36 AM Page 513 Thus, people should consider these variables in order to better calibrate their belief of an eyewitness to the likelihood that the witness is actually correct. SYSTEM VARIABLES What about system variables? Even less research has examined the effects of system vari- ables than criminal or situational variables on belief of eyewitnesses. Fortunately, a few studies have been conducted that shed light on the impact of system variables when cal- ibrating belief of eyewitnesses. Fairness of the Lineup This category includes any type of lineup bias, such as foil bias (members of the lineup are not sufficiently similar to the suspect) and instruction bias (eyewitnesses are told that the suspect either “is” or “may or may not be” in the lineup). Both biases have been shown to greatly increase false identification rates (e.g., Lindsay & Wells, 1980; Malpass & Devine, 1981). Devenport, Stinson, Cutler, and Kravitz (2002) studied jurors’ percep- tions of these biased lineup procedures. They found that although a foil-biased lineup was rated as significantly more suggestive than an unbiased lineup and that the presence of an instruction bias also led to higher ratings of suggestibility, these biases had little ef- fect on verdicts. Although the presence of a foil bias lowered guilty verdicts, this effect attained only marginal significance (.06). The presence of an instruction bias was not re- flected at all in differential guilt ratings. Even though mock jurors were aware that this bias could influence the suggestiveness of a lineup, they failed to consider it important enough to adjust their verdict decisions. Cutler et al. (1988) also found that biased line- ups failed to influence belief of eyewitnesses. Lindsay and Wells (1980) had witnesses to a staged crime attempt identification from foil-biased versus fair lineups and then testify in a mock-court procedure. The lineup fairness manipulation had a dramatic impact on the rate of false identification (70% versus 31%), but mock jurors “were no less likely to believe a witness making an identification from a low- rather than high-similarity lineup” (p. 307). Lineup Presentation System variables related to lineup presentation have been included separately from lineup biases. Although the simultaneous lineup is inferior to a properly conducted se- quential lineup, most people do not consider it a biased lineup. Devenport et al. (2002) attempted to determine whether lineup presentation, that is, whether eyewitnesses were presented with a sequential or simultaneous lineup, had an effect on mock jurors’ belief. The sequential lineup has been shown to be superior to the simultaneous lineup in its ability to decrease false identifications with little loss of correct identifications (Lindsay & Bellinger, 1999; Lindsay et al., 1991; Lindsay & Wells, 1985). In spite of this, people did not find the identification evidence more convincing when a sequential rather than a simultaneous lineup had been used by police. 514 BOYCE, BEAUDRY, LINDSAY ch19_8038_Lindsay_II_LEA 6/28/06 3:36 AM Page 514 Modality of the Identification Research indicates that people are much less likely to accurately identify voices than faces (e.g., Bull & Clifford, 1984; McAllister, Dale, & Keay, 1993, study 1). In fact, we’ve found in our laboratory that voice identifications have very little probative value (Pryke, Lindsay, Dysart, & Dupuis, 2004). However, McAllister et al. found that as long as any type of identification was made, witnesses were equally believed and guilty verdicts did not differ by condition. Cutler et al. (1988) also found that people were no less willing to believe voice identifications than face identifications. Evidence Obtained After Hypnosis An issue that hasn’t received much attention in the literature is whether a witness who makes an identification after undergoing hypnosis to facilitate recall is likely to be be- lieved. Although there are proponents of hypnotically induced witness statements, the evidence generally indicates that hypnosis does not aid witnesses in accurate recall (e.g., Smith, 1983). However, Spanos, Gwynn, and Terrade (1989) found that mock jurors be- lieved eyewitness evidence equally regardless of whether the eyewitness had undergone hypnosis in order to “remember” what her attacker had looked like (40% and 39% guilty verdicts for control and hypnosis groups, respectively). SUMMARY OF THE LITERATURE The majority of the research has focused on estimator variables rather than system vari- ables. Studies that relate to the characteristics of the eyewitness are particularly com- mon. The findings so far should raise concern. People do perform a type of calibration process when determining whether to believe eyewitnesses, but they base this process on estimator variables relating to characteristics of the witness, which have, at best, a mod- est relation and, at worst, no relation to eyewitness accuracy. People pay little if any at- tention to system variables, even though these variables have consistently been shown to have an effect on the accuracy of identification decisions. Although it is encouraging to think that the factors that seem to have the most impact on the likelihood of a cor- rect identification are factors that are under our control, this knowledge is less valuable if it remains limited to researchers in the area, thus having no effect in the courts. Clearly a goal should be to try to educate people about which variables actually affect eyewitness accuracy and which do not, so that people can better calibrate their belief. This highlights the value of expert testimony in the courtroom (Benton et al., this vol- ume; Van Wallendael, Devenport, Cutler, & Penrod, this volume). Probative Value of System and Estimator Variables Some estimator variables definitely have an impact on the likelihood that an eyewitness will be able to make an accurate identification. Consider situational variables. Clearly, a 19. BELIEF OF EYEWITNESS EVIDENCE 515 ch19_8038_Lindsay_II_LEA 6/28/06 3:36 AM Page 515 witness to a lengthy crime committed in broad daylight only a few feet from the witness will be more likely to be able to identify the criminal than a witness who caught a brief glimpse of the criminal 30 feet away in the middle of the night from the back of a car window in the pouring rain. Furthermore, characteristics of the criminal can affect how likely he is to be identified. Research has shown that people have difficulty identifying criminals who have changed their appearance, for example, by shaving a beard or cut- ting their hair (e.g., Cutler et al., 1987). In addition, the own-race bias has received a great deal of attention in the literature for the effects it can have on eyewitness accuracy (Bothwell, Brigham, & Malpass, 1989; Chance & Goldstein, 1996; Meisner & Brigham, this volume). However, some of these variables may suppress choosing rather than alter the ratio of correct to false identification. Very poor viewing conditions, such as darkness or ob- struction of view, may lead witnesses to decline to identify anyone. Since these witnesses will not appear in court, the fact that the viewing conditions reduced accuracy by re- ducing the proportion of witnesses making correct selections may not be critical. The critical issue is the ratio of accurate to inaccurate “choosers” and “testifiers” and how convincing they are in court. Furthermore, many witness factors are not reliably related to eyewitness accuracy, yet it appears to be these variables that have the largest impact on whether people are willing to believe an eyewitness. System variables, on the other hand, consistently show a relationship to eyewitness accuracy but are largely ignored by people when they are deciding whether to believe an eyewitness. The experience of one of the authors (RL) when training police, prose- cutors, and judges is that the eyewitness is seen as the problem; that is, the eyewitness is seen as the source of error. Police procedures, unless they are obviously biased, are rarely considered a potential problem until people have been exposed to research results docu- menting the impact on eyewitness accuracy of many standard but poor procedures. Since the public has limited exposure to such information, they do not consider system variables as a concern for evaluating witness testimony. This also is probably due to a general but incorrect expectation that police are aware of and use the best available pro- cedures. Unfortunately, this may mean that jurors are rarely exposed to and thus may not consider system variable issues in court. DOES BELIEF OF EYEWITNESSES DEPEND ON THE TYPE OF EYEWITNESS EVIDENCE PRESENTED? Some research has shown that the extent to which eyewitness evidence is utilized com- pared with other types of evidence may depend on whether mock jurors believe that the defendant is guilty or not. Interestingly, this notion also applies to which types of eyewit- ness evidence mock jurors focus on (Leippe, 1985; McAllister & Bregman, 1989; Lind- say, 1994). People appear to use a confirmatory strategy to select evidence that will best support their conclusion that the accused person is either guilty or innocent. Evidence supporting this view has been obtained in studies manipulating nonidentification and alibi evidence. 516 BOYCE, BEAUDRY, LINDSAY ch19_8038_Lindsay_II_LEA 6/28/06 3:36 AM Page 516 Nonidentification Evidence A nonidentification occurs whenever an eyewitness selects no one from a lineup or se- lects a foil rather than the suspect. The research on nonidentification evidence has met with conflicting results. Some studies indicate that nonidentification evidence is highly impactive on mock jurors, whereas other studies have found that this evidence tends to be underutilized. The confirmatory strategy framework can be used to explain these discrepant findings. For example, Leippe (1985) found that nonidentifications had a sub- stantial impact on mock jurors, with guilty verdicts declining from 47% to 14% if non- identification evidence was included. In a second study, guilty verdicts declined (non- significantly) from 29% to 12% if nonidentification evidence was included. Based on the low conviction rates in the control conditions, it seems likely that mock jurors were test- ing a hypothesis of not guilty based on the other evidence. If mock jurors were using a confirmatory strategy, we would expect them to utilize the nonidentification evidence to support their bias toward a not guilty verdict, which is precisely what they did. McAllister and Bregman (1986, 1989) also found results supporting the use of a con- firmatory strategy. They used a case for which baseline guilty rates were indicative of a not-guilty hypothesis. In line with the confirmatory strategy, they found not only that non- identifications were overutilized in their study, but also that alibi identifications, which would provide evidence of innocence, also were overutilized, whereas eyewitness identi- fications and alibi nonidentifications were underutilized. Conflicting or Corroborated Eyewitness Evidence At first glance, the results pertaining to how conflicting or corroborated eyewitness evi- dence affects juror belief are anything but conclusive, since some researchers found that this evidence has an effect and other researchers found that it does not. However, a sec- ond look at the findings in terms of hypothesis confirmation reveals another story. Lind- say et al. (1986) manipulated corroborating identification evidence (multiple witnesses identifying the same suspect) versus conflicting evidence (some witnesses stating that the suspect was not the person who committed the crime, whereas others said he was). They concluded that corroborating identification evidence was underutilized, but that conflicting identification evidence resulted in significantly fewer guilty verdicts. It is pos- sible that (based on the other evidence) mock jurors formed a hypothesis of not guilty, causing them to underutilize the corroborating evidence supporting a verdict of guilty and overutilize conflicting evidence confirming a verdict of not guilty. Leippe (1985) also examined the effects of corroborating evidence and found that the addition of another eyewitness who identified the accused did not significantly in- crease guilty verdicts (47% and 53%, respectively). It appears from the guilty verdict rate of about 50% that mock jurors in this case were unbiased as to the guilt of the accused, forming no prior hypotheses. McAllister and Bregman’s (1986) finding that neither con- flicting nor corroborating eyewitness evidence lowered or raised guilty verdicts compared with when just one eyewitness testified supports the notion that when people have no preconceived notions of a suspect’s guilt, both corroborating and consistent eyewitness 19. BELIEF OF EYEWITNESS EVIDENCE 517 ch19_8038_Lindsay_II_LEA 6/28/06 3:36 AM Page 517 testimony may be underutilized. The idea of a hypothesis confirmatory strategy playing a role in how identification evidence, nonidentification evidence, and the evidence of mul- tiple witnesses is utilized is interesting and deserving of future research. At the moment, the results are too few and inconsistent to draw conclusions. A clear limitation of the existing studies is that they were not designed to explicitly test the possibility that con- firmation biases were at work. IMPLICATIONS AND DIRECTIONS FOR FUTURE RESEARCH Eyewitness evidence is very compelling to jurors. However, people are not able to tell if a witness is accurate by watching his or her testimony. Jurors appear to base their deci- sions on whether to believe an eyewitness mostly on characteristics of the witness, such as confidence, which are not strongly related to eyewitness accuracy. Characteristics of the crime scene and the criminal tend to be overlooked, and system variables, which can have a great impact on eyewitness accuracy, are completely ignored when people are de- ciding whether a witness should be believed. In addition, jurors may focus on evidence that best supports their preconceived hypotheses about a suspect’s guilt or innocence. This can lead to the underutilization of nonidentification, conflicting, or corroborating eyewitness evidence, all of which clearly affect the likelihood that a suspect is guilty. The problem is clear. Jurors are unable to determine whether eyewitness evidence is accu- rate. This may occur because there may be no way to successfully make the discrimina- tion (Caputo & Dunning, this volume). Alternatively, jurors may fail to discriminate eyewitness accuracy because they focus on uninformative variables and disregard at least some variables that do speak to the likelihood that an eyewitness is correct. Or the task of evaluating the available cues in order to accurately assess likely accuracy may be too complex (Smith, Lindsay, & Pryke, 2000). How can we increase jurors’ sensitivity to factors that actually affect eyewitness ac- curacy? Obviously expert testimony is one option, and there is a great deal of research to be conducted examining that issue. However, based on the resistance expert testimony has received from the courts (Benton et al., this volume), it is important to determine if there are any other options. Once the case is in court, jurors can’t tell if an eyewitness is correct. What about at the time of the identification? It would be interesting to test whether people are able to discriminate between accurate and inaccurate witnesses if provided with a tape of the identification procedure (indeed, videotaping identification procedures is recommended by the National Institute of Justice committee on best prac- tices for eyewitness identification procedures; Technical Working Group, 1999). It is possible that jurors base the majority of their opinions, and in turn their verdicts, on their own judgments. That is, when a witness testifies, jurors observe factors related to the wit- ness, such as age, confidence, and consistency. As previously mentioned, these are the variables that are commonly used by experimental participants to determine the credi- bility of the witness. Perhaps presenting a videotape of the identification would be infor- 518 BOYCE, BEAUDRY, LINDSAY ch19_8038_Lindsay_II_LEA 6/28/06 3:36 AM Page 518 mative for jurors, since they would be able to see the witness waver or delay for long periods before making the identification versus quickly and confidently selecting the suspect. This information might then be used to discriminate between accurate and in- accurate witnesses. Are there ways of presenting the case that might cause jurors to focus less on the wit- ness and more on the police procedures in order to better calibrate their belief? Can a judge’s instructions about eyewitness identification lead jurors to evaluate eyewitness testimony in more appropriate ways (by taking into account more relevant and less irrel- evant variables)? More research is needed on all of the issues discussed in this chapter. We know something about how witness variables affect eyewitness belief, but our knowl- edge of other variables is lacking. We need to explore more thoroughly the effects of characteristics of the criminal and situation on jurors’ perceptions of eyewitness credi- bility. In addition, more research focused on system variables is needed. In some cases, the conclusions we have come to in this chapter are based on very little data. Hand in hand with expanding the focus of the research is increasing the breadth of the studies conducted. The majority of these studies look only at the effects of one or two variables on belief. However, in the real world, jurors are faced with a rich palette of information on which to base their belief. Cutler, Penrod, and colleagues are noteworthy exceptions in this regard (e.g., Cutler, Penrod, & Dexter, 1990; Cutler, Penrod, & Stuve, 1988). Their efforts to examine the effects of multiple variables within the same study are commendable. Bradfield and Wells (2000) provide another example with their inter- esting research on the additive effects of confidence and consistency on belief. In terms of methodology, the research on belief of eyewitnesses has typically used one of the three paradigms described at the beginning of the chapter. Questionnaires are an easy method for obtaining information; however, problems can arise because of response biases. Questionnaires often use a multiple-choice or forced-choice format that limits the depth and breadth of the information obtained from them. Questionnaires also may make certain issues more salient, leading respondents to answer questions based on de- mand characteristics when those factors would normally not occur to them if they were on a jury (Shaw et al., 1999). Prediction studies are not without their problems either. A prediction study is only as valid as the description of the original study given. If a poor description of what occurred originally is used, people’s estimates of the number of peo- ple who would make a correct identification are meaningless. Of course, the reliability of the original effect (in the study being described) also limits the validity of prediction studies. It does not matter if people can estimate the frequency of correct lineup deci- sions from a study if the results of the study do not reflect a general pattern of identifica- tion accuracy. Also, people in prediction studies are making judgments about witnesses on average, rather than a particular witness. Differences in actual versus predicted accuracy rates may reflect people’s inability to make overall estimations rather than their inability to determine whether an individual eyewitness is accurate or not (McAllister et al., 1993). Prediction studies where participants watch witnesses testify provide a measure of whether people can discriminate between accurate and inaccurate eyewitnesses based on all of the 19. BELIEF OF EYEWITNESS EVIDENCE 519 ch19_8038_Lindsay_II_LEA 6/28/06 3:36 AM Page 519 cues that would be available to jurors. Of course such studies risk lack of validity and generality unless diverse witnesses are used as stimuli (Wells & Windschitl, 1999). Trial simulation studies are used in an attempt to mimic real life where jurors are mak- ing decisions about people’s guilt or innocence. Such studies were thoroughly critiqued in the late 1970s (e.g., Bray & Kerr, 1979; Diamond, 1979). Case descriptions may offer little similarity to real life. Descriptions are often very brief. Manipulations may be espe- cially salient because they are in writing, drawing participants’ attention to them. Even videotaped trials, which offer an improvement to case descriptions in terms of richness of the stimulus, are limited in length, which may make some information, particularly manipulated variables, more salient than would be the case in a real trial (Ross et al., 1990). The lack of detail may overly simplify the decisions for mock jurors compared with what would be experienced in the real world. Simulated trials often do not include opening statements or judge’s instructions. Very few studies have focused on jury delib- eration and its effects on eyewitness belief and verdicts. Unrealistic dependent measures may be used, such as asking mock jurors to assign sentences or provide guilt responses on scales rather than as a binary verdict. It is commendable that a variety of paradigms have been used in the research, as each does have its limitations, and it is reassuring that the results have frequently converged. However, it might be beneficial for researchers to broaden their methods of measuring eyewitness belief in the laboratory in order to decrease the effect of factors such as demand characteristics (e.g., Kwong-See, Hoffman, & Wood, 2001). We encourage researchers to “think outside the box” to come up with new ways to test their hypotheses. Finally, leaving the laboratory and collecting field data is essential. It might be interesting for ex- ample to ask actual jurors what evidence they based their verdicts on in order to demon- strate the applicability of laboratory studies to the real world. POLICY IMPLICATIONS “Postdicting” eyewitness accuracy will never be an exact science. At the moment, we have no reason to believe that people, and thus the courts, are capable of discriminating be- tween cases of correct and mistaken identification based on eyewitness testimony. Al- though it may be possible to reduce the error rate in estimating accuracy to some degree, our best available information is likely to lead to about a 25% error (wrongful convic- tion) rate (Smith, Lindsay, & Pryke, 2000). And even this level of performance can only be achieved by regression analysis, not human judgment! Criminal justice systems need to acknowledge the limitations of eyewitnesses and do a better job of deciding when sufficient evidence exists, not only to convict, but also to arrest, charge, and prosecute suspects. In Canadian legal circles, one factor believed to lead to wrongful conviction is “tun- nel vision.” Once police and prosecutors become convinced of or committed to the guilt of the suspect, they fail to consider or actively dismiss evidence that may exonerate. Tunnel vision also may lead police and prosecutors to exaggerate the strength of the ev- idence supporting the hypothesis of guilt. These are examples of confirmation biases and 520 BOYCE, BEAUDRY, LINDSAY ch19_8038_Lindsay_II_LEA 6/28/06 3:36 AM Page 520 have long been understood by psychologists (Nickerson, 1998). How might such biases be counteracted? Following the Devlin Commission, the British argued that identification evidence alone should be insufficient to lead to conviction. They argued that corroboration of identification with some form of independent evidence should be required to dispel rea- sonable doubt. This seems like a good idea up to a point. Certainly wrongful convictions would be reduced by such a policy. However, there are too many cases that lack truly in- dependent, corroborating evidence. Many robbery cases, for example, would fall into this category. Police and prosecutors are not willing to abandon these cases. Thus it is understandable that shortly after proposing this policy the British abandoned it. The fact that the accused fit the description provided by the witness was deemed sufficient to cor- roborate an identification. Clearly such a trivial level of corroboration will not greatly re- duce errors (particularly since research suggests that description accuracy and identifi- cation accuracy are not closely related; Wells, 1985.) Better investigative procedures, particularly better identification procedures, are much more likely to reduce wrongful convictions than efforts to improve postdicting. Tunnel vision and many other considerations strongly support the use of blind testing (Charman & Wells, this volume). If officers are unaware of which lineup member is the suspect, their ability and propensity to influence witnesses would be greatly diminished. The se- quential lineup reduces misidentification dramatically with relatively small losses of cor- rect identifications (Lindsay & Wells, 1985; Steblay, Dysart, Fulero, & Lindsay, 2000). This is true even if strong lineup biases are present (Lindsay et al., 1991). As still newer and better procedures are developed, the ability to reduce the rate of false identification may be further enhanced (Dupuis & Lindsay, this volume). Procedures that reduce false identifications have many advantages over attempts to separate accurate from inaccurate identification decisions in court. Innocent people will be less likely to be arrested, charged, prosecuted, and convicted. Police will continue to investigate cases to find the true perpetrators. Until and unless we can develop highly accurate methods for determining the accuracy of eyewitnesses after identification deci- sions have been made, the best hope for reducing the unacceptably high rate of false identification and wrongful conviction is to develop improved identification procedures and convince, or better yet require the police to use them. REFERENCES Bell, B. E., & Loftus, E. F. (1988). Degree of detail of eyewitness testimony and mock juror judg- ments. Journal of Applied Social Psychology, 18, 1171–1192. Berman, G. L., & Cutler, B. L. (1996). Effects of inconsistencies in eyewitness testimony on mock-juror decision making. Journal of Applied Psychology, 81, 170–177. Berman, G. L., Narby, D. J., & Cutler, B. L. (1995). Effects of inconsistent eyewitness statements on mock-juror’s evaluations of the eyewitness, perceptions of defendant culpability and ver- dicts. Law & Human Behavior, 19, 79–88. Bothwell, R. K., Brigham, J. C., & Malpass, R. S. (1989). Cross-racial identification. Personality & Social Psychology Bulletin, 15, 19–25. 19. BELIEF OF EYEWITNESS EVIDENCE 521 ch19_8038_Lindsay_II_LEA 6/28/06 3:36 AM Page 521 Bradfield, A. L., & Wells, G. L. (2000). The perceived validity of eyewitness identification testi- mony: A test of the five Biggers criteria. Law & Human Behavior, 24, 581–594. Bray, R., & Kerr, N. L. (1979). Use of the simulation method in the study of jury behaviour: Some methodological considerations. Law & Human Behavior, 3, 107–119. Brewer, N., & Burke, A. (2002). Effects of testimonial inconsistencies and eyewitness confidence on mock-juror judgments. Law & Human Behavior, 26, 353–364. Brigham, J. C., & Bothwell, R. K. (1983). The ability of prospective jurors to estimate the accu- racy of eyewitness identifications. Law & Human Behavior, 7, 19–30. Brigham, J. C., Maass, A., Snyder, L. D., & Spaulding, K. (1982). Accuracy of eyewitness identi- fication in a field setting. Journal of Personality & Social Psychology, 42, 673–681. Bull, R., & Clifford, B. R. (1984). Earwitness voice recognition accuracy. In G. L. Wells & E. F. Loftus (Eds.), Eyewitness testimony: Psychological perspectives (pp. 92–124). New York: Cam- bridge University Press. Cavoukian, A. (1981). The influence of eyewitness identification evidence. Dissertation Abstracts International, 42, 352–353. Chance, J. E., & Goldstein, A. G. (1996). The other-race effect and eyewitness identification. In S. Sporer & R. Malpass (Eds.), Psychological issues in eyewitness identification (pp. 153–176). Mahwah, NJ: Lawrence Erlbaum Associates. Clifford, B. R., & Hollin, C. R. (1981). Effects of the type of incident and the number of perpe- trators on eyewitness memory. Journal of Applied Psychology, 66, 364–370. Cutler, B. L., Penrod, S. D., & Dexter, H. R. (1990). Juror sensitivity to eyewitness identification evidence. Law & Human Behavior, 14, 185–191. Cutler, B. L., Penrod, S. D., & Martens, T. K. (1987). The reliability of eyewitness identification: The role of system and estimator variables. Law & Human Behavior, 11, 233–258. Cutler, B. L., Penrod, S. D., & Stuve, T. E. (1988). Juror decision making in eyewitness identifi- cation cases. Law & Human Behavior, 12, 41–55. Deffenbacher, K. A., & Loftus, E. F. (1982). Do jurors share a common understanding concern- ing eyewitness behavior? Law & Human Behavior, 6, 15–30. Devenport, J. L., Stinson, V., Cutler, B. L., & Kravitz, D. A. (2002). How effective are the cross- examination and expert testimony safeguards? Jurors’ perceptions of the suggestiveness and fairness of biased lineup procedures. Journal of Applied Psychology, 87, 1042–1054. Devlin, Hon Lord Patrick. (1976). Report to the Secretary of State for the Home Department of the Departmental Committee on Evidence of Identification in Criminal Cases. HMSO. Diamond, S. S. (1979). Simulation: Does the microscope lens distort? Law & Human Behavior, 3, 1–4. Elliott, R., Farrington, B., & Manheimer, H. (1988). Eyewitnesses credible and discredible. Jour- nal of Applied Social Psychology, 18, 1411–1422. Fisher, R. P., & Cutler, B. L. (1995). The relation between consistency and accuracy of eyewit- ness testimony. In G. Davies, S. Lloyd-Bostock, et al. (Eds.), Psychology, law, and criminal jus- tice: International developments in research and practice (pp. 21–28). Oxford, England: Walter De Gruyter. Hatvany, N., & Strack, F. (1980). The impact of a discredited key witness. Journal of Applied Social Psychology, 10, 490–509. Kassin, S. M., & Barndollar, K. A. (1992). The psychology of eyewitness testimony: A comparison of experts and prospective jurors. Journal of Applied Social Psychology, 22(16), 1241–1249. Kassin, S. M., & Neumann, K. (1997). On the power of confession evidence: An experimental test of the fundamental difference hypothesis. Law & Human Behavior, 21, 469–484. Kennedy, T. D., & Haygood, R. C. (1992). The discrediting effect in eyewitness testimony. Journal of Applied Social Psychology, 22, 70–82. Kwong-See, S. T., Hoffman, H. G., & Wood, T. L. (2001). Perceptions of an old female eyewitness: Is the older eyewitness believable? Psychology & Aging, 16, 346–350. 522 BOYCE, BEAUDRY, LINDSAY ch19_8038_Lindsay_II_LEA 6/28/06 3:36 AM Page 522 Lampinen, J. M., Judges, D. P., Odegard, T. N., & Hamilton, S. (2005). The reactions of mock jurors to the Department of Justice guidelines for the collection and preservation of eye- witness evidence. Basic and Applied Social Psychology, 27, 155–162. Leippe, M. R. (1985). The influence of eyewitness nonidentifications on mock-jurors’ judgments of a court case. Journal of Applied Social Psychology, 15, 656–672. Leippe, M. R., Manion, A. P., & Romanczyk, A. (1992). Eyewitness persuasion: How and how well do fact finders judge the accuracy of adults’ and children’s memory reports? Journal of Personality & Social Psychology, 63, 181–197. Leippe, M. R., & Romanczyk, A. (1989). Reactions to child (versus adult) eyewitnesses: The in- fluence of jurors’ preconceptions and witness behavior. Law & Human Behavior, 13, 103–132. Leippe, M. R., Wells, G. L., & Ostrom, T. M. (1978). Crime seriousness as a determinant of ac- curacy in eyewitness identification. Journal of Applied Psychology, 63, 345–351. Lindsay, R. C. L. (1994). Expectations of eyewitness performance. In D. Ross, D. Read, & M. Toglia (Eds). Adult eyewitness testimony: Current trends and developments (pp. 362–384). New York: Cambridge University Press. Lindsay, R. C. L., & Bellinger, K. (1999). Alternatives to the sequential lineup: The importance of controlling the pictures. Journal of Applied Psychology, 84, 315–321. Lindsay, R. C. L., Lea, J. A., Nosworthy, G. J., Fulford, J. A., Hector, J., LeVan, V., et al. (1991). Biased lineups: Sequential presentation reduces the problem. Journal of Applied Psychology, 76, 796–802. Lindsay, R. C. L., Lim, R., Marando, L., & Cully, D. (1986). Mock-juror evaluations of eye- witness testimony: A test of metamemory hypotheses. Journal of Applied Social Psychology, 16, 447–459. Lindsay, R. C. L., Pozzulo, J. D., Craig, W., Lee, K., & Corber, S. (1997). Simultaneous lineups, sequential lineups, and showups: Eyewitness identification decisions of adults and children. Law & Human Behavior, 21, 391–404. Lindsay, R. C. L., & Wells, G. L. (1980). What price justice? Exploring the relationship of lineup fairness to identification accuracy. Law & Human Behavior, 4, 303–313. Lindsay, R. C. L., & Wells, G. L. (1985). Improving eyewitness identifications from lineups: Si- multaneous versus sequential lineup presentation. Journal of Applied Psychology, 70, 556–564. Lindsay, R. C. L., Wells, G. L., & O’Connor, F. J. (1989). Mock-juror belief of accurate and in- accurate eyewitnesses: A replication and extension. Law & Human Behavior, 13, 333–339. Lindsay, R. C. L., Wells, G. L., & Rumpel, C. M. (1981). Can people detect eyewitness-identifica- tion accuracy within and across situations? Journal of Applied Psychology, 66, 79–89. Loftus, E. (1974). Reconstructing memory: The incredible eyewitness. Psychology Today, 8, 116–119. Loftus, E. F., & Hoffman, H. G. (1989). Misinformation and memory: The creation of new mem- ories. Journal of Experimental Psychology: General, 118, 100–104. MacLin, O. H., MacLin, M. K., & Malpass, R. S. (2001). Race, arousal, attention, exposure and delay: An examination of factors moderating face recognition. Psychology, 7, 134–152. Malpass, R. S., & Devine, P. G. (1981). Eyewitness identification: Lineup instructions and the absence of the offender. Journal of Applied Psychology, 66, 482–489. McAllister, H. A., & Bregman, N. J. (1986). Juror underutilization of eyewitness nonidentifica- tions: Theoretical and practical implications. Journal of Applied Psychology, 71, 168–170. McAllister, H. A., & Bregman, N. J. (1989). Juror underutilization of eyewitness nonidentifica- tions: A test of the disconfirmed expectancy explanation. Journal of Applied Social Psychology, 19, 20–29. McAllister, H. A., Dale, R. H., & Keay, C. E. (1993). Effects of lineup modality on witness cred- ibility. Journal of Social Psychology, 133, 365–376. McCauley, M. R., & Parker, J. F. (2001). When will a child be believed? The impact of the vic- tim’s age and juror’s gender on children’s credibility and verdict in a sexual-abuse case. Child Abuse & Neglect, 25, 523–539. 19. BELIEF OF EYEWITNESS EVIDENCE 523 ch19_8038_Lindsay_II_LEA 6/28/06 3:36 AM Page 523 McCloskey, M., & Egeth, H. E. (1984). Process and outcome considerations in juror evaluation of eyewitness testimony. American Psychologist, 39, 1065–1066. Myers, B., & Arbuthnot, J. (1997). Polygraph testimony and juror judgements: A comparison of the Guilty Knowledge Test and the Control Question Test. Journal of Applied Social Psychol- ogy, 27, 1421–1437. Nickerson, R. S. (1998). Confirmation bias: A ubiquitous phenomenon in many guises. Review of General Psychology, 2, 175–220. Nunez, N., McCoy, M. L., Clark, H. L., & Shaw, L. A. (1999). The testimony of elderly victim/ witnesses and their impact on juror decisions: The importance of examining multiple stereo- types. Law & Human Behavior, 23, 413–423. Pryke, S., Lindsay, R. C. L., Dysart, J. E., & Dupuis, P. (2004). Multiple independent identifica- tion decisions: A method of calibrating eyewitness identifications. Journal of Applied Psychol- ogy, 89, 73–84. Ross, D. F., Dunning, D., Toglia, M. P., & Ceci, S. J. (1990). The child in the eyes of the jury: Assessing mock jurors’ perceptions of the child witness. Law & Human Behavior, 14, 5–23. Ross, D. F., Jurden, F. H., Linsday, R. C. L., & Keeney, J. M. (2003). Replications and limitations of a two-factor model of child witness credibility. Journal of Applied Social Psychology, 33, 418–430. Ross, D. F., Miller, B., & Moran, P. (1987). The child in the eyes of the jury: Assessing mock jurors’ perceptions of the child witness. In S. Ceci, D. Ross, & M. Toglia (Eds). Children’s eyewitness memory (pp. 121–141). New York: Springer-Verlag. Saunders, D. M., Vidmar, N., & Hewitt, E. C. (1983). Eyewitness testimony and the discrediting effect. In S. M. A. Lloyd-Bostock & B. R. Clifford (Eds.), Evaluating witness evidence. Lon- don: John Wiley. Scheck, B., & Neufeld, P. (2004). The innocence project. Retrieved March 12, 2004 from The Innocence Project Homepage: http://www.innocenceproject.org/ Shaw, J. S. I., Garcia, L. A., & McClure, K. A. (1999). A lay perspective on the accuracy of eye- witness testimony. Journal of Applied Social Psychology, 29, 52–71. Skolnick, P., & Shaw, J. I. (2001). A comparison of eyewitness and physical evidence on mock- juror decision making. Criminal Justice & Behavior, 28, 614–630. Smith, M. C. (1983). Hypnotic memory enhancement of witnesses: Does it work? Psychological Bulletin, 94, 387–407. Smith, S. M., Lindsay, R. C. L., & Pryke, S. (2000). Postdictors of eyewitness errors: Can false identifications be diagnosed? Journal of Applied Psychology, 85, 542–550. Spanos, N. P., Gwynn, M. I., & Terrade, K. (1989). Effects on mock jurors of experts favorable and unfavorable toward hypnotically elicited eyewitness testimony. Journal of Applied Psy- chology, 74, 922–926. Steblay, N. M. (1992). A meta-analytic review of the weapon focus effect. Law & Human Behav- ior, 16, 413–424. Steblay, N., Dysart, J., Fulero, S., & Lindsay, R. C. L. (2001). Eyewitness accuracy rates in se- quential and simultaneous lineup presentations: A meta-analytic comparison. Law & Human Behavior, 25, 459–473. Technical Working Group for Eyewitness Evidence. (1999). Eyewitness evidence: A guide for law enforcement. Washington, DC: U.S. Department of Justice, Office of Justice Programs, National Institute of Justice. Weinberg, H. I., & Baron, R. S. (1982). The discredible eyewitness. Personality & Social Psychol- ogy Bulletin, 8, 60–67. Wells, G. L. (1978). Applied eyewitness-testimony research: System variables and estimator vari- ables. Journal of Personality & Social Psychology, 36, 1546–1557. Wells, G. L. (1984). The psychology of lineup identifications. Journal of Applied Social Psychology, 14, 89–103. 524 BOYCE, BEAUDRY, LINDSAY ch19_8038_Lindsay_II_LEA 6/28/06 3:36 AM Page 524 Wells, G. L. (1985). Verbal descriptions of faces from memory: Are they diagnostic of identifica- tion accuracy? Journal of Applied Psychology, 70, 619–626. Wells, G. L., & Bradfield, A. L. (1998). “Good, you identified the suspect:” Feedback to eyewit- nesses distorts their reports of the witnessing experience. Journal of Applied Psychology, 83, 360–376. Wells, G. L., Ferguson, T. J., & Lindsay, R. C. L. (1981). The tractability of eyewitness confi- dence and its implications for triers of fact. Journal of Applied Psychology, 66, 688–696. Wells, G. L., & Leippe, M. R. (1981). How do triers of fact infer the accuracy of eyewitness iden- tifications? Using memory for peripheral detail can be misleading. Journal of Applied Psychol- ogy, 66, 682–687. Wells, G. L., Lindsay, R. C., & Ferguson, T. J. (1979). Accuracy, confidence, and juror percep- tions in eyewitness identification. Journal of Applied Psychology, 64, 440–448. Wells, G. L., Lindsay, R. C., & Tousignant, J. P. (1980). Effects of expert psychological advice on human performance in judging the validity of eyewitness testimony. Law & Human Behavior, 4, 275–285. Wells, G. L., & Windschitl, P. D. (1999). Stimulus sampling and social psychological experimen- tation. Personality and Social Psychology Bulletin, 25, 1115–1125. Wright, D. B., & Stroud, J. N. (2002). Age differences in lineup identification accuracy: People are better with their own age. Law & Human Behavior, 26, 641–654. Yarmey, A. D. (1984). Accuracy and credibility of the elderly witness. Canadian Journal on Aging, 3, 79–90. Yarmey, A. D., & Jones, H. P. (1983). Is the psychology of eyewitness identification a matter of common sense? In S. Lloyd-Bostock & B. R. Clifford (Eds.), Evaluating witness evidence (pp. 13–40). Chichester, England: John Wiley & Sons. 19. BELIEF OF EYEWITNESS EVIDENCE 525 ch19_8038_Lindsay_II_LEA 6/28/06 3:36 AM Page 525 PREFACE CRIMINAL LAW 2.0 HON. ALEX KOZINSKI1 I Although we pretend otherwise, much of what we do in the law is guesswork. For example, we like to boast that our criminal justice system is heavily tilted in favor of criminal defendants because we’d rather that ten guilty men go free than an innocent man be convicted.2 There is reason to doubt it, because very few criminal defendants actually go free after trial.3 Does this mean that many guilty men are never charged because the prosecution is daunted by its heavy burden of proof? Or is it because jurors almost always start with a strong presumption that someone wouldn’t be charged with a crime unless the police and the prosecutor were firmly convinced of his guilt? We tell ourselves and the public that it’s the former and not the latter, but we have no way of knowing. They say that any prosecutor worth his salt can get a grand jury to indict a ham sandwich. It may be that a decent prosecutor could get a petit jury to convict a eunuch of rape. The “ten guilty men” aphorism is just one of many tropes we assimilate long before we become lawyers. How many of us, the author included, were inspired to go to law school after watching Juror #8 turn his colleagues around by sheer force of reason and careful dissection of the evidence?4 “If that’s what the law’s about, then I want to be a lawyer!” I thought to myself. But is it? We know very little about this because very few judges, lawyers and law professors have spent significant time as jurors.5 In fact, much of the so-called wisdom that has been handed down to us about the workings of the legal system, and the criminal process in particular, has been undermined by experience, legal scholarship and common sense. Here are just a few examples: 1. Eyewitnesses are highly reliable. This belief is so much part of our culture that one often hears talk of a “mere” circumstantial case as contrasted to a solid case based on eyewitness testimony. In fact, research shows that eyewitness identifications are highly unreliable,6 especially where the witness and the perpetrator are of different races.7 Eyewitness reliability is further compromised when the identification occurs under the stress of a violent crime, an accident or catastrophic event—which 1. The author is a judge on the Ninth Circuit. He wishes to acknowledge the extraordinary help provided by his law clerk, Joanna Zhang. © 2015, Alex Kozinski. 2. Actually, as Sasha Volokh points out, the number of guilty men we are willing to free to save an innocent one is somewhat indeterminate. See Alexander “Sasha” Volokh, n Guilty Men, 146 U. PA. L. REV. 173, 187-92 (1997). 3. According to the most recent United States Attorneys’ Annual Statistical Report, out of the 3424 federal criminal cases that went to trial in 2013, only 228, or about 6.7 percent, resulted in acquittals. See Dep’t of Justice, U.S. Attorneys’ Annual Statistical Report: Fiscal Year 2013, at 51-56, Tables 2 & 2A (Sept. 22, 2014), available at http://www.justice.gov/sites/default/files/usao/legacy/2014/09/22/13statrpt.pdf. 4. 12 ANGRY MEN (United Artists 1956). 5. I’ve done it twice, and I can’t say I know much more about the process. One case was so clear-cut that only one verdict was possible. The other one was closer and resulted in a hung jury, but doubtless would have resulted in a swift conviction but for my participation. 6. See Roger B. Handberg, Expert Testimony on Eyewitness Identification: A New Pair of Glasses for the Jury, 32 AM. CRIM. L. REV. 1013, 1018-22 (1995). 7. See John P. Rutledge, They All Look Alike: The Inaccuracy of Cross-Racial Identifications, 28 AM. J. CRIM. L. 207, 211-15 (2001). 44 GEO. L.J. ANN. REV. CRIM. PROC (2015) iii pretty much covers all situations where identity is in dispute at trial.8 In fact, mistaken eyewitness testimony was a factor in more than a third of wrongful conviction cases.9 Yet, courts have been slow in allowing defendants to present expert evidence on the fallibility of eyewitnesses; many courts still don’t allow it.10 Few, if any, courts instruct juries on the pitfalls of eyewitness identification or caution them to be skeptical of eyewitness testimony. 2. Fingerprint evidence is foolproof. Not so. Identifying prints that are taken by police using fingerprinting equipment and proper technique may be a relatively simple process,11 but latent prints left in the field are often smudged and incomplete, and the identification process becomes more art than science. When tested by rigorous scientific methods, fingerprint examiners turn out to have a significant error rate.12 Perhaps the best-known example of such an error occurred in 2004 when the FBI announced that a latent print found on a plastic bag near a Madrid terrorist bombing was “a 100 percent match” to Oregon attorney Brandon Mayfield.13 The FBI eventu- ally conceded error when Spanish investigators linked the print to someone else. 3. Other types of forensic evidence are scientifically proven and therefore infallible. With the exception of DNA evidence (which has its own issues), what goes for fingerprints goes double and triple for other types of forensic evidence: Spectrographic voice identification error rates are as high as 63%, depending on the type of voice sample tested. Handwriting error rates average around 40% and 8. See Thomas Dillickrath, Expert Testimony on Eyewitness Identification: Admissibility and Alterna- tives, 55 U. MIAMI L. REV. 1059, 1063-64 (2001). 9. See The National Registry of Exonerations, % Exonerations by Contributing Factor (last visited Apr. 7, 2015), http://www.law.umich.edu/special/exoneration/Pages/ExonerationsContribFactorsByCrime. aspx (mistaken eyewitness identifications were a contributing factor in 34 percent of all exonerations recorded in the database). 10. The Seventh and Eleventh Circuits have “consistently looked unfavorably upon such testimony,” United States v. Smith, 122 F.3d 1355, 1357 (11th Cir. 1997), with the Eleventh Circuit going as far as to hold that eyewitness expert testimony is per se inadmissible. See United States v. Holloway, 971 F.2d 675, 679 (11th Cir. 1992); United States v. Hall, 165 F.3d 1095, 1104 (7th Cir. 1999) (noting a presumption against admission of eyewitness expert testimony). The Second Circuit takes a similarly skeptical approach, holding that eyewitness expert testimony likely usurps the jury’s role of determining witness credibility. See United States v. Lumpkin, 192 F.3d 280, 289 (2d Cir. 1999). The Third and Sixth Circuits, by contrast, have welcomed the admission of eyewitness expert testimony. See United States v. Smith, 736 F.2d 1103, 1107 (6th Cir. 1984); United States v. Stevens, 935 F.2d 1380, 1397-98 (3d Cir. 1991); see also United States v. Hines, 55 F. Supp. 2d 62, 72 (D. Mass. 1999) (“[w]hile jurors may well be confident that they can draw the appropriate inferences about eyewitness identification directly from their life experiences, their confidence may be misplaced, especially where cross-racial identification is concerned”). And don’t even get me started on the state courts—they’re all over the place. 11. Then, again, maybe not. When the FBI Special Agent came around to take fingerprints for my background check, he brought a fingerprint kit and 10 cards, all of which he insisted on filling—about 120 prints in all. “Why so many?” I asked. “Because sometimes they don’t come out so clear so we like to make backups.” He carefully rolled his ten dozen prints and left . . . then came back a week later with 10 more cards: None of the first set were any good. True story. 12. “[F]orensic fingerprint identification almost never deals in whole fingerprints. Rather, technicians use ‘latent’ fingerprints—invisible impressions that they ‘develop’ using a powder or a chemical developing agent. Latent prints are usually fragmentary, blurred, overlapping, and otherwise distorted. The challenge is to match the latent print to a pristine inked (or, these days, optically scanned) print taken under ideal conditions at the police station.” Simon Cole, The Myth of Fingerprints: A Forensic Science Stands Trial, 10 LINGUA FRANCA, no. 8, 2000; see Simon A. Cole, More Than Zero: Accounting for Error in Latent Fingerprint Identification, 95 J. CRIM. L. & CRIMINOLOGY 985, 994-1029 (2005); Michael J. Saks & Jonathan J. Koehler, The Coming Paradigm Shift in Forensic Identification Science, 309 SCI. 892, 895 (2005); Andy Newman, Fingerprinting’s Reliability Draws Growing Court Challenges, N.Y. TIMES (Apr. 7, 2001), http://www.nytimes.com/2001/04/07/us/fingerprinting-s-reliability-draws-growing-court- challenges.html. In United States v. Llera Piaza, 188 F. Supp. 2d 549, 564 (E.D. Pa. 2002), for example, Judge Louis Pollack rejected fingerprint identification expert testimony after concluding that the field of fingerprint identification has failed to systematically test its underlying assumptions and claims of expertise. 13. Saks & Koehler, supra n.12, at 894. 44 GEO. L.J. ANN. REV. CRIM. PROC (2015)iv sometimes approach 100%. False-positive error rates for bite marks run as high as 64%. Those for microscopic hair comparisons are about 12% (using results of mitochondrial DNA testing as the criterion).14 Other fields of forensic expertise, long accepted by the courts as largely infallible, such as bloodstain pattern identification, foot and tire print identification and ballis- tics have been the subject of considerable doubt.15 Judge Nancy Gertner, for example, has expressed skepticism about admitting expert testimony on handwriting,16 ca- nines,17 ballistics18 and arson.19 She has lamented that while “the Daubert-Kumho standard [for admitting expert witness testimony] does not require the illusory perfection of a television show (CSI, this wasn’t), when liberty hangs in the balance— and, in the case of the defendants facing the death penalty, life itself—the standards should be higher . . . than [those that] have been imposed across the country.”20 Some fields of forensic expertise are built on nothing but guesswork and false common sense.21 Many defendants have been convicted and spent countless years in prison based on evidence by arson experts who were later shown to be little better than witch doctors.22 Cameron Todd Willingham may have lost his life over it.23 14. Id. at 895 (internal citations omitted); see United States v. Starzecpyzel, 880 F. Supp. 1027, 1038 (S.D.N.Y. 1995) (McKenna, J.) (“the testimony at the Daubert hearing firmly established that forensic document examination, despite the existence of a certification program, professional journals and other trappings of science, cannot, after Daubert, be regarded as scientific . . . knowledge”) (internal quotation marks omitted); see also Radley Balko, How the Flawed “Science” of Bite Mark Analysis Has Sent Innocent People to Prison, WASH. POST (Feb. 13, 2015), http://www.washingtonpost.com/news/the-watch/ wp/2015/02/13/how-the-flawed-science-of-bite-mark-analysis-has-sent-innocent-people-to-jail/ (4-part se- ries criticizing the failure of courts to accept the consensus in the scientific community that “bite mark matching isn’t reliable and has no scientific foundation for its underlying premises, and that until and unless further testing indicates otherwise, it shouldn’t be used in the courtroom”). 15. See TERRY LABER ET AL., NATIONAL INSTITUTE OF JUSTICE, FINAL REPORT, RELIABILITY ASSESSMENT OF CURRENT METHODS IN BLOODSTAIN PATTERN ANALYSIS (2014), available at https://www.ncjrs.gov/pdffiles1/nij/ grants/247180.pdf; Yaron Shor & Sarena Weisner, A Survey on the Conclusions Drawn on the Same Footwear Marks Obtained in Actual Cases by Several Experts Throughout the World, 44 J. FORENSIC SCI. 380, 383 (1999); COMMITTEE ON IDENTIFYING THE NEEDS OF THE FORENSIC SCIENCES COMMUNITY, STRENGTHEN- ING FORENSIC SCIENCE IN THE UNITED STATES: A PATH FORWARD, NATIONAL RESEARCH COUNCIL 154-55 (2009), available at https://www.ncjrs.gov/pdffiles1/nij/grants/228091.pdf (noting the unknown, and yet to be sufficiently tested, reliability of ballistics analysis). 16. Hines, 55 F. Supp. 2d at 69-71 (ruling that a handwriting expert may not give an ultimate conclusion on the author of a robbery note, and remarking that “[t]here is no academic field known as handwriting analysis,” as “[t]his is a ‘field’ that has little efficacy outside of a courtroom”). 17. United States v. Hebshie, 754 F. Supp. 2d 89, 119 (D. Mass. 2010) (“There are no peer reviewed standardized methods of training detective dogs; their reliability is in fact highly variable”) (citing Michael E. Kurz et al., Effect of Background Interference on Accelerant by Canines, 41 J. FORENSIC SCI. 868 (1996)). 18. United States v. Green, 405 F. Supp. 2d 104, 107-08 (D. Mass. 2005) (finding “serious deficien- cies” in the ballistics expert’s proffered testimony, including the expert’s failure to “cite any reliable report describing his error rates, that of his laboratory, or indeed, that of the field”). 19. Hebshie, 754 F. Supp. at 114-15 (summarizing recent “public and professional literature reflect- [ing] increasing scrutiny of arson evidence by experts in both the scientific and legal fields as well as by the public at large,” and expressing concerns about arson expert testimony rooted in “bad science” and “unreliable methodologies”). 20. Green, 405 F. Supp. 2d at 109. 21. “[S]ubjective, pattern-based forensic techniques—like hair and bite-mark comparisons” leave much room for error. See Spencer S. Hsu, FBI Admits Flaws in Hair Analysis Over Decades, WASH. POST (Apr. 18, 2015), http://www.washingtonpost.com/local/crime/fbi-overstated-forensic-hair-matches-in-nearly- all-criminal-trials-for-decades/2015/04/18/39c8d8c6-e515-11e4-b510-962fcfabc310_story.html; Roger Ko- ppl, CSI for Real: How to Improve Forensic Science, REASON FOUND., Dec. 2007, at 3-18 (detailing inadequacies in the practice of forensic science). 22. See Paul Bieber, Anatomy of a Wrongful Arson Conviction, THE ARSON PROJECT, http://thearsonproject. org/charm/wp-content/uploads/2014/08/wrongful_convictions.pdf (since 1989, 29 exonerations have in- volved arson convictions). 23. See Robert Tanner, Science Casts Doubt on Arson Convictions, WASH. POST (Dec. 9, 2006), http://www.washingtonpost.com/wp-dyn/content/article/2006/12/09/AR2006120900357.html; David Grann, 44 GEO. L.J. ANN. REV. CRIM. PROC (2015) v 4. DNA evidence is infallible. This is true to a point. DNA comparison, when properly conducted by an honest, trained professional will invariably reach the correct result. But the integrity of the result depends on a variety of factors that are, unfortunately, not nearly so foolproof: the evidence must be gathered and preserved so as to avoid contamination; the testing itself must be conducted so that the two samples being compared do not contaminate each other; the examiner must be competent and honest.24 As numerous scandals involving DNA testing labs have shown,25 these conditions cannot be taken for granted, and DNA evidence is only as good as the weakest link in the chain. 5. Human memories are reliable. Much of what we do in the courtroom relies on human memory. When a witness is asked to testify about past events, the accuracy of his account depends not only on his initial perception, but on the way the memories are recorded, stored and retrieved. For a very long time, it was believed that stored memories were much like video tape or film—an accurate copy of real-word experi- ence that might fade with the passage of time or other factors, but could not be distorted or embellished. Science now tells us that this view of human memory is fundamentally flawed. The mind not only distorts and embellishes memories, but a variety of external factors can affect how memories are retrieved and described. In an early study by cognitive psychologist Elizabeth Loftus, people were shown videos of car accidents and then questioned about what they saw.26 The group asked how fast the cars were going when they “smashed” into each other estimated 6.5 mph faster than the group asked how fast the cars were going when they “hit” each other.27 A week later, almost a third of those who were asked about the “smash” recalled seeing broken glass, even though there was none.28 Trial by Fire, NEW YORKER (Sept. 7, 2009), http://www.newyorker.com/magazine/2009/09/07/trial-by-fire. 24. The DNA scandals continue to this day. See Jaxon Van Derbeken, DNA Lab Irregularities May Endanger Hundreds of SFPD Cases, SF GATE (Mar. 28, 2015), http://www.sfgate.com/crime/article/DNA- lab-irregularities-may-endanger-hundreds-of-6165643.php (noting that the San Francisco Police Depart- ment is facing accusations that its DNA crime lab technicians have been “fill[ing] in the gaps of poor-quality, incomplete genetic evidence” and passing them off as “definitive test results to the state’s offender tracking database, something that would not have been allowed with the original, lower-quality DNA evidence”); see also Kimberly C. Boies, Misuse of DNA Evidence is Not Always a “Harmless Error”: DNA Evidence, Prosecutorial Misconduct, and Wrongful Conviction, 17 TEX. WESLEYAN L. REV. 403, 414-16 (2011); Paul C. Giannelli, Wrongful Convictions and Forensic Science: The Need to Regulate Crime Labs, 86 N.C. L. REV. 172-95 (2007). 25. See Stuart Taylor Jr., Opening Argument–Innocents in Prison, NAT’L J. (Aug. 4, 2007), http://www. nationaljournal.com/magazine/opening-argument-innocents-in-prison-20070804; Mark Hansen, Crime Labs Under the Microscope After a String of Shoddy, Suspect and Fraudulent Results, ABA J. (Sep. 1, 2013), http://www.abajournal.com/magazine/article/crime_labs_under_the_microscope_after_a_string_of_ shoddy_suspect_and_fraudu/; Adam Liptak & Ralph Blumenthal, New Doubt Cast on Testing in Houston Police Crime Lab, N.Y. TIMES (Aug. 5, 2004), http://www.nytimes.com/2004/08/05/us/new-doubt-cast-on- testing-in-houston-police-crime-lab.html; Belinda Luscombe, When The Evidence Lies, TIME MAG. (May 13, 2001), http://content.time.com/time/magazine/article/0,9171,109625,00.html. 26. See Elizabeth F. Loftus & John C. Palmer, Reconstruction of Automobile Destruction: An Example of the Interaction Between Language and Memory, 13 J. VERBAL LEARNING & VERBAL BEHAV. 585-89 (1974), available at https://webfiles.uci.edu/eloftus/LoftusPalmer74.pdf. 27. Id. at 586, Table 1. 28. Id. at 587, Table 2. Professor Loftus has shown it is even possible to manufacture false memories. See, e.g., Elizabeth F. Loftus & Jacqueline E. Pickrell, The Formation of False Memories, 25 PSYCHIATRIC ANNALS 720-25 (1995), https://webfiles.uci.edu/eloftus/Loftus_Pickrell_PA_95.pdf. For example, she gave students each a packet describing three real childhood memories and a false one, and told the students that all four memories were real and took place with a close family member. In follow-up interviews asking the students to describe their memories, 7 of 24 students remembered the false event in their packet and some added their own details to that false memory. Id. at 722. Loftus was also able to convince participants in another experiment that they’d experienced traumatic events that never hap- 44 GEO. L.J. ANN. REV. CRIM. PROC (2015)vi This finding has troubling implications for criminal trials where witnesses are questioned long and hard by police and prosecutors before the defense gets to do so—if ever. There is thus plenty of opportunity to shape and augment a witness’s memory to bring it into line with the prosecutor’s theory of what happened. Yet with rare exceptions, courts do not permit expert testimony on human memory.29 For example, the district judge in the Scooter Libby case denied a defense motion for a memory expert, even though the key issue at trial was whose recollection of a 4-year-old telephone conversation should be believed.30 At least one member of the jury that convicted Libby lamented the lack of expert testimony on the subject.31 And a key witness in that case recently suggested in her memoirs that her memory may have been distorted by the prosecutor’s crafty questioning.32 Given the malleability of human memory, it should come as no surprise that many wrongful convictions have been the result of faulty witness memories, often manipu- lated by the police or the prosecution.33 6. Confessions are infallible because innocent people never confess. We now know that this is not true. Innocent people do confess with surprising regularity. Harsh interrogation tactics, a variant of Stockholm syndrome, the desire to end the ordeal, emotional and financial exhaustion, family considerations and the youth or feeble-mindedness of the suspect can result in remarkably detailed confessions that are later shown to be utterly false.34 pened, such as witnessing drug busts and breaking windows with their hands. See Elizabeth F. Loftus, Illusions of Memory, 142 PROCEEDINGS OF THE AM. PHIL. SOC., 60-73 (1998). Judge Mark Bennett provides a comprehensive overview of the existing cognitive psychological research on memory. See Mark W. Bennett, Unspringing The Witness Memory and Demeanor Trap: What Every Judge And Juror Needs to Know About Cognitive Psychology And Witness Credibility, AM. U. L. REV. (forthcoming 2015), available at http://papers.ssrn.com/sol3/papers.cfm?abstract_id2581650. 29. See, e.g., United States v. Affleck, 776 F.2d 1451, 1458 (10th Cir. 1985) (“Specialized testimony explaining memory . . . is improper. The average person is able to understand that people forget; thus, a faulty memory is a matter for cross-examination.”). Testimony on memory has been admitted in limited circumstances, such as in cases involving mistaken eyewitness identifications, see supra n.10, and repressed memory caused by stress or trauma, see Isely v. Capuchin Province, 877 F. Supp. 1055, 1064, 1066 (E.D. Mich. 1995) (admitting expert testimony on “repressed memory, its validity and reliability, and whether or not [the plaintiff] has, in fact, experienced repressed memory and/or post-traumatic stress disorder”). 30. United States v. Libby, 461 F. Supp. 2d 3, 16 (D.D.C. 2006). 31. See Peter Berkowitz, The False Evidence Against Scooter Libby, WALL ST. J. (Apr. 6, 2015), http://www.wsj.com/articles/the-false-evidence-against-scooter-libby-1428365673. 32. See JUDITH MILLER, THE STORY: A REPORTER’S JOURNEY (2015); see infra n.113. 33. See infra nn.51-52 and accompanying text (discussing how a 12-year-old boy accused the wrong man of a murder after the police fed the boy details of the crime); The National Registry of Exonerations, George Franklin, https://www.law.umich.edu/special/exoneration/Pages/casedetail.aspx?caseid3221 (George Franklin was convicted on the basis of his daughter’s testimony—20 years after the crime took place—that she had seen him commit the murder. He was released after it was revealed that the daughter had recalled the memory through hypnosis.); Jim Dwyer, Witness Accounts in Midtown Hammer Attack Show the Power of False Memory, N.Y. TIMES (May 14, 2015), http://www.nytimes.com/2015/05/15/ nyregion/witness-accounts-in-midtown-hammer-attack-show-the-power-of-false-memory.html?_r0. 34. One such instance took place in Lake County, Illinois, where local police interrogated Juan Rivera Jr. for 4 straight days until he suffered a psychological breakdown and confessed to sexually assaulting and killing a young girl. Rivera spent 20 years in prison until DNA evidence exonerated him in 2012. See Dan Hinkel, Wrongful Convictions: Exonerated Inmate Wins Early Round In Suit Against Lake County Officials, CHI. TRIB. (Oct. 14, 2013), http://articles.chicagotribune.com/2013-10-14/news/ct-met-juan-rivera- lawsuit-win-20131013_1_wrongful-convictions-dna-evidence-dna-exonerations; see also Brandon L. Gar- rett, The Substance of False Confessions, 62 STAN. L. REV. 1051, 1052-57 (2010); Jed S. Rakoff, Why Innocent People Plead Guilty, N.Y. REV. OF BOOKS (Nov. 20, 2014), http://www.nybooks.com/articles/ archives/2014/nov/20/why-innocent-people-plead-guilty/ (citing criminologists’ estimation that between 2 and 8 percent of convicted felons are innocent people who pleaded guilty, and noting that “young, unintelligent, or risk-averse defendants will often provide false confessions just because they cannot ‘take the heat’ of an interrogation”); Adam Cohen, Why Innocent Men Make False Confessions, TIME MAG. 44 GEO. L.J. ANN. REV. CRIM. PROC (2015) vii 7. Juries follow instructions. This is a presumption—actually more of a guess— that we’ve elevated to a rule of law.35 It is, of course, necessary that we do so because it links the jury’s fact-finding process to the law. In fact, however, we know very little about what juries actually do when they decide cases.36 Do they consider the instructions at all? Do they consider all of the instructions or focus on only some? Do they understand the instructions or are they confused? We don’t really know. We get occasional glimpses into the operations of juries when they send out questions or someone discloses juror misconduct, and even then the information we get is limited. But we have no convincing reason to believe that jury instructions in fact constrain jury behavior in all or even most cases.37 And, because the information we get from inside the jury room is so limited and sporadic, experience does little to improve our knowledge. Looking at 100 black boxes is no more informative than looking at one. 8. Prosecutors play fair. The Supreme Court has told us in no uncertain terms that a prosecutor’s duty is to do justice, not merely to obtain a conviction.38 It has also laid down some specific rules about how prosecutors, and the people who work for them, must behave—principal among them that the prosecution turn over to the defense exculpatory evidence in the possession of the prosecution and the police.39 There is reason to doubt that prosecutors comply with these obligations fully. The U.S. Justice Department, for example, takes the position that exculpatory evidence must be produced only if it is material.40 This puts prosecutors in the position of deciding whether tidbits that could be helpful to the defense are significant enough that a reviewing court will find it to be material, which runs contrary to the philosophy of the Brady/Giglio line of cases and increases the risk that highly exculpatory evidence will be suppressed. Beyond that, we have what I have described elsewhere as an “epidemic of Brady violations abroad in the land,”41 a phrase that has caused much controversy but brought about little change in the way prosecutors (Feb. 11, 2013), http://ideas.time.com/2013/02/11/why-innocent-men-make-false-confessions/; Shankar Vedantam, Confessions Not Always Clad in Iron, WASH. POST (Oct. 1, 2007), http://www.washingtonpost. com/wp-dyn/content/article/2007/09/30/AR2007093001326.html. 35. See Weeks v. Angelone, 528 U.S. 225, 234 (2000) (“A jury is presumed to follow its instruc- tions.”); Richardson v. Marsh, 481 U.S. 200, 211 (1987). 36. See David Alan Sklansky, Evidentiary Instructions and the Jury as Other, 65 STAN. L. REV. 407 (2013). 37. See id.; Krulewitch v. United States, 336 U.S. 440, 453 (1949) (Jackson, J., concurring) (“The naive assumption that prejudicial effects can be overcome by instructions to the jury . . . , all practicing lawyers know to be unmitigated fiction.”). 38. See Berger v. United States, 295 U.S. 78, 88 (1935) (“The United States Attorney is the representative not of an ordinary party to a controversy, but of a sovereignty whose obligation to govern impartially is as compelling as its obligation to govern at all; and whose interest, therefore, in a criminal prosecution is not that it shall win a case, but that justice shall be done.”). 39. See Brady v. Maryland, 373 U.S. 83, 87 (1963) (“suppression by the prosecution of evidence favorable to an accused upon request violates due process where the evidence is material either to guilt or to punishment”); United States v. Giglio, 405 U.S. 150, 154 (1972) (the Brady rule includes evidence that could be used to impeach a witness); see also Kyles v. Whitley, 514 U.S. 419, 437-38 (1995) (extending the state’s obligation under Brady to evidence in the possession of the police). 40. See, e.g., United States Attorneys’ Manual, Chapter 9-5.000, ISSUES RELATED TO TRIALS AND OTHER COURT PROCEEDINGS, http://www.justice.gov/usam/usam-9-5000-issues-related-trials-and-other-court- proceedings. 41. United States v. Olsen, 737 F.3d 625, 626 (9th Cir. 2013) (Kozinski, J., dissenting from denial of rehearing en banc); see People v. Velasco-Palacios, F068833, 2015 WL 782632 (Cal. Ct. App. Feb. 24, 2015) (noting that a prosecutor inserted a false confession into the transcript of the defendant’s police interrogation); Denis Slattery, Exclusive: Bronx Prosecutor Bashed and Barred from Courtroom for Misconduct, N.Y. DAILY NEWS (Apr. 4, 2014), http://www.nydailynews.com/new-york/bronx/bronx- prosecutor-barred-courtroom-article-1.1746238 (noting that a Bronx prosecutor failed to present evidence that would have freed a man held at Rikers Island on bogus rape charges). 44 GEO. L.J. ANN. REV. CRIM. PROC (2015)viii operate in the United States.42 9. The prosecution is at a substantial disadvantage because it must prove its case beyond a reasonable doubt. Juries are routinely instructed that the defendant is presumed innocent and the prosecution must prove guilt beyond a reasonable doubt, but we don’t really know whether either of these instructions has an effect on the average juror. Do jurors understand the concept of a presumption? If so, do they understand how a presumption is supposed to operate? Do they assume that the presumption remains in place until it is overcome by persuasive evidence or do they believe it disappears as soon as any actual evidence is presented? We don’t really know. Nor do we know whether juries really draw a distinction between proof by a preponderance, proof by clear and convincing evidence and proof beyond a reason- able doubt. These levels of proof, which lawyers and judges assume to be hermeti- cally sealed categories, may mean nothing at all in the jury room. My own experience as a juror certainly did nothing to convince me that my fellow jurors understood and appreciated the difference. The issue, rather, seemed to be quite simply: Am I convinced that the defendant is guilty? Even more troubling are doubts raised by psychological research showing that “whoever makes the first assertion about something has a large advantage over everyone who denies it later.”43 The tendency is more pronounced for older people than for younger ones, and increases the longer the time-lapse between assertion and denial. So is it better to stand mute rather than deny an accusation? Apparently not, because “when accusations or assertions are met with silence, they are more likely to feel true.”44 To the extent this psychological research is applicable to trials, it tends to refute the notion that the prosecution pulls the heavy oar in criminal cases. We believe that it does because we assume juries go about deciding cases by accurately remembering all the testimony and weighing each piece of evidence in a linear fashion, selecting which to believe based on assessment of its credibility or plausibility. The reality may be quite different. It may be that jurors start forming a mental picture of the events in question as soon as they first hear about them from the prosecution witnesses. Later-introduced evidence, even if pointing in the opposite direction, may not be capable of fundamentally altering that picture and may, in fact, reinforce it.45 And the effect may be worse the longer the prosecution’s case lasts and, thus, the longer it takes to bring the contrary evidence before the jury. Trials in general, and longer trials 42. See CENTER FOR PROSECUTOR INTEGRITY, AN EPIDEMIC OF PROSECUTOR MISCONDUCT, WHITE PAPER (Dec. 2013), available at http://www.prosecutorintegrity.org/wp-content/uploads/EpidemicofProsecutor- Misconduct.pdf. 43. Shankar Vedantam, Persistence of Myths Could Alter Public Policy Approach, WASH. POST (Sept. 4, 2007), http://www.washingtonpost.com/wp-dyn/content/article/2007/09/03/AR2007090300933.html (dis- cussing the results of a 2007 study by psychologist Norbert Schwarz); see Norbert Schwarz et al., Metacognitive Experiences and the Intricacies of Setting People Straight: Implications for Debiasing and Public Information Campaigns, 39 ADVANCES IN EXPERIMENTAL SOC. PSYCHOL. 127, 152 (2007) (“[o]nce a statement is accepted as true, people are likely to attribute it to a credible source—which, ironically, may often be the source that attempted to discredit it—lending the statement additional credibility when conveyed to others”) (citation omitted). 44. Vedantam, supra n.43 (quoting the statement of Peter Kim, an organizational psychologist who published a study in the Journal of Applied Psychology); see Donald L. Ferrin et al., Silence Speaks Volumes: The Effectiveness of Reticence in Comparison to Apology and Denial for Responding to Integrity- and Competence-Based Trust Violations, 92(4) J. APPLIED PSYCHOL. 893-908 (2007). 45. See Norbert Schwarz et al., Metacognitive Experiences and the Intricacies of Setting People Straight: Implications for Debiasing and Public Information Campaigns, 39 ADVANCES IN EXPERIMENTAL SOC. PSYCHOL. 127, 152 (2007); Eliot G. Disner, Some Thoughts About Opening Statements: Another Opening, Another Show, PRAC. LITIGATOR, Jan. 2004, at 61 (“there is substantial evidence that juries normally make up their minds long before closing argument”). 44 GEO. L.J. ANN. REV. CRIM. PROC (2015) ix in particular, may be heavily loaded in favor of whichever party gets to present its case first—the prosecution in a criminal case and the plaintiff in a civil case. If this is so, it substantially undermines the notion that we seldom convict an innocent man because guilt must be proven to a sufficient certainty. It may well be that, contrary to instructions, and contrary to their own best intentions, jurors are persuaded of whatever version of events is first presented to them and change their minds only if they are given very strong reasons to the contrary. 10. Police are objective in their investigations. In many ways, this is the bedrock assumption of our criminal justice process. Police investigators have vast discretion about what leads to pursue, which witnesses to interview, what forensic tests to conduct and countless other aspects of the investigation. Police also have a unique opportunity to manufacture or destroy evidence,46 influence witnesses, extract confes- sions47 and otherwise direct the investigation so as to stack the deck against people they believe should be convicted.48 And not just small-town police in Podunk or Timbuktu. Just the other day, “[t]he Justice Department and FBI [] formally acknowl- edged that nearly every examiner in an elite FBI forensic unit gave flawed testimony in almost all [of the 268] trials in which they offered evidence against criminal defendants over more than a two-decade period before 2000.”49 Do they offer a class at Quantico called “Fudging Your Results To Get A Conviction” or “Lying On The Stand 101”? How can you trust the professionalism and objectivity of police any- where after an admission like that? There are countless documented cases where innocent people have spent decades behind bars because the police manipulated or concealed evidence, but two examples will suffice: 46. One example is the case of Mark Prentice, who pleaded guilty to assault and robbery only after a New York State Police trooper, David Harding, reported that he had found fingerprints matching Prentice in the victim’s house. A subsequent investigation revealed that New York State Police troopers, including Harding, had falsified fingerprint evidence in at least 30 cases, and Harding admitted to planting evidence in Prentice’s case. Prentice was acquitted after spending six years in prison. Harding was then sentenced to 4.5 years in prison for fabricating evidence. See The National Registry of Exonerations, Mark Prentice, https://www.law.umich.edu/special/exoneration/Pages/casedetail.aspx?caseid4540. In addition to the cases recorded by the National Registry of Exonerations, researchers became aware of more than 1,100 cases in which convictions were overturned due to just 13 police corruption scandals, the majority of which involved planting drugs or guns on innocent individuals. See Chris Seward, Researchers: More than 2,000 False Convictions in Past 23 Years, NBC NEWS (May 21, 2012), http://usnews.nbcnews.com/_news/ 2012/05/21/11756575-researchers-more-than-2000-false-convictions-in-past-23-years?lite; Sean Gar- diner, Brooklyn District Attorney Kenneth Thompson Takes on Wrongful Convictions, WALL ST. J. (Aug. 8, 2014), http://www.wsj.com/articles/brooklyn-district-attorney-kenneth-thompson-takes-on-wrongful- convictions-1407547937 (Brooklyn DA Kenneth Thompson’s conviction integrity unit has ordered the review of more than 100 prior convictions, 70 of which involved accusations that former Brooklyn Detective Louis Scarcella coerced confessions and tampered with witness statements). 47. See supra n.34 (discussing Rivera’s coerced confession at the hands of the Lake County police); Spencer Ackerman, “I Sat In That Place for Three Days, Man”: Chicagoans Detail Abusive Confinement Inside Police “Black Site”, THE GUARDIAN (Feb. 27, 2015), http://www.theguardian.com/us-news/2015/feb/ 27/chicago-abusive-confinment-homan-square (four African-Americans describe being detained for sev- eral days inside a police warehouse, where they were “shackled and interrogated,” denied access to counsel and forbidden from notifying anyone of their whereabouts); see also Miriam S. Gohara, A Lie For a Lie: False Confessions and the Case for Reconsidering the Legality of Deceptive Interrogation Techniques, 33 FORDHAM URB. L.J. 791, 794-95 (2006); Laure Magid, Deceptive Police Interrogation Techniques: How Far is Too Far?, 99 MICH. L. REV. 1168, 1168 (2001). 48. 92 percent of arrest warrants obtained by the Ferguson, Missouri Police Department were issued against African Americans, who as a group were 68 percent less likely than others to have their charges dismissed. See UNITED STATES DEPARTMENT OF JUSTICE-CIVIL RIGHTS DIVISION, INVESTIGATION OF THE FERGU- SON POLICE DEPARTMENT, (Mar. 4, 2015), available at http://www.justice.gov/sites/default/files/opa/press- releases/attachments/2015/03/04/ferguson_police_department_report.pdf. 49. See Hsu, supra n.21. 44 GEO. L.J. ANN. REV. CRIM. PROC (2015)x In 2013, Debra Milke was released after 23 years on Arizona’s death row based entirely on a supposed oral confession she had made to one Detective Saldate who was much later shown to be a serial liar.50 And then there is the case of Ricky Jackson, who spent 39 years behind bars based entirely on the eyewitness identifica- tion of a 12-year-old boy who saw the crime from a distance and failed to pick Jackson out of a lineup.51 At that point, “the officers began to feed him information: the number of assailants, the weapon used, the make and model of the getaway car.”52 39 years! For some victims of police misconduct, exoneration comes too late: Mark Collin Sodersten died in prison while maintaining his innocence.53 After his death, a California appellate court determined that Sodersten had been denied a fair trial because police had failed to turn over exculpatory witness tapes.54 It posthumously set aside the conviction, which no doubt reduced Sodersten’s time in purgatory. 11. Guilty pleas are conclusive proof of guilt. Many people, including judges, take comfort in knowing that an overwhelming number of criminal cases are resolved by guilty plea rather than trial.55 Whatever imperfections there may be in the trial and criminal charging process, they believe, are washed away by the fact that the defendant ultimately consents to a conviction. But this fails to take into account the trend of bringing multiple counts for a single incident—thereby vastly increasing the risk of a life-shattering sentence in case of conviction56—as well as the creativity of prosecutors in hatching up criminal cases where no crime exists57 and the overcriminalization of virtually every aspect of American life.58 It also ignores that 50. Michael Keifer, Debra Milke Murder Charges Dismissed, AZ CENT. (Dec. 11, 2014), http://www. azcentral.com/story/news/local/phoenix/2014/12/11/%20milke-double-jeopardy-appeals/20253845/. 51. Radley Balko, This Week in Innocence: Ricky Jackson to be Released Tomorrow After 39 Years in Prison, WASH. POST (Nov. 20, 2014), http://www.washingtonpost.com/news/the-watch/wp/2014/11/20/this- week-in-innocence-ricky-jackson-to-be-released-tomorrow-after-39-years-in-prison/. Jackson was exoner- ated thanks to the dedicated efforts of the Ohio Innocence Project. See Ohio Innocence Project, University of Cincinnati College of Law, http://www.law.uc.edu/oip. 52. Exonerated Man Who Spent 39 Years in Prison Meets Accuser, CBS NEWS VIDEO (Jan. 5, 2015), http://www.cbsnews.com/videos/exonerated-man-who-spent-39-years-in-prison-meets-accuser/. 53. Laura Ernde, Accused Murderer Cleared Seven Months After Prison Death, DAILY J. (Jan. 18, 2007). 54. In re Sodersten, 53 Cal. Rptr. 3d 572 (Ct. App. 2007). 55. Judge Morris Hoffman, for example, cites to the fact that “almost all criminal defendants plead guilty” as support for the proposition that “the actual rate of wrongful convictions in the United States is vanishingly small.” See Morris B. Hoffman, The “Innocence” Myth, WALL ST. J., Apr. 26, 2007, at A19. But see Rakoff, supra n.34 (strongly objecting to the tendency to equate guilty pleas with actual guilt, noting that the current “prosecutor-dictated plea bargain system, by creating such inordinate pressures to enter into plea bargains, appears to have led a significant number of defendants to plead guilty to crimes they never actually committed”). As to the meaning of a 1 percent error rate, see infra pp. xiv-xv. 56. H. Mitchell Caldwell, Coercive Plea Bargaining: The Unrecognized Scourge of the Justice System, 61 CATH. U. L. REV. 63, 72-74 (2011). 57. See, e.g., Arthur Andersen LLP v. United States, 544 U.S. 696, 706 (2005) (reversing Arthur Andersen’s conviction of obstruction of justice under 18 U.S.C. §§ 1512(b)(2)(A) and (B), where the jury instructions, consistent with the government’s reading of the vaguely-worded statute, had all but erased a culpability requirement); United States v. Newman, 773 F.3d 438, 442, 448 (2d Cir. 2014) (pointing out “the doctrinal novelty of [the government’s] recent insider trading prosecutions” and reversing with prejudice two hedge fund managers’ convictions for securities fraud because the government “presented no evidence that [the managers] knew that they were trading on information obtained from insiders in violation of those insiders’ fiduciary duties”); United States v. Goyal, 629 F.3d 912, 921 (2010) (reversing a chief financial officer’s convictions of 15 counts of securities fraud and making false statements where “the government’s case suffered from a total failure of proof”) (internal quotation marks omitted); id. at 922 (Kozinski, C.J., concurring) (“[Goyal] is just one of a string of recent cases in which courts have found that federal prosecutors overreached by trying to stretch criminal law beyond its proper bounds. This is not the way criminal law is supposed to work.”) (citations omitted). 58. Justice Scalia criticized the overcriminalization of federal law in his dissent from denial of certiorari in Sorich v. United States, 555 U.S. 1204 (2009), a case in which the Seventh Circuit affirmed 44 GEO. L.J. ANN. REV. CRIM. PROC (2015) xi many defendants cannot, as a practical matter, tell their side of the story at trial because they fear being impeached with prior convictions or other misconduct.59 And, of course, if the trial process is perceived as highly uncertain, or even stacked in favor of the prosecution, the incentive to plead guilty to some charge that will allow the defendant to salvage a portion of his life, becomes immense.60 If the prosecution offers a take-it-or-leave-it plea bargain before disclosing exculpatory evidence, the defendant may cave to the pressure, throwing away a good chance of an acquittal. 12. Long sentences deter crime. In the United States, we have over 2.2 million people behind bars.61 Our rate of approximately 716 prisoners per 100,000 people is the highest in the world, over 5 times higher than that of other industrialized nations like Canada, England, Germany and Australia.62 Sentences for individual crimes are also far longer than in other developed countries. For example, an individual con- victed of burglary in the United States serves an average of 16 months in prison, compared with 5 months in Canada and 7 months in England.63 And the average prison sentence for assault in the United States is 60 months, compared to under 20 months in England, Australia and Finland.64 Incarceration is an immensely expensive enterprise. It is expensive for the taxpay- ers, as the average cost of housing a single prisoner for one year is approximately $30,000.65 A 20-year sentence runs into something like $600,000 in prison costs alone. Long sentences are also immensely hard on prisoners and cruel to their Chicago city employees’ convictions under the honest services mail fraud statute. The statute criminalizes the use of the mail or wire services to carry out a “scheme . . . to deprive another of the intangible right of honest services.” 18 U.S.C. § 1346. In urging the Court to construe the statute more narrowly, Justice Scalia pointed out that the mail fraud statute “has been invoked to impose criminal penalties upon a staggeringly broad swath of behavior, including misconduct not only by public officials and employees but also by private employees and corporate fiduciaries”—for example, the convictions of “a local housing official who failed to disclose a conflict of interest,” “students who schemed with their professors to turn in plagiarized work” and “lawyers who made side-payments to insurance adjusters in exchange for the expedited processing of their clients’ pending claims.” Sorich, 555 U.S. at 1204 (Scalia, J., dissenting from denial of certiorari) (internal quotation marks omitted); see Harvey A. Silverglate, Three Felonies a Day: How the Feds Target the Innocent (2009) (illustrating the shady practices prosecutors have used in order to convict individuals under vaguely worded federal statutes for conduct no rational person would view as criminal); Alex Kozinski & Misha Tseytlin, You’re (Probably) a Federal Criminal, in IN THE NAME OF JUSTICE 43-56 (Timothy Lynch ed., Cato Institute 2009); Glenn Harlan Reynolds, Ham Sandwich Nation: Due Process When Everything Is a Crime, 113 COLUM. L. REV. Sidebar 102 (2013); George F. Will, When Everything Is a Crime, Wash. Post (Apr. 8, 2015), http://www.washingtonpost.com/opinions/ when-everything-is-a-crime/2015/04/08/1929ab88-dd43-11e4-be40-566e2653afe5_story.html. 59. Tracey L. Meares, Rewards for Good Behavior: Influencing Prosecutorial Discretion and Conduct with Financial Incentives, 64 FORDHAM L. REV. 851, 863 (1995). 60. See John H. Blume & Rebecca K. Helm, The Unexonerated: Factually Innocent Defendants Who Plead Guilty, 100 CORNELL L. REV. 157 (2014); Bennett L. Gershman, Threats and Bullying by Prosecu- tors, 46 LOY. U. CHI. L.J. 327 (2014); see also infra n.73. 61. Roy Walmsley, World Prison Populations List (10th ed., Oct. 2013), INTERNATIONAL CENTRE FOR PRISON STUDIES 3, http://www.prisonstudies.org/sites/prisonstudies.org/files/resources/downloads/wppl_10. pdf (estimating there are 2,239,751 individuals, including pre-trial detainees, in American penal institutions). 62. Id. (Canada’s rate is 118 per 100,000; England’s is 148 per 100,000; Germany’s is 79 per 100,000; and Australia’s is 130 per 100,000). 63. See Adam Liptak, U.S. Prison Population Dwarfs That of Other Nations, N.Y. TIMES (Apr. 23, 2008), http://www.nytimes.com/2008/04/23/world/americas/23iht-23prison.12253738.html?pagewanted all&_r0. 64. FINDING DIRECTION: EXPANDING CRIMINAL JUSTICE OPTIONS BY CONSIDERING POLICIES OF OTHER NA- TIONS, JUST. POL’Y INST. 2 (Apr. 2011), http://www.justicepolicy.org/uploads/justicepolicy/documents/ sentencing.pdf. 65. See NATHAN JAMES, THE BUREAU OF PRISONS (BOP): OPERATIONS AND BUDGET, CONG. RES. SERV. 14 (Mar. 4, 2014), available at http://fas.org/sgp/crs/misc/R42486.pdf; Christian Henrichson & Ruth Dela- ney, What Incarceration Costs Taxpayers, VERA INST. OF JUST. CENTER ON SENT’G AND CORRECTIONS (Feb. 19, 2012), http://www.vera.org/pubs/special/price-prisons-what-incarceration-costs-taxpayers (“total per- inmate cost averaged $31,286” in 2011). 44 GEO. L.J. ANN. REV. CRIM. PROC (2015)xii families, as it’s usually very difficult for a prisoner to re-integrate into his family and community after very long prison sentences.66 We are committed to a system of harsh sentencing because we believe that long sentences deter crime and, in any event, incapacitate criminals from victimizing the general population while they are in prison. And, indeed, the United States is enjoying an all-time low in violent crime rates, which would seem to support this intuition.67 But crime rates have been dropping steadily since the 1990s, and not merely in the United States but throughout the industrialized world.68 Our intuition about harsh sentences deterring crime may thus be misguided.69 We may be spending scarce taxpayer dollars maintaining the largest prison population in the industrialized world, shattering countless lives and families, for no good reason. As with much else in the law, the connection between punishment and deterrence remains mysterious.70 We make our decisions based on faith. II What I have listed above are some of the reasons to doubt that our criminal justice system is fundamentally just.71 This is not meant to be an exhaustive list, nor is it clear that all of these uncertainties would, on closer examination, be resolved against the current system. But there are enough doubts on a broad range of subjects touching intimately on the integrity of the system that we should be concerned. The National Registry of Exonerations has recorded 1576 exonerations in the United States since 1989.72 The year 2014 alone saw a record high of 125 exonerations, up from 91 the 66. See generally Jalila Jefferson-Bullock, The Time is Ripe to Include Considerations of the Effects on Families and Communities of Excessively Long Sentences, 83 UMKC L. REV. 73 (2014). 67. See Uniform Crime Reports: Crime in the United States 2013, FBI, http://www.fbi.gov/about-us/cjis/ ucr/crime-in-the-u.s/2013/crime-in-the-u.s.-2013/violent-crime/violent-crime-topic-page/violentcrimemain_final (ex- plaining that an estimated 1.16 million violent crimes occurred in the U.S., which is 12.3 percent below the 2009 level and 14.5 percent below the 2004 level). 68. See Where Have All the Burglars Gone?, THE ECONOMIST (July 20, 2013), http://www.economist.com/ news/briefing/21582041-rich-world-seeing-less-and-less-crime-even-face-high-unemployment-and-economic; see also Inimai M. Chettiar, The Many Causes of America’s Decline in Crime, THE ATLANTIC (Feb. 11, 2015), http://www.theatlantic.com/features/archive/2015/02/the-many-causes-of-americas-decline-in-crime/38 5364/. 69. Nor does putting more people behind bars necessarily lead to less crime. A recent report by the Brennan Center reveals that “incarceration has been decreasing[ly effective] as a crime fighting tactic since at least 1980,” as increased incarceration has had “no observable effect” on the nationwide decline in violent crimes in the 1990s and 2000s. See DR. OLIVER ROEDER ET AL., WHAT CAUSED THE CRIME DECLINE?, BRENNAN CENTER FOR JUST., 22-23 (Feb. 12, 2015), available at https://www.brennancenter.org/ publication/what-caused-crime-decline. A recent study points to “prosecutors—more than cops, judges, or legislators—as the principal drivers of the increase in the prison population,” explaining that “[t]he real change is in the chances that a felony arrest by the police turns into a felony case brought by prosecutors.” See Jeffrey Toobin, The Milwaukee Experiment: What Can One Prosecutor Do About The Mass Incarceration of African-Americans?, THE NEW YORKER (May 11, 2015), http://www.newyorker.com/ magazine/2015/05/11/the-milwaukee-experiment. 70. Compare VALERIE WRIGHT, DETERRENCE IN CRIMINAL JUSTICE: EVALUATING CERTAINTY VERSUS SEVER- ITY OF PUNISHMENT, THE SENT’G PROJECT 6-8 (2010), available at http://www.sentencingproject.org/doc/ deterrence%20briefing%20.pdf (compiling several studies that conclude that longer sentences don’t lead to lower recidivism rates, and can even lead to higher rates), with FRANCESCO DRAGO ET AL., THE DETERRENT EFFECTS OF PRISON: EVIDENCE FROM A NATURAL EXPERIMENT, THE INST. FOR THE STUDY OF LAB. 2 (Jul. 2007), available at http://ftp.iza.org/dp2912.pdf (finding that in some cases, a longer sentence can reduce recidivism by 1.24 percent). According to Judge Rakoff, this uncertainty should not stop the judiciary from speaking out against the “evil” of mass incarceration. Jed S. Rakoff, Mass Incarceration: The Silence of the Judges, N.Y. REV. BOOKS (May 21, 2015), http://www.nybooks.com/articles/archives/ 2015/may/21/mass-incarceration-silence-judges. 71. There are similar reasons to doubt that our civil justice system is fundamentally just, but that’s a topic for another day. 72. The National Registry of Exonerations, http://www.law.umich.edu/special/exoneration/Pages/ 44 GEO. L.J. ANN. REV. CRIM. PROC (2015) xiii year before, and there is reason to believe the trend will continue.73 Certainly the significant number of inmates freed in recent years as the result of various innocence projects and especially as a result of DNA testing in cases where the convictions were obtained in the pre-DNA era, should cause us to question whether the current system is performing as effectively as we’ve been led to believe. It’s no answer to say that the exonerees make up only a minuscule portion of those convicted. For every exonerated convict, there may be dozens who are innocent but cannot prove it. We can be reasonably confident that the system reaches the correct result in most cases, but that is not the test. Rather, we must start by asking how confident we are that every one of the 2.2 million people in prisons and jails across the country are in fact guilty. And if we can’t be sure, then what is an acceptable error rate? How many innocent lives and families are we willing to sacrifice in order to have a workable criminal justice system? If we put the acceptable error rate at 5 percent, this would mean something like 110,000 innocent people incarcerated across the country. A 1 percent error rate would mean 22,000 innocent people—more or less the population of Nogales, Arizona—wrongly imprisoned. These numbers may seem tolerable un- less, of course, you, your friend or loved one draws the short straw. Do we know how these numbers compare to the actual error rate? We have no idea. What we have is faith that our system works very well and the errors, when they are revealed, are rare exceptions.74 Much hinges on retaining this belief: our self image as Americans; the pride of countless judges and lawyers; the idea that we live in a just society; confidence in the power of reason and logic; the certainty that none of us or our loved ones will face the unimaginable nightmare of unjust imprisonment detaillist.aspx (last visited Apr. 7, 2015). Of these exonerated individuals, 112 were sentenced to death, and 265 spent more than 20 years behind bars. The average time spent in prison was 9 years, with 40 percent imprisoned for more than 10 years. 80 percent were convicted by juries, 7 percent by judges and 12 percent pleaded guilty. 25 percent were exonerated at least in part by DNA evidence. The following factors contributed to their exonerations: mistaken witness identification (34% of exonerations); perjury or false accusation (55%); false confession (13%), defective or misleading forensic evidence (22%) and official misconduct (46%). Cases often involve more than one of these factors. See The National Registry of Exonerations, % Exonerations by Contributing Factor (last visited Apr. 7, 2015), http://www.law.umich. edu/special/exoneration/Pages/ExonerationsContribFactorsByCrime.aspx. 73. 47 of the 125 exonerations in 2014 involved defendants who had pleaded guilty. See University of Michigan, The National Registry of Exonerations, Exonerations in 2014, 3 (Jan. 27, 2015), https://www. law.umich.edu/special/exoneration/Documents/Exonerations_in_2014_report.pdf. The exonerations were mostly concentrated in California, Texas, New York and Illinois. According to an in-depth study by SAN FRANCISCO magazine, between 1989 and 2004, more than 200 California inmates were freed after courts found they were wrongfully convicted. See Nina Martin, Innocence Lost, S.F. (Nov. 2004), http://deathpenalty.org/downloads/SFMag.pdf. Their stories are the stuff of nightmares. Take, for example, Gloria Killian, a 30-something former law student who had signed up to do freelance detective work for a coin shop owner. One day, an elderly coin collector was robbed and killed, and someone called the Sacramento police accusing “a law student named Gloria” of being involved. Nothing came of this accusation until a year later, when a repeat felon named Gary Masse was convicted of the murder and sentenced to life without parole. He named Killian as his accomplice and claimed she masterminded the robbery. The accusation stuck, and Killian was convicted of conspiracy and murder, and sentenced to 32 years to life. Masse got his sentence reduced to 25 years. Over a decade later, a new investigation uncovered evidence that Masse had entered into an agreement with prosecutors to testify against Killian in exchange for leniency—a fact never disclosed to the defense. The investigation also turned up a letter Masse sent to the DA soon after Killian was sentenced, in which he wrote, “I lied my ass off for you people.” A panel of our court reversed Killian’s conviction in 2002, at which point she had already lost 16 years of her life to prison. See Killian v. Poole, 282 F.3d 1204 (9th Cir. 2002). The prosecutor walked away with an admonishment from the California State Bar. See Martin, supra, at 10-11. 74. Judge Hoffman cautions against the assumption that “our criminal justice system [is] so dismal that a rightful conviction seems the exception and not the rule,” urging that we “look not only at the number of wrongfully convicted defendants, but also at the number of rightly convicted ones.” See Hoffman, supra n.55. 44 GEO. L.J. ANN. REV. CRIM. PROC (2015)xiv or execution; belief in the incomparable integrity and accuracy of our system of justice; faith that we have transcended medieval methods of conviction and punish- ment so that only those who are guilty are punished, and their punishment is humane and proportionate. There are, we are convinced, no Edmond Dantèses and no Château d’Ifs in America today. But what do we really know? We must reject out of hand the idea that the number of actual exonerations represents all of those who have been wrongly convicted. Convicted prisoners wishing to gain release on grounds of innocence face formidable hurdles.75 To begin with, they are in prison and thus unable to pursue leads the police might have missed; they have to rely on someone on the outside to do it, and that’s often difficult or impossible to accomplish. A prisoner’s access even to his counsel is severely restricted once he’s incarcerated. A loyal friend or relative might do it, but friends and even relatives often abandon defendants who are convicted, no matter how much they may protest their innocence. A few prisoners may obtain the help of an innocence project, but the work is labor-intensive, resources are scarce and manpower is limited, so innocence projects engage in triage, focusing on the most promising cases.76 Of course, it’s often difficult to tell whether a case is promising until you look closely at it, so a promising case can easily be overlooked. But the biggest problem is that new evidence is hard—and often impossible—to find. If it’s a physical crime, police secure the crime scene and seize anything that looks like it could be relevant. The chance of going back years later and picking up new clues is vanishingly small. The trick then is to get whatever evidence the police have, assuming they didn’t destroy it or release it once it was clear that it wouldn’t be used at trial. If the crime is non-physical, such as fraud, child pornography or computer hacking, the police seize all the relevant computers, hard drives and paper records (including any exculpatory evidence the suspect may have there) and may well discard them after the conviction becomes final. For a brief period, DNA evidence helped exculpate defendants who were convicted in the pre-DNA era,77 but DNA often cannot help identify the true perpetrator because no sample of DNA was found or collected from the crime scene.78 A prisoner has to be exceedingly lucky to 75. See Brandon L. Garrett, Claiming Innocence, 92 MINN. L. REV. 1629, 1670-84 (2008). 76. See Steven A. Krieger, Why Our Justice System Convicts Innocent People, and the Challenges Faced by Innocence Projects Trying to Exonerate Them, 14 NEW CRIM. L. REV. 333, 367-77 (2011) (the average innocence project receives about 600 requests per year, but only has the bandwidth to effectively investigate around 100 cases per year). 77. Often over the dogged opposition of prosecutors who have no wish to have a victory snatched away. Take, for example, Ken Anderson, the Texas district attorney who succeeded in putting Michael Morton in prison for murder. See infra nn.80-82 and accompanying text. Morton was released after twenty-five years based on DNA test results linking the crime to another man, as well as exculpatory evidence that Anderson had withheld during trial. For six years, Anderson’s successor, John Bradley, fought Morton’s repeated requests for DNA testing, explaining that “[o]nce a prosecutor has a case in which he or someone else has achieved a conviction where a body of people have been convinced beyond reasonable doubt someone is guilty and then sentenced them, the presumption becomes that that is a justified verdict that the prosecutor must defend.” See Brandi Grissom, A Tough Prosecutor Finds His Certitude Shaken by a Prisoner’s Exoneration, THE TEX. TRIB. (Nov. 18, 2011), http://www. texastribune.org/library/multimedia/john-bradley-texas-prosecutor-asserts-change-of-heart/. The prosecutors in Anthony Ray Hinton’s case took the same approach as Bradley: “Despite pleas by Mr. Hinton’s lawyers, who cited conclusions by newly enlisted specialists, the state refused for years to reconsider the evidence” that eventually led to his release after 30 years on death row. See Alan Blinder, Alabama Man Freed After Decades On Death Row, N.Y. TIMES (Apr. 3, 2015), http://www.nytimes.com/ 2015/04/04/us/anthony-ray-hinton-alabama-prison-freed-murder.html?mwrsmEmail&_r1; see infra n.167. 78. See Richard A. Rosen, Innocence and Death, 82 N.C. L. REV. 61, 73 (2003) (“[F]or every defendant who is exonerated because of DNA evidence, there have been certainly hundreds, maybe thousands, who have been convicted of crimes on virtually identical evidence,” yet “[f]or these thousands of defendants . . . there was no physical evidence that could have been subjected to scientific scrutiny.”). 44 GEO. L.J. ANN. REV. CRIM. PROC (2015) xv collect enough evidence to prove his innocence; most cannot hope to meet that standard. I think it’s fair to assume—though there is no way of knowing—that the number of exculpations in recent years understates the actual number of innocent prisoners by an order, and probably two orders, of magnitude.79 Wrongful convictions are not merely unjust to the prisoner and his family, they often result in another injustice or series of injustices: When an innocent man is convicted, a guilty man is left free and emboldened to victimize others. The Michael Morton case provides a good example.80 Morton was convicted in 1987 for the 1986 beating murder of his wife. Twenty-five years later he was exonerated when DNA evidence pointed to another man, Mark Norwood, who was eventually convicted of killing Mrs. Morton. However, Norwood has now been charged with the similar beating-death of another woman, Debra Baker; that murder was committed a year after Morton was convicted of his wife’s murder.81 Norwood is awaiting trial for the Baker murder.82 Had police continued to investigate the Morton murder instead of shutting down the investigation once they decided that Michael Morton was the culprit, Debra Baker might still be alive. There’s another question the answer to which we must be reasonably confident: Of those that are guilty, can we be sure that substantial numbers are not spending far more time behind bars than is justified? The question of how much time prisoners spend behind bars is no less important than that of whether only the guilty are being locked up. The ability to pick up the threads of one’s life after three to five years in prison is quite different than after fifteen, twenty or twenty-five years. Aside from the brutalizing and often dehumanizing effect of long-term imprisonment,83 an inmate who is released after a lengthy prison term simply does not return to the same world he left behind: Children grow up; spouses find other partners; friends and acquain- tances forget; job prospects disappear; life and work skills deteriorate.84 Shorter sentences also reduce the consequences of wrongful convictions. While no time behind bars can be justified for someone who is innocent,85 we must be especially careful before imposing life-altering sentences. By any measure, the United States leads the world in incarceration. In absolute terms, it has more prisoners than any other country. With just 5 percent of the world’s 79. See Samuel R. Gross et al., Rate of False Conviction of Criminal Defendants Who Are Sentenced to Death, 111 PROC. NAT’L ACAD. SCIS. 7230, 7230 (2014), http://www.ncbi.nlm.nih.gov/pmc/articles/ PMC4034186/ (acknowledging that “the great majority of innocent defendants remain undetected,” and conservatively estimating that at least 4.1 percent of those on death row are innocent). 80. See supra n.77; Innocence Project, Michael Morton, http://www.innocenceproject.org/cases-false- imprisonment/michael-morton. 81. Morton was convicted and sentenced to life in prison in February 1987, and Baker was murdered in January 1988. See Brandi Grissom, Mark Norwood Indicted in Second Austin Murder, THE TEX. TRIB. (Nov. 9, 2012), http://www.texastribune.org/2012/11/09/mark-norwood-faces-grand-jury-second-austin- murder/. 82. See Claire Osborn & Jazmine Ulloa, Mark Norwood, Sentenced to Life for Morton Murder, Pleads Not Guilty in Second Case, STATESMAN (Jan. 9, 2014), http://www.statesman.com/news/news/local/mark- norwood-sentenced-to-life-for-morton-murder-p/nchx2/; Jazmine Ulloa, Judge Delays Trial in the 1988 Killing of Debra Baker, STATESMAN (Jan. 12, 2015), http://www.statesman.com/news/news/crime-law/judge- delays-trial-in-the-1988-killing-of-debra-ba/njmTY/. 83. See Jefferson-Bullock, supra n.66, at 100-05. 84. See Craig Haney, The Psychological Impact of Incarcertation: Implications for Postprison Adjust- ment, in PRISONERS ONCE REMOVED 46-48 (Jeremy Travis & Michelle Waul eds., The Urb. Inst. Press 2003). 85. Jails, which primarily house the legally innocent, treat their inmates no better than prisons do, and have their own share of horror stories. See Ryan Cooper, How Your Local Jail Became Hell: An Investigation, THE WEEK (Mar. 31, 2015), http://theweek.com/articles/540725/how-local-jail-became-hell- investigation. 44 GEO. L.J. ANN. REV. CRIM. PROC (2015)xvi population, we have almost a quarter of the world’s prisoners.86 China, with nearly 20 percent of the world’s population, has 16 percent of the world’s prisoners.87 Incarcera- tion rates were not always this high in the United States. For the first three-quarters of the twentieth century, the rate was well under 250 per 100,000.88 Then, starting around 1980, incarceration rates started rising sharply with the advent of the war on drugs, mandatory minimum sentences and three-strikes laws.89 The difference in incarceration rates cannot be explained by higher crime rates in the United States. Crime rates here are roughly equivalent to Canada and in many categories lower than other countries.90 And the crime rate has been dropping in the United States, as in many other industrialized nations.91 Yet, U.S. sentences are vastly, shockingly longer than just about anywhere else in the world.92 There are reasons to doubt whether the length of prison sentences in this country is just. Although elected officials, regardless of party affiliation and political leaning, seem to favor Draconian sentences, and the public seems to support them in the abstract, it’s unclear how much popular support they enjoy when applied to indi- vidual defendants. U.S. District Judge James Gwin of Ohio reported on an informal study he conducted involving 22 jury trials.93 He asked the jurors who had convicted the defendant to write down what each thought was the appropriate sentence. Judge Gwin found that the jurors’ recommended sentences were significantly lower than those recommended by the Sentencing Guidelines: “In several cases, the recom- mended median Guidelines range was more than 10 times greater than the median jurors’ recommendation. Averaged over more than 20 cases, jurors recommended sentences that were 37% of the minimum Guidelines recommended sentences and 22% of the median Guidelines recommended sentences.”94 Whether we are incarcerating the right people and for an appropriate length of time are important questions to which we do not have very good answers. We are taught early in our schooling that the criminal justice system is tilted heavily in favor of defendants, resolving all doubts in their favor. Movies and television reinforce this idea with countless stories of dedicated police and prosecutors bringing guilty people to justice,95 or of acquittals of the innocent because of the efforts of a dedicated lawyer or investigator.96 Our educational system spends little time pondering the fate 86. Liptak, supra n.63. 87. Walmsley, supra n.61. 88. See JUSTICE POLICY INSTITUTE, THE PUNISHING DECADE: PRISON AND JAIL ESTIMATES AT THE MILLENNIUM 4 (2000), available at http://www.justicepolicy.org/images/upload/00-05_rep_punishingdecade_ac.pdf (charting incarceration rates from 1900 onwards). 89. Id. Some states, like Maine and Minnesota, have stayed at the pre-1980s levels. Others, like Texas and Louisiana, have around 1000 inmates per 100,000 of population—which means that one out of every 100 people in those states are prisoners. See Liptak, supra n.63. Are people in Texas and Louisiana really three times worse than those in Maine and Minnesota? 90. See Jan van Dijk, John van Kesteren & Paul Smit, CRIMINAL VICTIMISATION IN INTERNATIONAL PERSPECTIVE 43 (2007), http://www.unicri.it/services/library_documentation/publications/icvs/publications/ ICVS2004_05report.pdf (ranking the top 15 countries by victimization rates, with Canada and the United States coming in 12th and 13th, respectively). 91. The Curious Case of the Fall in Crime, ECONOMIST (July 20, 2013), http://www.economist.com/news/ leaders/21582004-crime-plunging-rich-world-keep-it-down-governments-should-focus-prevention-not. 92. FINDING DIRECTION: EXPANDING CRIMINAL JUSTICE OPTIONS BY CONSIDERING POLICIES OF OTHER NA- TIONS, supra n.64. 93. James S. Gwin, Juror Sentiment on Just Punishment: Do the Federal Sentencing Guidelines Reflect Community Values?, 4 HARV. L. & POL’Y REV. 173 (2010). 94. Id. at 187-88. 95. See, e.g., JFK (Warner Bros. 1991); Law & Order: Special Victims Unit (NBCUniversal Television Distribution 1999-2015). 96. See, e.g., CONVICTION (Fox Searchlight Pictures 2010); JUST CAUSE (Warner Bros. 1995); MY COUSIN VINNY (Twentieth Century Fox 1992); TO KILL A MOCKINGBIRD (Universal Pictures 1962). 44 GEO. L.J. ANN. REV. CRIM. PROC (2015) xvii of those unjustly convicted or those wasting their lives behind bars because of a punishment that far outstrips whatever evil they were convicted of committing. No Dumas, Hugo or Zola has risen among us to foment public sympathy to the plight of the unjustly imprisoned. III Lawyers and judges are inculcated with the notion that the system works well and there is nothing to worry about. And perhaps it’s true. But there are far too many uncertainties for us to be complacent. Criminal trials as we know them were developed centuries ago at a time when life and technology were very different. The process has remained essentially unchanged since time out of mind. While we have much experience with the process, we know very little about how well it works. We tell ourselves that the system works, and we really believe it, but this is largely based on faith. When all is said and done, we have only a guess. Below I offer some suggestions on how the system might be improved and validated. I do not suggest how these changes are to be implemented: Some may require legislation; others a change in judicial practices; still others constitutional amendments. Nor do I insist that all my suggestions be implemented immediately. Some may deserve closer attention, and some should be delayed while others are accelerated. There may well be good reasons that my suggestions are unworkable, and perhaps others will come up with better ones. If my proposals raise controversy and opposition, leading to a spirited debate, I will have achieved my purpose.97 A. Juries Juries matter. They obviously matter in the relatively few cases that are actually tried to them, but they also matter in the multitude of cases that are pled or settled. To the extent the jury is viewed as an unpredictable, erratic force, it increases the uncertainty of the outcome and thus considerably raises the stakes for the parties, especially criminal defendants. If a prosecutor can make a credible case that a jury might return a verdict calling for life without parole, he is very likely to extract a plea deal involving a “mere” 15- or 20-year sentence. Most judges, especially trial judges, express satisfaction with the operation of the jury system. I’ve heard judges say that they seldom or never think juries reach the wrong outcome. I am skeptical of such claims. To begin with, judges don’t know any better than anyone else what is the correct verdict in a case. The most they can say is that they would have reached the same verdict as the jury.98 But judges are not usually called upon to make findings when they are presiding over a jury trial; their function is to determine whether there is sufficient evidence to support a guilty verdict, a process which presupposes that the prosecution’s witnesses are believed by the trier of fact. This is a very different and much less rigorous process than figuring out who’s lying and who’s telling the truth, and I doubt that judges routinely go through that process in parallel with the jury. I certainly don’t. Actual observation of behavior in the jury room is rare, but it does exist. As cameras have become smaller and less obtrusive over the last quarter century, we’ve 97. “When critics disagree, the artist is in accord with himself.” Oscar Wilde, THE PICTURE OF DORIAN GRAY, Preface (1891). 98. In fact, judges are far more likely to acquit than juries. See Andrew D. Leipold, Why Are Federal Judges So Acquittal Prone?, 83 WASH. U. L.Q. 151, 151 (2005) (“Statistically, federal judges are significantly more likely to acquit than a jury is—over a recent 14 year period, for example, the jury trial conviction rate was 84%, while the bench conviction rate was a mere 55%.”). 44 GEO. L.J. ANN. REV. CRIM. PROC (2015)xviii had several instances where we have been able to observe jury-room behavior.99 The results are not particularly reassuring. There is at least one case of documented jury nullification, with every juror expressing the belief that the defendant was guilty yet acquitting him nevertheless.100 In another case, jurors misconstrued the judge’s denial of their request for a certain transcript to mean that they were not entitled to any transcripts from the trial.101 And in a high profile murder trial, one juror experienced aggressive pushback for express- ing skepticism of the defendant’s guilt because the medical examiner had switched her diagnosis from accidental drug overdose to homicide only after listening to a tape recording where the defendant said “I got away with it.”102 The juror asked the judge (unsuccessfully) to excuse her from the jury because she was uncomfortable with being pressured, and eventually voted with the majority to convict the defendant while protesting that she was doing so in a “bullied manner.” Worse still, in deliberations after the penalty phase of the trial, a different juror expressed a complete change of heart from the jury’s guilty verdict the day before, emphatically maintaining that she never believed the defendant committed one of the two murders of which he was convicted. In short, “[e]ven with the camera rolling, jurors compro- mised on verdicts, allowed personality conflicts to interfere with the deliberations, and oversimplified the judge’s instructions.”103 Anecdotal accounts tend to support this view. I always debrief my jurors after they return a verdict (I’ve never had a hung jury) and try to get them to talk about what happened in the jury room. Some of the comments seem entirely rational, but much of what jurors describe looks like a fun-house mirror reflection of Twelve Angry Men. There was one case I remember where the jury acquitted despite what I thought was an iron-clad prosecution case. I was a bit shocked and entirely puzzled about what had happened. When I debriefed the jury, I got somewhat muted responses from most of the jurors but one gentleman, who turned out to be the foreman, had very strong views. He thought the government was wasting taxpayer money in prosecuting this defendant who had been caught red-handed with a suitcase full of some 10,000 Ecstasy pills just imported from Europe. The foreman was a large man and quite vociferous—almost belligerent—about it. Of course, during voir dire I had asked the usual questions about whether any of the panel members had philosophical objections to our drug laws, and he had answered in the negative. The reality was different; he had strong objections to the war on drugs and managed to pull the jury with him. That one strong personality can dominate the jury room is consistent with my own experience. I’ve sat on two (state) juries. One of the cases was not close by any 99. TV cameras have entered the jury room on at least three occasions. First, in 1986, PBS aired a broadcast showing footage of deliberations in a Wisconsin criminal trial. Inside the Jury Room (PBS Frontline television broadcast 1986). Then, in 1997, CBS aired a 2-hour documentary consisting of footage of jury deliberations in four Arizona trials. Enter the Jury Room (CBS Reports television broadcast 1997). Most recently, in 2004, ABC aired a 7-part TV series following six homicide cases from the pretrial stage to the jury deliberations and final verdict. In the Jury Room (ABC television broadcast 2004). 100. The jurors in the Wisconsin trial all acknowledged early in the deliberations that they believed the defendant was guilty, but concluded that the only way they could achieve a just result was to ignore the law. See Inside the Jury Room (PBS Frontline television broadcast 1986); Margaret E. Guthrie, Film Takes an Inside Look at Deliberations of Jurors, NAT’L L.J. (Apr. 14, 1986). 101. See Enter the Jury Room (CBS Reports television broadcast 1997); David Schneider, Jury Deliberations and the Need for Jury Reform: An Outsider’s View, 36 JUDGES’ J. 23, 31, 53 (1997). 102. In the Jury Room (ABC television broadcast 2004) (following the deliberations in State of Ohio v. Mark Ducic). The foreman, for example, accused the juror of “fighting logic.” 103. Diane E. Courselle, Struggling With Deliberative Secrecy, Jury Independence, and Jury Reform, 57 S.C. L. REV. 203, 229 (2005) (discussing the jury deliberations in the ABC In the Jury Room series). 44 GEO. L.J. ANN. REV. CRIM. PROC (2015) xix measure, but the other one hinged on the testimony of a single witness, as there was no physical evidence whatsoever. The police had not even managed to recover a large bag of coins that the accusing witness claimed he had handed over to the defendant during a store holdup, even though the defendant was apprehended within 20 minutes of the robbery, just a couple of blocks away. Having been elected foreman, I spoke after every one of my fellow jurors had expressed the view that the defendant was guilty. I reminded my colleagues of the prosecution’s heavy burden of proof and questioned whether the complaining witness’s identification could be trusted given the missing coins. If the defendant was, in fact, the perpetrator, he couldn’t have spent a bag of assorted coins in the time it took to apprehend him, and he couldn’t easily have hidden it when he saw the police approaching. And wouldn’t he have gotten much farther from the scene of the crime if he were carrying a bag of stolen coins? One by one, all but one of the other jurors switched their votes to acquit. The one exception proved impossible to budge so we eventually asked the judge to declare a mistrial, which he did. The simple truth is that our confidence in juries rests largely on faith, and our processes are not designed to help us improve the functioning of the jury because there is no systematic feedback mechanism to help us figure out what works and what doesn’t. I’ve recently suggested that we introduce cameras into the jury room,104 and I will not rehash the arguments I’ve advanced in support. Suffice to say that viewing what juries do in actual cases will give us a much better understanding of jury behavior and provide valuable information for different techniques in presenting evidence, instructing juries and jury management. Seeing what juries do in actual cases can also ameliorate or eliminate the endless speculation about which trial errors are harmless and which are prejudicial. Why shoot in the dark when a man’s liberty or life is at stake? The same is true where there is a claim of improper juror conduct. In such circumstances, the trial judge will hear conflicting accounts about what happened in the jury room. Wouldn’t it be better to see and hear what actually transpired? Videos of jury deliberations could be sealed and preserved for viewing by research- ers only after the case is final. Or they could be made available to the trial judge and a reviewing court as needed to resolve questions involving the jury’s conduct. Or they could be made available to the lawyers immediately after the verdict. Disclosure could be implemented incrementally over time and rolled back if the process is found to interfere with the jury’s function. But we need to get a close look as to what’s going on in the jury room before we can even begin the process of meaningful reform. In the meantime—or in addition—I offer the following suggestions for reform: 1. Give jurors a written copy of the jury instructions. Jury instructions are often lengthy and difficult to follow. Jurors are expected to absorb them by listening, which is probably the worst way to learn new and complex subject matter.105 Many judges try to ameliorate this problem by sending a copy of the instructions into the jury room when the panel retires to deliberate, but some judges refuse to do so. It should be reversible error for a judge to fail to send a full set of jury instructions with the jury when it retires to deliberate. Pre-instructing the jury on key concepts and giving them those instructions in writing is also a good idea. 2. Allow jurors to take notes during trial and provide them with a full trial 104. See Alex Kozinski & John Major, Why Putting Cameras in the Jury Room Is Not as Crazy as You Think, DUKE JUDICATURE (forthcoming 2015). 105. See Michael A. Cohen et al., Auditory Recognition Memory Is Inferior to Visual Recognition Memory, 106 PROC. NAT’L ACAD. SCIS. 6008, 6008 (Apr. 7, 2009). 44 GEO. L.J. ANN. REV. CRIM. PROC (2015)xx transcript. Most judges now allow note-taking and provide writing materials for the jury to use, but a minority refuse to do so. This should be reversible error. Consulting notes during deliberations is immensely useful when the jurors’ memories differ as to what a witness has said. Forcing jurors to rely on their recollections alone exacerbates the distorting memory effects discussed above. In fact, I would go a step farther and give jurors transcripts of the proceedings to consult during deliberations. This was not possible when transcripts had to be transcribed laboriously by hand. But real-time transcripts are now pretty much standard and available for the judge and lawyers to consult while the trial is going on. I can see no justification for keeping jurors in the dark. 3. Allow jurors to discuss the case while the trial is ongoing. Most jury trials now start with a stern admonition that jurors not discuss the case until they are sent out to deliberate. It’s unclear why we do this except that we’ve always done it that way.106 When I served as a juror, this restriction seemed unnatural and counter- productive. My guess is that it exacerbates the distorting effects of memory. Allowing jurors to discuss what they’ve heard could give them a chance to express doubts and to remind each other of the need to keep an open mind. 4. Allow jurors to ask questions during the trial. I’ve been doing this for some years in civil cases and it seems to work well. I ask jurors to put any questions in writing and hand them up to me. I then share these questions with the lawyers and let one or both use them during their examinations. Other techniques are possible, including having jurors pose questions to the witnesses directly and letting the lawyers follow up in light of the answers. 5. Tell jurors up-front what’s at stake in the case. In most jurisdictions, jurors in non-capital cases are not told what the likely punishment will be if the defendant is convicted. In fact, we tell jurors not to consider punishment in deciding guilt. I don’t understand why this is appropriate. In making most life decisions, we consider the consequences in determining how much effort to put into deciding and the degree of confidence we must feel before we go forward. Whether to get married or have a risky operation obviously requires a greater psychological commitment than choosing between Starbucks and Peets. Jurors should be told the gravity of the decision they are making so they can take it into account in deciding whether to convict or acquit. As representatives of the community where the defendant committed his crime, the jury should be allowed to make the judgment of whether the punishment is too severe to permit a conviction. Having to confront the jury with the severity of the punish- ment they are seeking to extract may well deter prosecutors from using overcharging as a bargaining tool. 6. Give jurors a say in sentencing. Except for capital cases, we have turned our sentencing process over entirely to experts and professionals. We have mandatory minimums, sentencing guidelines, probation officers and judges at all levels involved in the decision, but we studiously ignore the views of the very people who heard the evidence and are given the responsibility to determine guilt or innocence while reflecting the values of the community in which the offense occurred. This is a system only a lawyer could love. Jurors should be instructed on the range of punishments authorized by law and, if they find the defendant guilty, entrusted to weigh in on the appropriate sentence within that range. And I would make that the absolute upper limit of what punishment the judges actually impose, overriding any sentencing guidelines, mandatory minimums or their own considered judgment. 106. See David A. Anderson, Let Jurors Talk: Authorizing Pre-Deliberation Discussion of the Evi- dence During Trial, 174 MIL. L. REV. 92, 94-95, 121-24 (2002) (chronicling the history of the prohibition against pre-deliberation discussion and concluding that the rule doesn’t make sense and should be abolished). 44 GEO. L.J. ANN. REV. CRIM. PROC (2015) xxi B. Prosecutors Prosecutors hold tremendous power, more than anyone other than jurors, and often much more than jurors because most cases don’t go to trial. Prosecutors and their investigators have unparalleled access to the evidence, both inculpatory and exculpa- tory, and while they are required to provide exculpatory evidence to the defense under Brady, Giglio, and Kyles v. Whitley, it is very difficult for the defense to find out whether the prosecution is complying with this obligation. Prosecutors also have tremendous control over witnesses: They can offer incentives—often highly compelling incentives—for suspects to testify. This includes providing sweetheart plea deals to alleged co-conspirators and engineering jail-house encounters between the defendant and known informants. Sometimes they feed snitches non-public information about the crime so that the statements they attribute to the defendant will sound authentic.107 And, of course, prosecutors can pile on charges so as to make it exceedingly risky for a defendant to go to trial. There are countless ways in which prosecutors can prejudice the fact-finding process and undermine a defendant’s right to a fair trial.108 This, of course, is not their job. Rather, as the Supreme Court has held, “[A prosecutor] is in a peculiar and very definite sense the servant of the law, the twofold aim of which is that guilt shall not escape or innocence suffer. He may prosecute with earnestness and vigor—indeed, he should do so. But, while he may strike hard blows, he is not at liberty to strike foul ones.”109 All prosecutors purport to operate just this way and I believe that most do. My direct experience is largely with federal prosecutors and, with a few exceptions, I have found them to be fair-minded, forthright and highly conscientious.110 But there are disturbing indications that a non-trivial number of prosecutors—and sometimes entire prosecutorial offices— engage in misconduct that seriously undermines the fairness of criminal trials. The misconduct ranges from misleading the jury,111 to outright lying in court112 and tacitly 107. Recently, California Superior Court Judge Thomas Goethals removed the Orange County DA’s office from a high profile murder case after the prosecutors had shown a “chronic failure” to comply with orders to turn over evidence with respect to how defendant Scott Dekraai was assigned a cell next to a prolific jailhouse informant as part of a larger scheme to extract false confessions. People v. Dekraai, Supplemental Ruling 12ZF0128 (Cal. Sup. Ct. Mar. 12, 2015), http://big.assets.huffingtonpost.com/SUPP LEMENTALRULINGDekraai03122015.pdf; see Paloma Esquivel, D.A. Ran Illegal Snitch Operation in O.C. Jail, Attorneys Say, L.A. TIMES (Feb. 27, 2014), http://articles.latimes.com/2014/feb/27/local/ la-me-ln-jailhouse-informant-operation-20140227. Indeed, last year, it was revealed that “two of the most prolific jailhouse informants in Orange and Los Angeles counties, Raymond Cuevas and Jose Paredes[, had] befriended suspects in jail and collected information in more than 30 criminal cases” in exchange for over $150,000 from local law enforcement authorities. See Tony Saavedra, Money, Cable TV, Food Delivery: How Mexican Mafia Snitches Lived Like Kings Behind Bars, O.C. REG. (Nov. 23, 2014), http://www.ocregister.com/articles/cuevas-643108- paredes-informants.html; see also REPORT OF THE 1989-90 LOS ANGELES COUNTY GRAND JURY: INVESTIGA- TION OF THE INVOLVEMENT OF JAIL HOUSE INFORMANTS IN THE CRIMINAL JUSTICE SYSTEM IN LOS ANGELES COUNTY 28, available at http://perma.cc/S2MB-LTEV (“Copies of arrest reports, case files, and photo- graphs of victims are shown to informants.”); Tracey Kaplan, Santa Clara County: Ex-Jailer Says Planting Informants Was Routine, SANTA CRUZ SENTINEL (Mar. 1, 2015), http://www.santacruzsentinel.com/ general-news/20150301/santa-clara-county-ex-jailer-says-planting-informants-was-routine. 108. See, e.g., Peder B. Hong, Summation at the Border: Serious Misconduct in Final Argument in Criminal Trials, 20 HAMLINE L. REV. 43, 45-55 (1996) (outlining 19 forms of serious misconduct). 109. Berger v. United States, 295 U.S. 78, 88 (1935). 110. I was less impressed with the prosecutor in one of the two cases where I participated as a juror. I thought he engaged in unfair tactics that I would not have allowed, had I been the judge. 111. See, e.g., United States v. Kojayan, 8 F.3d 1315, 1322 (9th Cir. 1993) (AUSA strongly implied that a participant in the crime had not entered into a cooperation agreement, knowing full well that he had). 112. See, e.g., Sidney Powell, Federal Judge Blasts Yet Another Federal Prosecutor for Lying to the Court, OBSERVER (Dec. 9, 2014), http://observer.com/2014/12/federal-judge-blasts-yet-another-federal- 44 GEO. L.J. ANN. REV. CRIM. PROC (2015)xxii acquiescing or actively participating in the presentation of false evidence by police.113 Prosecutorial misconduct is a particularly difficult problem to deal with because so much of what prosecutors do is secret. If a prosecutor fails to disclose exculpatory evidence to the defense, who is to know? Or if a prosecutor delays disclosure of evidence helpful to the defense until the defendant has accepted an unfavorable plea bargain, no one will be the wiser. Or if prosecutors rely on the testimony of cops they know to be liars, or if they acquiesce in a police scheme to create inculpatory evidence, it will take an extraordinary degree of luck and persistence to discover it—and in most cases it will never be discovered. There are distressingly many cases where such misconduct has been docu- mented,114 but I will mention just three to illustrate the point. The first is United States v. Stevens,115 the prosecution of Ted Stevens, the longest serving Republican Senator in history. Senator Stevens was charged with corruption for accepting the services of a building contractor and paying him far below market price—essentially a bribe. The government’s case hinged on the testimony of the contractor, but the government failed to disclose the initial statement the contractor made to the FBI that he was probably overpaid for the services. The government also failed to disclose that the contractor was under investigation for unrelated crimes and thus had good reason to curry favor with the authorities.116 Stevens was convicted just a week before he stood for re-election and in the wake of the conviction, he was narrowly defeated, changing the balance of power in the Senate.117 The government’s perfidy came to light when a brave FBI agent by the name of Chad Joy blew the whistle on the government’s knowing concealment of exculpatory evidence. Did the government react in horror at having been caught with prosecutor-for-lying-to-the-court/ (AUSA in United States v. Seggerman, No. 10-cr-948 (S.D.N.Y.) was “lambasted” by Judge Kevin Duffy for being caught in “a flat out lie”—namely, by telling the court that press releases were not issued when they indeed had been). 113. See, e.g., Gathers v. United States, Nos. 09-CO-422, 11-CO-1676, 11-CO-1677, 12-CO-1411, 12-CO-1412, at 17-18 (D.C. Ct. App. Oct. 23, 2014), http://www.dccourts.gov/internet/documents/09-CO- 422.pdf (granting a new trial because the government presented “plainly false evidence highly prejudicial to the outcome” and “knew or should have known of the falsity, however belatedly this falsity may have come to the forefront”). 114. See The Open File Blog, http://www.prosecutorialaccountability.com/ (chronicling nationwide instances of prosecutorial misconduct); Registry Database, CENTER FOR PROSECUTOR INTEGRITY, http://www. prosecutorintegrity.org/registry/database/. The former governor of Illinois just pardoned a prisoner who had served 22 years because the prosecution was tainted with numerous flaws, including prosecutorial misconduct. See Nicholas Schmidle, Freedom for Tyrone Hood, THE NEW YORKER (Jan. 13, 2015), http://www.newyorker.com/news/news-desk/freedom-tyrone-hood. 115. No. 08-cr-231 (EGS), 2009 WL 6525926 (D.D.C. Apr. 7, 2009). 116. The whole sordid episode, as well as several others, are detailed in Sidney Powell’s powerful book, Licensed to Lie. See SIDNEY POWELL, LICENSED TO LIE 190-201 (2014). 117. This wasn’t the only prosecution that has had profound effects on American politics. “A revelation in journalist Judith Miller’s new memoir, ‘The Story: A Reporter’s Journey,’ exposes unscrupu- lous conduct by Special Counsel Patrick J. Fitzgerald in the 2007 trial of I. Lewis ‘Scooter’ Libby.” Berkowitz, supra n.31. According to Berkowitz, “It is painful to contemplate how many lives—American and Iraqi—might have been spared had Mr. Libby, the foremost champion within the White House in 2003 of stabilizing Iraq through counterinsurgency operations, not been sidelined and eventually forced to resign because of Mr. Fitzgerald’s multiyear investigation and relentless federal prosecution.” Id. Overly aggressive prosecution also wrecked the political career of longtime Iowa state legislator Henry Rayhons. The Attorney General of Iowa charged 78-year-old Rayhons with rape for having sex with his own wife, who was afflicted with Alzheimer’s. By the time the jury acquitted Rayhons, he had withdrawn from the re-election race for a seat he had held for 18 years. See Eugene Volokh, 78-Year-Old Iowa Ex-Legislator Acquitted of Having Nonconsensual Sex with His Wife, Who was Suffering from Alzhei- mer’s, WASH. POST (Apr. 23, 2015), http://www.washingtonpost.com/news/volokh-conspiracy/wp/2015/04/ 23/78-year-old-iowa-ex-legislator-acquitted-of-having-nonconsensual-sex-with-his-wife-who-was- suffering-from-alzheimers. 44 GEO. L.J. ANN. REV. CRIM. PROC (2015) xxiii its hands in the cookie jar? Did Justice Department lawyers rend their garments and place ashes on their head to mourn this violation of their most fundamental duty of candor and fairness? No way, no how. Instead, the government argued strenuously that its ill-gotten conviction should stand because boys will be boys and the evidence wasn’t material to the case anyway.118 It was only the extraordinary persistence and the courageous intervention of District Judge Emmet Sullivan, who made it clear that he was going to dismiss the Stevens case and then ordered an investigation of the government’s misconduct119 that forced the Justice Department to admit its malfeasance—what else could it do?—and move to vacate the former senator’s conviction. Instead of contrition, what we have seen is Justice Department officials of the highest rank suffering torn glenoid labrums from furiously patting them- selves on the back for having “done the right thing.”120 The second case comes from my own experience. The defendant was Debra Milke, 118. See POWELL, supra n.116, at 210 (Judge Emmet Sullivan observed: “Not only did the government seek to keep [Chad Joy’s] complaint a secret, but the government claimed that the allegations had nothing to do with the verdict and no relevancy to the defense; that the allegations could be addressed by the Office of Professional Responsibility’s investigation; and that any misconduct had already been addressed and remedied during the trial”). Lack of materiality is the Justice Department’s standard defense when it is caught committing a Brady violation. See DOJ Defends FBI Deputy Director Andrew Weissmann Against Serious Ethics Charges Pending in NY, SEEKING JUST. (Apr. 26, 2013), http://seeking-justice.org/ doj-defends-fbi-deputy-director-andrew-weissmann-against-serious-ethics-charges-pending-in-ny/ (the “en- tire . . . defense [was based on] the claim that the Rules of Professional Conduct require prosecutors to disclose to the defense team only information that is both favorable and material”) (emphasis and internal quotation marks omitted). Just as they did in Stevens, the prosecutors refused to acknowledge that they had a duty to turn over exculpatory evidence in United States v. Brown, C.R. No. H-03-363, 2010 WL 3359471 (S.D. Tex. Aug. 2010), affirmed by 650 F.3d 581 (5th Cir. 2011). In Brown, the government secured a paper-thin conviction against a Merrill Lynch executive for perjury and obstruction of justice for his grand jury testimony regarding the company’s involvement in a Nigerian barge transaction—all the while suppressing evidence showing that the executive had testified truthfully. See POWELL, supra n.116, at 376-77. 119. See Henry F. Schuelke III, Special Counsel, Report to Hon. Emmet G. Sullivan of Investigation Conducted Pursuant to the Court’s Order, dated Apr. 7, 2009, In re Special Proceedings, No. 1:09-mc-00198- EGS (D.D.C. Mar. 15, 2012), available at http://legaltimes.typepad.com/files/Stevens_report.pdf. 120. It’s not clear that the Justice Department learned much from the Stevens debacle, as it refused to admit that the exculpatory evidence it suppressed in the Stevens case was material to two other prosecutions that stemmed from the same investigation and involved overlapping issues and witnesses. See United States v. Kohring, 637 F.3d 895 (9th Cir. 2010); United States v. Kott, 423 Fed. App’x 736 (9th Cir. 2011); see also POWELL, supra n.116, at 231. Holding that the prosecution had yet again violated Brady by failing to disclose the very evidence deemed material in the Stevens case, a panel of my court vacated both defendants’ convictions and remanded for a new trial. Judge Betty Fletcher lambasted the prosecution’s “flagrant, willful bad-faith misbehavior” as “an affront to the integrity of our system of justice” and found “[t]he prosecution’s refusal to accept responsibility for its misconduct [] deeply troubling and indicat[ive] that a stronger remedy is necessary to impress upon it the reprehensible nature of its acts and omissions.” Kohring, 637 F.3d at 914 (Fletcher, J., concurring in part and dissenting in part); see Kott, 423 Fed. App’x at 738 (Fletcher, J., concurring in part and dissenting in part) (“Despite their assurances that they take this matter seriously, the government attorneys have attempted to minimize the extent and seriousness of the prosecutorial misconduct and even assert that Kott received a fair trial . . . . The government’s stance on appeal leads me to conclude that it still has failed to fully grasp the egregiousness of its misconduct, as well as the importance of its constitutionally imposed discovery obligations.”). A similar case was recently repeated from Cuyahoga County, Ohio. Cuyahoga Common Pleas Judge Nancy Russo ruled that three men who had spent almost 20 years in prison for a 1995 murder should be released pending a new trial, explaining: “A review of the evidence firmly supports the conclusion that [former Cuyahoga County prosecutor] Carmen Marino maliciously inserted himself into a criminal proceeding, and that he also sought to suppress evidence from the defendants, that he concealed public records from the citizenry, and that he subverted the process of justice, thereby violating each of the defendants’ individual rights to a fair trial.” See OH: The Long Shadow of Misconduct in Cuyahoga County, The Open File Blog (Apr. 17, 2015), http://www.prosecutorialaccountability.com/oh-the-long- shadow-of-misconduct-in-cuyahoga-county/. The current Cuyahoga County chief prosecutor responded that Marino had done nothing wrong. Id. 44 GEO. L.J. ANN. REV. CRIM. PROC (2015)xxiv who spent 23 years on Arizona’s death row after a conviction and sentence obtained in 1990 based on an oral confession she supposedly made to Phoenix police detective Armando Saldate Jr. as a result of a 20-minute interrogation.121 No one was present in the room with Milke except Saldate, who refused to record the session, despite his supervisor’s admonition that he do so.122 When the session ended, Saldate came out with nothing in writing—not even a Miranda waiver—and claimed Milke had confessed; Milke immediately and steadfastly denied it. The jury believed Saldate, but what the prosecution failed to disclose is that Saldate had a long and documented history of lying in court; he also had a serious disciplinary infraction bearing on his credibility: He had sought to extort sex from a lone female motorist and then lied about it when she reported the incident.123 It is not difficult to imagine that a jury may have been skeptical of Saldate’s testimony that Milke confessed, had it known about his track record. But the Maricopa County District Attorney’s office did not disclose this information, although it was party to many of the proceedings where Saldate had been found to be a liar. The evidence remained hidden for two decades until an unusually dedicated team of lawyers and investigators124 spent hundreds of hours digging through all of the criminal prosecutions in Maricopa County during the era when Saldate had been an investigator. It winnowed down those cases and focused on those where Saldate provided evidence.125 And the state doggedly refused to turn over Saldate’s disciplin- ary record until forced to disgorge it by an order of the district judge who considered Milke’s federal habeas petition. After we vacated the conviction and gave Arizona a chance to re-try Milke, the Arizona Court of Appeals barred any re-trial in an opinion so scathing it made the New York Times.126 The Court of Appeals described the “long course of Brady/ Giglio violations in this case” as a “flagrant denial of due process” and “a severe stain on the Arizona justice system”—one that it hoped would “remain unique in the history of Arizona law.”127 The Arizona Supreme Court recently denied the state’s petition for review,128 so the Court of Appeals decision stands. Maricopa County Attorney Bill Montgomery lamented that “[t]he denial of [the] petition for review is a dark day for Arizona’s criminal justice system.”129 121. See Michael Keifer, supra n.50. 122. See Milke v. Ryan, 711 F.3d 998, 1002 (9th Cir. 2013). 123. See id. at 1007. 124. This team included Milke’s attorney Anders Rosenquist, investigator Kirk Fowler and a dozen Arizona State University law students. See Steve Krafft, Debra Milke Case: Researchers Discovered Detective’s History of Misconduct, FOX 10 NEWS (Sept. 21, 2013), http://www.fox10phoenix.com/story/ 23447616/2013/09/16/debra-milke-case-research-team-discovered-detectives-misconduct. 125. This gargantuan effort is related in detail in our opinion. See Milke, 711 F.3d at 1018. 126. See Arizona: No Retrial for Woman Freed from Death Row, N.Y. TIMES (Dec. 11, 2014), http://www.nytimes.com/2014/12/12/us/arizona-no-retrial-for-woman-freed-from-death-row.html?_r1. 127. Milke v. Mroz, 339 P.3d 659, 665-66, 668 (Ariz. App. Ct. 2014). See Jonathan Abel, Buoying Brady’s Burden, DAILY J. (Mar. 19, 2015) (critiquing the prosecutorial misconduct in Milke’s case as “a symptom of a much larger problem,” particularly with respect to prosecutors’ reluctance to disclose Brady evidence concerning police misconduct). 128. Milke v. Mroz, CV-15-0016-PR (Mar. 17, 2015), available at https://perma.cc/QJE7-3FTN?typepdf. 129. News Release: County Attorney Comments on Arizona Supreme Court Ruling in State v. Milke (Mar. 17, 2015), http://www.maricopacountyattorney.org/newsroom/news-releases/2015/2015-03-17- County-Attorney-Comments-on-ASC-Ruling-in-State-v-Milke.html. Montgomery also had unkind things to say about our opinion, but we can live with that; life tenure is a wonderful thing. Less than a month later, Montgomery filed a motion to depublish the Court of Appeals’ decision barring re-trial, which the Arizona Supreme Court denied. Milke v. Mroz, CV-15-0016-PR (Apr. 21, 2015), available at http://www.azcourts.gov/Portals/21/MinutesCurrent/Mot_042115.pdf. Motions to depublish opinions that disclose prosecutorial misconduct, or at least to modify them to delete the name of the offending prosecutor, are common. See, e.g., infra n.193 (discussing the government’s motion in United 44 GEO. L.J. ANN. REV. CRIM. PROC (2015) xxv The third case is unfolding as I write these words. It involves the prosecution in Orange County of Scott Dekraai, who was convicted of having shot several people at a hair salon and is facing a capital penalty-phase trial. The prosecution presented evidence from a jailhouse informant, Fernando Perez, whom Dekraai had purportedly confessed to. It turns out that Perez was a serial informant who had presented similar confessions.130 Defense counsel challenged the informant, and Superior Court Judge Thomas Goethals ordered the prosecution to produce evidence bearing on this claim.131 He eventually found that the Orange County District Attorney’s office had engaged in a “chronic failure” to disclose exculpatory evidence pertaining to a scheme run in conjunction with jailers to place jailhouse snitches known to be liars near suspects they wished to incriminate, effectively manufacturing false confes- sions.132 The judge then took the drastic step of disqualifying the Orange County District Attorney’s office from further participation in the case.133 But this result came only after public defender Scott Sanders “wasted two years uncovering govern- ment misconduct, time that he could have spent preparing Dekraai’s defense against the death penalty.”134 Pulling an elephant’s teeth is surely easier than extracting exculpatory evidence from an unwilling prosecution team. These cases are hardly unique or isolated. But they illustrate that three ingredients must be present before we can be sure that the prosecution has met its Brady obligations under the law applicable in most jurisdictions. First, you must have a highly committed defense lawyer with significant resources at his disposal. Second, you must have a judge who cares and who has the gumption to hold the prosecutor’s feet to the fire when a credible claim of misconduct has been presented. And, third, you need a great deal of luck, or the truth may never come out. The misconduct uncovered in the Milke and Dekraai cases seems to implicate many other cases where criminal defendants are spending decades in prison. We can only speculate how many others are wasting their lives behind bars because they lacked the right lawyer or the right judge or the luck needed to uncover prosecutorial misconduct. While most prosecutors are fair and honest, a legal environment that tolerates sharp prosecutorial practices gives important and undeserved career advantages to prosecutors who are willing to step over the line, tempting others to do the same. Having strict rules that prosecutors must follow will thus not merely avoid the risk of letting a guilty man free to commit other crimes while an innocent one languishes his life away, it will also preserve the integrity of the prosecutorial process by shielding principled prosecutors from unfair competition from their less principled colleagues. Here are some potential reforms that would help achieve these goals: 1. Require open file discovery. If the prosecution has evidence bearing on the crime with which a defendant is being charged, it must promptly turn it over to the States v. Kojayan). 130. See Paloma Esquivel, Seal Beach Shooting Case Casts Spotlight on Jailhouse Informants, L.A. TIMES (Apr. 20, 2014), http://www.latimes.com/local/la-me-oc-informant-20140421-story.html#page1. 131. This courage did not go unpunished. Since February 2014, around the time that Judge Goethals began looking into prosecutors’ mishandling of informant-related evidence, the Orange County DA’s office has asked to disqualify him from 57 other criminal cases. See Christopher Goffard, O.C. Prosecutors Steering Cases Away from Judge Thomas Goethals, L.A. TIMES (Mar. 13, 2015), http://www. latimes.com/local/orangecounty/la-me-jailhouse-snitch-20150314-story.html. 132. People v. Dekraai, Supplemental Ruling 12ZF0128 (Cal. Sup. Ct. Mar. 12, 2015), available at http://big.assets.huffingtonpost.com/SUPPLEMENTALRULINGDekraai03122015.pdf. 133. See supra n.7; Christopher Goffard, Orange County D.A. Is Removed from Scott Dekraai Murder Trial, L.A. TIMES (Mar. 12, 2015), http://www.latimes.com/local/orangecounty/la-me-jailhouse-snitch-201 50313-story.html. The California Attorney General’s Office has since appealed Judge Goethals’s ruling, and I, of course, express no view as to how the case should be decided. 134. Goffard, supra n.133. 44 GEO. L.J. ANN. REV. CRIM. PROC (2015)xxvi defense. North Carolina adopted such a rule by statute after Alan Gell was convicted of murder and sentenced to death, even though the prosecution had statements of 17 witnesses who reported to have seen the victim alive after Gell was incarcerated— evidence that the prosecution failed to disclose until long after trial.135 Three years after its passage, the law forced disclosure of evidence that eventually exonerated three Duke lacrosse players who were falsely accused of rape—and led to the defeat, disbarment and criminal contempt conviction of Durham District Attorney Mike Nifong.136 Prosecutors were none too happy with the law and tried hard to roll it back in 2007 and again in 2012, but the result was an even stronger law that applies not only to prosecutors but to police and forensic experts, as well it should.137 A far weaker law was proposed by several U.S. Senators following the disgraceful prosecutorial conduct during the Stevens case. The law would require prosecutors to disclose all information “that may reasonably appear to be favorable to the defen- dant.”138 Despite support from both Democrats and Republicans, the bill has made no progress toward passage because of steadfast opposition from the U.S. Department of Justice.139 In his 2012 Preface to these pages,140 Attorney General Eric Holder voiced a strong commitment to ensuring compliance with Brady and related discovery obligations, but all of the measures he mentions leave prosecutors in charge of deciding what evidence will be material to the defense—something they cannot possibly do, because they do not know all the potential avenues a defense lawyer may pursue, and because it’s not in their hearts to look for ways to help the other side. If the Department of Justice wants to show its commitment to justice, it should drop its opposition to the Fairness in Disclosure of Evidence Act and help get it passed into law. 2. Adopt standardized, rigorous procedures for dealing with the government’s disclosure obligations. For reasons already explained, enforcing the government’s obligations is critical to achieving a level playing field in criminal cases. But policing this conduct is exceedingly difficult for the simple reason that “Brady violations . . . almost always defy detection. The cops know it. The prosecutors know it. The defense and the defendant have no idea whether Brady material exists.”141 Open file discovery would go a long way toward ameliorating the problem, but not far enough. The prosecutor’s file will generally contain what the police and investigators consider to be inculpatory evidence; a great deal might be left out that is unhelpful to the 135. See EVIDENCEPROFBLOG, Open And Shut: North Carolina Strengthens Its Open Discovery Law (June 3, 2011), http://lawprofessors.typepad.com/evidenceprof/2011/06/back-in-2004-north-carolina- governor-mike-easley-signed-a-bill-into-law-that-required-prosecutors-to-share-their-files.html. 136. See Duke Lacrosse Prosecutor Disbarred, CNN (June 17, 2007), http://www.cnn.com/2007/LAW/ 06/16/duke.lacrosse/; THE ASSOCIATED PRESS, Day in Jail for Ex-Duke Prosecutor, N.Y. Times (Sept. 1, 2007), http://www.nytimes.com/2007/09/01/us/01nifong.html?_r0&gwhD729031CB5109A29647D63 F43549BEA4&gwtpay. 137. See N.C. GEN. STAT. § 15A-903(a)(1) (2011), available at http://www.ncleg.net/Sessions/2011/Bills/ House/PDF/H408v2.pdf (“Upon motion of the defendant, the court must order: The State to make available to the defendant the complete files of all law enforcement agencies, investigatory agencies, and prosecutors’ offices involved in the investigation of the crimes committed or the prosecution of the defendant.”); EVIDENCEPROFBLOG, supra n.135. 138. Fairness in Disclosure of Evidence Act of 2012, S. 2197, 112th Cong. (2012), available at https://www.govtrack.us/congress/bills/112/s2197/text. 139. See Video Recording: Ensuring that Federal Prosecutors Meet Discovery Obligations: Hearing on S. 2197 Before the S. Judiciary Comm., 112th Cong. (2012) (on file with S. Judiciary Comm.) (statement of James M. Cole, Deputy Att’y Gen., U.S. Dep’t of Justice opposing the bill: “[I]n reacting to the Stevens case, we must not let ourselves forget . . . true improvements to discovery practices will come from prosecutors and agents . . . [i]n other words, new rules are unnecessary.”). 140. Eric Holder Jr., Preface, In the Digital Age, Ensuring that the Department Does Justice iii, 41 GEO. L.J. ANN. REV. CRIM. PROC. (2012). 141. Scott H. Greenfield, The Flood Gates Myth, SIMPLE JUSTICE (Feb. 16, 2015), http://blog. simplejustice.us/2015/02/16/the-flood-gates-myth/. 44 GEO. L.J. ANN. REV. CRIM. PROC (2015) xxvii prosecution. Yet the government’s disclosure obligation extends to information that is in the hands of investigators and places an affirmative obligation on prosecutors to become aware of exculpatory evidence that is held by others acting on the govern- ment’s behalf.142 Ensuring that the government complies with this obligation can’t be left up to individual prosecutors. Rather, prosecutorial offices must establish firm policies to ensure compliance. This does happen from time to time. For example, in 1990, Chief Assistant United States Attorney Mary Jo White of the Eastern District of New York, Chief of the Criminal Division Bill Muller and Chief of the Narcotics Unit David Shapiro, among others, issued a detailed, thoughtful 27-page memorandum analyzing the govern- ment’s disclosure obligation at the time and recommending procedures to be followed when dealing with informants and other government witnesses.143 One of those recommendations was that the office maintain, and provide to the defense, “informa- tion about every case in which an informant has testified as an informant or a defendant, including the district or state in which the proceedings took place, the docket numbers and transcripts, where possible . . . and statements by a judge referring to a witness’s truthfulness and any allegations of double dealing or other misconduct.”144 The memo contained other similarly enlightened recommendations, disclosing a firm commitment to complying with the spirit, not merely the letter, of Brady and its progeny. Some years later, in 1999, a similar set of procedures was adopted by the United States Attorney’s Office in the Northern District of California in a manual drafted by one of the authors of the EDNY memo who had moved there and served as head of the Criminal Division.145 But, according to a lawyer who left the office in 2002, the manual was disregarded by the new U.S. Attorney. Compliance with the government’s disclosure obligations cannot be left to the political vagaries of 93 U.S. Attorneys’ offices and the countless District Attorneys’ offices across the country. Instead, the U.S. Justice Department and the justice department of every state must ensure compliance by setting standards and meaning- fully disciplining prosecutors who willfully fail to comply. If they will not do it on their own, Congress and the state legislatures must prod them into it by adopting such standards by legislation. 3. Adopt standardized, rigorous procedures for eyewitness identification. North Carolina leads the way, once again, with the Eyewitness Identification Reform Act,146 which does just that. It provides in relevant part that lineups “shall be conducted by an independent administrator”; “[i]ndividuals or photos shall be presented to wit- nesses sequentially, with each individual or photo presented to the witness sepa- rately”; the eyewitness must be instructed that he “should not feel compelled to make an identification”; “at least five fillers shall be included in a [photo or live] lineup, in addition to the suspect”; and live identification procedures must be recorded on video.147 This law, too, came as a result of a huge miscarriage of justice when Jennifer Thompson-Cannino mistakenly identified Ronald Cotton as her rapist.148 He 142. See Kyles v. Whitley, 514 U.S. 419, 437 (1995). 143. See MARY JO WHITE ET AL., BRADY/GIGLIO DISCLOSURES (Oct. 30, 1990) (unpublished internal memorandum, on file with author). 144. Id. at 2. 145. See AUSA Manual for the Northern District of California (unpublished internal manual, on file with author). 146. N.C. GEN. STAT. § 15A-284.52 (2012), http://law.justia.com/codes/north-carolina/2012/chapter- 15a/article-14a/section-15a-284.52. 147. Id. 148. See Innocence Project, Ronald Cotton, http://www.innocenceproject.org/cases-false-imprisonment/ ronald-cotton. The case and the reform that it triggered were featured on a 60 Minutes episode titled 44 GEO. L.J. ANN. REV. CRIM. PROC (2015)xxviii spent 11 years in prison before he was exonerated by DNA evidence.149 The cases involving mistaken eyewitness identification are legion.150 4. Video record all suspect interrogations.151 The surprising frequency of false confessions should make us deeply skeptical of any interrogation we cannot view from beginning to end. Suspects are frequently isolated and pressured in obvious and subtle ways, and when the process ends we often have very different accounts of what happened inside the interrogation room.152 In those circumstances, whom are we to believe? Most of the time, the judge and juries believe the police. There may have been a time when we had to rely on such second-hand reports, but technology has now made this unnecessary: Video recording equipment is dirt cheap, and storage space for the resulting files is endless. No court should ever admit a confession unless the prosecution presents a video of the entire interrogation process from beginning to end.153 It appears that change is underway. Just last year, the Justice Department reversed its century-old prohibition against recording interrogations and adopted a policy “establish[ing] a presumption that the Federal Bureau of Investigation (FBI), the Drug Enforcement Administration (DEA), the Bureau of Alcohol, Tobacco, Firearms, and Explosives (ATF), and the United States Marshals Service (USMS) will electroni- cally record statements made by individuals in their custody.”154 “Eyewitness: How Accurate is Visual Memory?” See EVIDENCEPROFBLOG, Can I Get A(n Eye) Witness: 60 Minutes Story Exposes Problems with Eyewitness IDs (Mar. 9, 2009), http://lawprofessors.typepad.com/ evidenceprof/2009/03/those-of-you-wh.html. 149. See Innocence Project, supra n.148. 150. For example, in Gantt v. Roe, 389 F.3d 908, 914 n.8 (9th Cir. 2004), the police first showed an eyewitness a picture of a car owned by an initial suspect named Wilson, which the witness identified as the car he had seen the morning of the crime. The police then showed the witness a six-photo lineup including Wilson’s photo, and “sure enough, [the witness] selected Wilson as someone who ‘looked like the pedestrian he had seen,’” even though Wilson was eventually shown to have zero connection to the crime. Id.; see also Newsome v. McCabe, 256 F.3d 747, 749 (7th Cir. 2001) (there was ample evidence that police officers had “encouraged two witnesses to select [the defendant, who was exonerated after 15 years in prison,] from a lineup . . . yet withheld from the prosecutors information about their coaching of the witnesses and the fact that these witnesses earlier selected pictures from a book of mug shots that did not contain [the defendant]’s photo”). 151. In fact, why don’t police officers wear body cameras at all times? It would protect the suspect and the police officer. See Steve Tuttle, Cambridge University Study Shows On-Officer Video Reduces Use-of-Force Incidents by 59 percent, TASER Int’l (Apr. 8, 2013), https://s3.amazonaws.com/uploads. hipchat.com/33/3817/rc4wgppgd39lrns/130408%20Rialto%20AXON%20Flex%20Cambridge%20Study. pdf (the use of “officer worn cameras reduced the rate of use-of-force incidents by 59 percent” and “utilization of cameras led to an 87.5 percent reduction in complaints” by citizens against police officers); see also U.S. DEPARTMENT OF JUSTICE, IMPLEMENTING A BODY-WORN CAMERA PROGRAM: RECOMMENDATIONS AND LESSONS LEARNED (2014), available at http://www.policeforum.org/assets/docs/Free_Online_Documents/ Technology/implementing%20a%20body-worn%20camera%20program.pdf. 152. See, e.g., Taylor v. Maddox, 366 F.3d 992 (9th Cir. 2004) (the defendant claimed his confession was coerced, while the detectives argued otherwise); Milke, 711 F.3d at 1002 (Detective Saldate claimed that Milke confessed to the murder during her interrogation, while Milke maintained that Saldate ignored her request for a lawyer and “embellished and twisted [her] statements to make it sound like she had confessed”). In both these cases, we lacked access to a video or audio recording to ascertain what really happened. 153. This practice has been adopted in England, Ireland and Australia, where the general rule is that all interrogations—and not just confessions—must be recorded on audio or video. However, Australia is the only country that explicitly provides that the consequence for failing to record is inadmissibility of the contents of the interrogation. See TOM SULLIVAN, COMPENDIUM: ELECTRONIC RECORDING OF CUSTODIAL INTERROGATIONS, National Association of Criminal Defense Lawyers, July 11, 2014, available at http://www. nacdl.org/WorkArea/DownloadAsset.aspx?id33287&libID33256. In addition, a number of states, including Alaska, Arkansas, Minnesota, Montana and New Jersey, require all interrogations to be recorded and consider compliance with that requirement a factor in determining whether a statement made in an interrogation is admissible. See id. 154. See James M. Cole, Memorandum, POLICY CONCERNING ELECTRONIC RECORDING OF STATEMENTS, May 12, 2014, https://s3.amazonaws.com/s3.documentcloud.org/documents/1165422/doj-interrogation- 44 GEO. L.J. ANN. REV. CRIM. PROC (2015) xxix 5. Impose strict limits on the use of jailhouse informants. In response to a devastating report on jailhouse informants issued by the Los Angeles County grand jury in 1990, the county adopted procedures that required the approval of a commit- tee before informants could be used.155 The use of informants consequently plum- meted.156 Even still, the practice of using jailhouse informants as a means of detecting and perhaps manufacturing incriminatory evidence has continued in Califor- nia.157 Serial informants are exceedingly dangerous because they have strong incen- tives to lie or embellish, they have learned to be persuasive to juries and there is no way to verify whether what they say is true.158 A man jailed on suspicion of a crime should not be subjected to the risk that someone with whom he is forced to share space will try for a get-out-of-jail-free card by manufacturing a confession. 6. Adopt rigorous, uniform procedures for certifying expert witnesses and preserving the integrity of the testing process. There is an effort underway to do this at the federal level. A 30-member commission headed by the Justice Department and comprised of forensic scientists, researchers, prosecutors, defense attorneys and judges was founded two years ago with the goal of “improv[ing] the overall reliabil- ity of forensic evidence after instances of shoddy scientific analysis by federal, state and local police labs helped convict suspects.”159 However, the Justice Department recently made the unilateral decision that “the subject of pre-trial forensic discovery—i.e., the extent to which information regard- ing forensic science experts and their data, opinions, methodologies, etc., should be disclosed before they testify in court—is beyond the ‘scope’ of the Commission’s business and therefore cannot properly be the subject of Commission reports or discussions in any respect.”160 This prompted the resignation of commission member Judge Rakoff, who criticized the decision as “a major mistake that is likely to significantly erode the effectiveness of the Commission” and a reflection of “a determination by the Department of Justice to place strategic advantage over a search for the truth.”161 He elaborated: “A primary way in which forensic science interacts with the courtroom is through discovery, for if an adversary does not know in advance sufficient information about the forensic expert and the methodological and evidentiary bases for that expert’s opinions, the testimony of the expert is nothing more than trial by ambush.”162 Judge Rakoff’s noisy resignation had its desired effect: Two days later, the Justice Department reversed its decision to bar the commission from considering issues memo.txt; Michael S. Schmidt, In Policy Change, Justice Dept. to Require Recording of Interrogations, N.Y. TIMES (May 22, 2014), http://www.nytimes.com/2014/05/23/us/politics/justice-dept-to-reverse-ban- on-recording-interrogations.html?_r0. 155. See Henry Weinstein, Use of Jailhouse Testimony is Uneven in State, L.A. TIMES (Sept. 21, 2006), http://articles.latimes.com/2006/sep/21/local/me-jailhouse21. 156. Id. 157. See supra n.107. 158. See Russell D. Covey, Abolishing Jailhouse Snitch Testimony, 49 WAKE FOREST L. REV. 1375, 1376-1409 (2014). 159. See Tim Cushing, Judge Resigns from Forensic Science Committee, Calls Out DOJ’s “Trial By Ambush” Tactics, TECHDIRT (Feb. 5, 2015), https://www.techdirt.com/articles/20150202/11152629883/judge- resigns-forensic-science-committee-calls-out-dojs-trial-ambush-tactics.shtml; Spencer S. Hsu, U.S. To Commit Scientists and New Commission To Fix Forensic Science, WASH. POST (Feb. 15, 2013), http://www. washingtonpost.com/local/crime/us-to-commit-scientists-and-new-commission-to-fix-forensic-science/ 2013/02/15/e11c31f8-77b3-11e2-8f84-3e4b513b1a13_story.html. 160. See Full Text: Judge’s Protest Resignation Letter, WASH. POST (Jan. 29, 2015), http://www. washingtonpost.com/local/full-text-judges-protest-resignation-letter/2015/01/29/41659da6-a7e1-11e4- a2b2-776095f393b2_story.html. 161. Id. 162. Id. 44 GEO. L.J. ANN. REV. CRIM. PROC (2015)xxx related to pre-trial forensic discovery.163 Judge Rakoff subsequently returned to the commission, which is now in the process of preparing recommendations for the Attorney General. But why should the Justice Department have to be buffaloed into doing the right thing? 7. Keep adding conviction integrity units. We know that there are innocent people languishing in prison, but figuring out who they are is very difficult—more so if the prosecution, which has control of whatever evidence there is, is fighting you tooth and nail. That turns out to be a common response from prosecutors confronted with evidence that they may have obtained a wrongful conviction. A separate unit within the prosecutor’s office, with access to all the available evidence, and with no track record to defend, may be the best chance we have of identifying wrongfully convicted prisoners. More than a dozen such offices have been established across the country164 and more are being added.165 This trend needs to continue and escalate. Better yet, there might be a federal agency to investigate the problem of questionable state convictions. This would reduce the bias that one state agency might have in favor of another. In addition, state and federal law ought to be revised to give convicted defendants full access to DNA and other evidence in the possession of the prosecution. We have repeatedly witnessed the appalling spectacle of innocent defendants spending many years fighting to obtain the evidence that would eventually exonerate them. Michael Morton spent six additional years in prison because District Attorney John Bradley worked very hard to block Morton’s requests for DNA testing.166 And Anthony Ray Hinton spent more than fifteen years in prison fighting for the right to test evidence that eventually set him free.167 Bruce Godschalk lost seven years168; Frank Lee Smith died in prison waiting for DNA testing that eventually proved his innocence.169 163. See Spencer S. Hsu, Judge Rakoff Returns to Forensic Panel After Justice Department Backs Off Decision, WASH. POST (Jan. 30, 2015), http://www.washingtonpost.com/local/crime/in-reversal-doj-lets- forensic-panel-suggest-trial-rule-changes-after-us-judge-protests/2015/01/30/2f031d9e-a89c-11e4-a2b 2-776095f393b2_story.html. 164. Various District Attorneys’ offices in 12 states, as well as the U.S. Attorney’s Office in Washing- ton, D.C., have established conviction integrity units for the purpose of identifying and investigating wrongful conviction claims, often in collaboration with local innocence projects. See Center for Prosecu- tor Integrity, CONVICTION INTEGRITY UNITS, http://www.prosecutorintegrity.org/ (last visited Mar. 18, 2015); CENTER FOR PROSECUTOR INTEGRITY, CONVICTION INTEGRITY UNITS: VANGUARD OF CRIMINAL JUSTICE REFORM 9 (Dec. 2014), available at http://www.prosecutorintegrity.org/wp-content/uploads/2014/12/ Conviction-Integrity-Units.pdf (noting that these conviction integrity units have produced a total of 61 exonerations, with 33 attributed to the unit in Dallas, Texas); Gardiner, supra n.46 (Brooklyn DA Kenneth Thompson overhauled the office’s conviction integrity unit and, in a mere 7 months, has ordered 7 murder convictions overturned). 165. See, e.g., Marisa Gerber, L.A. County D.A. to Create Unit to Review Wrongful-Conviction Claims, L.A. TIMES (Apr. 22, 2015), http://www.latimes.com/local/lanow/la-me-ln-conviction-review-unit-201504 22-story.html#page1; Jim Forsyth, Bexar DA Establishes “Conviction Integrity Unit”, WOAI LOCAL NEWS (Feb. 25, 2015), http://www.woai.com/articles/woai-local-news-sponsored-by-five-119078/bexar-da- establishes-conviction-integrity-unit-13288998/. 166. See Brandi Grissom, supra n.77. In the words of the Houston Chronicle, “The fall of John Bradley was swift and severe and justified.” Lisa Falkenberg, Tossed from Office, Ex-Williamson DA Lands Job in Sunny Palau, HOUSTON CHRON. (July 1, 2014), http://www.houstonchronicle.com/news/ columnists/falkenberg/article/Falkenberg-5594473.php. Bradley lost the Republican primary for William- son County District Attorney in 2012, a post he had held for a decade. 167. Just recruiting the panel of experts, including a former F.B.I. official, to review the forensic evidence took Hinton and his lawyers almost a decade. See Alan Blinder, supra n.77; Anthony Ray Hinton Is Free After 30 Years Wrongfully On Death Row, EQUAL JUSTICE INITIATIVE (Apr. 3, 2015), http://www.eji. org/node/1064. 168. See Sara Rimer, DNA Testing in Rape Cases Frees Prisoner After 15 Years, N.Y. TIMES (Feb. 15, 2002), http://www.nytimes.com/2002/02/15/us/dna-testing-in-rape-cases-frees-prisoner-after-15-years.html. 169. See The National Registry of Exonerations, Frank Lee Smith, https://www.law.umich.edu/special/ exoneration/Pages/casedetail.aspx?caseid3644. Smith was exonerated on the basis of DNA testing results 11 months after his death in 2000 and 14 years after his conviction. He had requested DNA testing 44 GEO. L.J. ANN. REV. CRIM. PROC (2015) xxxi There is no justification for withholding evidence that might set an innocent man free from unjust imprisonment. Whatever impediments have been interposed to prevent access to such evidence to convicted defendants and those working on their behalf ought to be summarily removed by legislation giving them full and swift access to all evidence in possession of the government. Most states now have laws allowing post-conviction access to DNA testing, but many are restrictive in practice— for example, denying requests from inmates who originally confessed to the crime or imposing a deadline of one year after conviction to file a request.170 Nebraska’s statute, however, serves as a good example to emulate. It provides: [A] person in custody pursuant to the judgment of a court may, at any time after conviction, file a motion, with or without supporting affidavits, in the court that entered the judgement requesting forensic DNA testing of any biological material that: (a) Is related to the investigation or prosecution that resulted in such judgment; (b) Is in the actual or constructive possession or control of the state or is in the possession or control of others under circumstances likely to safeguard the integrity of the biological material’s original physical composition; and (c) Was not previously subjected to DNA testing or can be subjected to retesting with more current DNA techniques that provide a reasonable likelihood of more accurate and probative results.171 The statute further provides that DNA tests must be performed in a nationally accredited laboratory, that the county attorney must submit an inventory to the defense and to the court of all evidence secured by the state in connection with the case.172 8. Establish independent Prosecutorial Integrity Units. In my experience, the U.S. Justice Department’s Office of Professional Responsibility (OPR) seems to view its mission as cleaning up the reputation of prosecutors who have gotten themselves into trouble. In United States v. Kojayan,173 we found that Assistant United States Attorney Jeffrey Sinek had misled the district court and the jury. The district judge, who had trusted the AUSA, was so taken aback with the revelation that he barred further re-prosecution of the defendants as a sanction for the government’s miscon- duct.174 OPR investigated and gave the AUSA a clean bill of health. And no Justice Department lawyer has yet been sanctioned for the Stevens prosecution despite the clear evidence of willful misconduct.175 Prosecutors need to know that someone is watching over their shoulders—someone who doesn’t share their values and eat lunch in the same cafeteria. Move OPR to the Department of Agriculture, and institute similar independent offices in the 50 states. to no avail for 2 years. 170. See Sue Russell, The Right and Privilege of Post-Conviction DNA Testing, PACIFIC STANDARD (Oct. 4, 2012), http://www.psmag.com/politics-and-law/the-right-and-privilege-of-post-conviction-dna-testing- 47781; Innocence Project, ACCESS TO POST-CONVICTION DNA TESTING (Oct. 10, 2014), http://www. innocenceproject.org/free-innocent/improve-the-law/fact-sheets/access-to-post-conviction-dna-testing. 171. See Neb. Rev. Stat. § 29-4120. 172. Id. 173. 8 F.3d 1315, 1322 (9th Cir. 1993). 174. Order on Motion for Acquittal, No. 2:91-cr-00622-ER-2, Dkt. 111 (Mar. 9, 1994) (granting the motion to dismiss the indictment with prejudice). 175. The Justice Department did give two of its prosecutors, Joseph Bottini and James Goeke, slaps on the wrists, but the Merit Systems Protection Board recently overturned even these mild sanctions on the basis that the Justice Department violated its own procedures for investigating alleged misconduct. See Lisa Rein, Review Board Clears U.S. Prosecutors Accused of Botching Sen. Ted Stevens’s Corruption Trial, WASH. POST (Jan. 14, 2015), http://www.washingtonpost.com/blogs/federal-eye/wp/2015/01/14/panel- clears-u-s-prosecutors-accused-of-botching-sen-ted-stevenss-corruption-trial/. A bungled attempt at sanc- tions strikes me as worse than no sanctions at all. What does that say about the sincerity and competence of the Justice Department’s efforts? They can topple a senator and jail Martha Stewart, but they can’t even spank their own misbehaving lawyers? 44 GEO. L.J. ANN. REV. CRIM. PROC (2015)xxxii C. Judges Judges, especially trial judges, can do a great deal to ensure that prosecutors comply with their constitutional obligations, that only reliable evidence is presented to juries, that juries are properly instructed and that the trial is generally fair. There has been an avalanche of exonerations in recent years, many of them of people who spent half a lifetime or more behind bars,176 and in every one of those cases there was some sort of proceeding—usually a trial, sometimes a plea—where a judge let an innocent man be convicted and sent him to prison or death row. When such cases are reported, the trial judge is always given a pass, as if he were merely a bystander who watched helplessly while an innocent man had his life ripped away from him. I don’t buy it. Any judge that inexperienced or incompetent has no business presiding over anything more significant than small claims court. In criminal cases, judges have an affirmative duty to ensure fairness and justice, because they are the only ones who can force prosecutors and their investigators and experts to comply with due process. Other than being vigilant, compassionate and even-handed, there are specific measures trial judges can take to ensure fairness in criminal proceedings and avoid the conviction of innocents. 1. Enter Brady compliance orders in every criminal case. The Brady rule is in many ways the ultimate guarantor of fairness in our criminal justice system. This is because police have unparalleled access to the evidence in criminal cases—both inculpatory and exculpatory. Once a crime is reported and police are on the scene, they can secure the area and prevent anyone from touching anything until they are done. They have control of what evidence is sent out for forensic testing; they talk to witnesses and get their impression before anyone else does. Police and prosecutors, working together, can lean on witnesses by threatening prosecution or offering leniency. If there is evidence helpful to the defense, it will generally wind up in the possession of the police; if witnesses have made helpful statements in their initial contact with investigators (as happened in the Stevens case) that information will be in the sole possession of the prosecution. A defense investigator or lawyer plowing over the same territory after the police have done their job will generally find the scene denuded of clues and witnesses who are skittish and laconic. Brady and its progeny therefore impose important obligations on prosecutors, obligations that are too frequently ignored. In case after case where an innocent person is exonerated after many years in prison, it turns out that the prosecution failed to disclose or actively concealed exculpatory evidence. But Brady is not self-enforcing; failure to comply with Brady does not expose the prosecutor to any personal risk.177 When Judge Sullivan discovered that the prosecutors in the Stevens case had obtained their conviction after failing to disclose exculpatory evidence, he appointed a special counsel, DC attorney Henry Schuelke III, to independently investigate the prosecutors’ conduct.178 Schuelke determined that the lawyers had committed willful Brady violations but that the court lacked the power to sanction the wrongdoers because they had not violated any court-imposed obligations.179 The solution to this problem is for judges to routinely enter Brady compliance orders, and many judges do so already. Such orders vary somewhat from judge to 176. See The National Registry of Exonerations, supra n.72 and accompanying. 177. See Imbler v. Pachtman, 424 U.S. 409, 430, 431 n.34 (1976) (prosecutors are absolutely immune for “activities [that are] intimately associated with the judicial phase of the criminal process,” including the willful suppression of exculpatory evidence). 178. See Henry F. Schuelke III, Special Counsel, Report to Hon. Emmet G. Sullivan of Investigation Conducted Pursuant to the Court’s Order, supra n.119. 179. See id. 44 GEO. L.J. ANN. REV. CRIM. PROC (2015) xxxiii judge, but typically require the government to turn over, when received, documents and objects, reports of examinations and tests, expert witness opinions and all relevant material required by Brady and Giglio.180 Entering such an order holds prosecutors personally responsible to the court and will doubtless result in far greater compliance. 2. Engage in a Brady colloquy. This procedure was proposed by Professor Jason Kreag in an article published last year in the Stanford Law Review Online,181 and it strikes me as a good idea. The details are outlined in Professor Kreag’s article but the general idea is that, during pretrial hearings and before a defendant enters a guilty plea, the trial judge would have a conversation with the prosecutor on the record, asking him such questions as, “Have you reviewed your file . . . to determine if [it] include[s] information that is favorable to the defense?” and “Have you identified information that is favorable to the defense, but nonetheless elected not to disclose [it] because you believe that the defense is already aware of the information or the information is not material?”182 There is nothing like having to face a judge on the record to impress upon lawyers the need to scrupulously comply with their profes- sional obligations. But the questions must be sufficiently specific and detailed to avoid the mantra, “We’re aware of our Brady obligations and we’ve met them.” 3. Adopt local rules that require the government to comply with its discovery obligations without the need for motions by the defense. The prosecution need not present Brady evidence unless the defense asks for it, usually by motion.183 This seems sort of silly because the defense obviously wants whatever exculpatory evi- dence the prosecution might have. Surprisingly, few courts have rules that obviate the need for criminal discovery motions.184 I’m aware of only a dozen or so federal courts that have local rules either stating that the defense doesn’t need to make a formal discovery motion, or requiring the government to disclose Brady/Giglio material within a specific time frame, without mentioning a defense motion.185 An example of such a rule is Eastern District of Washington Local Criminal Rule 16(a), which was adopted just last year. The rule requires the government to make available within 14 days of arraignment: (1) all of the defendant’s oral and written 180. These orders are routine among all the district judges in the Eastern District of Washington. See, e.g., Judge Justin Quackenbush, Scheduling Order at 1, No. 2:15-CR-0025-JLQ (E.D. Wa. Mar. 23, 2015) (“the United States shall forthwith provide, when received, all relevant material required by Brady and by Giglio”) (citations omitted); Judge Edward Shea, Case Management Order at 4, No. 4:14-CR-6053-EFS (E.D. Wa. Feb. 13, 2015) (“The Court further presumes a request for discovery and disclosure under Federal Rules of Evidence 404(b), 608(b), and 609, Brady, Giglio, United States v. Henthorn, 931 F.2d 29 (9th Cir. 1991), and their progeny.”) (citations omitted). 181. Jason Kreag, The Brady Colloquy, 67 STAN. L. REV. ONLINE 47 (2014). 182. Id. at 50. 183. See Bennett L. Gershman, Litigating Brady v. Maryland: Games Prosecutors Play, 57 CASE W. RES. L. REV. 531, 534 (2007), available at http://digitalcommons.pace.edu/cgi/viewcontent.cgi?article 1535&contextlawfaculty (“Prosecutorial disclosure of Brady evidence is not automatic. Prosecutors are typically required to provide Brady evidence only upon a request.”); FED. JUDICIAL CTR., TREATMENT OF BRADY V. MARYLAND MATERIAL IN UNITED STATES DISTRICT AND STATE COURTS’ RULES, ORDERS, AND POLICIES 14 (2007), available at https://bulk.resource.org/courts.gov/fjc/bradyma2.pdf. 184. See LAURAL HOOPER ET AL., FED. JUDICIAL CTR., A SUMMARY OF RESPONSES TO A NATIONAL SURVEY OF RULE 16 OF THE FEDERAL RULES OF CRIMINAL PROCEDURE AND DISCLOSURE PRACTICES IN CRIMINAL CASES: FINAL REPORT TO THE ADVISORY COMMITTEE ON CRIMINAL RULES (2011), available at http://www.uscourts.gov/ uscourts/RulesAndPolicies/rules/Publications/Rule16Rep.pdf. 185. Courts that require the government to provide criminal discovery without a motion include the District of Hawaii, District of Kansas, District of New Hampshire, District of New Mexico, Western District of Texas, Eastern District of Washington and Eastern District of Wisconsin. Courts that imply as much include the Middle District of Alabama, Southern District of Alabama, Northern District of California, District of Massachusetts, Northern District of New York and the District of Vermont. See id. at 18. 44 GEO. L.J. ANN. REV. CRIM. PROC (2015)xxxiv statements, the defendant’s prior record, documents and objects and expert witness opinions that are in the government’s “possession, custody or control or which may become known . . . through due diligence”; (2) information from an “electronic eavesdrop, wiretap or any other interception,” as well as “the authorization for and information gathered from” a tracking device or video/audio recording used during investigations; (3) “search warrants and supporting affidavits”; (4) information regard- ing whether physical evidence intended to be offered in the government’s case-in- chief was seized without a warrant; and (5) photographs used in any photo lineup, as well as information obtained from any other identification technique.186 Rule 16(a)(6) is a catchall clause that requires the government to “[a]dvise the defendant’s attorney of evidence favorable to the defendant and material to the defendant’s guilt or punishment to which defendant is entitled pursuant to Brady and United States v. Agurs.”187 I have no idea why this isn’t part of the Federal Rules of Criminal Procedure, but it should be. 4. Condition the admission of expert evidence in criminal cases on the presen- tation of a proper Daubert showing. As Judge Nancy Gertner has pointed out on numerous occasions,188 courts in criminal cases routinely admit expert evidence lacking the proper foundations and sometimes amounting to little more than guess- work. Few defense lawyers challenge the reliability of expert evidence because few trial judges grant requests for Daubert hearings.189 And appellate courts affirm such denials under a very generous abuse of discretion standard.190 With the mounting number of wrongful convictions based on faulty expert evidence in such diverse areas as arson and shaken baby syndrome, courts must be far more rigorous in enforcing Daubert before allowing experts to testify in criminal trials. Failure to hold a Daubert hearing where the reliability of expert evidence has been credibly chal- lenged should be considered an error of law, as should the refusal to allow a defense memory expert where the case turns on conflicting recollections of past events.191 5. When prosecutors misbehave, don’t keep it a secret. Defense lawyers who are found to have been ineffective regularly find their names plastered into judicial opinions, yet judges seem strangely reluctant to name names when it comes to misbehaving prosecutors.192 Indeed, judges seem reluctant to even suspect prosecu- tors of improper behavior, as if they were somehow beyond suspicion. For example, the district judge in the Kojayan case, discussed above, could have obviated the appeal and the entire sordid episode by forcing the Assistant U.S. Attorney to answer a simple question: “Did Nourian have a plea agreement with the government?” Defense counsel urged the judge to ask the question but to no avail. It was not until the oral argument before our court that the AUSA was compelled to disclose that fact: 186. U.S. District Court for the Eastern District of Washington, Local Crim. R. 16(a), available at http://www.waed.uscourts.gov/sites/default/files/Local_Criminal_Rules-20150303_0.pdf. 187. Id. (citations omitted). 188. See, e.g., Nancy Gertner, Judges Need to Set a Higher Standard for Forensic Evidence, N.Y. TIMES (Mar. 30, 2015), http://www.nytimes.com/roomfordebate/2015/03/30/robert-durst-handwriting-and- judging-forensic-science/judges-need-to-set-a-higher-standard-for-forensic-evidence; Nancy Gertner, Com- mentary on the Need for a Research Culture in the Forensic Sciences, 58 UCLA L. REV. 789, 793 (2011). 189. See David E. Bernstein, The Misbegotten Judicial Resistance to the Daubert Revolution, 89 NOTRE DAME L. REV. 27, 50-66 (2013). Moreover, “[s]tatistics substantiate the ubiquity of defense failure to initiate Daubert challenges, confirming the rarity in the trial courts of any defense challenge to a prosecutor’s proffered expert testimony.” See also Peter J. Neufeld, The (Near) Irrelevance of Daubert to Criminal Justice and Some Suggestions for Reform, 95 AM. J. PUB. HEALTH 107, 110 (2005). 190. See, e.g., Gen. Elec. Co. v. Joiner, 522 U.S. 136, 141 (1997). 191. See supra pp. vi-vii and accompanying footnotes. 192. See Adam M. Gershowitz, Prosecutorial Shaming: Naming Attorneys to Reduce Prosecutorial Misconduct, 42 U.C. DAVIS L. REV. 1059, 1069-71 & n.21 (2009). 44 GEO. L.J. ANN. REV. CRIM. PROC (2015) xxxv [Q]: Was there a cooperation agreement? AUSA: Well, your honor, that is not something that’s in the record. [Q]: I understand. Was there a cooperation agreement? AUSA: There was an agreement with the Southern District of New York and [Nourian], yes.193 Naming names and taking prosecutors to task for misbehavior can have magical qualities in assuring compliance with constitutional rights. In Baca v. Adams,194 a panel of our court dealt with a case where both the California trial court and the California Court of Appeal concluded that a prosecutor lied on the stand, but nonetheless deemed the error harmless. During our questioning, we asked the Deputy Attorney General arguing the case whether the lying prosecutor and another untruth- ful witness had been prosecuted for perjury or otherwise sanctioned. The answer, of course, was that they had not been. We then suggested that, in resolving the case, we would write an opinion naming those who had misbehaved and the failure of the state authorities to take any actions against them. The video of that oral argument made its way to the blogosphere and has been viewed over 24,000 times.195 Not surprisingly, three weeks afterwards, the California Attorney General wrote confessing error and requesting that we remand to the district court with instructions that it grant a conditional writ of habeas corpus.196 The incident, by the way, illustrates the importance of providing video access to court proceedings. It is far easier to hide an injustice from public scrutiny if only the judge and a few lawyers know about it. Judges who see bad behavior by those appearing before them, especially prosecu- tors who wield great power and have greater ethical responsibilities, must hold such misconduct up to the light of public scrutiny. Some of us regularly encourage prosecutors to speak to their supervisors, even the United States Attorney, to ensure that inappropriate conduct comes to their attention, with excellent results.197 If judges have reason to believe that witnesses, especially police officers or government informants, testify falsely, they must refer the matter for prosecution. If they become aware of widespread misconduct in the investigation and prosecution of criminal cases, a referral to the U.S. Department of Justice for a civil rights violation might well be appropriate.198 193. United States v. Kojayan, 8 F.3d 1315, 1320 (9th Cir. 1993). The Justice Department reacted with typical insouciance: It filed a motion to depublish the opinion or, in the alternative, to amend the opinion to remove the AUSA’s name. USA’s Motion for Depublication, or in the Alternative, Modification of Opinion w/Declaration of AUSA Sinek, No. 95-50875, Dkt. 51 (Sept. 24, 1993); see supra n.129. 194. No. 13-56132, 2015 WL 412835, at *1 (9th Cir. Jan. 30, 2015). 195. See 13-56132 Johnny Baca v. Derral Adams, YOUTUBE (Jan. 8, 2015), https://www.youtube.com/ watch?v2sCUrhgXjH4. 196. Appellee’s Unopposed Motion for Summary Reversal and Remand to the District Court to Conditionally Grant the Writ, Baca v. Adams (Jan. 29, 2015) (No. 13-56132, Dkt. 33). 197. A memorable example is United States v. Maloney, 755 F.3d 1044 (9th Cir. 2014) (en banc). The AUSA had sandbagged the defense at trial by making for the first time a factual assertion not in the record in his rebuttal during closing argument. At oral argument, I asked the AUSA to go back and show the video of the oral argument to the U.S. Attorney and “see whether this [conduct] is something [she] want[s] to be teaching [her] line attorneys.” 11-50311 United States v. Maloney, YOUTUBE (Sept. 19, 2013), https://www.youtube.com/watch?vHgafGnA4Eow, at 59:00. A little over two weeks later, we received a letter from Laura Duffy, the U.S. Attorney herself, admitting that the AUSA had acted improperly and promising to “use the video of the argument as a training tool to reinforce the principle that all Assistant U.S. Attorneys must be aware of the rules pertaining to closing argument and must make every effort to stay well within these rules.” Motion to Summarily Reverse the Conviction, Vacate the Sentence and Remand to the District Court, United States v. Maloney (Oct. 7, 2013) (No. 11-50311, Dkt. 52-1). Bravo Ms. Duffy! 198. But not always successful. In our opinion vacating Milke’s conviction, we made an express referral of the matter to the Justice Department based on what appeared to us to be knowing and repeated use of perjured 44 GEO. L.J. ANN. REV. CRIM. PROC (2015)xxxvi D. Miscellaneous On March 8, 2015, A.M. “Marty” Stroud III, a Shreveport lawyer and former state prosecutor, published a remarkable piece in the Shreveport Times reflecting on the case of Glenn Ford, who spent 30 years on death row after being convicted of murder and sentenced to death in 1984.199 Ford was released after the state disclosed evidence proving his innocence. Stroud offered a public apology for his conduct in the case. It is well worth reading in full, but here is the gist of it: At the time this case was tried there was evidence that would have cleared Glenn Ford. The easy and convenient argument is that the prosecutors did not know of such evidence, thus they were absolved of any responsibility for the wrongful conviction. I can take no comfort in such an argument . . . . Had I been more inquisitive, perhaps the evidence would have come to light years ago . . . . My mindset was wrong and blinded me to my purpose of seeking justice, rather than obtaining a conviction of a person who I believed to be guilty. I did not hide evidence, I simply did not seriously consider that sufficient information may have been out there that could have led to a different conclusion. And that omission is on me. I did not question the unfairness of Mr. Ford having appointed counsel who had never tried a criminal jury case much less a capital one. It never concerned me that the defense had insufficient funds to hire experts . . . . The jury was all white, Mr. Ford was African-American. Potential African- American jurors were struck with little thought about potential discrimination . . . . I also participated in placing before the jury dubious testimony from a forensic pathologist that the shooter had to be left handed . . . . All too late, I learned that the testimony was pure junk science at its evil worst. In 1984, I was 33 years old. I was arrogant, judgmental, narcissistic and very full of myself. I was not as interested in justice as I was in winning. To borrow a phrase from Al Pacino in the movie “And Justice for All,” “Winning became everything.”200 What is remarkable about Stroud’s statement is not that he gained a conviction and death sentence for a man that turned out to be innocent. Or that that man spent three decades caged like an animal. That kind of thing is all too common. Nor is there anything unusual about the confluence of errors that led to the wrongful conviction— failure to uncover exculpatory evidence, inexperienced defense lawyers, race-based jury selection, junk science, and a judge who passively watched the parade and sat on his thumbs. The same goes for a prosecutorial attitude of God-like omniscience and unwillingness to entertain the possibility that the wrong man is being prosecuted. These things happen all the time in case, after case, after case. What is unusual—unique really—is Stroud’s willingness to accept personal respon- sibility for the calamity he helped inflict on Glenn Ford and his family—his willing- testimony by Detective Saldate in a large number of criminal prosecutions. Milke, 711 F.3d at 1019-20. The Justice Department declined to investigate the matter, yet evidence that Milke’s case was not an isolated incident was readily available. For example, in a recent letter to the editor complaining about Milke’s release, a colleague of Saldate’s in the 1980s stated: “I am painfully aware that Detective Armando Saldate and his now deceased partner were notorious for bending the rules, especially when it came to suspect interviews. Other homicide detectives attempted to make supervisors aware of these serious issues. They were met with disdain and angrily told that if they couldn’t be a team player, they could find another place to work. Nothing else was said for fear of retaliation, and no corrective steps were taken.” See Antonio Morales Jr., Op-ed, Milke Doesn’t Deserve Her Freedom, AZ CENTRAL (Mar. 20, 2015), http://www.azcentral.com/story/opinion/letters/2015/03/19/milke-deserve-freedom/25057 361/. If evidence of such widespread misconduct in the highest level of a metropolitan police department is unworthy of even an investigation by the U.S. Justice Department, one must wonder what is. 199. A.M. “Marty” Stroud III, RE: “State Should Give Ford Real Justice”, SHREVEPORT TIMES, Mar. 8, 2015, at 6D, http://www.shreveporttimes.com/story/opinion/readers/2015/03/20/lead-prosecutor-offers- apology-in-the-case-of-exonerated-death-row-inmate-glenn-ford/25049063/. 200. Id. 44 GEO. L.J. ANN. REV. CRIM. PROC (2015) xxxvii ness to embrace this as his personal failure, not just an unfortunate failure of the system. Most prosecutorial attitudes run the gamut from “that’s why they put erasers on pencils” to “they must be guilty of something.” Everyone else in the system, starting with trial judges, absolves himself of personal responsibility when a heinous failure occurs. We could do with a lot less of that. In a sense, however, the system is responsible because it places a great deal of power and responsibility in young, ambitious lawyers, like Stroud, who have every incentive to close their eyes to the possibility of innocence, to testilying by police, to bogus experts and to suggestive eyewitness identification procedures. A prosecutor certainly does not help advance his career by providing to the defense evidence that his star witness made a statement directly contrary to his testimony before the police started leaning on him—as happened in the shameful prosecution and wrongful conviction of Senator Stevens. Faced with a remote possibility of being found out, and the likelihood that nothing bad will happen even if they are, many prosecutors will turn a blind eye or worse. And that’s how miscarriages of justice happen. Some of the suggestions above will help ameliorate the problem, but there are some other reforms that require either legislation, a ruling by the Supreme Court, action by parties not involved in the criminal justice process or a constitutional amendment. These may be more difficult to achieve, but here they are nonetheless: 1. Abandon judicial elections. Professor Monroe Freedman made the case for the unconstitutionality of elected state judges in his succinct monograph, The Unconstitu- tionality of Electing State Judges.201 He relied on the separate opinions of Justices O’Connor and Ginsburg in Republican Party of Minnesota v. White,202 citing Justice O’Connor’s opinion for “studies showing that judges who face elections are far more likely to override jury sentences of life without parole and impose the death pen- alty.”203 The difficulty confronting any judge who faces an election is compounded by the well-known practice of prosecutors enlisting one of their own to oppose a judge that they consider to be pro-defense.204 And in at least 19 states, lawyers may also “paper” or “affidavit” a judge by filing a peremptory challenge to disqualify a judge they deem “prejudiced” against their interests, without having to submit any explanation or proof of prejudice.205 This tactic can be used en masse to effectively preclude a judge from hearing any criminal cases, and is precisely what appears to be happening to the judge in Orange County who removed the District Attorney’s office from a high-profile case because of repeated instances of misconduct.206 While many, 201. Monroe H. Freedman, The Unconstitutionality of Electing State Judges, 26 GEO. J. LEGAL ETHICS 217 (2013). 202. 536 U.S. 765 (2002). 203. Freedman, supra n.201, at 218. 204. See Jennifer Emily, Dallas DA Accused of Pushing Prosecutors to Run Against Judges, THE DALLAS MORNING NEWS (Oct. 7, 2013), http://www.dallasnews.com/news/politics/local-politics/20131006- da-accused-of-pushing-prosecutors-to-run-against-incumbent-judges.ece (six prosecutors from the Dallas County DA’s office were running for state district judge benches, five of whom were challenging incumbent Democratic judges). 205. See Michelle Quinn, District Attorney’s Boycott of a Judge Raises Issues, N.Y. TIMES (Mar. 20, 2010), http://www.nytimes.com/2010/03/21/us/21sfcourt.html?pagewantedall&_r0 (Santa Clara County DA disqualified one judge from 100 cases as retaliation for the judge freeing a child molester after the deputy DA provided false testimony and withheld exculpatory evidence); Maureen Cavanaugh & Pat Finn, San Diego’s Great Judge Boycott, KPBS (Feb. 22, 2010) http://www.kpbs.org/news/2010/feb/22/san- diegos-great-judge-boycott/ (discussing the boycott of certain judges by the San Diego County DA’s office after those judges had either made rulings against the prosecution or criticized prosecutors for failing to disclose exculpatory evidence). 206. Prosecutors from the Orange County DA’s office made blanket disqualification requests against Judge Thomas Goethals in his other criminal cases as soon as he began probing into the misuse of jailhouse informants in the Dekkrai murder trial. See supra n.131; Eric Hartley, Prosecutors Avoiding 44 GEO. L.J. ANN. REV. CRIM. PROC (2015)xxxviii perhaps, most judges resist the pressure and remain impartial, the fact that they may have to face the voters with the combined might of the prosecution and police groups aligned against them no doubt causes some judges to rule for the prosecution in cases where they would otherwise have ruled for the defense.207 2. Abrogate absolute prosecutorial immunity. In Imbler v. Pachtman,208 a divided Supreme Court held that prosecutors are absolutely immune from damages liability for misconduct they commit when performing the traditional activities of a prosecutor. Imbler was not a constitutional ruling; the Court was interpreting 42 U.S.C. § 1983. And it was certainly not a result compelled by the language of the statute; section 1983 says nothing about immunity. Rather, Imbler reflected a pure policy judgment that prosecutors needed complete freedom from liability in order to properly discharge their functions. Writing for himself and two others, Justice White would have adopted a more limited immunity rule that would have held prosecutors liable for certain kinds of deliberate misconduct such as willfully failing to disclose Brady and Giglio evidence.209 Under Imbler, prosecutors cannot be held liable, no matter how badly they misbehave, for actions such as withholding exculpatory evidence, introducing fabri- cated evidence, knowingly presenting perjured testimony and bringing charges for which there is no credible evidence. All are immune from liability. A defense lawyer who did any such things (or their equivalents) would soon find himself disbarred and playing house with Bubba. The Imbler majority seemed reassured by the possibility that rogue prosecutors will be subject to other constraints: We emphasize that the immunity of prosecutors from liability in suits under [§] 1983 does not leave the public powerless to deter misconduct or to punish that which occurs. This Court has never suggested that the policy considerations which compel civil immunity for certain governmental officials also place them beyond the reach of the criminal law . . . . Moreover, a prosecutor stands perhaps unique, among officials whose acts could deprive persons of constitutional rights, in his amenability to professional discipline by an associa- tion of his peers. These checks undermine the argument that the imposition of civil liability is the only way to insure that prosecutors are mindful of the constitutional rights of persons accused of crime.210 This argument was dubious in 1976 and is absurd today. Who exactly is going to prosecute prosecutors? Despite numerous cases where prosecutors have committed willful misconduct, costing innocent defendants decades of their lives, I am aware of only two who have been criminally prosecuted for it; they spent a total of six days behind bars.211 Judge They Say Is Biased, O.C. REG. (June 13, 2014), http://www.ocregister.com/articles/prosecutors-6182 07-goethals-judge.html?page1. The Orange County Bar Association took notice and passed Resolution 15R-01, titled “Independence of the Judiciary,” in which it stated that it “publicly disapproves of the use of tactics which are, or have the appearance of being, punitive and retaliatory towards any sitting judge,” and that “the excessive use of [the affidavit procedure] against a particular judge can be . . . inappropriate . . . and could be construed as an attempt to intimidate not just that judge, but the entire judiciary, who will and must remain independent.” See Orange Cnty. Bar Ass’n, Resolution 15R-01: Independence of the Judiciary (Mar. 27, 2015), http://www.ocbar.org/Portals/0/pdf/press_releases/2015/2015_03_30_OCBA_ ResolutionR15-01.pdf. 207. See supra n.129 (again, life tenure is a wonderful thing). 208. 424 U.S. 409, 430, 431 n.34 (1976). 209. Id. at 438-45 (White, J., concurring). In fact, on May 1, 2015, the Supreme Court of Canada reversed course and embraced a similar rule. See Henry v. British Columbia (Attorney General), [2015] S.C.C. 24 (Can.) (government may be sued when prosecutors intentionally withhold evidence favorable to the defense). 210. Id. at 428-29. 211. Texas district attorney Ken Anderson, see supra n.77, went to jail for five days (serving only half of his 10-day sentence) for hiding evidence that put Michael Morton in prison for a quarter of a century. And he got even that much because he was found in contempt of a Brady compliance order entered by the 44 GEO. L.J. ANN. REV. CRIM. PROC (2015) xxxix There have been a few instances of professional discipline against prosecutors, though even that has been much less than against similarly-situated private law- yers.212 By and large, however, professional organizations are exceedingly reluctant to impose sanctions on prosecutors for misconduct in carrying out their professional responsibilities.213 Sidney Powell’s book, Licensed to Lie, illustrates exhaustively the futility of getting bar disciplinary boards to impose professional discipline for misconduct committed in the course of criminal prosecutions.214 Despite this dismal track record refuting the bland assurances of the Imbler majority that prosecutors will be subject to other forms of control, even if damages lawsuits are not available, the Court has reaffirmed Imbler on numerous occasions. Most recently, in its unanimous opinion in Van de Kamp v. Goldstein,215 the Court denied compensation to the petitioner, Thomas Goldstein, who had spent 24 years in prison based on the testimony of notorious jailhouse snitch Edward Fink. Prosecutors used Fink as a utility infielder in numerous cases, and he somehow always managed to testify that the defendant had confessed.216 Unmoved, the Court held the prosecutors and their supervisors were all protected by absolute immunity and Mr. Goldstein can pound sand.217 What kind of signal does this send to young prosecutors who are out to make a name for themselves? I think it signals that they can be as reckless and self-serving as they want, and if they get caught, nothing bad will happen to them. Imbler and Van de Kamp should be overruled. It makes no sense to give police, who often have to act in high pressure situations where their lives may be in danger, only qualified immu- nity218 while giving prosecutors absolute immunity. It is a disparity that can only be trial judge in that case. See Texas Prosecutor to Serve 10 Days for Innocent Man’s 25-Year Imprisonment, THE GUARDIAN (Nov. 8, 2013), http://www.theguardian.com/world/2013/nov/08/texas-prosecutor-ken- anderson-michael-morton-trial. None of the prosecutors who concealed evidence in the Stevens criminal case were prosecuted, and the two who were initially disciplined by the DOJ got their sanctions overturned by the Merit Systems Protection Board. See supra n.175. Mike Nifong, the district attorney who committed widespread misconduct when prosecuting the Duke Lacrosse players, was convicted of criminal contempt but sentenced to just one day in jail. See supra n.136. The list of prosecutors who have committed misconduct causing serious, lasting harm to innocent people and who have not themselves been criminally prosecuted is very long indeed. I am aware of no prosecutors, other than Ken Anderson and Mike Nifong, who have been convicted of prosecutorial misconduct. 212. For example, Trinidad County, Colorado District Attorney Frank Ruybalid pleaded guilty to over a dozen instances of professional misconduct and had his law license suspended for six months, but that suspension was immediately suspended, even though “private attorneys ‘have received sanctions more severe than a six-month stayed suspension’ for conduct similar to Ruybalid’s.” See Alan Prendergast, Frank Ruybalid, Trinidad District Attorney, Cops a Plea, Admits Misconduct, WESTWORD (Jan. 29, 2015), http://www.westword.com/news/frank-ruybalid-trinidad-district-attorney-cops-a-plea-admits-misconduct- 6282816 (quoting the settlement agreement). 213. Nor have courts been eager to uphold sanctions imposed by professional organizations. See, e.g., In re Kline, No. 13-BG-851, at 2-3 (D.C. Ct. App. Apr. 9, 2015), available at http://www.dccourts.gov/ internet/documents/13-BG-851.pdf (despite finding that “Bar Counsel proved by clear and convincing evidence that [the prosecutor] intentionally failed to disclose information in violation of [a D.C. Rule of Professional Conduct prohibiting prosecutors from intentionally withholding exculpatory evidence from the defense in a criminal case], the panel concluded that “given the confusion regarding the correct interpretation of a prosecutor’s obligations under the rule, sanctioning [the prosecutor] would be unwarranted”). One can hope that prosecutors in the District of Columbia will no longer be confused as to their disclosure obligations after In re Kline. 214. See POWELL, supra n.116, at 397-401. 215. 555 U.S. 335 (2009). 216. Id. at 339. 217. Id. at 349 (“[W]here a § 1983 plaintiff claims that a prosecutor’s management of a trial-related information system is responsible for a constitutional error at his or her particular trial, the prosecutor responsible for the system enjoys absolute immunity just as would the prosecutor who handled the particular trial itself.”). 218. See, e.g., Messerschmidt v. Millender, 132 S. Ct. 1235 (2012); see also Devereaux v. Abbey, 263 F.3d 1070 (9th Cir. 2001) (en banc) (police have only qualified immunity for allegedly fabricating 44 GEO. L.J. ANN. REV. CRIM. PROC (2015)xl explained by the fact that prosecutors and judges are all part of the legal profession and it’s natural enough to empathize with people who are just like you.219 If the Supreme Court won’t overrule Imbler and Van de Kamp, Congress is free to do it by amending 42 U.S.C. § 1983. 3. Repeal AEDPA § 2254(d). Prior to AEDPA taking effect in 1996, the federal courts provided a final safeguard for the relatively rare but compelling cases where the state courts had allowed a miscarriage of justice to occur. One of the better- known examples of this is the case of Ron Williamson, who in 1994 was just 5 days away from being executed for a murder of which he was eventually cleared by DNA evidence. He was saved when U.S. District Judge Frank Seay entered a stay of execution that began a process culminating in Williamson’s exoneration. The case is described in depth in John Grisham’s non-fiction book, The Innocent Man.220 The federal court safety-value was abruptly dismantled in 1996 when Congress passed and President Clinton signed the Antiterrorism and Effective Death Penalty Act. Hidden in its interstices was a provision that has pretty much shut out the federal courts from granting habeas relief in most cases, even when they believe that an egregious miscarriage of justice has occurred.221 We now regularly have to stand by in impotent silence, even though it may appear to us that an innocent person has been convicted.222 Not even the Supreme Court may act on what it believes is a constitutional violation if the issue is raised in a habeas petition as opposed to on direct appeal.223 There are countless examples of this, but evidence in a criminal case); Gantt v. City of Los Angeles, 717 F.3d 702 (9th Cir. 2013) (same). 219. Though it raises other questions, it’s also worth taking another look at absolute judicial immunity. See Timothy M. Stengel, Absolute Judicial Immunity Makes Absolutely No Sense: An Argument for an Exception to Judicial Immunity, 84 TEMP. L. REV. 1071 (2012) (arguing that absolute judicial immunity should be removed in cases where malice or corruption is substantiated). 220. See JOHN GRISHAM, THE INNOCENT MAN: MURDER AND INJUSTICE IN A SMALL TOWN (2006). 221. Namely, 28 U.S.C. § 2254(d) provides that a writ of habeas corpus shall not be granted unless the adjudication of the claim on the merits in state court “(1) resulted in a decision that was contrary to, or involved an unreasonable application of, clearly established Federal law, as determined by the Supreme Court of the United States; or (2) resulted in a decision that was based on an unreasonable determination of the facts in light of the evidence presented in the State court proceeding.” 222. See, e.g., Murdoch v. Castro, 609 F.3d 983 (9th Cir. 2010) (en banc) (Kozinski, J., dissenting) (the plurality applied AEDPA deference and denied habeas relief to a petitioner who had steadfastly main- tained his innocence and had “strong proof” that he was in fact innocent). It’s no surprise that courts have “performed miserably in ferreting out the innocent.” See Adam Liptak, Study of Wrongful Convictions Raises Questions Beyond DNA, N.Y. TIMES (July 23, 2007), http://www.nytimes.com/2007/07/23/us/23bar. html?_r1&&gwhD810E36AF10FBA1A836653D78673C1C8&gwtpay. Not only did the Supreme Court decline to hear the appeals of 30 of 31 prisoners who were later exonerated by DNA evidence, but it ruled against the prisoner in the one appeal it did hear. See Brandon L. Garrett, Judging Innocence, 108 COLUM. L. REV. 55, 95 (2008). 223. Compare Brown v. Payton, 544 U.S. 133, 148-49 (2005) (Breyer, J., concurring) (“In my view, this is a case in which Congress’ instruction to defer to the reasonable conclusions of state-court judges makes a critical difference. See 28 U.S.C. § 2254(d)(1). Were I a California state judge, I would likely hold that Payton’s penalty-phase proceedings violated the Eighth Amendment . . . . Nonetheless, in circumstances like the present, a federal judge must leave in place a state-court decision unless the federal judge believes it is ‘contrary to, or involved an unreasonable application of, clearly established Federal law, as determined by the Supreme Court of the United States.’ § 2254(d)(1) . . . . I cannot say that the California Supreme Court decision fails this deferential test.”), and Sessoms v. Grounds, 776 F.3d 615, 631 (9th Cir. 2015) (Kozinski, C.J., reluctantly dissenting) (“But what we must decide is not what Sessoms meant or the officers understood, but whether it was unreasonable for the state courts to conclude that a reasonable officer would have been perplexed as to whether Sessoms was asking for an attorney . . . . I am dismayed that Sessoms’s fate—whether he will spend his remaining days in prison, half a century or more caged like an animal—turns on such esoterica. But that’s the standard we are bound to apply, even if we are convinced that the habeas petitioner’s constitutional rights were violated.”), with Hinton v. Alabama, 134 S. Ct. 1081, 1083 (2014) (per curiam) (vacating the state court’s judgment on direct appeal upon concluding that Anthony Ray Hinton’s trial attorney “rendered constitution- 44 GEO. L.J. ANN. REV. CRIM. PROC (2015) xli perhaps the best illustration is Cavazos v. Smith,224 the case involving a grandmother who had spent 10 years in prison for the alleged shaking death of her infant grandson—a conviction secured by since-discredited junk science. My court freed Smith, but the Supreme Court summarily reversed (over Justice Ginsburg’s impas- sioned dissent) based on AEDPA.225 AEDPA is a cruel, unjust and unnecessary law that effectively removes federal judges as safeguards against miscarriages of justice. It has resulted and continues to result in much human suffering. It should be repealed.226 4. Treat prosecutorial misconduct as a civil rights violation. The U.S. Justice Department seems ready enough to pursue charges of civil rights violations in cases where police have engaged in physical violence,227 but far more reluctant to pursue misbehaving prosecutors.228 But prosecutors can wreck and take lives just like police, and their actions are often far more premeditated than those of officers who may over-react to a belligerent suspect. And when a prosecutorial office uses known liars as jailhouse snitches, or presents evidence from cops they know are prone to fabricate evidence or conduct suggestive lineups or eyewitness identifications, they are commit- ting civil rights violations with dire consequences for their victims. It is precisely such alternative enforcement mechanisms that the Supreme Court hypothesized in Imbler in deciding to give prosecutors absolute immunity.229 One can only hope that the U.S. Department of Justice will reconsider what appears to be its policy against investigating prosecutorial misconduct in criminal cases as potential civil rights violations. 5. Give criminal defendants the choice of a jury or bench trial. Under current law, either the defendant or the prosecution can insist on trying the case before a jury.230 Conventional wisdom is that defendants prefer juries because it only takes one juror to hang, but experienced defense lawyers know that some kinds of cases can best be tried before a judge—particularly where a defendant wishes to testify but ally deficient performance,” see supra n.77. 224. 132 S. Ct. 2 (2011). 225. Citing 2254(d), the Supreme Court explained: “Doubts about whether Smith is in fact guilty are understandable. But it is not the job of this Court, and was not that of the Ninth Circuit, to decide whether the State’s theory was correct. The jury decided that question, and its decision is supported by the record.” 132 S. Ct. at 7. 226. In last year’s Preface, Shon Hopwood details other ill effects of AEDPA’s heartless regime. Shon Hopwood, Preface, Failing to Fix Sentencing Mistakes: How the System of Mass Incarceration May Have Hardened the Hearts of the Federal Judiciary, 43 GEO. L.J. ANN. REV. CRIM. PROC. (2014). 227. For example, the Justice Department brought civil rights charges against the four LAPD officers involved in the brutal beating of Rodney King. See James H. Andrews, US Justice is on Trial in Rodney King Case, CHRISTIAN SCI. Monitor (Mar. 15, 1993), http://www.csmonitor.com/1993/0315/15121.html. And while it ultimately decided not to prosecute police officer Darren Wilson for the shooting death of Michael Brown, the Justice Department conducted an extensive investigation which culminated in a 86-page report. See UNITED STATES DEPARTMENT OF JUSTICE-CIVIL RIGHTS DIVISION, DEPARTMENT OF JUSTICE REPORT REGARDING THE CRIMINAL INVESTIGATION INTO THE SHOOTING DEATH OF MICHAEL BROWN BY FERGUSON, MISSOURI POLICE OFFICER DARREN WILSON (Mar. 4, 2015), available at http://apps.washingtonpost.com/g/ documents/national/department-of-justice-report-on-the-michael-brown-shooting/1436/. 228. See supra n.198 (discussing the Justice Department’s decision not to investigate the prosecutors’ misconduct in Milke). 229. See Imbler, 424 U.S. at 428 (“This Court has never suggested that the policy considerations which compel civil immunity for certain governmental officials also place them beyond the reach of the criminal law. Even judges, cloaked with absolute civil immunity for centuries, could be punished criminally for willful deprivations of constitutional rights . . . .”). 230. See Fed. R. Crim. P. 23(a) (“If the defendant is entitled to a jury trial, the trial must be by jury unless: (1) the defendant waives a jury trial in writing; (2) the government consents; and (3) the court approves.”). 44 GEO. L.J. ANN. REV. CRIM. PROC (2015)xlii fears impeachment with prior misdeeds.231 The prosecution has many institutional advantages, not the least being that they get to go first and thus have their theory of the case laid out before the defendant can present any evidence at all.232 I would think it fair to let the defendant get the choice of judge or jury. Because the government has no constitutional right to a jury, but the defendant does, there should be no constitu- tional impediment to such a rule.233 And while I’m at it, I’d amend Federal Rule of Evidence 609(a) (and its state analogues) to preclude impeachment of a criminal defendant testifying on his own behalf with evidence of his past criminal convictions.234 Too many defendants are put to the grim choice of either telling their side of the story and having the jury hear of their prior misdeeds, or standing mute and seeming to acquiesce in the prosecution’s case.235 If the defendant lies, a skilled prosecutor will trip him up on cross; there is no need to paint him as a monster before the jury.236 6. Conduct in depth studies of exonerations. The recent spate of exonerations, especially those obtained by DNA evidence, gives us a window as to what can go wrong in our criminal justice system. It is an important database that ought to make us doubt the supposed infallibility of our criminal justice process. But it can also be a rich source of useful information about why criminal prosecutions go wrong, why police focus on a single innocent suspect, why prosecutors pursue cases without asking hard questions about whether the defendant is truly guilty and why judges and juries are so badly misled in so many cases. This should not be a matter left to academia, although much good work is done there now. Far better, though, if the federal government devoted, say, the cost of one aircraft carrier to analyze and dissect these cases and try to figure out what went wrong and what we can do better in the future. Thus far, the government has only made such an inquiry into a handful of cases.237 This effort needs to be expended on a much larger scale, because even a single wrongful conviction is one too many. 231. See Gordon Van Kessel, Adversary Excesses in the American Criminal Trial, 67 NOTRE DAME L. REV. 403, 482 (1992) (many defendants do not testify because “[t]he threat of felony conviction impeachment can be a powerful deterrent to taking the witness stand”). 232. See supra n.43 and accompanying text (discussing the advantage of going first). 233. See Adam H. Kurland, Providing a Federal Criminal Defendant with a Unilateral Right to a Bench Trial: A Renewed Call to Amend Federal Rule of Criminal Procedure 23(a), 26 U.C. DAVIS L. REV. 309 (1993) (fleshing out the arguments in favor of giving the defendant a unilateral right to choose between a bench or jury trial). 234. See Fed. R. Evid. 609(a)(1)(B) (evidence of a criminal conviction for purposes of attacking a witness’s character for truthfulness “must be admitted in a criminal case in which the witness is a defendant, if the probative value of the evidence outweighs its prejudicial effect to that defendant”). 235. See Roselle L. Wissler & Michael J. Saks, On the Inefficacy of Limiting Instructions: When Jurors Use Prior Conviction Evidence to Decide Guilt, 9 LAW & HUM. BEHAV. 37, 47 (1985) (“[T]he presentation of the defendant’s criminal record [under Rule 609] does not affect the defendant’s credibility, but does increase the likelihood of conviction, and the judge’s limiting instructions do not appear to correct that error. People’s decision processes do not employ the prior-conviction evidence in the way the law wishes them to use it. From a legal policy viewpoint, the risk of prejudice to the defense is greater than the unrealized potential benefit to the prosecution.”). 236. Rape shield laws and Federal Rule of Evidence 412 are based on similar policy considerations. See 124 Cong. Rec. 36,256 (1978) (statement of Sen. Biden) (“The enactment of [the Privacy Protection for Rape Victims Act of 1978, of which Federal Rule of Evidence 412 was the centerpiece,] will eliminate the traditional defense strategy . . . of placing the victim and her reputation on trial in lieu of the defendant [and] end the practice . . . wherein rape victims are bullied and cross-examined about their prior sexual experiences[, making] the trial almost as degrading as the rape itself.”). 237. The National Institute of Justice recently sponsored in-depth analyses on three high-profile legal mistakes in an effort to “tease out the sequence of factors that might have contributed to a mistake and, perhaps, lead to a more accident-proof legal system”; the study found that “[t]here was no single bad actor,” with most cases involving “a series of small slip-ups that cascaded into an important mistake.” See 44 GEO. L.J. ANN. REV. CRIM. PROC (2015) xliii 7. Repeal three felonies a day for three years.238 Professor Tim Wu of Columbia Law School recounted a “darkly humorous game” played by Assistant U.S. Attorneys in the Southern District of New York: [S]omeone would name a random celebrity—say, Mother Theresa or John Lennon. It would then be up to the junior prosecutors to figure out a plausible crime for which to indict him or her. The crimes were not usually rape, murder, or other crimes you’d see on Law & Order but rather the incredibly broad yet obscure crimes that populate the U.S. Code like a kind of jurisprudential minefield: Crimes like “false statements” (a felony, up to five years), “obstructing the mails” (five years), or “false pretenses on the high seas” (also five years). The trick and skill lay in finding the more obscure offenses that fit the character of the celebrity and carried the toughest sentences.239 A big reason prosecutors have so much leverage in plea negotiations is that there are many laws written in vague and sweeping language, inviting prosecutorial adventur- ism.240 It is thus difficult for individuals charged with a crime to know how to defend themselves and to gauge the likelihood of being acquitted. Even if ultimately vindicated, the process of being charged itself takes a massive toll. Arthur Andersen, guilty of no crime according to the Supreme Court, neverthe- less was put out of business, leaving its 85,000 employees world-wide without jobs.241 Senator Stevens lost his Senate seat even though his prosecution was riddled with misconduct and the Justice Department eventually dismissed all charges. The list of lives and businesses ruined by baseless prosecutions is long.242 And, in the words of George Will, “as the mens rea requirement withers when the quantity and complexity of laws increase, the doctrine of ignorantia legis neminem excusat— ignorance of the law does not excuse—becomes problematic. The regulatory state is rendering unrealistic the presumption that a responsible citizen should be presumed to have knowledge of the law.”243 Repealing a thousand vague and over-reaching laws and replacing them with laws that are cast narrowly to punish morally reprehen- sible conduct and give fair notice as to what is criminal may not solve the problem altogether, but it would be a good start. CONCLUSION ‘Nuff said. Douglas Starr, A New Way to Reform the Judicial System, The NEW YORKER (Mar. 31, 2015), http://www. newyorker.com/news/news-desk/the-root-of-the-problem. 238. Harvey Silverglate estimates that a typical American commits three felonies a day due to overbroad laws. Silverglate, supra n.58; see Kozinski & Tseytlin, supra n.58, at 4445 (“[M]ost Americans are criminals and don’t know it, or suspect they are but believe they’ll never get prosecuted.”). 239. Tim Wu, American Lawbreaking, SLATE (Oct. 14, 2007), http://www.slate.com/articles/news_and_ politics/jurisprudence/features/2007/american_lawbreaking/introduction.html. 240. See supra n.57 (discussing examples). 241. See Andersen Died in Vain, CHI. TRIB. (Mar. 14, 2012), http://articles.chicagotribune.com/2012- 03-14/opinion/ct-edit-andersen-20120314_1_andersen-s-professional-standards-group-andersen-case- founder-arthur-andersen. 242. See, e.g., Goyal, 629 F.3d at 922 (Kozinski, C.J., concurring) (“This case . . . has no doubt devastated the defendant’s personal and professional life.”); POWELL, supra n.116, at 52 (“Jim Brown[, see supra n.118,] was devastated. His entire life and that of his family had been turned upside down. Now he was facing prison time. He was in shock.”). 243. Will, supra n.58 (citing Michael Anthony Cottone, Rethinking Presumed Knowledge of the Law in the Regulatory Age, 82 TENN. L. REV. 137 (2014)). 44 GEO. L.J. ANN. REV. CRIM. PROC (2015)xliv Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification Committee on Scientific Approaches to Understanding and Maximizing the Validity and Reliability of Eyewitness Identification in Law Enforcement and the Courts Committee on Science, Technology, and Law Policy and Global Affairs Committee on Law and Justice Division of Behavioral and Social Sciences and Education Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification THE NATIONAL ACADEMIES PRESS 500 Fifth Street, NW Washington, DC 20001 NOTICE: The project that is the subject of this report was approved by the Govern- ing Board of the National Research Council, whose members are drawn from the councils of the National Academy of Sciences, the National Academy of Engineer- ing, and the Institute of Medicine. The members of the committee responsible for the report were chosen for their special competences and with regard for appropriate balance. This study was funded by a grant between the National Academy of Sciences and the Laura and John Arnold Foundation. Any opinions, findings, conclusions, or rec- ommendations expressed in this publication are those of the author and do not nec- essarily reflect the views of the organization that provided support for the project. International Standard Book Number13: 978-0-309-31059-8 International Standard Book Number10: 0-309-31059-8 Library of Congress Control Number: 2014955458 Additional copies of this report are available from the National Academies Press, 500 Fifth Street, NW, Room 360, Washington, DC 20001; (800) 624-6242 or (202) 334-3313; http://www.nap.edu. Copyright 2014 by the National Academy of Sciences. All rights reserved. Printed in the United States of America Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification The National Academy of Sciences is a private, nonprofit, self-perpetuating society of distinguished scholars engaged in scientific and engineering research, dedicated to the furtherance of science and technology and to their use for the general welfare. Upon the authority of the charter granted to it by the Congress in 1863, the Acad- emy has a mandate that requires it to advise the federal government on scientific and technical matters. Dr. Ralph J. Cicerone is president of the National Academy of Sciences. The National Academy of Engineering was established in 1964, under the charter of the National Academy of Sciences, as a parallel organization of outstanding en- gineers. It is autonomous in its administration and in the selection of its members, sharing with the National Academy of Sciences the responsibility for advising the federal government. The National Academy of Engineering also sponsors engineer- ing programs aimed at meeting national needs, encourages education and research, and recognizes the superior achievements of engineers. Dr. C. D. Mote, Jr., is presi- dent of the National Academy of Engineering. The Institute of Medicine was established in 1970 by the National Academy of Sciences to secure the services of eminent members of appropriate professions in the examination of policy matters pertaining to the health of the public. The Insti- tute acts under the responsibility given to the National Academy of Sciences by its congressional charter to be an adviser to the federal government and, upon its own initiative, to identify issues of medical care, research, and education. Dr. Victor J. Dzau is president of the Institute of Medicine. The National Research Council was organized by the National Academy of Sciences in 1916 to associate the broad community of science and technology with the Academy’s purposes of furthering knowledge and advising the federal government. Functioning in accordance with general policies determined by the Academy, the Council has become the principal operating agency of both the National Academy of Sciences and the National Academy of Engineering in providing services to the government, the public, and the scientific and engineering communities. The Council is administered jointly by both Academies and the Institute of Medicine. Dr. Ralph J. Cicerone and Dr. C. D. Mote, Jr., are chair and vice chair, respectively, of the National Research Council. www.national-academies.org Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification v COMMITTEE ON SCIENTIFIC APPROACHES TO UNDERSTANDING AND MAXIMIZING THE VALIDITY AND RELIABILITY OF EYEWITNESS IDENTIFICATION IN LAW ENFORCEMENT AND THE COURTS Co-Chairs THOMAS D. ALBRIGHT (NAS), Professor and Director, Vision Center Laboratory and Conrad T. Prebys Chair in Vision Research, Salk Institute for Biological Studies JED S. RAKOFF, Senior Judge, United States District Court for the Southern District of New York Members WILLIAM G. BROOKS III, Chief of Police, Norwood (MA) Police Department JOE S. CECIL, Project Director, Division of Research, Federal Judicial Center WINRICH FREIWALD, Assistant Professor, Laboratory of Neural Systems, The Rockefeller University BRANDON L. GARRETT, Roy L. and Rosamond Woodruff Morgan Professor of Law, University of Virginia Law School KAREN KAFADAR, Commonwealth Professor and Chair of Statistics, University of Virginia A.J. KRAMER, Federal Public Defender for the District of Columbia SCOTT McNAMARA, Oneida County (NY) District Attorney CHARLES ALEXANDER MORGAN III, Associate Clinical Professor of Psychiatry, Yale University School of Medicine ELIZABETH A. PHELPS, Silver Professor of Psychology and Neural Science, New York University DANIEL J. SIMONS, Professor, Department of Psychology, University of Illinois ANTHONY D. WAGNER, Professor of Psychology and Neuroscience and Co-Director, Center for Cognitive and Neurobiological Imaging, Stanford University; Director, Stanford Memory Laboratory JOANNE YAFFE, Professor of Social Work and Adjunct Professor of Psychiatry, University of Utah Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification vi Staff ANNE-MARIE MAZZA, Study Director and Director, Committee on Science, Technology, and Law ARLENE F. LEE, Director, Committee on Law and Justice STEVEN KENDALL, Program Officer, Committee on Science, Technology, and Law KAROLINA KONARZEWSKA, Program Coordinator, Committee on Science, Technology, and Law ANJALI SHASTRI, Christine Mirzayan Science and Technology Policy Graduate Fellow SARAH WYNN, Christine Mirzayan Science and Technology Policy Graduate Fellow Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification vii COMMITTEE ON SCIENCE, TECHNOLOGY, AND LAW Co-Chairs DAVID BALTIMORE (NAS/IOM), President Emeritus and Robert Andrews Millikan Professor of Biology, California Institute of Technology DAVID S. TATEL, Judge, U.S. Court of Appeals for the District of Columbia Circuit Members THOMAS D. ALBRIGHT (NAS), Professor and Director, Vision Center Laboratory and Conrad T. Prebys Chair in Vision Research, Salk Institute for Biological Studies ANN ARVIN (IOM), Lucile Packard Professor of Pediatrics and Microbiology and Immunology; Vice Provost and Dean of Research, Stanford University BARBARA E. BIERER, Professor of Medicine, Harvard Medical School CLAUDE CANIZARES (NAS), Vice President and the Bruno Rossi Professor of Physics, Massachusetts Institute of Technology ARTURO CASADEVALL (IOM), Leo and Julia Forchheimer Professor of Microbiology and Immunology; Chair, Department of Biology and Immunology; and Professor of Medicine, Albert Einstein College of Medicine JOE S. CECIL, Project Director, Program on Scientific and Technical Evidence, Division of Research, Federal Judicial Center R. ALTA CHARO (IOM), Warren P. Knowles Professor of Law and Bioethics, University of Wisconsin at Madison HARRY T. EDWARDS, Judge, U.S. Court of Appeals for the District of Columbia Circuit DREW ENDY, Associate Professor, Bioengineering, Stanford University and President, The BioBricks Foundation MARCUS FELDMAN (NAS), Burnet C. and Mildred Wohlford Professor of Biological Sciences, Stanford University JEREMY FOGEL, Director, Federal Judicial Center HENRY T. GREELY, Deane F. and Kate Edelman Johnson Professor of Law and Professor, by courtesy, of Genetics, Stanford University MICHAEL GREENBERGER, Law School Professor and Director, Center for Health and Homeland Security, University of Maryland BENJAMIN W. HEINEMAN, JR., Senior Fellow, Harvard Law School and Harvard Kennedy School of Government Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification viii MICHAEL IMPERIALE, Arthur F. Thurnau Professor of Microbiology and Immunology, University of Michigan GREG KISOR, Chief Technologist, Intellectual Ventures GOODWIN LIU, Associate Justice, California Supreme Court JENNIFER MNOOKIN, David G. Price and Dallas P. Price Professor of Law, University of California, Los Angeles School of Law R. GREGORY MORGAN, Vice President and General Counsel, Massachusetts Institute of Technology ALAN B. MORRISON, Lerner Family Associate Dean for Public Interest and Public Service Law, George Washington University Law School CHERRY MURRAY (NAS/NAE), Dean, School of Engineering and Applied Sciences, Harvard University ROBERTA NESS (IOM), Dean and M. David Low Chair in Public Health, University of Texas School of Public Health HARRIET RABB, Vice President and General Counsel, The Rockefeller University DAVID RELMAN (IOM), Thomas C. and Joan M. Merigan Professor, Departments of Medicine, and of Microbiology and Immunology, Stanford University and Chief, Infectious Disease Section, VA Palo Alto Health Care System RICHARD REVESZ, Lawrence King Professor of Law; Dean Emeritus; and Director, Institute for Policy Integrity, New York University School of Law MARTINE A. ROTHBLATT, Chairman and Chief Executive Officer, United Therapeutics DAVID VLADECK, Professor and Co-Director, Institute for Public Representation, Georgetown Law School Staff ANNE-MARIE MAZZA, Director STEVEN KENDALL, Program Officer KAROLINA KONARZEWSKA, Program Coordinator Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification ix COMMITTEE ON LAW AND JUSTICE Chairs JEREMY TRAVIS (Chair), President, John Jay College of Criminal Justice, The City University of New York RUTH D. PETERSON (Vice-Chair), Professor of Sociology and Director, Criminal Justice Research Center, Ohio State University Members CARL C. BELL, Staff Psychiatrist, St. Bernard’s Hospital; Staff Psychiatrist, Jackson Park Hospital’s Outpatient Family Practice Clinic; and Professor of Psychiatry and Public Health, University of Illinois at Chicago JOHN J. DONOHUE III, C. Wendell and Edith M. Carlsmith Professor of Law, Stanford University Law School MINDY FULLILOVE, Professor of Clinical Psychiatry and Professor of Clinical Sociomedical Sciences and Co-Director, Community Research Group, New York State Psychiatric Institute and Mailman School of Public Health, Columbia University MARK KLEIMAN, Professor of Public Policy, University of California, Los Angeles GARY LAFREE, Director, National Consortium for the Study of Terrorism and Responses to Terrorism (START) and Professor, Criminology and Criminal Justice, University of Maryland JANET L. LAURITSEN, Professor, Department of Criminology and Criminal Justice, University of Missouri GLENN C. LOURY, Merton P. Stoltz Professor of the Social Sciences, Department of Economics, Brown University JAMES P. LYNCH, Professor and Chair, Department of Criminology and Criminal Justice, University of Maryland CHARLES F. MANSKI (NAS), Board of Trustees Professor in Economics, Department of Economics, Northwestern University DANIEL S. NAGIN, Teresa and H. John Heinz III University Professor of Public Policy and Statistics, Carnegie Mellon University ANNE MORRISON PIEHL, Associate Professor, Department of Economics and Program in Criminal Justice, Rutgers University DANIEL B. PRIETO, Director, Cybersecurity and Technology and Director, Defense Industrial Base Cyber Security/Information Assurance, Office of the Secretary of Defense Chief Information Officer SUSAN B. SORENSON, Professor, University of Pennsylvania Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification x DAVID WEISBURD, Distinguished Professor, Department of Criminology, Law and Society and Director, Center for Evidence- Based Crime Policy, George Mason University; Walter E. Meyer Professor of Law and Criminal Justice, The Hebrew University Faculty of Law CATHY SPATZ WIDOM, Distinguished Professor, Psychology Department, John Jay College of Criminal Justice, The City University of New York PAUL K. WORMELI, Executive Director, Integrated Justice Information Systems Staff ARLENE F. LEE, Director EMILY BACKES, Research Associate MALAY MAJMUNDAR, Senior Program Officer STEVE REDBURN, Scholar JULIE SCHUCK, Senior Program Associate DANIEL TALMAGE, Program Officer TINA M. LATIMER, Program Coordinator Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification Acknowledgments ACKNOWLEDGMENT OF PRESENTERS The committee gratefully acknowledges the contributions of the fol- lowing individuals: Karen L. Amendola, Police Foundation; Steven E. Clark, University of California, Riverside; Rob Davis, Police Executive Research Forum; Kenneth Deffenbacher, University of Nebraska at Omaha; Paul DeMuniz, Oregon Supreme Court; Shari Seidman Diamond, Northwestern University and American Bar Foundation; John Firman, International Association of Chiefs of Police; Ronald Fisher, Florida International University; Geoffrey Gaulkin, Special Master, State v. Henderson (NJ); Kristine Hamann, Na- tional District Attorney’s Association; Barbara Hervey, Texas Court of Criminal Appeals; Robert J. Kane, Supreme Judicial Study Group on Eye- witness Identification (MA); Saul Kassin, John Jay College of Criminal Justice; Peter Kilmartin, State of Rhode Island; David LaBahn, Association of Prosecuting Attorneys; Elizabeth F. Loftus, University of California, Irvine; Roy S. Malpass, University of Texas at El Paso; Sheri Mecklenburg, U.S. Department of Justice; Christian A. Meissner, Iowa State University; John Monahan, University of Virginia; Steven D. Penrod, John Jay College of Criminal Justice; P. Jonathon Phillips, National Institute of Standards and Technology; Joseph Salemme, Chicago Police Department; Daniel L. Schacter, Harvard University; Barry Scheck, The Innocence Project; Jessica Snowden, Federal Judicial Center; Nancy K. Steblay, Augsburg College; Gary L. Wells, Iowa State University; John T. Wixted, University of Cali- fornia, San Diego; David V. Yokum, University of Arizona. xi Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification xii ACKNOWLEDGMENTS ACKNOWLEDGMENT OF REVIEWERS This report has been reviewed in draft form by individuals chosen for their diverse perspectives and technical expertise, in accordance with pro- cedures approved by the National Academies’ Report Review Committee. The purpose of this independent review is to provide candid and critical comments that will assist the institution in making its published report as sound as possible and to ensure that the report meets institutional standards for objectivity, evidence, and responsiveness to the study charge. The review comments and draft manuscript remain confidential to protect the integrity of the process. We wish to thank the following individuals for their review of this report: Art Acevedo, Austin, Texas Police Department; Aaron Benjamin, University of Illinois at Urbana-Champaign; Vicki Bruce, Newcastle Uni- versity; Jules Epstein, Widener University; Jeremy Fogel, Federal Judicial Center; Constantine Gatsonis, Brown University; Henry T. Greely, Stanford University; Peter Imrey, Cleveland Clinic; Robert Kane, Massachusetts Su- preme Court; Timothy Koller; Office of the Richmond County District At- torney; Elizabeth Loftus, University of California, Irvine; Robert Masters, Office of the Queens County District Attorney; Geoffrey Mearns, Northern Kentucky University; and Hal Stern, University of California, Irvine. Although the reviewers listed above have provided many constructive comments and suggestions, they were not asked to endorse the conclu- sions or recommendations, nor did they see the final draft of the report before its release. The review of this report was overseen by David Korn, Harvard Medical School and Massachusetts General Hospital and Stephen E. Fienberg, Carnegie Mellon University. Appointed by the National Acad- emies, they were responsible for making certain that an independent ex- amination of this report was carried out in accordance with institutional procedures and that all review comments were carefully considered. Re- sponsibility for the final content of this report rests entirely with the author- ing committee and the institution. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification xiii Preface Eyewitness identifications play an important role in the investigation and prosecution of crimes, but they have also led to erroneous convictions. In the fall of 2013, the Laura and John Arnold Foundation called upon the National Academy of Sciences (NAS) to assess the state of research on eyewitness identification and, when appropriate, make recommendations. In response to this request, the NAS appointed an ad hoc study committee that we have been privileged to co-chair. The committee’s review analyzed relevant published and unpublished research, external submissions, and presentations made by various experts and interested parties. The research examined fell into two general cat- egories: (1) basic research on vision and memory and (2) applied research directed at the specific problem of eyewitness identification. Basic research has progressed for many decades, is of high quality, and is largely definitive. Research of this category identifies principled and insurmountable limits of vision and memory that inevitably affect eyewit- ness accounts, bear on conclusions regarding accuracy, and provide a broad foundation for the committee’s recommendations. Through its review, the committee came to recognize that applied eyewitness identification research has identified key variables affecting the accuracy of eyewitness identifications. This research has been instrumental in informing law enforcement, the bar, and the judiciary of the frailties of eyewitness identification testimony. Such past research has appropriately identified the variables that may affect an individual’s ability to make an accurate identification. However, given the complex nature of eyewitness identification, the practical difficulties it poses for experimental research, Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification xiv PREFACE and the still ongoing evolution of statistical procedures in the field of eyewitness identification research, there remains at the time of this review substantial uncertainty about the effect and the interplay of these variables on eyewitness identification. Nonetheless, a range of practices has been validated by scientific methods and research and represents a starting place for efforts to improve eyewitness identification procedures. In this report, the committee offers recommendations on how law enforcement and the courts may increase the accuracy and utility of eyewit- ness identifications. In addition, the committee identifies areas for future research and for collaboration between the scientific and law enforcement communities. We are indebted to those who addressed the committee and to those who submitted materials to the committee, and we are particularly indebted to the members of the committee. These individuals devoted untold hours to the review of materials, meetings, conference calls, analyses, and report writing. This report is very much the result of the enormous contributions of an engaged community of scholars and practitioners who reached their findings and recommendations after many vigorous and thoughtful discus- sions. We also would like to thank the project staff, Karolina Konarzewska, Steven Kendall, Arlene Lee, and Anne-Marie Mazza, and editor Susanna Carey for their dedication to the project and to the work of the committee. Thomas D. Albright and Jed S. Rakoff Committee Co-chairs Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification xv Contents SUMMARY 1 1 INTRODUCTION 9 2 EYEWITNESS IDENTIFICATION PROCEDURES 21 3 THE LEGAL FRAMEWORK FOR ASSESSMENT OF EYEWITNESS IDENTIFICATION EVIDENCE 31 4 BASIC RESEARCH ON VISION AND MEMORY 45 5 APPLIED EYEWITNESS IDENTIFICATION RESEARCH 71 6 FINDINGS AND RECOMMENDATIONS 103 APPENDIXES A BIOGRAPHICAL INFORMATION OF COMMITTEE AND STAFF 123 B COMMITTEE MEETING AGENDAS 133 C CONSIDERATION OF UNCERTAINTY IN DATA ON THE CONFIDENCE–ACCURACY RELATIONSHIP AND THE RECEIVER OPERATING CHARACTERISTIC (ROC) CURVE 139 Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification xvi CONTENTS BOXES, FIGURES, AND TABLES Boxes 1-1 The Ronald Cotton Case, 10 1-2 Charge to the Committee, 12 2-1 Blinding, 26 5-1 The Influences of Discriminability and Response Bias on Human Binary Classification Decisions, 81 5-2 Analysis of Receiver Operating Characteristics (ROCs), 84 Figures 1-1 Memory accuracy and time, 17 5-1 Contingency table for possible eyewitness identification outcomes, 78 C-1 Data inferred from Juslin, Olsson and Winman, 143 C-2 Data from Brewer and Wells, 147 C-3 Data from Experiment 1A in Mickes, Flowe, and Wixted, 148 C-4 Data from Experiment 2 in Mickes, Flowe, and Wixted, 149 Tables C-1 Conditions and Logarithms of Reported pAUC Values, 152 C-2 Analysis of Variance Table for log(pAUC), 153 Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 1 Summary Eyewitnesses play an important role in criminal cases when they can identify culprits.1 Yet it is well known that eyewitnesses make mis-takes and that their memories can be affected by various factors including the very law enforcement procedures designed to test their memo- ries. For several decades, scientists have conducted research on the factors that affect the accuracy of eyewitness identification procedures. Basic re- search on the processes that underlie human visual perception and memory have given us an increasingly clear picture of how eyewitness identifications are made and, more important, an improved understanding of the prin- cipled limits on vision and memory that may lead to failures of identifica- tion. Basic research has been complemented by a growing body of applied research on eyewitness identification, which has examined those variables that particularly affect eyewitnesses to crimes: system variables (conditions such as the procedures followed to obtain identifications that can be con- trolled by law enforcement) and estimator variables (conditions associated with the actual crime, such as viewing conditions, or factors specific to the eyewitness, such as the race of the victim relative to that of the perpetrator, that cannot be controlled by law enforcement). Through such scientific research, we have learned that many factors in- fluence the visual perceptual experience: dim illumination and brief viewing times, large viewing distances, duress, elevated emotions, and the presence of a visually distracting element such as a gun or a knife. Gaps in sensory 1 Throughout this report, the term identification denotes person recognition. Eyewitness identification refers to recognition by a witness to a crime of a culprit unknown to the witness. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 2 IDENTIFYING THE CULPRIT input are filled by expectations that are based on prior experiences with the world. Prior experiences are capable of biasing the visual perceptual experi- ence and reinforcing an individual’s conception of what was seen. We also have learned that these qualified perceptual experiences are stored by a sys- tem of memory that is highly malleable and continuously evolving, neither retaining nor divulging content in an informational vacuum. The fidelity of our memories to actual events may be compromised by many factors at all stages of processing, from encoding to storage to retrieval. Unknown to the individual, memories are forgotten, reconstructed, updated, and distorted. Therefore, caution must be exercised when utilizing eyewitness procedures and when relying on eyewitness identifications in a judicial context. In 2013, the Laura and John Arnold Foundation called on the National Academy of Sciences (NAS) to appoint an ad hoc study committee to: 1. critically assess the existing body of scientific research as it relates to eyewitness identification; 2. identify any gaps in the existing body of literature and suggest appropriate research questions to pursue that will further our un- derstanding of eyewitness identification and that might offer ad- ditional insight into law enforcement and courtroom practice; 3. provide an assessment of what can be learned from research fields outside of eyewitness identification; 4. offer recommendations for best practices in the handling of eyewit- ness identifications by law enforcement; 5. offer recommendations for developing jury instructions; 6. offer advice regarding the scope of a Phase II consideration of neu- roscience research as well as any other areas of research that might have a bearing on eyewitness identification; and 7. write a consensus report with appropriate findings and recommendations. The committee heard from numerous experts, practitioners, and stake- holders and reviewed relevant published and unpublished literature as well as submissions provided to the committee. In this report, the committee offers its findings and recommendations for: • identifying and facilitating best practices in eyewitness procedures for the law enforcement community; • strengthening the value of eyewitness identification evidence in court; and • improving the scientific foundation underpinning eyewitness identification. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification SUMMARY 3 OVERARCHING FINDINGS The committee is confident that the law enforcement community, while operating under considerable pressure and resource constraints, is working to improve the accuracy of eyewitness identifications. These efforts, how- ever, have not been uniform and often fall short as a result of insufficient training, the absence of standard operating procedures, and the continuing presence of actions and statements at the crime scene and elsewhere that may intentionally or unintentionally influence eyewitness’ identifications. Basic scientific research on human visual perception and memory has provided an increasingly sophisticated understanding of how these systems work and how they place principled limits on the accuracy of eyewitness identification.2 Basic research alone is insufficient for understanding condi- tions in the field, and thus has been augmented by studies applied to the specific practical problem of eyewitness identification. Applied research has identified key variables that affect the accuracy and reliability of eyewitness identifications and has been instrumental in informing law enforcement, the bar, and the judiciary of the frailties of eyewitness identification testimony. A range of best practices has been validated by scientific methods and research and represents a starting place for efforts to improve eyewitness identification procedures. A number of law enforcement agencies have, in fact, adopted research-based best practices. This report makes actionable recommendations on, for example, the importance of adopting “blinded” eyewitness identification procedures. It further recommends that standard- ized and easily understood instructions be provided to eyewitnesses and calls for the careful documentation of eyewitness’ confidence statements. Such improvements may be broadly implemented by law enforcement now. It is important to recognize, however, that, in certain cases, the state of sci- entific research on eyewitness identification is unsettled. For example, the relative superiority of competing identification procedures (i.e., simultane- ous versus sequential lineups) is unresolved. The field would benefit from collaborative research among scientists and law enforcement personnel in the identification and validation of new best practices that can improve eyewitness identification procedures. Such a foundation can be solidified through the use of more effective research designs (e.g., those that consider more than one variable at a time, and in 2 Basic research on vision and memory seeks a comprehensive understanding of how these systems are organized and how they operate generally. The understanding derived from basic research includes principles that enable one to predict how a system (such as vision or memory) might behave under specific conditions (such as those associated with witnessing a crime) and to identify the conditions under which it will operate most effectively and those under which it will fail. Applied research, by contrast, empirically evaluates specific hypotheses about how a system will behave under a particular set of real-world conditions. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 4 IDENTIFYING THE CULPRIT different study populations to ensure reproducibility and generalizability), more informative statistical measures and analyses (i.e., methods from statistical machine learning and signal detection theory to evaluate the per- formance of binary classification tasks), more probing analyses of research findings (such as analyses of consequences of data uncertainties), and more sophisticated systematic reviews and meta-analyses (that take account of current guidelines, including transparency and reproducibility of methods). In view of the complexity of the effects of both system and estimator variables and their interactions on eyewitness identification accuracy, bet- ter experimental designs that incorporate selected combinations of these variables (e.g., presence or absence of a weapon, lighting conditions, etc.) will elucidate those variables with meaningful influence on eyewitness performance, which can, in turn, inform law enforcement practice of eye- witness identification procedures. To date, the eyewitness literature has evaluated procedures mostly in terms of a single diagnosticity ratio or an ROC (Receiver Operating Characteristic) curve; even if uncertainty is incorporated into the analysis, many other powerful tools for evaluating a “binary classifier” are available and worthy of consideration.3 Finally, syntheses of eyewitness research has been limited to meta-analyses that have not been conducted in the context of systematic reviews. Systematic reviews of stronger research studies need to conform to current standards and be translated into terms that are useful for decision makers. The committee here offers a summary of its key recommendations to strengthen the effectiveness of policies and procedures used to obtain ac- curate eyewitness identifications. RECOMMENDATIONS TO ESTABLISH BEST PRACTICES FOR THE LAW ENFORCEMENT COMMUNITY The committee’s review of law enforcement practices and procedures, coupled with its consideration of the scientific literature, has identified a number of areas where eyewitness identification procedures could be strengthened. The practices and procedures considered here involve acquisi- tion of data that reflect a witness’ identification and the contextual factors that bear on that identification. A recurrent theme underlying the commit- tee’s recommendations is development of and adherence to guidelines that are consistent with scientific standards for data collection and reporting. 3 T. Hastie, R. Tibshirani, and J. H. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction (New York: Springer, 2009). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification SUMMARY 5 Recommendation #1: Train All Law Enforcement Officers in Eyewitness Identification The committee recommends that all law enforcement agencies provide their officers and agents with training on vision and memory and the vari- ables that affect them, on practices for minimizing contamination, and on effective eyewitness identification protocols. Recommendation #2: Implement Double-Blind Lineup and Photo Array Procedures The committee recommends blind (double-blind or blinded) admin- istration of both photo arrays and live lineups and the adoption of clear, written policies and training on photo array and live lineup administration. Recommendation #3: Develop and Use Standardized Witness Instructions The committee recommends the development of a standard set of easily understood instructions to use when engaging a witness in an identification procedure. Recommendation #4: Document Witness Confidence Judgments The committee recommends that law enforcement document the wit- ness’ level of confidence verbatim at the time when she or he first identifies a suspect. Recommendation #5: Videotape the Witness Identification Process The committee recommends that the video recording of eyewitness identification procedures become standard practice. RECOMMENDATIONS TO STRENGTHEN THE VALUE OF EYEWITNESS IDENTIFICATION EVIDENCE IN COURT The best guidance for legal regulation of eyewitness identification evi- dence comes not from constitutional rulings, but from the careful use and understanding of scientific evidence to guide fact-finders and decision- makers. The Manson v. Brathwaite test under the Due Process Clause of the U.S. Constitution for assessing eyewitness identification evidence was estab- lished in 1977, before much applied research on eyewitness identification had been conducted. This test evaluates the “reliability” of eyewitness iden- Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 6 IDENTIFYING THE CULPRIT tifications using factors derived from prior rulings and not from empirically validated sources. As critics have pointed out, the Manson v. Brathwaite test includes factors that are not diagnostic of reliability. Moreover, the test treats factors such as the confidence of a witness as independent markers of reliability when, in fact, it is now well established that confidence judg- ments may vary over time and can be powerfully swayed by many factors. While some states have made minor changes to the due process framework, wholesale reconsideration of this framework is only a recent development. Recommendation #6: Conduct Pretrial Judicial Inquiry The committee recommends that, as appropriate, a judge make basic inquiries when eyewitness identification evidence is offered. Recommendation #7: Make Juries Aware of Prior Identifications The committee recommends that judges take all necessary steps to make juries aware of prior identifications, the manner and time frame in which they were conducted, and the confidence level expressed by the eye- witness at the time. Recommendation #8: Use Scientific Framework Expert Testimony The committee recommends that judges have the discretion to al- low expert testimony on relevant precepts of eyewitness memory and identifications. Recommendation #9: Use Jury Instructions as an Alternative Means to Convey Information The committee recommends the use of clear and concise jury instruc- tions as an alternative means of conveying information regarding the fac- tors that the jury should consider. RECOMMENDATIONS TO IMPROVE THE SCIENTIFIC FOUNDATION UNDERPINNING EYEWITNESS IDENTIFICATION RESEARCH Basic scientific research on visual perception and memory provides important insight into the factors that can limit the fidelity of eyewitness identification. Research targeting the specific problem of eyewitness iden- tification complements basic scientific research. However, this strong sci- entific foundation remains insufficient for understanding the strengths and Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification SUMMARY 7 limitations of eyewitness identification procedures in the field. Many of the applied studies on key factors that directly affect eyewitness performance in the laboratory are not readily applicable to actual practice and policy. Applied research falls short because of a lack of reliable or standardized data from the field, a failure to include a range of practitioners in the estab- lishment of research agendas, the use of disparate research methodologies, failure to use transparent and reproducible research procedures, and inad- equate reporting of research data. The task of guiding eyewitness identifi- cation research toward the goal of evidence-based policy and practice will require collaboration in the setting of research agendas and agreement on methods for acquiring, handling, and sharing data. Recommendation #10: Establish a National Research Initiative on Eyewitness Identification The committee recommends the establishment of a National Research Initiative on Eyewitness Identification. Recommendation #11: Conduct Additional Research on System and Estimator Variables The committee recommends broad use of statistical tools that can render a discriminability measure to evaluate eyewitness performance and a rigorous exploration of methods that can lead to more conservative responding. The committee further recommends that caution and care be used when considering changes to any existing lineup procedure, until such time as there is clear evidence for the advantages of doing so. CONCLUSION Eyewitness identification can be a powerful tool. As this report indi- cates, however, the malleable nature of human visual perception, memory, and confidence; the imperfect ability to recognize individuals; and policies governing law enforcement procedures can result in mistaken identifications with significant consequences. New law enforcement training protocols, standardized procedures for administering lineups, improvements in the handling of eyewitness identification in court, and better data collection and research on eyewitness identification can improve the accuracy of eyewit- ness identifications. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 9 1 Introduction Accurate eyewitness identifications 1 may aid in the apprehension and prosecution of the perpetrators of crimes. However, inaccurate identifications may lead to the prosecution of innocent persons while the guilty party goes free. It is therefore crucial to develop eyewitness identification procedures that achieve maximum accuracy and reliability. Eyewitness evidence is not infallible. In 1932, Yale University law pro- fessor Edwin M. Borchard documented nearly seventy cases of miscarriage of justice caused by eyewitness errors in his book, Convicting the Innocent.2 Years later, in 1967, the U.S. Supreme Court highlighted the danger of er- roneous eyewitness identification in United States v. Wade, stating, “The vagaries of eyewitness identification are well-known; the annals of criminal law are rife with instances of mistaken identification.”3 The Federal Bureau of Investigation (FBI) estimates that U.S. law en- forcement made 12,196,959 arrests in 2012. The FBI estimates that 521,196 of these arrests were for violent crimes.4 Accurate data on the number of crimes observed by eyewitnesses are not available. If only a fraction of the violent crimes in the United States involve an eyewitness, the number must 1 Throughout this report, the term identification denotes person recognition. Eyewitness identification refers to recognition by a witness to a crime of a culprit unknown to the witness. 2 Edwin M. Borchard, Convicting the Innocent: Sixty-Five Actual Errors of Criminal Justice (New York: Garden City Publishing Company, Inc., 1932). 3 United States v. Wade, 388 U.S. 230, 288 (1967). 4 Federal Bureau of Investigation, “Crime in the United States 2012: Persons Arrested,” available at: http://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/2012/crime-in-the-u.s.-2012/ persons-arrested/persons-arrested. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 10 IDENTIFYING THE CULPRIT BOX 1-1 The Ronald Cotton Casea In 1984, a college student named Jennifer Thompson was raped in her apart- ment in Burlington, North Carolina. The police asked her to help create a com- posite sketch of the rapist. The police then received a tip that a local man named Ronald Cotton resembled the composite, and shortly after the crime, Thompson was shown a photo array containing six photos. With some difficulty, she chose two pictures, one of which was of Cotton. Finally, she said, “I think this is the guy,” pointing to Cotton. “You’re sure,” the lead detective asked, and she responded, “Positive.” Thompson asked, “Did I do OK?” The detectives responded, “You did great.” She has described how those encouraging remarks had the effect of mak- ing her more confident in her identification. The police then showed Thompson a live lineup. Cotton was the only person repeated from the prior photo array. This would make Cotton more familiar and might suggest that he was the prime suspect. Nevertheless, Thompson remained hesitant and was having trouble deciding between two people. After several minutes, she told the police that Cotton “looks the most like him.” The lead detec- tive asked “if she was certain,” and she said, “Yes.” Again, the detectives further reinforced her decision. The lead detective told Thompson, “It’s the same person you picked from the photos.” She later described feeling a “huge amount of relief” when told that she had again picked the right person. At Ronald Cotton’s criminal trial, Thompson agreed she was “absolutely sure” that he was the rapist. Cotton was sentenced to life in prison plus 54 years. He served 10.5 years before DNA tests exonerated him and implicated another man, Bobby Poole. Not only did the identification procedures increase Thompson's confidence in the mistaken memory event, but they also resulted in her rejection of the actual culprit. Poole had been presented to Thompson at a post-trial hear- ing, and she could not recognize him. “I have never seen him in my life,” she said at the time. In response to this error, the lead detective in the case, Mike Gauldin, later as police chief, was the first in the state to institute a series of new practices, in- cluding double-blind lineup procedures. In the years that followed, North Carolina adopted such practices statewide. Ronald Cotton and Jennifer Thompson have since written a book, Picking Cotton, that describes their case and experiences. aSee, generally, http://www.cbsnews.com/news/eyewitness-how-accurate-is-visual-memory/ and http://www.slate.com/articles/news_and_politics/jurisprudence/features/2011/getting_it_ wrong_convicting_the_innocent/how_eyewitnesses_can_send_innocents_to_jail.html. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification INTRODUCTION 11 be sizable. One estimate based on a 1989 survey of prosecutors suggests that at least 80,000 eyewitnesses make identifications of suspects in crimi- nal investigations each year.5 Recently, post-conviction DNA exonerations of innocent persons have dramatically highlighted the problems with eyewitness identifications.6,7 In the United States, more than 300 exonerations have resulted from post- conviction DNA testing since 1989.8 According to the Innocence Project, at least one mistaken eyewitness identification was present in almost three- quarters of DNA exonerations.9 In many of these cases, eyewitness identi- fication played a significant evidentiary role, and almost without exception, the eyewitnesses who testified expressed complete confidence that they had chosen the perpetrator. Many eyewitnesses testified with high confidence despite earlier expressions of uncertainty.10 For example, in the well-known case of Ronald Cotton (see Box 1-1), Jennifer Thompson (the victim) has described how she was initially quite unsure of her eyewitness identification of Cotton, a man later exonerated by DNA testing. She became certain it was Cotton only after the police made confirmatory remarks and had her participate in two identification procedures where Cotton was the only person shown both times. Erroneous eyewitness identifications can occur across the range of criminal convictions in which eyewitness evidence is presented, but most of these cases lack the biological material that can be tested for DNA and used for exoneration purposes. While eyewitness misidentifications may have been a dominant factor in some erroneous convictions, it is important to note that other factors, including errors at various stages of the legal and judicial processes, may have contributed to the erroneous convictions. CHARGE TO THE COMMITTEE In 2013, the Laura and John Arnold Foundation called on the Na- tional Research Council (NRC) to assess the state of scientific research on 5 A. G. Goldstein, J. E. Chance, and G. R. Schneller, “Frequency of Eyewitness Identification in Criminal Cases: A Survey of Prosecutors,” Bulletin of the Psychonomic Society 27(1): 71, 73 (January 1989). 6 CNN, “Exonerated: Cases by the Numbers,” December 4, 2013, available at: http://www. cnn.com/2013/12/04/justice/prisoner-exonerations-facts-innocence-project/. 7 Taryn Simon, “Freedom Row,” New York Times Magazine, January 26, 2003. 8 The Innocence Project, “DNA Exoneree Case Profiles,” available at: http://www.innocence project.org/know/. 9 The Innocence Project, “Eyewitness Identification,” available at: http://www.innocence project.org/fix/Eyewitness-Identification.php. 10 Brandon L. Garrett, Convicting the Innocent: Where Criminal Prosecutions Go Wrong 63–68 (Cambridge, MA: Harvard University Press, 2011). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 12 IDENTIFYING THE CULPRIT eyewitness identification and to recommend best practices11 for handling eyewitness identifications by law enforcement and the courts. The goal of this effort was to evaluate the scientific basis for eyewitness identification, to help establish the scientific foundation for effective real-world practices, and to facilitate the development of policies to improve eyewitness identifi- cation validity in the context of the American justice system. In response to this charge, the NRC appointed an ad hoc committee, the Committee on Scientific Approaches to Understanding and Maximiz- ing the Validity and Reliability of Eyewitness Identification in Law En- forcement and the Courts (hereinafter, the committee), to undertake this study (see Box 1-2 for the committee’s charge). The committee met three times, held numerous conference calls, heard from various stakeholders (see Appendix B), and reviewed extensive research on eyewitness identification before reaching its findings and recommendations. 11 For the purposes of this report, the committee characterizes best practice as the adoption of standardized procedures based on scientific principles. The committee does not make any endorsement of practices designated as best practices by other bodies. BOX 1-2 Charge to the Committee The charge to the NRC was to: 1. critically assess the existing body of scientific research as it relates to eyewit- ness identification; 2. identify any gaps in the existing body of literature and suggest, as appropriate, research questions to pursue that will further our understanding of eyewitness identification and that might offer additional insight into law enforcement and courtroom practice; 3. provide an assessment of what can be learned from research fields outside of eyewitness identification; 4. offer recommendations for best practices in the handling of eyewitness identi- fications by law enforcement; 5. offer recommendations for developing jury instructions; 6. offer advice regarding the scope of a Phase II consideration of neuroscience research as well as any other areas of research that might have a bearing on eyewitness identification; and 7. write a consensus report with appropriate findings and recommendations. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification INTRODUCTION 13 SCIENCE AND LAW Law enforcement officers investigating crimes rely on eyewitness iden- tification procedures to verify that a suspect is the individual seen by an eyewitness.12 Such procedures can take place under conditions that may have significant effects on the accuracy and reliability of an eyewitness’ identification. Unlike officers in the field, laboratory researchers have, in theory, greater control over influences that might contaminate the visual perceptual experience and memory of an eyewitness. Science is a self-correcting enterprise. Researchers formulate and test hypotheses using observations and experiments, which are then subject to independent review. In science, evidence and data are analyzed and experi- ments are repeated to ensure that biases or other factors do not lead to in- correct conclusions. Scientific progress results from the review and revision of earlier results and conclusions. The culture of scientific research is markedly different from a legal cul- ture that must seek definitive results in individual cases. In 1993, in Daubert v. Merrell Dow Pharmaceuticals, Inc., the U.S. Supreme Court ruled that, under Rule 702 of the Federal Rules of Evidence (which covers both civil and criminal trials in the federal courts), a “trial judge must ensure that any and all scientific testimony or evidence admitted is not only relevant, but reliable.”13 Criminal justice and legal personnel have come to rely on eyewitness evidence. Law enforcement officials have first-hand experience with eye- witnesses in criminal investigations and trials, and over the years, some juridictions have implemented and strengthened practices and procedures in an attempt to improve acccuracy. Consequently, the law enforcement and legal communities have made important contributions to our understand- ing of eyewitness identifications and the improvements of practices in the field. Researchers have become increasingly involved in assessing eyewit- ness identification procedures as law enforcement, lawyers, and judges have themselves sought more accurate procedures and approaches. In the 2009 National Research Council report, Strengthening Forensic Science in the United States: A Path Forward, the committee noted, “in addition to protecting innocent persons from being convicted of crimes that they did not commit, we are also seeking to protect society from persons who have 12 For ease of reading, throughout the report the committee will use the term officer to mean law enforcement officials and professionals. 13Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579 (1993). The Court also noted that “there are important differences between the quest for truth in the courtroom and the quest for truth in the laboratory. Scientific conclusions are subject to perpetual revision. Law, on the other hand, must resolve disputes finally and quickly.” Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 14 IDENTIFYING THE CULPRIT committed criminal acts.”14 This shared common goal of protecting in- nocent persons and society makes collaboration between the scientific, law enforcement, and legal communities critically important. IDENTIFYING THE CULPRIT Officers typically use three procedures to identify a perpetrator whose identity is unknown: (1) showups; (2) presentations of photo arrays; and (3) live lineups. A showup is a procedure in which officers present a single criminal suspect to a witness. This procedure usually occurs near the crime location and immediately or shortly after the crime has occurred. Officers also use photo arrays and live lineups, in which they ask the witness to view numerous individuals, one of whom may be the perpetrator. The suspect is presented along with fillers (known non-suspects). Currently, photo arrays are used more often than live lineups.15,16 If the eyewitness makes a positive identification during a showup, a photo array, or a lineup, the identification may constitute evidence about a suspect’s involvement in a crime. The eyewitness identification may, when considered with other available evidence, establish probable cause to sup- port an arrest. Such evidence may play a pivotal role in enabling the pros- ecution to meet its burden of proof in a subsequent trial. In recent years, more law enforcement agencies have created written eyewitness identification policies and have adopted formalized training. However, there are many agencies that do not have standard written poli- cies or formalized training for the administration of identification proce- dures or for ongoing interactions with witnesses.17 VISION AND MEMORY At its core, eyewitness identification relies on brain systems for visual perception and memory: The witness perceives the face and other aspects of the perpetrator’s physical appearance and bearing, stores that informa- 14 National Research Council, Strengthening Forensic Science in the United States: A Path Forward (Washington, DC: The National Academies Press, 2009), p. 12. 15 Police Executive Research Forum, “A National Survey of Eyewitness Identification Pro- cedures in Law Enforcement Agencies,” March 2013, p. 48. The survey indicates that 94.1 percent of responding law enforcement agencies reported that they use photo arrays, while only 21.4 percent reported using live lineups. Sixty-one point eight percent of agencies re- ported that they use showups. See also J. S. Neuschatz et al., “Comprehensive Evaluation of Showups,” in Advances in Psychology and Law, ed. M. Miller and B. Bornstein (New York: Springer, in press). 16 Throughout the report, unless otherwise specified, references to lineups refer to both photo arrays and live lineups. 17 Police Executive Research Forum, p. 65. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification INTRODUCTION 15 tion in memory, and later retrieves the information for comparison with the visual percept of an individual in a lineup. Recent years have seen great ad- vances in our scientific understanding of the basic mechanisms, operational strategies, and limitations of human vision and memory. These advances inform our understanding of the accuracy of eyewitness identification. Human vision does not capture a perfect, error-free “trace” of a wit- nessed event. What an individual actually perceives can be heavily influ- enced by bias18 and expectations derived from cultural factors, behavioral goals, emotions, and prior experiences with the world. For eyewitness iden- tification to take place, perceived information must be encoded in memory, stored, and subsequently retrieved. As time passes, memories become less stable. In addition, suggestion and the exposure to new information may influence and distort what the individual believes she or he has seen. Several factors are known to affect the fidelity of visual perception and the integrity of memory. In particular, vision and memory are constrained by processing bottlenecks and various sources of noise.19 Noise comes from a variety of sources, some associated with the structure of the visual environment, some inherent in the optical and neuronal processes involved, some reflecting sensory content not relevant to the observer’s goals, and some originating with incorrect expectations derived from memory. The concept of noise has profound significance for understanding eyewitness identification, as the accuracy of information about the environment gained through vision and stored in memory is necessarily, and often sharply, lim- ited by noise. The recognition of one person by another—a seemingly commonplace and unremarkable everyday occurrence—involves complex processes that are limited by noise and subject to many extraneous influences. Eyewitness identification research confronts methodological challenges that some other basic experimental sciences do not encounter, as well as practical challenges 18 Bias is defined as any tendency that prevents unprejudiced consideration of a question (see Dictionary.com; http://dictionary.reference.com/browse/bias). Response bias is a general term for a wide range of influences that moderate the responses of participants away from an accurate or truthful response. Response bias can be induced or caused by a number of factors, all relating to the idea that humans do not respond passively to stimuli, but rather actively integrate multiple sources of information to generate a response in a given situation [(see M. Orne,“On the Social Psychology of the Psychological Experiment: With Particular Reference to Demand Characteristics and Their Implications,” American Psychologist 17: 776–783, (1962)]. In research, bias is seen in sampling or testing when circumstances select or encourage one outcome or answer over another (see Merriam-Webster.com; http://www. merriam-webster.com/dictionary/bias). 19 Noise refers here to factors that cause uncertainty on the part of an individual about whether a particular signal (e.g. a specific visual stimulus) is present. This use of the term fol- lows the definition used in electronic signal transmission, in which noise refers to random or irrelevant elements that interfere with detection of coherent and informative signals. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 16 IDENTIFYING THE CULPRIT in establishing adequate experimental controls over the numerous variables that affect visual perception and memory. APPLIED RESEARCH ON EYEWITNESS IDENTIFICATION: SYSTEM AND ESTIMATOR VARIABLES Our understanding of the underlying processes and limits of eyewit- ness identification, derived from basic research on vision and memory, is complemented by research directed specifically at the problem of eyewit- ness identification. The modern era of eyewitness identification research began in the 1970s. Today, eyewitness identification is generally viewed as a behavioral output. The accuracy and reliability of eyewitness identifica- tion are critically modulated by variables that include a witness’ extant cognition and memory and related psychological and situational factors at the time of the event, over the ensuing intervals, and at all stages of recall (see Figure 1-1). Because a crime is an unexpected event, one can draw a natural distinction between variables that reflect the witness’ unplanned situational or cognitive state at the time of the crime and the variables that reflect controllable conditions and internal states following the witnessed events. Researchers categorize these factors, respectively, as estimator vari- ables and system variables.20 System variables describe the characteristics of specific procedures and practices (e.g., the content and nature of instructions given to witnesses who are asked if they are able to make an identification). The criminal justice system can exert some control over system variables by follow- ing standardized procedures that are based on scientific knowledge and strengthened through education and training. One important category of system variables concerns the conditions and protocols for lineup identification. Under current law enforcement practice, eyewitness identification procedures involve having a witness view individuals or images of individuals. Research indicates that accuracy and reliability of eyewitness identifications may be influenced by the type of presentation (e.g., lineup) used, the likeness of non-suspect lineup partici- pants (fillers) to the suspect, the number of fillers, and the suspect’s physical location in the presentation.21,22 Eyewitness performance may be affected by how the lineup images are presented—simultaneously (as a group) or 20 G. L. Wells, “Applied Eyewitness-Testimony Research: System Variables and Estimator Variables,” Journal of Personality and Social Psychology 36(12):1546–1557 (1978). 21 N. K. Steblay et al., “Eyewitness Accuracy Rates in Police Showup and Lineup Presenta- tions: A Meta-Analytic Comparison,” Law and Human Behavior 27(5): 523–540 (October 2003). 22 R. J. Fitzgerald et al., “The Effect of Suspect-Filler Similarity on Eyewitness Identification Decisions: A Meta-analysis,” Psychology, Public Policy, and Law 19(2): 151–164 (May 2013). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification INTRODUCTION 17 sequentially (one at a time). System variables, such as the nature of the instructions and feedback provided before and after the identification pro- cedure, may also affect the eyewitness’ identification. Estimator variables affect the accuracy of eyewitness identification, but they are beyond the control of the criminal justice system. Estimator variables tend to be associated with characteristics of the witness or factors that are operating either at the time of the criminal event (perhaps relating to memory encoding) or the retention interval (the time between witness- ing an event and the identification process). Specific examples include the eyewitness’ level of stress or trauma at the time of the incident, the light level and nature of the visual conditions that affect visibility and the clarity of a perpetrator’s features, and the physical distance between the witness and the perpetrator. Both system and estimator variables will be discussed in detail in subsequent chapters. EFFORTS AT IMPROVEMENT In response to insights gained from research on erroneous convictions, there have been attempts to provide recommendations for improving the reliability and validity of eyewitness identifications. An effort of particular note is the National Institute of Justice’s (NIJ) Technical Working Group for Eyewitness Evidence (TWGEYEE). Called together by then-U.S. Attorney General Janet Reno in 1998, members of the working group were asked to develop and publish guidance for improving eyewitness identification FIGURE 1-1 Memory accuracy and time. SOURCE: Courtesy of Thomas D. Albright. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 18 IDENTIFYING THE CULPRIT procedures.23 The working group recognized the role that memory plays in the mistaken interpretation and remembrance of events and offered guid- ance based on the practical experiences of the law enforcement community and insights gained from behavioral and psychological research. The NIJ provided detailed instructions for each step of the eyewitness identification procedure to the approximately 18,000 state and local law enforcement agencies across the nation. After the report was issued, only a few states conducted evaluations and engaged in improvement efforts, including the implementation of new laws and the issuance of corrective guidelines and policies. Consequently, eyewitness identification policies remain fragmented by jurisdiction, except in a minority of states that have adopted state-wide policies. At present, the United States does not have a uniform national set of protocols.24 JUDICIAL CONSIDERATION OF EYEWITNESS IDENTIFICATION EVIDENCE The U.S. Supreme Court’s 1977 ruling in Manson v. Brathwaite pro- vides the current framework for judicial review of eyewitness identification under the Due Process Clause of the U.S. Constitution.25 The Manson v. Brathwaite test asks judges to evaluate the “reliability” of eyewitness iden- tifications using factors derived from prior rulings and not from empirically validated sources. The Manson v. Brathwaite ruling was not based on much of the research conducted by scientists on visual perception, memory, and eyewitness identification, and it fails to include important advances that have strengthened standards for judicial review of eyewitness identification evidence at the state level. In 2011, the Justices of the Massachusetts Supreme Judicial Court con- vened the Study Group on Eyewitness Identification to “offer guidance as to how our courts can most effectively deter unnecessarily suggestive iden- tification procedures and minimize the risk of a wrongful conviction.” The report made five recommendations to minimize inaccurate identifications: (1) acknowledge variables affecting identification accuracy; (2) develop a model policy and implement best practices for police departments; (3) ex- pand use of pretrial hearings; (4) expand use of improved jury instructions; and (5) offer continuing education.26 23 U.S. Department of Justice, Office of Justice Programs, Eyewitness Evidence: A Guide for Law Enforcement (Washington, DC, 1999). 24 Police Executive Research Forum, p. 65. 25 Manson v. Brathwaite, 432 U.S. 98, 114 (1977). 26 Massachusetts Supreme Judicial Court Study Group on Eyewitness Identification, Report and Recommendations to the Justices, July 24, 2013, available at: http://www.mass.gov/courts/ docs/sjc/docs/eyewitness-evidence-report-2013.pdf. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification INTRODUCTION 19 In 2011, the New Jersey Supreme Court issued a unanimous decision in State v. Larry R. Henderson. The opinion revised the legal framework for evaluating and admitting eyewitness identification evidence and directed that improved jury instructions be prepared to help jurors evaluate such evidence. Henderson drew on an extensive review of scientific evidence regarding human vision, memory, and the various factors that can affect the reliability of eyewitness identifications. In July 2012, the court released expanded jury instructions and revised court rules relating to eyewitness identifications in criminal cases.27 In fall 2012, the Oregon Supreme Court also established a new pro- cedure for evaluating whether eyewitness identifications could be used in court. In State v. Lawson, the Court reviewed eyewitness identification research conducted over the past 30 years, determined that the Manson v. Brathwaite test “does not accomplish its goal of ensuring that only suf- ficiently reliable identifications are admitted into evidence,” and offered a revised procedure that requires the court to make a determination of whether investigators used “suggestive” tactics to get an identification and the extent to which other information supports the identification.28 Despite these improvements and judicial decisions, policies and prac- tices across the country remain inconsistent. ORGANIZATION OF THE REPORT This report begins with a description of law enforcement protocols for eyewitness identification (Chapter 2). Chapter 3 presents the legal frame- work for eyewitness identification evidence. A discussion of the current scientific understanding of visual perception and memory follows in Chap- ter 4. In Chapter 5, the committee provides an assessment of eyewitness identification research. The report concludes with the committee’s findings and recommendations (Chapter 6). 27 New Jersey Judiciary, “Supreme Court Releases Eyewitness Identification Criteria for Criminal Cases,” July 19, 2012, available at: http://www.judiciary.state.nj.us/pressrel/2012/ pr120719a.htm. 28 State v. Lawson, 352 Or. 724 (Or. 2012). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 21 2 Eyewitness Identification Procedures Police in the United States investigate millions of crimes each year. 1 Only a small percentage of the police-investigated crimes involve the use of police-arranged identification procedures. Identification procedures are unnecessary when, for example, the perpetrator is caught during the commission of the criminal act, as in the crime of driving while intoxicated, or when the victim knows the perpetrator, as in crimes of do- mestic violence.2 Police use identification procedures for numerous reasons. In some circumstances, the police identify a suspect during an investigation and use the identification procedure to test a witness’ ability to identify the suspect as the perpetrator. In other instances, the identification procedure is used as an investigative tool to further an investigation. A positive identifica- tion might form probable cause for a search warrant or the apprehension and subsequent questioning of a suspect, or both. Most significant for the purposes of this report are the circumstances in which a witness positively identifies the police suspect as the perpetrator, and the identification serves as compelling evidence in the prosecution of a case. Data on the number of eyewitness identification procedures are not systematically or uniformly collected. While the exact number of eyewitness 1 Federal Bureau of Investigation, “Crime in the United States 2012: Persons Arrested,” available at: http://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/2012/crime-in-the-u.s.-2012/ persons-arrested/persons-arrested. 2 Throughout Chapter 2, the terms law enforcement and police are used interchangeably and refer to all law enforcement agencies at the local, state, and federal levels. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 22 IDENTIFYING THE CULPRIT identification procedures related to crimes involving strangers is unknown, mistaken identifications have disastrous effects for those wrongly accused of crimes and for society should a guilty person go free. Mistaken identifi- cations may also erode public confidence in the criminal justice system as a whole.3 Recently, some police departments and prosecutors have imple- mented stringent eyewitness identification procedures in an effort to reduce erroneous convictions.4 Police-arranged eyewitness identification procedures vary greatly de- pending on the nature of the case. In some cases, a police-arranged identifi- cation is conducted at the very early stages of an investigation. For instance, consider the circumstance in which police respond to a bank robbery in progress. The perpetrator is described as a white male, approximately 6 feet, 2 inches in height wearing an orange shirt. As the police arrive at the crime scene, an officer observes and apprehends a man fleeing the bank wearing an orange shirt and exhibiting similar physical characteristics. In this situation, a police-arranged identification procedure may be conducted on the scene and prior to any significant investigation. At the other extreme are, for example, lengthy homicide or rape cases that include extensive investigations, forensic testing, and eyewitness interviews conducted over a protracted period of time. Such efforts may culminate in the identification of a suspect and the suspect’s inclusion in a photo array identification pro- cedure. In such a circumstance, an eyewitness may not be asked to identify a perpetrator until months after the commission of the crime—and often after repeated probes of her or his memory by, for example, police, family members, and others. Identification procedures may be used in different ways for different purposes. They are not always used to identify an unknown perpetrator of a crime. The police may, for example, use photo arrays and confirmatory single photographs to clarify the legal identity (birth name/government name) of an individual who is well known to a witness, but only by a street name. In such examples, a witness may know (and may have known) the perpetrator for years but may only be able to identify him by a common 3 See, generally, The International Association of Chiefs of Police, “National Summit on Wrongful Convictions: Building a Systemic Approach to Prevent Wrongful Convictions,” August 2013. 4 See The Innocence Project, Eyewitness Identification, available at: http://www.innocence project.org/fix/Eyewitness-Identification.php; U.S. Department of Justice, Office of Justice Programs, Eyewitness Evidence: A Guide for Law Enforcement (Washington, DC, 1999); Met- ropolitan Police—District of Columbia, General Order—Procedures for Obtaining Pretrial Eyewitness Identification, April 18, 2013; New York State District Attorneys Association Best Practice Committee, New York State Photo Identification Guidelines, October 2010; Rhode Island Police Chiefs Association, Lineup and Showup Procedures (Eyewitness Identification), November 2011; and Innocence Project of Texas, Eyewitness Identification Reform, available at: http://www.ipoftexas.org/eyewitness-id. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification EYEWITNESS IDENTIFICATION PROCEDURES 23 street name, such as “Prince.” The police typically will use an identification procedure to identify the “Prince” to which the witness is referring before they make an arrest or take other investigative measures such as the execu- tion of a search warrant. This chapter reviews the eyewitness identification procedures com- monly used by the police and concludes with a brief discussion of situa- tions in which citizens engage in identifying perpetrators without police assistance. PHOTOGRAPHIC ARRAY The photo array is the most common police-arranged identification procedure used in the United States.5 A photo array consists of six to nine photographs displayed to a witness. An officer might create an array by selecting photographs of persons deemed to resemble the perpetrator.6 Officers might then display the photographs one at a time to the witness and ask whether she or he recognizes each one. This method is known as a sequential procedure. Officers might also create photo arrays by cutting six square holes in a folder and taping the photographs to the back of the folder so that the faces of the fillers (non-suspects) and suspect are displayed together. When such photographs are presented simultaneously as a two by three matrix, this type of array is referred to as a “six pack.” When, as in this instance, photographs are displayed together, this is referred to as a simultaneous procedure. In 1999, Attorney General Janet Reno released the U.S. Department of Justice, Eyewitness Evidence: A Guide for Law Enforcement,7 one of the earliest efforts to establish standardized procedures for police-arranged eyewitness identification. The guide set forth rigorous criteria and basic procedures to promote accuracy in eyewitness evidence.8 However, after the guide was released, most police departments in the United States did not adopt these procedures. Today, many police departments use computer systems to access image databases and assemble photo arrays. Officers enter physical characteristics (e.g., race, gender, hair color) specific to the suspect into a computer, and the system retrieves filler photographs with the desired attributes. If an of- ficer determines that a photograph in the array is suggestive or otherwise in- appropriate, she or he can reject one or more fillers and instruct the system 5 Police Executive Research Forum, “A National Survey of Eyewitness Identification Proce- dures in Law Enforcement Agencies,” March 2013, p. 48. 6 Historically, the photographs were mug shots in the possession of a police department. 7 U.S. Department of Justice, Office of Justice Programs, Eyewitness Evidence: A Guide for Law Enforcement (Washington, DC, 1999). 8 Ibid, pp. 11–38. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 24 IDENTIFYING THE CULPRIT to provide alternate photographs. Departments may conduct the procedure without revealing to the witness how many photographs she or he will view. In recent decades, many police agencies and prosecutors have adopted sequential presentation of photographs, based on the belief that this ap- proach improves the performance of an eyewitness. Currently, however, there is no consensus among law enforcement professionals as to whether the sequential presentation procedure is superior to the simultaneous pro- cedure (see Chapter 5). The District of Columbia Metropolitan Police De- partment, for example, does not endorse either simultaneous or sequential procedures in its Procedures for Obtaining Pretrial Eyewitness Identifica- tion.9 The District Attorneys Association of the State of New York in 2010 adopted recommended policies for New York State and endorsed the simul- taneous method.10 On the other hand, in North Carolina, legislation was passed that requires that lineup photographs be presented sequentially,11 and in Massachusetts, the Supreme Judicial Court Study Group on Eyewit- ness Identification recommended sequential procedures as best practice for Massachusetts Police Departments.12 The committee was presented with information regarding improvement efforts from states including New Jersey, Oregon, Rhode Island, Texas, New York, and Massachusetts. However, the committee is unable to deter- mine the percentage of police departments that have adopted policies for eyewitness identification procedures and instituted training in these proce- dures.13 Some police departments require that photo arrays be presented to the witness during a procedure that is either “double blind” or “blinded.”14 (See Box 2-1 for a discussion of blinding as used in scientific practice and blinding as used in eyewitness identification procedures.) Blinding is used to prevent conscious and unconscious cues from being given to the witness. In a double-blind procedure, an individual who does not know the identity of the suspect or the suspect’s position in the photo array shows a photo array to the eyewitness. In cases where such a double-blind procedure is 9 See Metropolitan Police—District of Columbia, General Order—Procedures for Obtaining Pretrial Eyewitness Identification, April 18, 2013. 10 See New York State District Attorneys Association Best Practice Committee, New York State Photo Identification Guidelines, October 2010. 11 N.C. Gen. Stat. § 15A-284.52 (West 2007). 12 See Massachusetts Supreme Judicial Court Study Group on Eyewitness Identification, Report and Recommendations to the Justices (2013). 13 The Police Executive Research Forum’s 2013 survey of eyewitness identification proce- dures in law enforcement agencies [Police Executive Research Forum, A National Survey of Eyewitness Identification Procedures in Law Enforcement Agencies, (2013)], notes that most agencies that completed the survey have no written policy for eyewitness identification proce- dures and that more agencies provide training to their employees than have written policies. See pp. 79–80. 14 Police Executive Research Forum, p. 64. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification EYEWITNESS IDENTIFICATION PROCEDURES 25 not feasible, a “blinded” procedure will approximate the condition of double-blinding. For example, the photo array may be administered by an individual who knows who the suspect is, but is unable to tell when the witness is looking at the suspect’s photo and so is unable to provide even subconscious feedback to the witness. In one common “blinded” proce- dure, the officer places each photo in a separate envelope or folder and then shuffles the envelopes/folders so that only the witness sees the images therein. Additional recommendations to minimize the possibility of biasing feedback to the witness include requiring that the officer read instructions to the witness from a pre-printed form.15 If the witness identifies someone from the photo array, some depart- ments ask the witness for a confidence statement. Based upon information presented to the committee, it appears that police departments do not always document identification procedures in instances when an identifica- tion is not made. Further, if a witness does make an identification, practices differ as to how such information is documented and preserved. Some agencies, for example, require officers to document this information in a written report. Others make audio or video recordings of the identification procedure. LIVE LINEUP A live lineup is a police-arranged identification procedure in which the physical suspect and fillers stand or sit in front of the witness (either individually, i.e., sequentially or en masse, i.e., simultaneously). The police generally use at least five fillers. Fillers are selected for their physical simi- larities to the suspect (gender, race, hair length and color, facial hair, height, skin tone, and other distinguishing features). The fillers are presumed to be unknown to the witness. Traditionally, the suspect and fillers are seated or stood in a row, and the witness views the lineup from behind a two-way mirror. Police use both simultaneous and sequential procedures for live lineups. Live lineups are used in some jurisdictions, but they are not the pre- dominant method used by law enforcement.16 The use of these police identification procedures is limited for a variety of reasons. First, in certain circumstances, legal counsel may be required at a lineup, thereby making it less attractive to police and prosecutors. Second, in smaller jurisdictions, it may be difficult to obtain suitable fillers (e.g., those with appropriate 15 As discussed in Chapter 3, the courts have been sensitive to the potential for misiden- tification resulting from “suggestive” identification procedures and have set standards for admissibility of evidence. 16 Police Executive Research Forum, p. 48. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 26 IDENTIFYING THE CULPRIT physical similarities to the suspect). Third, conducting a lineup requires a significant amount of time and labor, 17 thereby making photo arrays a more attractive alternative that may be undertaken promptly and with less demand on department resources. 17 Live lineup construction may be further constrained by the inability to hold a suspect in custody without probable cause. See Chapter 3. BOX 2-1 Blinding Empirical evidencea has shown that the beliefs, desires, and expectations of researchers can influence, often subconsciously, how they observe and in- terpret the phenomena they study and thus the outcomes of experiments. This evidence has influenced how scientists carry out their experiments, resulting in the use of blind or double-blind procedures to control for this form of bias. Blind assessmentb has been used since the late 18th century; an early medical trial in 1835 used double-blind assessment, and psychologists started using blinding in the 20th century.c By the 1950s, blind assessment in randomized controlled trials was considered standard procedure in both psychological and medical research. Currently, virtually all of science uses some form of blinding. In single-blind experiments, participants do not know which treatment they are receiving; this form of blinding is used widely across scientific fields. In experi- ments involving humans, as in medical or psychological research, double-blind procedures are used to guard against “expectancy effects” for both participants and researchers. In a classic double-blind clinical trial, some patients receive ac- tive medication and others are given an alternative (either a “standard treatment” or a similar-looking placebo without active ingredients), but neither researchers nor participants know who is receiving which treatment. In an eyewitness identification setting, double-blinding can be used to prevent a lineup administrator from either intentionally or unintentionally influencing a wit- ness. In these cases, neither the eyewitness nor the administrator knows which persons in a photo array or live lineup are the suspected culprits and which are the fillers.d,e In eyewitness identification procedures, as in science, the purpose of double-blinding is to prevent the conscious or subconscious expectations of the administrator from influencing the witness or research outcomes. In a double-blind photo array, the officer or detective conducting the inves- tigation reads a set of standard instructions to the witness. The instructions may include an advisory that the officer about to show the photos does not know whether any of the photos are of the person who committed the crime. The officer then leaves the room and a second officer—perhaps a patrol officer—displays the Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification EYEWITNESS IDENTIFICATION PROCEDURES 27 SHOWUP A showup is a police-arranged identification procedure in which the police show one person to a witness and ask if she or he recognizes that per- son. This procedure typically is used when the police locate a suspect shortly after the commission of a crime and within close proximity to the scene. Case law limits the time and distance from a crime during which such a procedure will pass legal standards. In response to such case law, police typically restrict showups to a two-hour time period after the commis- photos. It is the duty of this second officer (the “blind administrator”) to show the photos and, if an identification is made, document what the witness said and ask the witness how certain she or he is of their identification. Once all photos have been shown, the officer reports the result of the procedure to the investigating officer (preferably out of earshot from the witness). As an alternative to a double-blind array, some departments use “blinded” procedures. A blinded procedure prevents an officer from knowing when the wit- ness is viewing a photo of the suspect, but can be conducted by the investigating officer. A common approach is the so-called “folder shuffle.” With a six-photo array, an officer uses eight manila folders. A photograph of a filler is placed in the top folder, and a photograph of the suspect and four additional fillers are placed in the next five folders. The six folders are then shuffled so that the officer does not know which folder contains the image of the suspect. Two folders with blank paper are placed on the bottom of the stack so that the witness is led to believe that there are more than six images in the array (this is referred to as back-loading, and it prevents the witness from knowing when she or he is about to view the last photograph). After reading instructions to the witness, the administering officer sits to the witness’ left and hands him or her one folder at a time and instructs him/her to open each folder and look at the enclosed photo. The cover of the folder blocks the officer from viewing the photo that the witness is viewing. When an identifica- tion occurs, the officer notes the witness’ words and reaction and asks about the witness’ confidence in his or her identification. aR. Rosenthal, Experimenter Effects in Behavioral Research (New York: John Wiley, 1976). bM. Stolberg, “Inventing the Randomized Double-Blind Trial: The Nürnberg Salt Test of 1835,” James Lind Library Bulletin (2006), available at: http://www.jameslindlibrary.org/ illustrating/articles/inventing-the-randomized-double-blind-trial-the-nurnberg-salt. cT. J. Kaptchuk," Intentional ignorance: A History of Blind Assessment and Placebo Controls in Medicine,\” Bulletin of the History of Medicine 72(3): 389–433 (1998). dP. Kilmartin, Presentation to the committee, February 6, 2014. eK. Hamann, Presentation to the committee, December 2, 2013. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 28 IDENTIFYING THE CULPRIT sion of a crime. Ideally, officials take the witness to the location where the suspect has been detained and do not display the suspect in a suggestive manner (e.g., not in a police car, not handcuffed, without drawn weapons). However, as chases, fights, or disarmaments frequently precede showups, the apprehension of a suspect can raise safety issues that make it difficult to adhere to recommended procedures. Further, the nature of a showup does not lend itself to the use of a blinded procedure. A showup is designed to promptly clear innocent suspects, thereby sparing them from a prolonged period of detention as the investigation continues. Delaying the showup to locate an uninvolved officer may defeat that purpose. While some law en- forcement agencies use a standard procedure with written instructions when conducting a showup, there is no indication that such procedures are used uniformly. Courts consider showups highly suggestive, and prosecutors urge the police to exercise caution when conducting them. CONFIRMATORY PHOTOGRAPH Police will, on occasion, display a single photograph to a witness in an effort to confirm the identity of a perpetrator. Police typically limit this method to situations in which the perpetrator is previously known to or acquainted with the witness. FIELD VIEW Police also use field views in attempts to identify perpetrators. The method, which involves inviting a witness to view many people in a context where the perpetrator is thought likely to appear, is used when the police do not have a suspect but believe that the offender frequents a particular location. For example, police investigating a purse snatching may obtain information that the perpetrator frequents a particular recreation site dur- ing the lunch hour. A plainclothes officer or investigator might take the eyewitness to the site and walk around with him or her during the lunch hour without directing his or her attention to any specific individual. OTHER PROCEDURES—MUG BOOKS AND YEARBOOKS At times, police use other means to identify perpetrators. In the past, police sometimes had witnesses review mug shot books. Mug books have since been largely replaced by digitized images displayed on computer screens. Nonetheless, there are situations in which the police will have a witness review a large collection of photographs in an effort to identify a perpetrator. Witnesses who identify a perpetrator as being a student at a specific school might be asked to review a yearbook for that school in an Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification EYEWITNESS IDENTIFICATION PROCEDURES 29 effort to identify the perpetrator. When using this method, police typically attempt to mask the names of the students. Similarly, if the offender is be- lieved to be an individual from a certain profession, then the police might have the witness review photographs from the suspect’s professional society. Social media sites also serve as the catalyst for police-arranged identification procedures. If a witness knows that the perpetrator is a “friend” of Jane Doe through social media, then the police might have the witness review all friends of Jane Doe to see if she or he recognizes the individual. All of these additional procedures (i.e., confirmatory photo, field view, mug books, yearbooks) have the potential to introduce biases of the sort that blind lineup procedures are designed to avoid. NON-POLICE IDENTIFICATION PROCEDURES In some cases, the victims or witnesses, or both, identify suspects without involving the police. A private citizen, organization, or corpora- tion may conduct an investigation before, during, or even after reporting a crime to the police. The identification of suspects by entities other than law enforcement has become increasingly common as more businesses and private citizens use security cameras to identify criminal actors. High- resolution cameras coupled with high-capacity hard drives allow for real- time streaming of video with superior clarity. Such systems are relatively inexpensive and within financial reach of many home and business owners. Additionally, the proliferation of smart phones has put the ability to cre- ate a spontaneous, high-quality video record of an event into the hands of more and more people. The rise of social media has resulted in the rise of private investigations and identifications using this resource. In one recent case, a stabbing vic- tim drew a picture of her assailant and showed it to her husband.18 Upon viewing the picture, the husband believed that the assailant looked familiar and might be his ex-girlfriend. He obtained several photographs of the ex- girlfriend from her personal website and showed them to the victim who, after looking at those and other online images, identified the suspect at a lineup and at trial. CONCLUSION Many local, state, and federal law enforcement agencies have adopted policies and practices to address the issue of misidentification. However, efforts are not uniform or systemic.19 Many agencies are unfamiliar with 18 New Jersey v. Chen, 27 A.3d 930 (N.J. 2011). 19 See Massachusetts Supreme Judicial Court Study Group on Eyewitness Evidence, p. 2. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 30 IDENTIFYING THE CULPRIT the science that has emerged during the past few decades of research on eyewitness identifications. Questions remain about the optimal design of photo array procedures, including the size of the array, the contents of the photographs, and their relationship to the context of the crime scene. Similar questions apply to the design of live lineups.20 Eyewitness identifica- tion is further complicated by the increasing number of situations in which victims and witnesses seek to identify the perpetrator of a crime without the aid of law enforcement. Such identifications raise new concerns about reliability and accuracy of the identification of individuals. Inconsistent and nonstandard practices might easily add noise to the eyewitness iden- tification process, contaminate the witness, and bias the outcome of an identification procedure. 20 The design of a live lineup is subject to more practical constraints than a photo array. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 31 3 The Legal Framework for Assessment of Eyewitness Identification Evidence The admissibility of eyewitness testimony at a criminal trial may be challenged on the basis of procedures used by law enforcement of-ficials in obtaining the eyewitness identification. The U.S. Supreme Court, in its 1977 ruling in Manson v. Brathwaite, set out the modern test under the Due Process Clause of the U.S. Constitution that regulates the fairness and the reliability of eyewitness identification evidence.1 The Court also specified five reliability factors, discussed below, that a judge must con- sider when deciding whether to exclude the identification evidence at trial.2 Although the constitutional standards for assessing eyewitness tes- timony have remained unchanged in the decades since the Manson v. Brathwaite decision, a body of research has shed light on the extent to which each of the five reliability factors supports a reliable eyewitness identification. Research has cast doubt, for instance, on the belief that the apparent certainty displayed in the courtroom by an eyewitness is an indi- cator of an accurate identification, and has found that a number of factors may enhance the certainty of the eyewitness. Recently, state courts and lower federal courts have taken the lead in developing standards relating to the admissibility of expert evidence, jury instructions, and judicial notice of scientific evidence. Some states have adopted more stringent standards for regulating eyewitness identification evidence than the U.S. Constitution requires, either by legislative statutes or by state court decisions, and have modified or entirely supplanted the Man- 1 Manson v. Brathwaite, 432 U.S. 98, 113–114 (1977). 2 Manson v. Brathwaite at 114. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 32 IDENTIFYING THE CULPRIT son v. Brathwaite test to take account of advances in the growing body of scientific research. This chapter describes the changes in the legal standards for eyewitness identification and explores the relationship between the state of the scientific research and the law regulating procedures and evidence. EYEWITNESS EVIDENCE AND DUE PROCESS UNDER THE U.S. CONSTITUTION Beginning with rulings in 1967, the U.S. Supreme Court set out a standard under the Due Process Clause of the Fourteenth Amendment for reviewing eyewitness identification evidence.3 In Manson v. Brathwaite, the Court emphasized that “reliability is the linchpin in determining the admissibility of identification testimony.”4 First, the Court instructed judges to examine whether the identification procedures were unnecessarily sug- gestive. Second, to assess whether an identification is reliable, judges were instructed to examine the following five factors: (1) the opportunity of the witness to view the criminal at the time of the crime; (2) the witness’ degree of attention; (3) the accuracy of the witness’ prior description of the crimi- nal; (4) the level of certainty demonstrated at the confrontation; and (5) the time between the crime and the identification procedure.5 The five factors were drawn from earlier judicial rulings and not from scientific research.6 Eyewitness identification evidence continues to be litigated primarily under the flexible two-part Manson v. Brathwaite Due Process test.7 It is 3 In Stovall v. Denno, 388 U.S. 293, 302 (1967), the U.S. Supreme Court first set out a due process rule asking whether identification procedures used were “so unnecessarily suggestive and conducive to irreparable mistaken identification.” The Court elaborated that rule in deci- sions such as Simmons v. U.S., 390 U.S. 377, 384 (1968) and Foster v. California, 394 U.S. 440, 442 (1969), and then adopted an approach setting out “reliability” considerations in Neil v. Biggers, 409 U.S. 188 (1972). For a description of the development of this doctrine, see, e.g., B. L. Garrett, “Eyewitnesses and Exclusion,” Vanderbilt Law Review 65(2): 451, 463–467 (2012). 4 Brathwaite, 423 U.S. at 114. 5 Id. at 114. 6 Id. at 114. Justice Thurgood Marshall dissented, noting studies indicated that unnecessarily suggestive eyewitness identifications had resulted in “repeated miscarriages of justice result- ing from juries’ willingness to credit inaccurate eyewitness testimony.” 432 U.S. at 125–27 (Marshall, J., dissenting). 7 Due process is the most important constitutional right that arises in challenges to eyewit- ness identification, but rights under the Fourth and Sixth Amendments also may be implicated. The Fourth Amendment protects individuals “against unreasonable searches and seizures,” and the probable cause typically required to seize and arrest a suspect may arise from an eye- witness identification. U.S. Const. Amend. IV. The few lower courts to address the question are divided on whether probable cause is needed to place individuals in a live lineup proce- dure. Biehunik v. Felicetta, 441 F.2d 228, 230 (2d Cir. 1971); but see, e.g., Wise v. Murphy, 275 A.2d 205, 212–15 (D.C. 1971); State v. Hall, 461 A.2d 1155 (N.J. 1983). In contrast, Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification THE LEGAL FRAMEWORK 33 important to note, however, that the vast majority of criminal cases are settled through plea bargaining. The role that evidence type and strength play in plea bargaining is complex and necessarily difficult to study. Because eyewitness identification evidence may never be tested at trial, it is doubly important for lawyers and judges to understand the credibility of the prof- fered evidence.8 In the most recent U.S. Supreme Court ruling addressing a challenge to an eyewitness identification (Perry v. New Hampshire),9 the Court ruled that a due process analysis was not triggered. In that case, while the police were obtaining a description of the suspect, the eyewitness looked out of the apartment window and recognized the suspect standing outside. The police had not intended to conduct an identification procedure. In those circum- stances, the Court ruled that the Due Process Clause does not require a pre- liminary judicial review of the reliability of an eyewitness identification.10 probable cause is not required to place a person’s photograph in an array, since doing so does not involve a seizure. However, courts may also rule that an illegal stop or seizure renders a subsequent identification inadmissible, absent an “independent” source for the courtroom identification. U.S. v. Crews, 445 U.S. 463, 473 (1980). In addition, the Sixth Amendment provides that, in all criminal prosecutions, the accused has the right “to have the assistance of counsel for his defense.” In United States v. Wade, the Supreme Court held that, once indicted, a person has a right to have a lawyer present at a lineup, reasoning that the right to counsel applies at all “critical” stages of the criminal process. 388 U.S. 218, 235–37 (1967). However, the Court subsequently held that a photo array procedure, of the type now most commonly used by police agencies, does not implicate the Wade right to counsel. U.S. v. Ash, 413 U.S. 300, 321 (1973). 8 As the current report demonstrates, a comparative consideration of evidence value is particularly important in the case of eyewitness identification evidence. Similar consideration should be given when other adjudication mechanisms are used (e.g., bench trials). 9 Perry v. New Hampshire, 132 S. Ct. 716, 718 (2012). In that case, the eyewitness happened to look out her window and see the suspect standing at the crime scene where the police had told him to wait. The Court held that the Due Process Clause did not regulate such a situation, since the police did not intend to conduct an identification procedure. Id. at 729. The Court indicated that the reliability of the evidence could be addressed by federal and state evidentiary standards, and added: “In appropriate cases, some States also permit defendants to present expert testimony on the hazards of eyewitness identification evidence.” Id. 10 Justice Sotomayor dissented, arguing, “Our due process concern . . . arises not from the act of suggestion, but rather from the corrosive effects of suggestion on the reliability of the resulting identification,” and the manner in which “[a]t trial, an eyewitness’ artificially inflated confidence in an identification’s accuracy complicates the jury’s task of assessing witness credibility and reliability.” Perry, 132 S. Ct. at 731–32 (Sotomayor, J., dissenting). Justice Sotomayor also emphasized: “A vast body of scientific literature has reinforced every concern our precedents articulated nearly a half-century ago.” Id. at 738. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 34 IDENTIFYING THE CULPRIT STATE LAW REGULATION OF EYEWITNESS EVIDENCE State Supreme Court Standards Several state supreme courts have altered or supplemented the federal Manson v. Brathwaite due process rule to focus more on the effects of sug- gestion, to emphasize certain factors in specific circumstances,11 or to focus on showup identifications in particular.12 New Jersey and Oregon have now supplemented the Manson v. Brathwaite test with separate state law standards regulating eyewitness identification evidence. In 2011, the New Jersey Supreme Court issued a unanimous decision in State v. Larry R. Henderson that revised the legal framework for admitting eyewitness identification evidence and directed that revised jury instructions be prepared to help jurors evaluate such evidence.13 The new framework was based on the record of hearings before a Special Master that considered an extensive review of scientific research regarding eyewitness identifica- tions.14 The legal framework established by the Henderson opinion relies on pretrial hearings to review eyewitness evidence and more comprehensive jury instructions at trial.15 To obtain a pretrial hearing, a defendant must show some evidence of suggestiveness related to either estimator or system 11 See State v. Ramirez, 817 P.2d 774, 780–81 (Utah 1991) (altering three of the reliability factors to focus on effects of suggestion); State v. Marquez, 967 A.2d 56, 69–71 (Conn. 2009) (adopting criteria for assessing suggestion); Brodes v. State, 614 S.E.2d 766, 771 & n.8 (Ga. 2005) (rejecting eyewitness certainty jury instruction); State v. Hunt, 69 P.3d 571, 576 (Kan. 2003) (adopting Utah’s five factor “refinement” of the Biggers factors); State v. Crom- edy, 727 A.2d 457, 467 (N.J. 1999) (requiring, when applicable, instruction on cross-racial misidentifications). 12 See, e.g., State v. Dubose, 285 Wis.2d 143, 166 (Wis. 2005); Commonwealth v. Johnson, 650 N.E.2d 1257, 1261 (Mass. 1995); People v. Adams, 423 N.E.2d 379, 383–84 (N.Y. 1981). 13 State v. Henderson, 27 A.3d 872 (N.J. 2011). The Henderson opinion described criticisms of the Manson v. Brathwaite test, including that suggestion may itself affect the seeming “reli- ability” of the identification. Id. at 877–78. For examples of scholarly criticism of the Manson v. Brathwaite test in light of scientific research, see, e.g., G. L. Wells and D. S. Quinlivan, “Suggestive Eyewitness Identification Procedures and the Supreme Court’s Reliability Test in Light of Eyewitness Science: 30 Years Later,” Law and Human Behavior 33(1): 1, 16 (Febru- ary 2009); T. P. O’Toole and G. Shay, “Manson v. Brathwaite Revisited: Towards a New Rule of Decision for Due Process Challenges to Eyewitness Identification Procedures,” Valparaiso University Law Review 41(1): 109 (2006). 14 See Report of the Special Master at 16–17, State v. Henderson, No. A-8-08 (N.J. June 18, 2011, available at: http://www.judiciary.state.nj.us/pressrel/HENDERSON%20FINAL%20 BRIEF%20.PDF%20(00621142.pdf. 15 In the companion case, State v. Chen, 27 A.3d 930, 932 (N.J. 2011), the New Jersey Supreme Court took an approach that departed from that of the U.S. Supreme Court in Perry, ruling that the defendant may be entitled to a hearing in a case in which the eyewitness identified the defendant using social media, not a police-orchestrated identification procedure. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification THE LEGAL FRAMEWORK 35 variables that could lead to mistaken identification.16 At the pretrial hear- ing, the State must offer proof that the eyewitness identification is reliable. However, the ultimate burden of proving a “very substantial likelihood of irreparable misidentification” is on the defendant.17 In July 2012, the New Jersey Supreme Court released an expanded set of jury instructions and related rules that govern the use of suggestive identifications.18 The jury instructions state that “[r]esearch has shown that there are risks of making mistaken identifications” and noted that eyewit- ness evidence “must be scrutinized carefully.”19 Human memory involves three stages—encoding, storage, and retrieval. At “each of these stages, memory can be affected by a variety of factors.”20 The Court identified a set of factors that jurors should consider when deciding whether eyewit- ness identification evidence is reliable, including estimator variables (e.g., stress, exposure duration, weapon focus, distance, lighting, intoxication, disguises or changed appearance of the perpetrator, time since the incident, and cross-racial effects) and system variables (e.g., lineup composition, fillers, use of multiple viewings, presence of feedback, use of double-blind procedures, and use of showup identifications). The instructions also noted the possible influence of outside opinions, descriptions or identifications by other witnesses, and photographs or media accounts.21 In 2012, in Oregon v. Lawson, the Oregon Supreme Court established a new procedure for evaluating the admissibility of eyewitness identifica- tions. In a unanimous decision, the Court found “serious questions” about the reliability of eyewitness identification, citing research conducted over the past 30 years.22 The Court determined that the Manson v. Brathwaite two-step process for weighing eyewitness identification “does not accom- plish its goal of ensuring that only sufficiently reliable identifications are admitted into evidence,” because it relies on an eyewitness’ self-reports to determine whether the threshold level of suggestiveness is reached, ren- dering the identification unreliable.23 The Court set forth a process that requires the trial court to examine whether investigators used “suggestive” 16 Henderson, 27 A.3d. at 878. 17 Id. 18 New Jersey Criminal Model Jury Instructions, Identification (July 19, 2012), available at: http://www.judiciary.state.nj.us/pressrel/2012/jury_instruction.pdf; New Jersey Court Rule 3:11, Record of an Out-of-Court Identification Procedure (July 19, 2012), available at: http:// www.judiciary.state.nj.us/pressrel/2012/new_rule.pdf; New Jersey Court Rule 3:13-3, Discov- ery and Inspection (July 19, 2012), available at: http://www.judiciary.state.nj.us/pressrel/2012/ rev_rule.pdf. 19 See New Jersey Criminal Model Jury Instructions, Identification, supra at 2. 20 Id. 21 Id. at 9. 22 State v. Lawson, 352 Ore. 724 (Or. 2012). 23 Id. at 746–748. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 36 IDENTIFYING THE CULPRIT identification procedures and whether other factors, such as estimator vari- ables, may have affected the reliability of the identification.24 The Court ruled that “intermediate remedies,” including the use of expert testimony, should be available even if the trial judge concludes that the identification is admissible. The Court also briefly noted that judges might use “case-specific jury instructions.”25 Other states continue to explore possible changes to the judicial re- view of eyewitness identification evidence. In 2013, the Massachusetts Supreme Judicial Court Study Group on Eyewitness Identification offered guidance on the adjudication of eyewitness identification evidence.26 The report adopted Lawson’s approach of taking judicial notice of “certain scientifically-established facts about eyewitness identification.”27 The re- port recommended that trial judges conduct pretrial hearings to determine whether suggestive identification procedures were used, and if so, whether these procedures impaired the reliability of identification evidence. Pretrial hearings would consider the effects of both estimator variables (relating to viewing at the crime scene) and system variables (relating to the lineup or showup procedures) on the identification. The report also recommended that the state adopt a set of recommended practices for conducting identi- fication procedures, create new model jury instructions on eyewitness iden- tifications, and set limitations on the admissibility of certainty statements and in-court identifications.28 State Statutes Regulating Identification Procedures Judicial rulings regulating admissibility of eyewitness evidence in the courtroom do not specify the identification procedures to be used by law enforcement officials. However, 14 states have adopted legislation regard- ing eyewitness identification procedures. Of the 14, 11 states (Connecticut, Illinois, Maryland, North Carolina, Ohio, Texas, Virginia, West Virginia, Wisconsin, Utah, and Vermont) have enacted statutes directly requiring that 24 Id. at 747–748, 755–756. 25 Id. at 759, 763. 26 See Massachusetts Supreme Judicial Court Study Group on Eyewitness Evidence, Report and Recommendations to the Justices (2013). 27 Id. at 48. 28 Id. at 28. In the courtroom, the eyewitness can easily see where the defendant is sitting. Thus, in-court identifications do not reliably test an eyewitness’ memory. Nevertheless, courts have shown great tolerance of in-court identifications, deeming them based on “independent” memory, and even following suggestive out-of-court procedures. Garrett, Eyewitnesses and Exclusion, supra. For example, the New York Court of Appeals ruled that “[e]xcluding evi- dence of a suggestive showup does not deprive the prosecutor of reliable evidence of guilt. The witness would still be permitted to identify the defendant in court if that identification is based on an independent source.” People v. Adams, 423 N.E.2d 379, 384 (N.Y. 1981). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification THE LEGAL FRAMEWORK 37 law enforcement officials adopt written procedures for eyewitness identifi- cations and regulating the particular procedures to be used.29 Three more states (Georgia, Nevada, and Rhode Island) have passed statutes recom- mending further study, tasking a group with developing best practices, or requiring some form of written policy.30 State statutes typically assert that a trial judge may consider the failure to follow the prescribed procedures as a factor in assessing admissibility and informing the jury. The statutes rarely require that a trial judge exclude such identification evidence from consideration by the jury. However, some of the more detailed statutes, such as those in Ohio, North Carolina, and West Virginia, require that law enforcement officials use particular practices (e.g., eyewitness instructions, a blind administrator). Other statutes require adherence to model policies or guidelines. Utah requires that lineup pro- cedures be recorded. Some jurisdictions and departments also have volun- tarily adopted guidelines or policies regulating eyewitness identifications.31 Several state courts have issued rulings regulating lineup practices (e.g., New Jersey’s Supreme Court has required documentation of identification procedures).32 AIDING JURORS IN ASSESSMENT OF EYEWITNESS TESTIMONY Expert Witness Testimony Regarding Eyewitness Identification The standards for assessing the admissibility of testimony by expert witnesses have undergone great changes in the past two decades. Before 1993, the Frye test allowed scientific expert testimony in federal courts if it met the standard of “general acceptance” in the relevant scientific community.33 In 1993, the Supreme Court, in Daubert v. Merrell Dow 29 See Conn. Gen. Stat. § 54-1p (West 2012); 725 Ill. Comp. Stat. § 5/107A-5 (West 2003); Md. Code Ann., Pub. Safety § 3-506 (West 2007); N.C. Gen. Stat. § 15A-284.52 (West 2007); Ohio Rev. Code Ann. § 2933.83 (West 2010); Tex. Code Crim. Proc. Ann. art. 38.20 (West 2011); Utah Code Ann. §77-8-4 (West 1980); Va. Code Ann. §19.2-390.02 (West 2005); Va Code Ann. § 9.1-102.54; 13 V.S.A. § 5581; W. Va. Code Ann. § 62-1E-1 (West 2013); Wis. Stat. § 175.50 (West 2005). 30 GA. H.R. 352, 149th Gen. Assem., Reg. Sess. (April 20, 2007); Nev. Rev. Stat. § 171.1237 (West 2011); R.I. Gen. Laws § 12-1-16 (West 2012); 2010 Leg. Reg. Sess. (Vt. 2010). 31 See, e.g., John J. Farmer, Jr., Attorney General of the State of New Jersey, “Letter to All County Prosecutors: Attorney General Guidelines for Preparing and Conducting Photo and Live Lineup Identification Procedures” (April 18, 2001), available at: http://www.state.nj.us/ lps/dcj/agguide/photoid.pdf; CALEA Standards for Law Enforcement Agencies: 42.2.11 Line- ups, available at: http://www.calea.org/content/standards-titles; International Association of Chiefs of Police, Model Policy: Eyewitness Identification (2010). 32 State v. Delgado, 188 N.J. 48, 63–64, 902 A.2d 888 (2006). 33 Frye v. United States, 54 App. D.C. 46, 293 F. 1013 (1923). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 38 IDENTIFYING THE CULPRIT Pharmaceuticals, Inc.,34 ruled that, under Federal Rule of Evidence 702, a “trial judge must ensure that any and all scientific testimony or evidence admitted is not only relevant, but reliable.”35 Judges determine reliability by assessing the scientific foundation of the expert’s testimony prior to trial, so that “evidentiary reliability will be based upon scientific validity.”36 Many states have adopted Daubert, and many of those that have not for- mally adopted Daubert have revised their Frye test to adopt much of the Daubert standard. In turn, Federal Rule of Evidence 702 has been revised to incorporate the holding in Daubert.37 Federal and state courts remain divided on whether expert testimony on eyewitness identifications is ad- missible under Daubert or Frye, and on the proper exercise of trial court discretion when deciding whether to admit such expert testimony. Appellate rulings emphasize that a trial judge should use discretion when deciding whether proffered expert evidence satisfies the Daubert or Frye standards. An increasing number of rulings emphasize the value of presenting expert testimony regarding eyewitness identification. Some courts have held that it can be an abuse of discretion for a trial judge to bar the defense from admitting such testimony.38 Detailed descriptions of the relevant scientific research findings accompany such decisions.39 There are also many federal and state courts that continue to follow the traditional approach, emphasiz- ing that credibility of eyewitnesses is a matter within the “province of the jury” and insisting that information regarding valid scientific research in this area will not assist the jury in its task.40 34 509 U.S. 579 (1993). 35 Id. at 589. 36 Id. at 590 n.9. 37 Fed. R. Evid. 702. Rule 702 now provides: A witness who is qualified as an expert by knowledge, skill, experience, training, or educa- tion may testify in the form of an opinion or otherwise if: (a) the expert’s scientific, technical, or other specialized knowledge will help the trier of fact to understand the evidence or to de- termine a fact in issue; (b) the testimony is based on sufficient facts or data; (c) the testimony is the product of reliable principles and methods; and (d) the expert has reliably applied the principles and methods to the facts of the case. 38 See, e.g., Tillman v. State, 354 S.W.3d 425, 441 (Tex. Crim. App. 2011); People v. Le- Grand, 835 N.Y.S.2d 523, 524 (2007); State v. Clopten, 223 P.3d 1103, 1117 (Utah 2009); U.S. v. Smithers, 212 F.3d 306, 311–14 (6th Cir. 2000). 39 See, e.g., State v. Copeland, 226 S.W.3d 287, 299–300 (Tenn. 2007); Tillman, 354 S.W.3d at 441; Clopten, 223 P.3d at 1108. 40 For scholarly examination of this case law, see, e.g., “The Province of the Jurist: Judicial Resistance to Expert Testimony on Eyewitnesses as Institutional Rivalry,” Harvard Law Review 126(8): 2381 (2013); R. Simmons, “Conquering the Province of the Jury: Expert Testimony and the Professionalization of Fact-Finding,” University of Cincinnati Law Review 74: 1013 (2006); G. Vallas, “A Survey of Federal and State Standards for the Admission of Expert Testimony on the Reliability of Eyewitnesses,” American Journal of Criminal Law 39(1): 97 (2011). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification THE LEGAL FRAMEWORK 39 The trend is toward greater acceptance of expert testimony regarding the factors that may affect eyewitness identification. In a 2012 decision, the Connecticut Supreme Court disavowed earlier rulings restricting expert testimony and stated that such rulings are now “out of step with the wide- spread judicial recognition that eyewitness identifications are potentially unreliable in a variety of ways unknown to the average juror.”41 Similarly, the Pennsylvania Supreme Court recently held that expert testimony on eyewitness identifications was no longer per se inadmissible, emphasizing that “courts in 44 states and the District of Columbia have permitted such testimony at the discretion of the trial judge,” and that “all federal circuits that have considered the issue, with the possible exception of the 11th Circuit, have embraced this approach.”42 As the Seventh Circuit Court of Appeals recently explained: It will not do to reply that jurors know from their daily lives that memory is fallible. The question that social science can address is how fallible, and thus how deeply any given identification should be discounted. That jurors have beliefs about this does not make expert evidence irrelevant; to the contrary, it may make such evidence vital, for if jurors’ beliefs are mistaken then they may reach incorrect conclusions. Expert evidence can help jurors evaluate whether their beliefs about the reliability of eyewitness testimony are correct.43 Courts also have allowed expert witnesses to testify about particular is- sues concerning eyewitness identifications, such as cross-race effects, stress, weapons focus, suggestive lineup procedures, and the like.44 Rarely have experts conducted eyewitness identification research related to the specific case before the court. However, in one such case, in which an experiment 41 State v. Guilbert, 306 Conn. 218, 234 (Conn. 2012). Prior to that decision, the Connecti- cut Supreme Court had long ruled that “the reliability of eyewitness identification is within the knowledge of jurors and expert testimony generally would not assist them in determining the question” (State v. Kemp, supra 199 Conn. at 473, 477), and that factors affecting eyewit- ness memory are “nothing outside the common experience of mankind” (State v. McClendon, supra 248 Conn. at 572, 586). 42 Com. v. Walker, 2014 WL 2208139 *13 (Pa. 2014) (collecting authorities). 43 U.S. v. Bartless, 567 F.3d 901, 906 (7th Cir. 2009). Other federal courts have found it a proper exercise of discretion to exclude expert testimony on eyewitness identifications. See, e.g., United States v. Lumpkin, 192 F.3d 280, 289 (2d Cir. 1999). Most federal courts treat the subject as one of considerable trial discretion; see, e.g., United States v. Rodriguez-Berrios, 573 F.3d 55, 71–72 (1st Cir. 2006). For a survey of federal decisions, see Lauren Tallent, Note, Through the Lens of Federal Evidence Rule 403: An Examination of Eyewitness Identification Expert Testimony Admissibility in the Federal Circuit Courts, Washington & Lee Law Review 68 (2): 765 (2011); see also Walker, 2014 2208139 *13. 44 See, e.g., Loftus, Doyle & Dysart at § 14-8[a]-[b] p. 408 n. 41–42, 410, n. 53 (5th Edi- tion, 2013) (collecting cases). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 40 IDENTIFYING THE CULPRIT was conducted with the actual photo array used in the case, the federal courts found expert testimony admissible where it was directed not only to general research, but also by the question of whether suggestive procedures affected the identification in that case.45 Expert witnesses who explain the complications of eyewitness identifi- cation can be expensive. Most criminal defendants are indigent and cannot afford such assistance.46 In Ake v. Oklahoma, the Supreme Court held that an indigent defendant has a constitutional due process right to assistance by an expert witness only if that expert assistance is so crucial to the defense (or such a “significant factor”) that its denial would deprive the defendant of a fundamentally fair trial.47 In federal courts, funding for expert wit- nesses is available, and requests by indigent defendants are common.48 In state courts, such assistance is uncommon, especially in state courts that rarely find denial of expert assistance on eyewitness matters to be a due process violation. Expert testimony on eyewitness memory and identifications has many advantages over jury instructions as a method to explain relevant scientific framework evidence to the jury: (1) Expert witnesses can explain scientific research in a more flexible manner, by presenting only the relevant research to the jury; (2) Expert witnesses are familiar with the research and can de- scribe it in detail; (3) Expert witnesses can convey the state of the research at the time of the trial; (4) Expert witnesses can be cross-examined by the other side; and (5) Expert witnesses can more clearly describe the limita- tions of the research. The benefits of expert testimony are offset somewhat by the expense. However, conflicting testimony by opposing experts may lead to confusion among the jurors. Nonetheless, trial judges have discre- tion to determine whether the potential benefits of expert testimony out- weigh the cost. Jury Instructions Regarding Eyewitness Identification Some courts restricting expert testimony have found jury instructions regarding the fallible nature of eyewitness identifications to be an accept- able substitute for expert testimony.49 At the conclusion of a criminal trial, 45 Newsome v. McCabe, 319 F.3d 301 (7th Cir. 2003). 46 See, e.g., Bureau of Justice Statistics, “Indigent Defense,” available at: http://www.bjs.gov/ index.cfm?ty=pbdetail&iid=995. 47 470 U.S. 68, 82–83 (1985). Even if an indigent defendant receives funding to retain an expert, the judge may ultimately decide that the expert testimony is not admissible at trial. 48 18 U.S.C. § 3006A(e)(1). 49 See, e.g., U.S. v. Jones, 689 F.3d 12, 20 (1st Cir. 2012) (“The judge was fully entitled to conclude that this general information could be more reliably and efficiently conveyed by instructions rather than through dueling experts.”). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification THE LEGAL FRAMEWORK 41 the trial judge can instruct jurors on the factors that may result in an erro- neous identification while also offering instructions on the legal principles jurors must apply when assessing the factual record. Such instructions may be given when the witness testifies. Judges tend to rely on model or pattern instructions, because any departure from these standard instructions may be a ground for appellate reversal. The New Jersey Supreme Court viewed jury instructions as preferable to expert testimony.50 The New Jersey instructions adopted, following the Henderson decision, are by far the most detailed set of jury instructions regarding eyewitness identification evidence. Traditionally, instructions re- garding eyewitness identifications have been brief and remind the jurors to consider the following: (1) the credibility of an eyewitness is like that of any other witness and (2) any eyewitness identification is part of the prosecu- tor’s burden of proof in a criminal case.51 Many state courts have held that, although general jury instructions regarding credibility and the burden of proof are appropriate, more specific instructions on eyewitness identifica- tions are considered an inappropriate judicial comment on the evidence.52 Following the U.S. Supreme Court’s decision in Manson v. Brathwaite, some state courts supplemented their jury instructions by including the five reliability factors named by the Supreme Court.53 In 1972, in U.S. v. Telfaire, the D.C. Circuit Court of Appeals adopted a set of influential model jury instructions to be used in appropriate federal cases involving eyewitness identifications.54 The instructions emphasized the following: You must consider the credibility of each identification witness in the same way as any other witness, consider whether he is truthful, and consider 50 The New Jersey Supreme Court indicated: “Jury charges offer a number of advantages: they are focused and concise, authoritative (in that juries hear them from the trial judge, not a witness called by one side), and cost-free; they avoid possible confusion to jurors created by dueling experts; and they eliminate the risk of an expert invading the jury’s role or opining on an eyewitness’ credibility.” Henderson, 27 A.3d at 925. 51 New Jersey courts used such instructions a decade before Henderson. See, e.g., State v. Robinson, 165 N.J. 32, 46–47 (N.J. 2000). Some states have also approved instructions informing the jury that there may be an “independent source” for an in-court identification. See, e.g., State v. Cannon, 713 P.2d 273, 281 (Ariz. 1985). 52 Brodes v. State, 279 Ga. 435, 439 & n.6 (Ga. 2005) (surveying state case law). 53 State v. Tatum, 219 Conn. 721 (1991). 54 U.S. v. Telfaire, 469 F.2d 552, 558 (D.C. Cir. 1972). Some federal courts follow that ap- proach, while others adopt a “flexible approach.” See, e.g., United States v. Luis, 835 F.2d 37, 41 (2d Cir. 1987). Some more recent federal model instructions include added detail, reflecting variables such as stress and cross-race identifications. See, e.g., Third Circuit Model Criminal Jury Instructions, 4.15 (Jan. 2014), available at: http://www.ca3.uscourts.gov/sites/ ca3/files/2013%20Chapter%204%20final%20revised.pdf. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 42 IDENTIFYING THE CULPRIT whether he had the capacity and opportunity to make a reliable observa- tion on the matter covered in his testimony.55 The Telfaire instructions departed from the brief traditional instruction by adding that the jury should consider factors related to the initial sighting, including “how long or short a time was available, how far or close the witness was, how good were lighting conditions, [and] whether the witness had had occasion to see or know the person in the past.” The decision also noted that an identification is more reliable if the witness is able to pick the defendant out of a group, rather than at a showup, and that the jury should consider the length of time between the crime and the identification.56 Some states have adopted cautionary instructions on specific issues related to eyewitness identification evidence. In State v. Ledbetter, the Connecticut Supreme Court ordered lower courts to use a special instruc- tion in cases in which law enforcement failed to instruct the eyewitness that the perpetrator may or may not be present in a lineup.57 The Georgia Supreme Court concluded in 2005 that one particular use of the Manson v. Brathwaite factors must no longer be permitted: “we can no longer endorse an instruction authorizing jurors to consider the witness’ certainty in his/ her identification as a factor to be used in deciding the reliability of that identification.”58 Other courts have done the same.59 In 1999, the New Jersey Supreme Court ruled in State v. Cromedy that instructions on cross- racial identifications are required in certain cases.60 Expert testimony on eyewitness memory and identifications appears to have many advantages when used as a method to explain relevant scientific framework evidence to the jury. However, when expert testimony is not available to the defense, jury instructions may be a preferable alternative means to inform the jury of the findings of scientific research in this area. 55 U.S. v. Telfaire, 469 F.2d at 559. 56 Id. at 558. 57 State v. Ledbetter, 275 Conn. 534, 579–580 (2005) (The instruction reads, in part, “the individual conducting the procedure either indicated to the witness that a suspect was present in the procedure or failed to warn the witness that the perpetrator may or may not be in the procedure. Psychological studies have shown that indicating to a witness that a suspect is pres- ent in an identification procedure or failing to warn the witness that the perpetrator may or may not be in the procedure increases the likelihood that the witness will select one of the indi- viduals in the procedure, even when the perpetrator is not present. Thus, such behavior on the part of the procedure administrator tends to increase the probability of a misidentification.”) 58 Brodes, 279 Ga. at 442. 59 See, e.g., supra Commonwealth v. Payne, 426 Mass. 692 (1998); State v. Romero, 191 N.J. 59 (2007). 60 State v. Cromedy, 158 N.J. 112 (1999); see also Innocence Project, “Know the Cases: McKinley Cromedy,” available at: http://www.innocenceproject.org/Content/McKinley_ Cromedy.php. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification THE LEGAL FRAMEWORK 43 Brief instructions may not, however, provide sufficient guidance to explain the relevant scientific evidence to the jury, but lengthy instructions may be cumbersome and complex. More research is warranted to better understand how best to com- municate to jurors the factors that may affect the validity of eyewitness testimony and support a more sensitive discrimination of the strengths and weaknesses of eyewitness testimony in individual cases. Indeed, research findings on the effectiveness of jury instructions on assessment of eyewitness identification evidence have been mixed. In general, such studies find that jury instructions cause jurors to become more suspicious of all eyewitness identification evidence.61 A recent study of the effect of the New Jersey jury instructions used in Henderson found that the instructions reduced juror re- liance on both strong and weak eyewitness identification evidence.62 Among the few studies finding that jury instructions succeed in increasing jurors’ sensitivity to the strength of such evidence are those that study the effect of jury instructions presented before the eyewitness testimony rather than at the end of the case before deliberation.63 Such studies also have examined instructions that use visual aids rather than rely on a judge’s recitation of written instructions.64 In addition, research studies might explore the use of videotape as an alternative way to present such information65 and the effects of moving jury instructions to precede the introduction of the testi- mony by the eyewitness. 61 For a review of this research, see K. A. Martire and R. I. Kemp, “The Impact of Eyewitness Expert Evidence and Judicial Instruction on Juror Ability to Evaluate Eyewitness Testimony,” Law and Human Behavior 33:225–236, 226 (reviewing studies of jury instructions on eyewit- ness identification and concluding that increased skepticism and confusion is a common re- sult); see also J. L. Devenport, C. D. Kimbrough, and B. L. Cutler, “Effectiveness of traditional safeguards against erroneous conviction arising from mistaken eyewitness identification,” in Expert testimony on the psychology of eyewitness identification, ed. B. L. Cutler (New York: Oxford University Press, 2009), 51–68 (summarizing research studying the Telfair jury instruc- tion and concluding that “cautionary jury instructions may be an ineffective safeguard against erroneous convictions resulting from mistaken eyewitness identifications.”). 62 A. P. Papailiou, D. V. Yokum, C. T. Robertson, “The Novel New Jersey Eyewitness In- struction Induces Skepticism But Not Sensitivity,” August 2014, available at: http://papers. ssrn.com/sol3/papers.cfm?abstract_id=2475217. 63 See, e.g., N. B. Pawlenko, M. A. Safer, R. A. Wise, and B. Holfeld, “A Teaching Aid for Improving Jurors’ Assessments of Eyewitness Accuracy,” Applied Cognitive Psychology 27(2): 190–197. Other studies are reviewed in Martire and Kemp, supra note 105 at 226. 64 Pawlenko et al., supra note 107. 65 For an example of videotaped instructions, see Federal Judicial Center, The Patent Process: An Overview for Jurors, available at: http://www.youtube.com/watch?v=ax7QHQTbKQE. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 44 IDENTIFYING THE CULPRIT CONCLUSION The Manson v. Brathwaite test under the Due Process Clause of the U.S. Constitution set out the modern test that regulates the fairness and the reliability of eyewitness identification evidence. The test evaluates the “reliability” of eyewitness identifications using factors derived from prior rulings and not from empirically validated sources. It includes factors that are not diagnostic of reliability and treats factors such as the confidence of a witness as independent markers of reliability when, in fact, it is now well established that confidence judgments may vary over time and can be pow- erfully swayed by many factors. The best guidance for legal regulation of eyewitness identification evidence comes not, however, from constitutional rulings, but from the careful use and understanding of scientific evidence to guide fact-finders and decision makers. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 45 4 Basic Research on Vision and Memory Accurate eyewitness identification requires that a witness to a crime correctly sense, perceive, and remember objects and events that occurred and recall them later. The veracity of the witness’ identi- fication thus depends on the limits of sensation, perception, and memory. Recent scientific studies have yielded great advances in our understanding of how vision and memory work. This chapter provides a brief overview of current knowledge, identifies areas in which vision and memory are imperfect, and describes implications for the accuracy of eyewitness iden- tification. These implications, in turn, have guided much of the applied research on this topic (see Chapter 5) and provide a general framework for the recommendations made herein (see Chapter 6). VISION AND MEMORY IN CONTEXT This chapter begins by offering a concrete example to place the body of basic scientific research on vision and memory in context so as to better communicate its relevance to eyewitness identification. In the sections that follow the example, the different functional steps of the sequence (high- lighted in italics) are dissected in some detail, with special reference to its limitations and the ways in which it may fail to deliver accurate eyewitness identification. While returning home late, you hear a muffled scream from around the street corner. Seconds later, you come face-to-face with a man turning the corner and moving swiftly past you. Instantaneously, properties of the Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 46 IDENTIFYING THE CULPRIT scene are conveyed to you through patterns of light cast on the backs of your eyes and sensed by photoreceptors in your retina. Only a fraction of the information sensed is selected for further processing; in this case you focus your attention on certain features of the man’s face. Those features are integrated and interpreted to yield a coherent percept of the man. As you round the corner, you perceive, through an identical process, the victim slumped lifelessly against a wall. You quickly grasp the meaning of these perceptual experiences, and they immediately elicit both cognitive and visceral components (e.g., increased heart rate) of fear and anxiety. Your percepts are initially encoded in short-term working memory, where content is limited and labile. Your elevated level of arousal may cause interference and some loss of content, but with time and recognition of the importance of the experience, your percepts are consolidated into long- term memory. Long-term memories are maintained in storage but subject to ongoing updates and modifications resulting from new experiences and perhaps distortions caused by sustained levels of stress. At a later date, you are asked to look at a police lineup that includes a suspect apprehended near the crime scene. Visual features of the men in the lineup are sensed, selectively attended, and perceived, using the same visual processes engaged on the night in question. Some of these features—the high brow and sharp cheekbones of one man in the lineup—elicit retrieval of memories of your visual experiences on the night of the crime. The si- multaneously perceived and retrieved experiences are implicitly compared, leading to a cycle of greater visual scrutiny of the man in front of you and retrieval of additional details of the original percept. The context of the lineup procedure, the sight of the man, and the retrieved memories trigger latent emotions and anxiety, which may interfere with your comparison of percept and memory. Eventually, the comparison reaches your internal criterion for identification: You decide, with an implicit level of certainty, that your current visual percept and the percept from the night of the crime were caused by the same external source (the man now in front of you), and you assert that you have identified the person you witnessed at the crime scene. VISION Functional Processes of Vision To understand the contributions and limitations of vision to eyewit- ness identification, it is useful to consider the workings of three functional stages of visual processing—sensation, attention, and perception—bearing in mind that they comprise highly interdependent elements of a continuous operation. Sensation is the initial process of detecting light and extracting basic image features. Sensations themselves are evanescent, and only a small Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification BASIC RESEARCH ON VISION AND MEMORY 47 fraction of what is sensed is actually perceived. Attention is the process by which information sensed by the visual system is selected for further pro- cessing. Perception is the process by which attended visual information is integrated, linked to environmental cause, made coherent, and categorized through the assignment of meaning, utility, value, and emotional valence. In addition, memories and emotions resulting from prior experiences with the world can influence all stages of visual processing and thus define a thread that weaves throughout the following discussions. All of the functional processes of vision are beset by noise, which affects the quality and types of information accessible from the visual en- vironment, and bears heavily on the validity of eyewitness identification. Before considering the processes of sensation, attention, and perception in greater detail, consideration is given to the concept of noise in visual pro- cessing and to ways of interpreting its impact on visual experience. The Fundamental Role of Noise Vision is usefully understood as the process of detecting informative signals about the external world and using those signals to recognize ob- jects, make decisions, and guide behavior. As with any signal detection, there are occasionally factors that lead to uncertainty on the part of the observer about whether a particular signal is present. These factors are generically termed noise, following the definition used in electronic signal transmission, in which noise refers to random or irrelevant elements that interfere with detection of coherent and informative signals. In vision, noise comes from a variety of sources, some associated with the structure of the visual environment (e.g., occluding surfaces, glare, shadows), some inher- ent to the optical and neuronal processes involved (e.g., scattering of light in the eye), some reflecting sensory content not relevant to the observer’s goals (e.g., a distracting sign or a loud sound), and some originating with incorrect expectations derived from memory. Consider, for example, the seemingly simple problem of detecting a green light while waiting at a traffic signal. In this case, your ability to “see” the green light may be compromised by glare or dust on your windshield, by poor visual acuity, by your eyes having been aimed instead at the driver of the adjacent car, by the presence of other (irrelevant) colored lights in your field of view (e.g., a traffic signal at a different intersection or the lights of a nearby restaurant), by a cell phone conversation, or by the news on the car radio. The signifi- cance of this view for eyewitness identification is profound, as it helps us to realize that the accuracy of information about the environment—the face Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 48 IDENTIFYING THE CULPRIT of a criminal, for example—gained through vision is necessarily, and often sharply, limited by noise.1 The fact that vision is noise-limited suggests a familiar statistical frame- work—signal detection theory—for assessing and understanding the effects of noise on visual perception and recognition ability.2 Signal detection theory has long been successfully applied to analogous problems in elec- tronic signal reception.3 To illustrate these principles as applied to sensory processing, consider the problem of detecting a vibrating cell phone in your pocket. Anyone who has operated a cell phone in vibrate mode will be familiar with two types of signal detection errors: (1) the occasional sense that the phone is vibrating in your pocket, only to discover that it is not, and, conversely, (2) the phone call that is sometimes missed because you attribute the vibration to some other cause. Signal, in this example, is a subtle tactile stimulus resulting from an incoming phone call. Noise, in this example, is all of the other things in your environment that may also lead to subtle tactile stimulation, such as vibration of your car seat, a shift of keys in your pocket, or the touch of another person. Signal detection theory posits that there are three main factors that determine whether a signal will be detected: (1) the distribution of stimuli (e.g., the variety of stimulus magnitudes) that reflect noise only, (2) the distribution of stimuli that reflect signal, and (3) the observer’s criterion for “deciding” that a specific stimulus resulted from noise sources or sig- nal. An important factor for the fidelity of signal detection is the degree to which noise and signal distributions overlap with one another. In the case of the vibrating cell phone, if the distributions of tactile stimuli resulting from noise and signal overlap, as is often the case, then there will always be some cases in which you believe the phone is vibrating when it is not (noise stimuli attributed to signal source), and there will be some cases in which the phone is vibrating and you miss the call (signal stimuli attributed to noise source). The third factor that influences signal detection in the presence of noise is the observer’s decision criterion, which is simply the value (e.g., stimulus amplitude) above which a stimulus is attributed to signal, and below which a stimulus is attributed to noise. In the same sense that your car radio is programmed to “decide” (and allow you to hear) when informative patterns of electromagnetic radiation (signal) are sufficiently different from random fluctuations (noise), an observer adopts a criterion for deciding whether a 1 W. S. Geisler, “Sequential Ideal-Observer Analysis of Visual Discriminations,” Psychologi- cal Review 96(2): 267–314 (1989). 2 D. M. Green and J. A. Swets, Signal Detection Theory and Psychophysics (New York: Wiley, 1966). 3 W. W. Peterson, T. G. Birdsall, and W. C. Fox, “The Theory of Signal Detectability,” Proceedings of the IRE Professional Group on Information Theory 4(4): 171–212 (1954). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification BASIC RESEARCH ON VISION AND MEMORY 49 stimulus is caused by a signal or is simply a manifestation of noise. This criterion reflects the level of precision acceptable for the observer’s needs, given uncertainty about whether a given stimulus reflects a real signal. In practice, the criterion4 used is determined by a host of factors unique to the circumstances, including psychological and social demands and be- havioral goals. These factors collectively determine the relative “costs” of incorrect attributions of signal as noise (“misses”) and of noise as signal (“false alarms”). If an individual places high value on not missing a phone call, then she or he will adopt a very liberal criterion, in which all stimuli reflecting real incoming calls (signal) are successfully detected, but many noise stimuli (e.g., shifting keys in a pocket) are erroneously (and frustratingly) believed to be incoming calls. By contrast, if an individual places little value on de- tecting incoming phone calls, she or he will adopt a conservative criterion, in which many calls are missed and noise stimuli rarely elicit an effort to answer the phone, which may be of value to the individual who wishes to avoid distraction. The example of the signal detection logic used for the vibrating cell phone applies similarly to all aspects of visual perceptual experience, in- cluding the conditions of witnessing criminal events. The uncertainty about visual events caused by manifold sources of noise will inevitably lead to inaccurate visual perceptual experiences, which result from conditions in which an observer fails to detect a critically informative stimulus as “real” (attributing the stimulus instead to a source of noise) or confidently per- ceives a noise stimulus to have originated from an informative source. The latter instance is problematic because it increases the likelihood that observers will unwittingly “construct,” on the basis of expectations derived from memory and situational context, perceptual experiences to account for noise erroneously interpreted as signal. What follows from this consideration of uncertainty and decision cri- teria for visual perception is that the actual impact of factors that limit the amount of visual information available to an eyewitness (factors considered in more detail below) will depend on the criterion adopted. The criterion may reflect the values and prejudices of the eyewitness, his or her motiva- tional and emotional state, and a variety of behavioral goals. In principle, the observer’s criterion can be altered by instruction or incentives, but it is important to note that the criterion held by an observer witnessing a crime scene cannot be anticipated, nor can it be altered after the fact. It is an “estimator variable,” which simply needs to be recognized and understood when evaluating eyewitness reports. By contrast, the decision criterion held 4 The criterion is sometimes referred to as bias. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 50 IDENTIFYING THE CULPRIT by an observer at the time of identification can be controlled, and there may be valid reasons for doing so (see Chapter 5).5 In the following discussions of sensation, attention, and perception, the various means and conditions under which many different types of noise introduce uncertainty in visual signal detection (and thus fundamentally limit the accuracy of eyewitness identification) are addressed. Visual Sensation When an observer views an object of any sort (such as a person) or events involving the object (a criminal act), patterns of light reflected from the environment are focused by the lens at the front of the eye and projected onto the back surface of the eye (the retina) to form the retinal image. Light in the image is initially “sensed” by the activation of photoreceptors, and early stages of sensory processing function to detect spatial and temporal contrast along a number of dimensions, including intensity and wavelength of light.6 These contrast measurements are integrated by subsequent pro- cessing stages in the brain to yield representations of basic image features, or primitives, such as oriented image contours.7 Several sources of noise, or factors that limit the ratio of signal to noise, can restrict the visual information accessible to these early sensory processes. Some factors are inherent to the visual system and largely un- controllable (e.g., the scattering of light by the fluid and tissues of the eye) and can be exacerbated by common observer-specific visual deficits (e.g., myopia, poor contrast sensitivity, or color blindness). Others factors are dependent on viewing conditions (e.g., the effects of viewing time and level of illumination).8 Both of these types of factors predictably influence the quantity of information—the visual signal strength—that a viewer gains from a visual scene, and thus the degree to which the perceptual experi- 5 L. Mickes, H. D. Flowe, and J. T. Wixted, “Receiver Operating Characteristic Analysis of Eyewitness Memory: Comparing the Diagnostic Accuracy of Simultaneous and Sequential Lineups,” Journal of Experimental Psychology: Applied 18(4): 361–376 (2012). 6 M. Meister and M. Tessier-Lavigne, “Low-level Visual Processing: The Retina,” in Prin- ciples of Neuroscience, 5th Edition, ed. E. Kandel, J. H. Schwartz, T. M. Jessell, S. A. Siegelbaum, and A. J. Hudspeth (New York: McGraw-Hill Professional, 2012), 577–601. 7 C. D. Gilbert, “Intermediate-level Visual Processing and Visual Primitives,” in Principles of Neuroscience, 5th Edition, ed. E. Kandel, J. H. Schwartz, T. M. Jessell, S. A. Siegelbaum, and A. J. Hudspeth (New York: McGraw-Hill Professional, 2012), 602–620. 8 D. G. Pelli, “Uncertainty Explains Many Aspects of Visual Contrast Detection and Dis- crimination” Journal of the Optical Society of America A2(9): 1508–32 (1985). D. G. Pelli, “The Quantum Efficiency of Vision,” in Vision: Coding and Efficiency, ed. C. Blakemore (Cambridge: Cambridge University Press, 1990), 3–24. G. Sperling, “The Information Avail- able in Brief Visual Presentations,” Psychological Monographs: General and Applied 74(11, Whole No. 498): 1–29 (1960). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification BASIC RESEARCH ON VISION AND MEMORY 51 ence can accurately reflect the properties of the external world.9 At the ex- treme, short viewing times and low levels of illumination simply reduce the number of correlated photons reaching the retina to the point where they scarcely exceed photon noise, and uncertainty is very high.10 At slightly longer viewing times and greater illumination levels, signal-to-noise levels improve, but there may remain marked limits on visual sensitivity. Visual acuity, for example, which is a measure of the ability to resolve the fine spatial details of a visual pattern, is known to decline significantly with decreases in illumination.11 Signal-to-noise loss can depend on the direction of the observer’s gaze. Visual acuity is highest at the observer’s center of gaze. The center is the part of your visual system that is used for fine sensing, such as reading or scrutinizing faces in a social context. Acuity drops off markedly with angular distance from this center, such that the quality and quantity of information sensed a mere 10 degrees from center are far less than what is available at the center of gaze.12 Under unrestricted viewing conditions, the movements of the eyes largely overcome the effects of gaze direction. However, under the viewing conditions associated with a typical crime, this source of noise may place severe limitations on the ability of the observer to sense key pieces of infor- mation that are not present at the center of gaze. To appreciate the impact of these limitations, consider that patients with macular degeneration are effectively blinded in the region of the visual field possessing highest acuity, and must rely instead on the much-reduced quality of visual information gained from the peripheral visual field. To compensate for this clinical loss, images and text must be greatly magnified to enable comprehension—an option that is clearly not available to an eyewitness. Visual Attention Light falling on all parts of the retina is available to be sensed—and must be sensed for it to be available for further processing—but only a 9 G. Sperling, “A Signal-to-Noise Theory of the Effects of Luminance on Picture Memory: Comment on Loftus,” Journal of Experimental Psychology: General 115(2): 189–192 (1986). 10 S. Hecht, S. Schlaer, and M. H. Pirenne, “Energy, Quanta, and Vision,” Journal of General Physiology 25(6): 819–840 (1942). 11 P. W. Cobb, “The Influence of Illumination of the Eye on Visual Acuity,” American Journal of Physiology 29: 76–99 (1911). S. Hecht, “A Quantitative Basis for the Relation Between Vi- sual Acuity and Illumination,” Proceedings of the National Academy of Sciences 13: 569–574 (1927). S. Shlaer, “The Relation Between Visual Acuity and Illumination,” Journal of General Physiology 21 (2): 165–188 (1937). 12 H. Strasburger, I. Rentschler, and M. Jüttner, “Peripheral Vision and Pattern Recognition: A Review,” Journal of Vision 11(5):13, 1–82 (2011). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 52 IDENTIFYING THE CULPRIT small fraction of the information sensed reaches awareness or is used by the observer for recognition, action, or storage in memory. This limited access to visual sensory information is a product of selective attention.13 Attention is an active process that can be directed by external factors—visual attri- butes with high salience, such as a bright light or an unfamiliar object—or by internal control.14 If you are searching for a coffee cup, for example, you may explicitly direct your attention to the table where it was last seen. Attention can be directed to different types of image content, including spe- cific locations in space,15 specific image features (such as a specific color),16 or to specific objects (such as the coffee cup).17 Attended image content is transiently enhanced to increase the fidelity of visual experience.18 Attention interacts with sensory processing, for ex- ample, by selectively enhancing contrast19 and potentially overcoming low signal-to-noise levels resulting from limited viewing time or illumination.20 The effects of attention on contrast enhancement can be potentiated further when attention is commanded by emotionally laden stimuli.21 Image con- 13 W. James, Principles of Psychology (New York: Henry Holt, 1890); H. Pashler, J. John- ston, and E. Ruthruff, “Attention and Performance,” Annual Review of Psychology 52: 629–651 (2001). 14 M. I. Posner, “Orienting of Attention,” Quarterly Journal of Experimental Psychology 32: 3–25 (1980). 15 Ibid. 16 A. F. Rossi and M. A. Paradiso, “Feature-specific Effects of Selective Visual Attention,” Vision Research 35(5): 621–634 (1995). 17 J. Duncan, “Selective Attention and the Organization of Visual Information,” Journal of Experimental Psychology: General 113(4): 501–517 (1984). 18 H. Pashler, J. Johnston, and E. Ruthruff, “Attention and Performance,” Annual Review of Psychology 52: 629–651 (2001). 19 M. Carrasco et al.,“Attention Alters Appearance” Nature Neuroscience 7: 308–313 (2004). 20 M. I. Posner, C. R. Snyder, and B. J. Davidson, “Attention and the Detection of Signals,” Journal of Experimental Psychology 109(2): 160–174 (1980). M. Carrasco and B. McElree, “Covert Attention Accelerates the Rate of Visual Information Processing,” Proceedings of the National Academies of Science 98(9): 5363–5367 (2001). Y. Yeshurun and M. Carrasco, “Attention Improves or Impairs Visual Performance by Enhancing Spatial Resolution,” Nature 396, 72–75 (1998). M. Carrasco et al., “Covert Attention Increases Spatial Resolution with or without Masks: Support for Signal Enhancement,” Journal of Vision 2(6): 467–79 (2002). E. Blaser et al., “Measuring the Amplification of Attention,” Proceedings of the National Academies of Science 96(20): 11681–11686 (1999). K. Anton-Erxleben and M. Carrasco, “Attentional Enhancement of Spatial Resolution: Linking Behavioural and Neurophysiological Evidence,” Nature Reviews Neuroscience 14(3):188–200 (2013). J. W. Couperus and G. R. Mangun, “Signal Enhancement and Suppression During Visual-Spatial Selective Attention,” Brain Research 1359:155–177 (2010). 21 E. A. Phelps, S. Ling, and M. Carrasco, “Emotion Facilitates Perception and Potentiates: The Perceptual Benefits of Attention,” Psychological Science 17(4): 292 (2006). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification BASIC RESEARCH ON VISION AND MEMORY 53 tent not falling within the focus of attention is processed with less fidelity.22 In some cases, unattended content is effectively invisible: It does not reach awareness, it is not perceived, and it is not available for use in guiding deci- sions or actions, or for storage in memory.23 Different pieces of visual information compete for selection,24 as their attributes of physical salience, location in space, novelty, and relevance to the observer’s needs and behavioral goals are always changing.25 The outcome of the competition is highly susceptible to noise (in this instance, noise is defined as uncontrolled factors that bias the focus of attention and create uncertainty about the content of a visual scene), because the infor- mational content of the visual image vastly exceeds what can be attended at any point in time. The implications of such noise for eyewitness identi- fication are profound. An observer must “select” what to attend to, often within a short window of time, without advance warning, in the presence of many novel objects and events, and under such confounding influences as anxiety and fear. The signal detection framework is readily adaptable to the problem of noise in visual attention and provides some insights into the limits of attentional selection in the presence of noise.26 In essence, this signal detec- tion approach quantifies the extent to which multiple items competing with one another for attention affect attentional enhancement for any one of the items.27 Reductions in efficiency are common under such noise conditions. Indeed, sensitivity to unattended items can be markedly reduced under conditions of high “perceptual load,” in which there are many objects si- 22 Posner, Snyder, and Davidson, “Attention and the Detection of Signals.” Y. Yeshurun and M Carrasco, “Attention Improves or Impairs Visual Performance by Enhancing Spatial Resolution,” Nature 396: 72–75 (November 1998). 23 A. Mack and I. Rock, Inattentional Blindness (Cambridge, MA: MIT Press, 1998). 24 R. Desimone and J. Duncan, “Neural Mechanism of Selective Visual Attention,” Annual Review of Neuroscience 18: 193–222 (March 1995). 25 J. M. Wolfe and T. S. Horowitz, “What Attributes Guide the Deployment of Visual Atten- tion and How Do They Do It?” Nature Reviews Neuroscience 5: 495–501 (June 2004). H. E. Egeth and S.Yantis, “Visual Attention: Control, Representation, and Time Course,” Annual Review of Psychology 48(1): 269–297 (February 1997). M. I. Posner, “Orienting in Atten- tion,” Quarterly Journal of Experimental Psychology 32(1): 3–25 (1980). A. Treisman and G. Gelade, “A Feature Integration Theory of Attention,” Cognitive Psychology 12(1):97–136 (January 1980). L. Itti and C. Koch, “A Saliency-based Search Mechanism for Overt and Co- vert Shifts of Visual Attention,” Vision Research 40(10–12): 1489–1506 (June 2000). 26 G. Sperling and M. J. Melchner, “The Attention Operating Characteristic: Examples from Visual Search,” Science 202(4365): 315–318 (October 1978). G. Sperling and B. A. Dosher, “Strategy and Optimization in Human Information Processing,” in Handbook of Perception and Human Performance, ed. K. Boff, L. Kaufman, and J. Thomas (New York: Wiley, 1986). 27 Ibid. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 54 IDENTIFYING THE CULPRIT multaneously competing for attention.28 The spacing of items in the visual field also impacts visual sensitivity.29 When objects are closely spaced, their discriminability is reduced. One explanation offered for this “crowding effect” is that the spacing of visual items is smaller than the resolution of visual attention.30 The visual phenomenon of crowding suggests that a crime committed in a visually complex scene, such as a sporting event, could easily place limits on the ability of a witness to accurately perceive the facial features of a perpetrator. A related consequence of attentional noise is that competing interests can readily hijack the attentional focus. The technique of misdirection— one of the original mainstays of performance magic—directs attention to uninformative image content and exploits the invisibility of unattended features.31 The well-studied inattentional blindness effect is another ex- ample of this phenomenon, in which attention that is pre-directed to one behaviorally significant property of a visual scene precludes awareness of other features that also may be important.32 (For a dramatic demonstration of this effect, produced by Simons and Chabris,33 see http://tinyurl.com/ inattentional-blindness.) Inattentional blindness effects translate well to real-world interactions between people. An individual can be surprisingly unaware of surreptitious changes to the physical appearance of another person while engaged in con- versation.34 One demonstration of this phenomenon involved two strang- ers (experimenter and pedestrian) in a brief face-to-face conversation on a sidewalk. At some point in the conversation an opaque door was carried between the two individuals, and another person with different appearance, clothing, and voice quickly replaced the experimenter. More than half of 28 N. Lavie, “Perceptual Load as a Necessary Condition for Selective Attention,” Journal of Experimental Psychology: Human Perception and Performance 21(3): 451–468 (June 1995). J. W. Couperus, “Perceptual Load Influences Selective Attention Across Development,” De- velopmental Psychology 47(5):1431–1439 (September 2011). 29 D. M. Levi, “Crowding—An Essential Bottleneck for Object Recognition: A Mini-review,” Vision Research 48: 635–654 (2008). 30 J. Intriligator and P. Cavanagh, “The Spatial Resolution of Visual Attention,” Cognitive Psychology 43: 171–216 (2001). 31 G. Kuhn et al., “Misdirection in Magic: Implications for the Relationship Between Eye Gaze and Attention,” Visual Cognition 16(2–3): 391–405 (2008). S. L. Macknik, S. Martinez- Conde, and S. Blakeslee, Sleights of Mind: What the Neuroscience of Magic Reveals About Our Everyday Deceptions (New York: Henry Holt and Co., 2010). 32 A. Mack and I. Rock, Inattentional Blindness (Cambridge, MA: MIT Press, 1998). U. Neisser and R. Becklen, “Selective Looking: Attending to Visually Specified Events,” Cognitive Psychology 7(4): 480–494 (October 1975). D. Simons, “Attentional Capture and Inattentional Blindness,” Trends in Cognitive Sciences 4(4): 147–155 (April 2000). 33 D. J. Simons and C. F. Chabris, “Gorillas in Our Midst: Sustained Inattentional Blindness for Dynamic Events,” Perception 28: 1059–1074 (1999). 34 D. J. Simons and D. T. Levin, “Failure to Detect Changes to People During a Real-World Interaction,” Psychonomic Bulletin and Review 5(4): 644–649 (1998). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification BASIC RESEARCH ON VISION AND MEMORY 55 the participants (pedestrians) failed to notice that their conversation part- ner had changed. This finding suggests that naturally occurring events that briefly divert attention have the potential to markedly impair the accuracy of eyewitness identifications. Attentional hijacking is particularly characteristic of stimuli that elicit strong emotional responses, such as fear and arousal.35 Visual stimuli that trigger fear responses act as powerful external cues that command atten- tion.36 While this potentiates sensitivity to those stimuli, at the considerable expense of sensitivity to others, it is often the case that the attended emo- tional stimuli are not the ones with relevant informational content.37The so-called weapon focus is a real-world case in point for eyewitness iden- tification, in which attention is compellingly drawn to emotionally laden stimuli, such as a gun or a knife, at the expense of acquiring greater visual information about the face of the perpetrator (see also discussion of weapon focus in Chapter 5).38 (One might argue that this is an adaptation that benefits immediate action or engagement with a threatening stimulus, but is surely detrimental to one’s efforts to bear witness.) Visual Perception Visual perception is the conscious functional result of efforts to identify the environmental causes of the pattern of light cast onto the back of the eye.39 Perception does not reflect the sensory world passively, as camera film detects patterns of light. On the contrary, visual perception is constructive 35 C. H. Hansen and R. D. Hansen, “Finding the Face in the Crowd: An Anger Superior- ity Effect,” Journal of Personality and Social Psychology 54: 917–924 (1988). E. Fox et al., “Facial Expressions of Emotion: Are Angry Faces Detected More Efficiently?” Cognition and Emotion 14(1): 61–92 (2000). R. Compton, “The Interface Between Emotion and Attention: A Review of Evidence from Psychology and Neuroscience,” Behavioral and Cognitive Neu- roscience Reviews 2(2): 115–129 (2003). R. L. Bannerman, E. V. Temminck, and A. Sahraie, “Emotional Stimuli Capture Spatial Attention But Do Not Modulate Spatial Memory,” Vision Research 65: 12–20 (15 July 2012). 36 J. A. Easterbrook, “The Effects of Emotion on Cue Utilization and the Organization of Behavior,” Psychological Review 66(3): 183–201 (1959). 37 E. Ferneyhough et al., “Anxiety Modulates the Effects of Emotion and Attention on Early Vision,” Cognition and Emotion 27(1): 166–176 (2013). G. Pourtois and P. Vuilleumier, “Dy- namics of Emotional Effects on Spatial Attention in the Human Visual Cortex,” Progress in Brain Research 156: 67–91 (2006). 38 T. Kramer, R. Buckhout, and P. Eugenio, “Weapon Focus, Arousal, and Eyewitness Memory: Attention Must Be Paid,” Law and Human Behavior 14(2): 167–184 (1990). R. S. Truelove, “Do Weapons Automatically Capture Attention,” Applied Cognitive Psychology 20(7): 871–893 (2006). E. F. Loftus, G. R. Loftus, and J. Messo, “Some Facts About ‘Weapon Focus’,” Law and Human Behavior 11(1): 55–62 (1987). 39 W. James, Principles of Psychology (New York: Henry Holt, 1890). S. Harnad, ed., Cat- egorical Perception: The Groundwork of Cognition (New York: Cambridge University Press, 1987). T. D. Albright, “Perceiving,” Daedalus (in press). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 56 IDENTIFYING THE CULPRIT and entails (1) integrating and segmenting attended attributes of the visual image into objects, (2) complementing and interpreting the product with expectations derived from memory of prior experiences with the world, and (3) assigning meaning and emotional valence by reference to prior knowledge of function and value.40 All of these perceptual processes are affected by noise. Because the things perceived are the things we place into memory, perceptual noise can dramatically limit the accuracy of eyewitness identification. The process of feature integration and interpretation may be dis- torted by images of an object unique to a specific angle of view.41 The retinal pattern generated by a face viewed directly from the front differs considerably—with changes in aspect ratio and relative placement of fa- cial features—from that generated by a face viewed from an oblique side angle. Viewing a face from an angle above or below center (as might be the case if the criminal were standing over you, or below you on the stairs) also yields retinal distortions of facial features. In this case, the distortions prominently mimick facial gestures of smiling versus frowning, and perhaps cause incorrect inferences about the emotional state of the person observed and his or her intentions and motivations. (This distortion is the basis for the Japanese Noh Theatre mask effect, in which a rigid mask tilted forward leads to the appearance of a smile and backward leads to the appearance of a frown—an effect you can simulate by simply looking into the mirror and tilting your face up or down.)42 Viewing conditions can also affect the perception of face, gender, and age.43 Investigators found that faces that were physically identical—and particularly those bordering on androgyny—were perceived as unambigu- ously male or female depending on where they appeared in the observer’s visual field. The spatial patterning of these effects was distinctive and stable for each observer. Perceptual distortions of this sort are a source of noise that may have important implications for the accuracy of eyewitness identification. Perceptual distortions also may be introduced through memory recall. 40 C. D. Gilbert, “The Constructive Nature of Visual Processing,” in Principles of Neurosci- ence, 5th Edition, ed. E. Kandel, J. H. Schwartz, T. M. Jessell, S. A. Siegelbaum, and A. J. Hudspeth (New York: McGraw-Hill Professional, 2012). T. D. Albright, “On the Perception of Probable Things: Neural Substrates of Associative Memory, Imagery, and Perception,” Neuron 74 (2): 227–245 (2012). 41 W. G. Hayward and P. Williams, “Viewpoint Dependence and Object Discriminability,” Psychological Science 11(1): 7–12 (2000). 42 M. J. Lyons et al., “The Noh Mask Effect: Vertical Viewpoint Dependence on Facial Expression Perception,” Proceedings of the Royal Society B: Biological Sciences 267(1459): 2239–2245 (2000). 43 A. Afraz, M. Vaziri-Pashkam, and P. Cavanagh, “Spatial Heterogeneity in the Perception of Face and Form Attributes,” Current Biology 20(23): 2112–2116 (2010). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification BASIC RESEARCH ON VISION AND MEMORY 57 The way an observer experiences a visual scene—the setting, the people, and the actions associated with a crime —is commonly influenced as much by expectations from prior experience with the world as it is by the precise patterns of light cast upon the retina. There are good reasons why this is true. As noted above, the sensory input (the pattern of light received) is often noisy, incomplete, and ambiguous, and memories of what is likely to be out there, given the context, are called on to fill in the blanks, recon- cile ambiguities, and leave clear and coherent percepts.44 This perceptual completion is probabilistic.45 It is an hypothesis, and the accuracy naturally depends on the degree to which the observer’s expectations match the noisy sensory data. What is implied is that the same mechanism that grants the certainty of perceptual experience in the face of noise and ambiguity is also capable of implicitly fabricating content that does not correspond to external reality and yet is experienced with no less certainty. Performance magic relies on this constructive nature of perceptual experience, and that nature is also the foundation for many visual illusions and forms of visual art.46 In a classic experiment that drives home the point, Bruner and Postman looked at the ability of observers to recognize ‘‘trick’’ playing cards.47 The trick cards were created by altering the color of a given suit (e.g., a red seven of spades). Observers were shown a series of cards with brief presenta- tions. Some cards were trick, and the remainder normal. With astonish- ing frequency, observers reported that the trick cards were normal. When questioned, observers defended their reports, even after being allowed to scrutinize the trick cards, thus demonstrating that learned properties of the world are capable of sharply altering our experience and, moreover, reinforcing our convictions about what we have seen, even in the face of countermanding sensory evidence. In view of this inherent dependence of perception on prior experiences and context—and, importantly, the fact that the viewer is commonly none the wiser when perception differs from 44 Albright, “On the Perception of Probable Things.” 45 D. C. Knill and W. Richards, Perception as Bayesian Inference, ed. D. C. Knill and W. Richards (Cambridge: Cambridge University Press, 1996). D. Kersten “High-level Vision as Statistical Inference,” in The New Cognitive Neurosciences, 2nd Edition, ed. M. S. Gazzaniga (Cambridge: MIT Press, 1999), 353–363. D. Kersten, P. Mamassian, and A. Yuille, “Object Perception as Bayesian Inference,” Annual Review of Psychology 55: 271–304 (February 2004). 46 E. H. Gombrich, Art and Illusion. A Study in the Psychology of Pictorial Representation (London: Phaidon 1960). T. D. Albright, “The Veiled Christ of Cappella Sansevero: On Art, Vision and Reality,” Leonardo 46(1): 19–23 (2013). Macknik, Martinez-Conde, and Blakeslee, Sleights of Mind: What the Neuroscience of Magic Reveals About Our Everyday Deceptions (New York: Henry Holt and Co., 2010). 47 J. S. Bruner and L. Postman, “On the Perception of Incongruity: A Paradigm,” Journal of Personality 18(2): 206–223 (1949). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 58 IDENTIFYING THE CULPRIT the “ground truth” of the external world—it appears that accurate eyewit- ness identification may be difficult to achieve. Additional noise (in this case defined as uncertainty resulting from loss of perceptual resolution) may result from the fact that visual perception is categorical.48 Although the objects of our experience vary broadly along multiple sensory dimensions, we lump them into categories based upon prior associations, many of which stem from common functions, physical properties, meanings, or emotional valence. Apples in a basket or the many typographic fonts for the letter “A” are visually distinct, yet we readily perceive them as categorically identical. For most behavioral and cognitive goals, perceptual processing is greatly simplified by treating all members of a category as the same, despite their differences. It rarely matters, for example, whether the apple we choose is dappled on one side or irregular in shape, nor does the font used bear greatly on our ability to read. One of the functional corollaries of categorical perception is that observers are far better at discriminating between objects from different categories than objects from the same category.49 Evidence indicates that the structure of object memory is also categorical, suggesting that perceived objects are encoded in memory as a category type, often without specific detail.50 Perceptual categorization naturally applies to faces.51 We readily cat- egorize faces by distinctions along the obvious dimensions of gender, age, 48 W. James, Principles of Psychology (New York: Henry Holt, 1980). S. Harnad, ed., Categorical Perception: The Groundwork of Cognition (New York: Cambridge University Press, 1987). 49 R. Goldstone, “Influences of Categorization on Perceptual Discrimination,” Journal of Experimental Psychology General 123(2): 178–200 (1994). R. Goldstone, Y. Lippa, and R. M. Shiffrin, “Altering Object Representations Through Category Learning,” Cognition 78(1): 27–43 (2001). 50 E. Tulving, “Episodic and Semantic Memory,” in Organization of Memory, ed. E. Tulving and W. Donaldson (New York: Academic Press, 1972), 381–403. L. K. Tyler et al., “Processing Objects at Different Levels of Specificity,” Journal of Cognitive Neuroscience 16(3): 351–362 (2004). M. J. Farah and J. L. McClelland, “A Computational Model of Semantic Memory Impairment: Modality Specificity and Emergent Category Specificity,” Journal of Experimen- tal Psychology: General 120 (4): 339–357 (1991). C. Gerlach et al., “Categorization and Category Effects in Normal Object Recognition: A PET Study,” Neuropsychologia 38(13): 1693–1703 (2000). G. W. Humphreys and E. M. Forde, “Hierarchies, Similarity, and Inter- activity in Object Recognition: ‘Category-Specific’ Neuropsychological Deficits,” Behavioral and Brain Sciences 24(3): 453–476 (2001). 51 J. M. Beale and F. C. Keil, “Categorical Effects in the Perception of Faces,” Cognition 57(3): 217–239 (1995). D. T. Levin, “Classifying Faces by Race: The Structure of Face Cat- egories,” Journal of Experimental Psychology: Learning, Memory, and Cognition 22(6):1364– 1382 (1996). D. T. Levin and J. Beale, “Categorical Perception Occurs in Newly Learned Faces, Cross-Race Faces, and Inverted Faces,” Perception and Psychophysics 62: 386–401 (2000). M. A. Webster et al., “Adaptation to Natural Facial Categories,” Nature 428(6982): 557–561 (2004). Y. Lee et al., “Broadly Tuned Face Representation in Older Adults Assessed by Categorical Perception,”Journal of Experimental Psychology: Human Perception and Performance 40(3): 1060–1071 (2014). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification BASIC RESEARCH ON VISION AND MEMORY 59 and race, but we also draw distinctions along dimensions such as skin tone, hair color and style, presence and type of facial hair, such subtler factors as shape of cheeks and jaw, and subjective qualities such as attractiveness. The practical consequence of this for eyewitness identification is that the precision of a perceptual experience may be reduced within any of these cat- egories, particularly because we typically witness criminal events for such a brief period of time. The ensuing memory of the experience will likely reflect that reduced precision, and the memory retrieved may regress to a category prototype or to other exemplars of the perceived category.52 The witness may categorically perceive a square jawed man with a moustache, but the fine details needed for individuation of a suspect are neither per- ceived nor encoded in memory. For example, although you may have seen the iconic Marlboro Man countless times on billboards and in magazines, it is unlikely that you could distinguish him in a lineup from other square jawed mustachioed men. MEMORY Functional Processes of Memory Conscious visual perceptual experiences, rendered by the processes described in the previous section on vision, are commonly stored as declara- tive memories, meaning that they can be consciously accessed and expressed as knowledge about the world (as distinct from procedural memories, such as motor skills).53 Declarative memories are of two types, semantic and episodic, reflecting a distinction between memories of meanings, facts, and concepts versus memories of events (such as those witnessed during a crime).54 Declarative memories are conceptualized as involving three core processes—encoding, storage, and retrieval—which refer to the placement of items in memory, their maintenance therein, and subsequent access to the stored information.55 Like vision, memory is also beset by noise. Encoding, storage, and re- membering are not passive, static processes that record, retain, and divulge 52 J. Huttenlocher, L. V. Hedges, and J. L. Vevea, “Why Do Categories Affect Stimulus Judge- ment?” Journal of Experimental Psychology: General 129(2): 220–241 (2000). R. Goldstone, Y. Lippa, and R. M. Shiffrin, “Altering Object Representations Through Category Learning,” Cognition 78(1): 27–43 (2001). 53 W. James, Principles of Psychology (New York: Henry Holt, 1890). B. Milner, Physiologie de l’hippocampe, ed. P. Passouant (Paris: Centre National de la Recherche Scientifique, 1962), 257–272. L. R. Squire and J. Wixted, “The Cognitive Neuroscience of Human Memory since H.M.,” Annual Review of Neuroscience 34: 259–288 (2011). 54 Tulving, “Episodic and Semantic Memory.” 55 E. Tulving, “Organization of Memory: Quo vadis?” in The Cognitive Neurosciences, ed. M. S. Gazzaniga (Cambridge, MA: MIT Press, 1995), 839–847. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 60 IDENTIFYING THE CULPRIT their contents in an informational vacuum, unaffected by outside influences. The contents cannot be treated as a veridical permanent record, like pho- tographs stored in a safe. On the contrary, the fidelity of our memories for real events may be compromised by many factors at all stages of process- ing, from encoding through storage, to the final stages of retrieval. Without awareness, we regularly encode events in a biased manner and subsequently forget, reconstruct, update, and distort the things we believe to be true.56 The following sections discuss memory encoding, storage, and retrieval, with emphasis on the limits of these processes as they pertain to eyewitness identification. Emotions can strongly influence these processes of memory; some specific actions are highlighted. The phenomenon of “recognition memory” is also discussed. This refers to the specific type of memory re- trieval in which a stimulus (e.g., a face) is used to probe memory, and the rememberer (e.g., an eyewitness) must decide whether the strength of the elicited memory evidence is sufficient to declare that the stimulus was pre- viously encountered or is novel. Recognition memory underlies eyewitness identification, as the witness must make a recognition decision. Memory Encoding Memory encoding refers to the process whereby perceived objects and events are initially placed into storage. The encoding process involves two stages, which are commonly distinguished by the quantity of information stored, the duration of storage, and the susceptibility to interference.57 Short-term or working memory is the conscious content of recent percep- tual experiences or information recently recalled from long-term storage. Information that remains at the focus of attention persists in and forms the contents of short-term memory. This form of memory is of limited duration 56 J. T. Wixted, “The Psychology and Neuroscience of Forgetting,” Annual Review of Psychology 55: 235–269 (2004). E. Tulving and D. M. Thomson, “Encoding Specificity and Retrieval Processes in Episodic Memory,” Psychological Review 80(5): 352–373 (1973). Y. Dudai, “Reconsolidation: The Advantage of Being Refocused,” Current Opinion in Neurobi- ology 16(2): 174–178 (2006). E. F. Loftus, “Planting Misinformation in the Human Mind: A 30-Year Investigation of the Malleability of Memory,” Learning and Memory 12(4): 361–366 (2005). R. A. Bjork, “Interference and Memory,” in Encyclopedia of Learning and Memory, ed. L. R. Squire (New York: Macmillan, 1992), 283–288. J. A. McGeoch, “Forgetting and the Law of Disuse,” Psychological Review 39(4): 352–370 (1932). J. G. Jenkins and K. M. Dallenbach, “Obliviscence during Sleep and Waking,” The American Journal of Psychology 35(4): 605–612 (1924). B. J. Underwood and L. Postman, “Extra-Experimental Sources of Interference in Forgetting,” Psychological Review 67 (2): 73–95 (1960). 57 R. C. Atkinson and R. M. Shiffrin, “Human Memory: A Proposed System and its Control Processes,” in The Psychology of Learning and Motivation (Volume 2), ed. K. W. Spence and J. T. Spence (New York: Academic Press,1968), 89–195. W. James, Principles of Psychology (New York: Henry Holt, 1890). A. Baddeley, “Working Memory: Looking Back and Look- ing Forward,” Nature Reviews Neuroscience 4(10): 829–839 (2003). A. Baddley, Working Memory (New York: Oxford University Press, 1986). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification BASIC RESEARCH ON VISION AND MEMORY 61 and capacity58 and labile, decaying quickly with time and easily disrupted by other perceptual or cognitive processes.59 Through cellular and molecu- lar events that play out over time, the contents of short-term memories may be encoded and consolidated into long-term memory,60 which is more enduring (albeit evolving with ongoing experience), and of greater capacity. The structure of an individual’s full library of long-term declarative memories can be thought of as a collection of associations between items of specific semantic (e.g., the fact that that person X is a 34-year-old female) or episodic content (e.g., the fact that person X was at location Y on the night of the witnessed crime).61 As the individual gains new experiences, long- term declarative memories may be updated by adding new content to the existing library or by forming new associations between existing content.62 Memories are particularly labile during the encoding process. The con- tents of short-term memory are limited and highly subject to interference by subsequent sensory, cognitive, emotional, or behavioral events; the contents can also be biased by prior knowledge, expectations, or beliefs, resulting in a distorted representation of experience. Short-term memories of events that happened early in a witnessed proceeding may simply be forgotten with the passage of time or badly compromised by attention directed to subsequent emotional events or cognitive and behavioral demands (e.g., anxiety, fear, the need to escape). In such cases, the compromised information may never be consolidated fully into long-term storage or that storage may contain distorted content.63 At the same time, the quality of encoding of stimuli that are attended is commonly enhanced by highly emotional content.64 58 G. A. Miller, “The Magical Number Seven,” The Psychological Review 63(2): 81–97 (1956). 59 J. Jonides et al., “The Mind and Brain of Short-Term Memory,” Annual Review of Psy- chology 59: 193–224 (2008). 60 E. Kandel and L. Squire, Memory: From Mind to Molecules (New York: Scientific Ameri- can Library, 2008). 61 J. R. Anderson, The Architecture of Cognition (Cambridge: Harvard University Press, 1983). J. R. Anderson and C. Lebiere, The Atomic Components of Thought (Mahwah: Lawrence Erlbaum Associates, 1998). 62 M. P. Walker et al., “Dissociable Stages of Human Memory Consolidation and Reconsoli- dation,” Nature 425: 616 (2003). 63 J. L. McGaugh, “Memory—a Century of Consolidation,” Science 287(5451): 248–251 (2000). J. L. McGaugh and B. Roozendaal, “Role of Adrenal Stress Hormones in Forming Lasting Memories in the Brain,” Current Opinion in Neurobiology 12(2): 205–210 (2002). 64 K. N. Ochsner, “Are Affective Events Richly Recollected or Simply Familiar? The Experi- ence and Process of Recognizing Feelings Past,” Journal of Experimental Psychology: General 129 (2): 242–261 (2000). D. Talmi, et al., “Immediate Memory Consequences of the Effect of Emotion on Attention to Pictures,” Learning and Memory 15(2008): 172–182. E. A. Kens- inger and D. L. Schacter, “Neural Processes Supporting Young and Older Adults’ Emotional Memories,” Journal of Cognitive Neuroscience 7 (2008): 1–13. E. A. Phelps. “Emotion and Cognition: Insights from Studies of the Human Amygdala,” Annual Review of Psychology 57: 27–53 (2006). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 62 IDENTIFYING THE CULPRIT Memory Storage Memory storage refers to the long-term retention of information after encoding. The stability of stored information is continuously challenged and subject to modification. We forget, qualify, or distort existing memories as we acquire new perceptual experiences and encode new content and as- sociations into memory.65 Forgetting can be partially mitigated, and memories stabilized, by hab- its of retrieval (or reactivation) and reconsolidation, which happen when- ever we tell the story of our experiences.66 Reactivation is not perfect. With each implicit retrieval or explicit telling of a story, we may unconsciously smooth over inconsistencies or modify content based on our prior beliefs, the accounts of others, or through the lens of new information. We may add embellishments that reflect opinions, emotions, or prejudices67 rather than observed facts; or we may simply omit disturbing content and pass over fine details.68 A second threat to the stability of long-term memories is, ironically, our life-long ability to learn new things. Because memory mechanisms are inherently plastic throughout life, content stored for the long term is sur- prisingly labile in the face of new information. Our memories are thus an ever-evolving account of our experiences. A memory that reflects witnessing person X at location Y on a particular evening might be readily and notably updated by subsequent learning that location Y is the home of a business associate of person X. Our memories of the witnessed actions of person 65 J. T. Wixted, “The Psychology and Neuroscience of Forgetting,” Annual Review of Psy- chology 55: 235–269 (2004). Tulving and Thomson, “Encoding Specificity and Retrieval Pro- cesses.” Y. Dudai, “Reconsolidation: The Advantage of Being Refocused,” Current Opinion in Neurobiology 16(2): 174–178 (2006). E. F. Loftus, “Planting Misinformation in the Human Mind: A 30-Year Investigation of the Malleability of Memory,” Learning and Memory 12(4): 361–366 (2005). R. A. Bjork, “Interference and Memory,” in Encyclopedia of Learning and Memory, ed. L. R. Squire (New York: Macmillan, 1992), 283–288. J. A. McGeoch, “Forget- ting and the Law of Disuse,” Psychological Review 39(4): 352–370 (1932). J. G. Jenkins and K. M. Dallenbach, “Obliviscence During Sleep and Waking,” The American Journal of Psychology 35 (1924): 605–612. B. J. Underwood and L. Postman, “Extra-Experimental Sources of Interference in Forgetting,” Psychological Review 67(2): 73–95 (1960). E. F. Loftus, “The Malleability of Human Memory,” American Scientist 67(3): 312–320 (1979). D. J. Yi et al., “When a Thought Equals a Look: Refreshing Enhances Perceptual Memory,” Journal of Cognitive Neuroscience 20(8): 1371–1380 (2008). 66 C. M. Alberini, Memory Reconsolidation (Waltham: Academic Press, 2013). 67 D. L. Schacter, Psychology, Second Edition (New York: Worth Publishers, 2011), 253– 254. E. F. Loftus and H. G. Hoffman, “Misinformation and Memory, the Creation of New Memories,” Journal of Experimental Psychology 118(1): 100–104 (1989). G. Mazzoni and A. Memon, “Imagination Can Create False Autobiographical Memories,” Psychological Science 14(2): 186–188 (2003). 68 F. C. Bartlett, Remembering: A Study in Experimental and Social Psychology (London: Cambridge University Press, 1932). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification BASIC RESEARCH ON VISION AND MEMORY 63 X may be qualified by new knowledge of his or her life history. Moreover, because new content can be added and the source of that content forgot- ten, we may attribute our updated memories to the originally witnessed events—in some cases substantially changing what we believe we have seen.69 It is thus not surprising that newly incorporated information need not be true to fact. Research on false memories shows that it is possible to plant fabricated content in memory, which leads us to recall things we never experienced.70 The emotional content of stored memories is a factor that appears to promote long-term retention; memories of highly arousing emotional stimuli, such as those associated with a witnessed crime, tend to be more enduring than memories of non-arousing stimuli.71 Highly salient, un- expected, or arousing events—such as the Kennedy assassination or the Space Shuttle disaster—are commonly more strongly stored in memory, and their later retrieval is often associated with the subjective experience 69 D. S. Lindsay and M. K. Johnson, “Recognition Memory and Source Monitoring,” Bul- letin of the Psychonomic Society 29(3): 203–205 (1991). D. L. Schacter and C. S. Dodson, “Misattribution, False Recognition and the Sins of Memory,” Philosophical Transactions of the Royal Society: Biological Sciences 356(1413): 1385–1393 (2001). L. A. Henkel, N. Franklin, and M. K. Johnson, “Cross-Modal Source Monitoring Confusions Between Per- ceived and Imagined Events,” Journal of Experimental Psychology: Learning, Memory, and Cognition 26(2): 321–335 (2000). D. L. Schacter, ed., Memory Distortion: How Minds, Brains, and Societies Reconstruct the Past (Cambridge, MA: Harvard University Press, 1995). K. J. Mitchell and M. K. Johnson, “Source Monitoring: Attributing Mental Experiences,” in The Oxford Handbook of Memory, ed. E. Tulving and F. I. M. Craik (New York: Oxford University Press, 2000), 179–195. H. L. Roediger III and K. B. McDermott, “Creating False Memories: Remembering Words Not Presented in Lists,” Journal of Experimental Psychology: Learning, Memory, and Cognition 21(4): 803–814 (1985). 70 Loftus, “Planting Misinformation in the Human Mind.” E. F. Loftus and J. E. Pickrell, “The Formation of False Memories,” Psychiatric Annals 25(12): 720–725 (1995). M. K. Johnson and C. L. Raye, “False Memories and Confabulation,” Trends in Cognitive Sciences 2(4): 137–145 (1998). 71 L. J. Kleinsmith and S. Kaplan, “Paired-Associate Learning as a Function of Arousal and Interpolated Interval” Journal of Experimental Psychology 65(2): 190–193 (1963). M. W. Eysenck, “Arousal, Learning, and Memory,” Psychological Bulletin 83(3): 389–404 (1976). F. Heuer and D. Reisberg, “Vivid Memories of Emotional Events: The Accuracy of Remem- bered Minutiae,” Memory and Cognition 18(5): 496–450 (1990). T. Sharot and E. A. Phelps, “How Arousal Modulates Memory: Disentangling the Effects of Attention and Retention,” Cognitive, Affective, and Behavioral Neuroscience 4(3): 294–306 (2004). E. A. Kensinger, R. J. Garoff-Eaton, and D. L. Schacter, “Memory for Specific Visual Details Can Be Enhanced by Negative Arousing Content,” Journal of Memory and Language 54(1): 99–112 (2006). E. Kensinger, “Remembering Emotional Experiences: The Contribution of Valence and Arousal,” Reviews in the Neurosciences 15(4): 241–251 (2004). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 64 IDENTIFYING THE CULPRIT of high vividness and a sense of reliving72 (although not necessarily with greater accuracy, as detailed below). The stronger encoding and storage of emotional memories results from the engagement of a specialized system of stress hormones (glucocorticoids) which is triggered by arousing content and has potentiating effects on the neuronal processes underlying memory consolidation and storage.73 Despite the vividness and the sense of reliving that characterizes retrieval of emotional memories, there are many indica- tions that such memories are just as prone to errors.74 This may reflect, in part, memory enhancements, of the sort described above, which accompany frequent re-consolidation or re-telling of the story of the emotional experi- ence, and often include details (some true to fact, some not) learned after the experience.75 Although emotional memories are often inaccurate in detail, one important corollary of their vividness is that they are frequently 72 G. Wolters and J. J. Goudsmit, “Flashbulb and Event Memory of September 11, 2001: Consistency, Confidence and Age Effect,” Psychological Report 96: 605–619 (2005). E. A. Kensinger, A. C. Krendl, and S. Corkin, “Memories of an Emotional and a Nonemotional Event: Effects of Aging and Delay Interval,” Experimental Aging Research 32: 23–45 (2006). U. Neisser and N. Harsch, “Phantom Flashbulbs: False Recollections of Hearing the News about Challenger,” in Affect and Accuracy in Recall: Studies of “Flashbulb” Memories, ed. E. Winograd and U. Neisser (New York: Cambridge University Press, 1992): 9–31. K. S. LaBar and E. A. Phelps, “Arousal-Mediated Memory Consolidation: Role of the Medial Temporal Lobe in Humans,” Psychological Science 9(6): 490–493 (1998). 73 J. L. McGaugh, “Memory: A Century of Consolidation,” Science 287(5451): 248–251 (2000). J. L. McGaugh and B. Roozendaal, “Role of Adrenal Stress Hormones in Forming Lasting Memories in the Brain,” Current Opinion in Neurobiology 12(2): 205–210 (2002). 74 E. A. Kensinger, “Remembering the Details: Effects of Emotion,” Emotion Review 1(2): 99–113 (2009). T. Sharot, M. R. Delgado, and E. A. Phelps, “How Emotion Enhances the Feeling of Remembering,” Nature Neuroscience 7(12): 1376–1380 (2004). H. Schmolck, E. A. Buffalo, and L. R. Squire, “Memory Distortions Develop over Time: Recollections of the O. J. Simpson Trial Verdict after 15 And 32 Months,” Psychological Science 11 (1): 39–45 (2000). S. R. Schmidt, “Autobiographical Memories for the September 11th Attacks: Reconstructive Errors and Emotional Impairment of Memory,” Memory and Cognition 32(3): 443–454 (2004). T. W. Buchanan and R. Adolphs, “The Role of the Human Amygdala in Emotional Modulation of Long-Term Declarative Memory,” in Emotional Cognition: From Brain to Be- havior, ed. S. Moore and M. Oaksford (Amsterdam: John Benjamins Publishing, 2002), 9–34. 75 E. Soleti et al., “Does Talking About Emotions Influence Eyewitness Memory? The Role of Emotional vs. Factual Retelling on Memory Accuracy,” Europe’s Journal of Psychology 8(4): 632–640 (2012). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification BASIC RESEARCH ON VISION AND MEMORY 65 held with high confidence.76 This breakdown of the relationship between accuracy and confidence can obviously undermine eyewitness accounts.77 The enduring plasticity of stored memories is a serious concern for the validity of eyewitness identification. A witness’ inevitable interactions with law enforcement and legal counsel, not to mention communications from journalists, family, and friends, have the potential to significantly modify the witness’ memory of faces encountered and of other event details at the scene of the crime.78 Thus, the fidelity of retrieved events—and the accuracy of identification—is likely to be greater when retrieval occurs closer to the time of the witnessed events. The conclusion above has important implica- tions for law enforcement and the legal process and calls into question the validity of in-court identifications and their appropriateness as statements of fact. Memory Retrieval Memory retrieval refers to the process by which stored information is accessed and brought into consciousness, where it can be used to make deci- sions and guide actions. Retrieval of long-term declarative memories is of- ten triggered through association with an external stimulus (i.e., a retrieval cue).79 For example, the slight stubble on a lineup participant’s face may be enough to elicit retrieval of a suspect’s entire face. These same retrieval processes can also be engaged internally—a verbally triggered stream of thought related to a witnessed crime may readily bring to mind visual fea- tures of the perpetrator. A corollary of this association-based phenomenon is that memory retrieval is often context dependent; a memory may be more 76 U. Rimmele et al., “Emotion Enhances the Subjective Feeling of Remembering, Despite Lower Accuracy for Contextual Details,” Emotion 11(3): 553–562 (2011). Kensinger, “Re- membering the Detail.” Neisser and Harsh, Affect and Accuracy in Recall. E. A. Phelps and T. Sharot, “How (and Why) Emotion Enhances the Subjective Sense of Recollection,” Current Directions in Psychological Science 17(2): 147–152 (2008). 77 K. A. Houston et al., “The Emotional Eyewitness: The Effects of Emotion on Specific Aspects of Eyewitness Recall and Recognition Performance,” Emotion 13(1): 118–128 (2013). R. B. Edelstein et al., “Emotion and Eyewitness Memory,” in Memory and Emotion, ed. D. Reisberg and P. Hertel (New York: Oxford University Press, 2004): 308–346. S-A. Christian- son, “Emotional Stress and Eyewitness Memory: A Critical Review,” Psychological Bulletin 112(2): 284–309 (1992). 78 M. S. Zaragoza and S. M. Lane, “Sources of Misattribution and Suggestibility of Eyewit- ness Memory,” Journal of Experimental Psychology: Learning, Memory, and Cognition 20 (4): 934–945 (1994). W. C. Thompson, K. A. Clarke-Stewart, and S. J. Lepore, “What Did the Janitor Do? Suggestive Interviewing and the Accuracy of Children’s Accounts,” Law and Human Behaviour 21(4): 405–426 (1997). D. S. Lindsay and M. K. Johnson, “The Eyewitness Suggestibility Effect and Memory for Source,” Memory and Cognition 17(3): 349–358 (1989). 79 E. Tulving and Z. Pearlstone, “Availability Versus Accessibility of Information in Memory for Words,” Journal of Verbal Learning and Verbal Behaviour 5: 381–391 (1966). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 66 IDENTIFYING THE CULPRIT readily retrieved if the observer is in physical surroundings that are the same as or similar to those in which the original experiences took place (because the surroundings provide additional cues to trigger memory retrieval).80 Memory retrieval is heavily affected by various sources of noise. Simi- larities of meaning or appearance between retrieval cues and items in memory can easily lead to retrieval of the wrong item, producing a false memory.81 This is particularly a problem given the categorical nature of memory.82 The rugged mustachioed man in the lineup may lead to retrieval of the familiar categorical prototype—the Marlboro Man—rather than the specific person perceived at the scene of the crime, which in turn could interfere with or lead to errors in recognition (i.e., identification). Another type of memory retrieval failure is caused by “intrusion errors,” in which information known to be commonly associated with events of a general type becomes incorporated into the retrieved content of a specific memory (and subsequently incorporated into the reconsolidated memory). For ex- ample, because guns are often associated with robbery, an observer may readily and unwittingly incorporate a gun into the retrieved version of his or her memory of a witnessed robbery. Intrusion errors are one manifestation of a larger retrieval problem in which there is loss of information about the source of a memory. In cases of “source memory failure,” we effectively forget how we know things (forget when and where we learned the content of our memories). What this means practically is that we may attribute later acquisition of information to earlier experiences. An eyewitness might learn from the police or some other source that a potential suspect has a moustache and then attribute 80 D. Godden and A. Baddeley, “Context Dependent Memory in Two Natural Environ- ments,” British Journal of Psychology 66(3): 325–331 (1975). S. M. Smith and E. Vela, “Environmental Context-Dependent Eyewitness Recognition,” Applied Cognitive Psychology 6: 125–139 (1992). S. M. Smith and E. Vela, “Environmental Context-Dependent Memory: A Review and Meta-Analysis,” Psychonomic Bulletin Review 8 (2): 203–220 (2001). Tulving and Thomson, “Encoding Specificity and Retrieval Processes.” 81 J. R. Anderson, “A Spreading Activation Theory of Memory,” Journal of Verbal Learning and Verbal Behavior 22(3): 261–295 (1983). A. M. Collins and E. F. Loftus, “A Spreading- Activation Theory of Semantic Processing,” Psychological Review 82(6):407–428 (1975). H. L. Roediger III, D. A. Balota, and J. M. Watson, “Spreading Activation and Arousal of False Memories,” in The Nature of Remembering: Essays in Honor of Robert G. Crowder, ed. H. L. Roediger III, J. Nairne, I. Neath, and A. Surprenant (Washington, DC: American Psychological Association, 2001): 95–115. C. J. Brainerd and V. F. Reyna, The Science of False Memory (New York: Oxford University Press, 2005). 82 Tulving, “Episodic and Semantic Memory.” M. J. Farah and J. L. McClelland, “A Compu- tational Model of Semantic Memory Impairment: Modality Specificity and Emergent Category Specificity,” Journal of Experimental Psychology: General 120(4): 339–357 (1991). G. W. Humphreys and E. M. Forde, “Hierarchies, Similarity, and Interactivity in Object Recogni- tion: ‘Category-Specific’ Neuropsychological Deficits,” Behavioral and Brain Sciences 24(3): 453–476 (2001). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification BASIC RESEARCH ON VISION AND MEMORY 67 that knowledge to the witnessed events, which may, in turn, have disastrous consequences for the ability of the eyewitness to accurately report what she or he has seen. As for the processes of memory encoding and storage, the emotional content of memory also affects memory retrieval. As noted above, memory retrieval is commonly context dependent. A related and well-documented phenomenon that bears on emotional memories is state dependent memory, in which retrieval accuracy is best if the individual’s cognitive state at the time of retrieval matches cognitive state at the time of encoding.83 When memories have an emotional component, retrieval may be best when the individual is induced to a corresponding emotional state (mood dependent memory),84 which is accomplished by verbally or physically placing him or her in the same context, and may offer a valuable investigative tool for probing eyewitness accounts.85 Recognition Memory Recognition memory is a specific type of declarative memory retrieval in which a sensory stimulus (a “cue” stimulus) elicits a memory of the stimulus stored following a prior encounter and often the sequence of events involving the stimulus, the spatial context in which the stimulus was experienced, and the presence of other objects, people, or thoughts that had appeared with the stimulus during the event.86 Recognition memory decisions are based on the retrieved memory evidence, which can be trig- gered by the stimulus and can also emerge from an active search of items 83 D. W. Goodwin et al., “Alcohol and Recall: State-Dependent Effects in Man,” Science 163(3873): 1358–1360 (1969). Tulving and Thomson, “Encoding Specificity and Retrieval Processes.” Psychological Review 80(5): 352–373 (1973). E. Girden and E. Culler, “Condi- tioned Responses in Curarized Striate Muscle in Dogs,” Journal of Comparative Psychology 23(2): 261–274 (1937). D. A. Overton, “State-Dependent or ‘Dissociated’ Learning Produced with Pentobarbital,” Journal of Comparative and Physiological Psychology 57(1): 3–12 (1964). 84 P. M. Kenealy, “Mood State-Dependent Retrieval: The Effects of Induced Mood on Mem- ory Reconsidered,” The Quarterly Journal of Experimental Psychology Section A: Human Experimental Psychology 50(2): 290–317 (1997). P. A. Lewis and H. D. Critchley, “Mood- Dependent Memory,” Trends in Cognitive Sciences 7(10): 431–433 (2003). G. H. Bower, “Mood and Memory,” American Psychologist 36(2): 129–148 (1981). F. I. M. Craik and R. S. Lockhart, “Levels of Processing: A Framework for Memory Research,” Journal of Verbal Learning and Verbal Behavior 11(6):671–684 (1972). Kensinger, “Remembering the Detail.” K. A. Leight and H. C. Ellis “Emotional Mood States, Strategies, and State-Dependency in Memory,” Journal of Verbal Learning and Verbal Behavior 20(3): 251–266 (1981). 85 S. M. Smith and E. Vela, “Environmental Context-Dependent Eyewitness Recognition,” Applied Cognitive Psychology 6: 125–139 (1992). 86 G. Mandler, “Recognizing: The Judgment of Previous Occurrence,” Psychological Review 87(3): 252–271 (1980). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 68 IDENTIFYING THE CULPRIT in memory. One factor affecting the strength of the evidence retrieved is the similarity between the cue stimulus and the stimulus or stimuli that was/ were previously encountered during the event. An observer engaged in this process holds an implicit criterion for the strength of evidence required to reach a positive decision. In the case of eyewitness identification, this pro- cess is routinely elicited by viewing faces in a lineup. When the evidence retrieved is insufficient to reach a decision, this can lead to a cycle of ever- greater scrutiny of the cue stimulus and efforts to recollect additional details of the original event. Ultimately a decision must be made about whether the retrieved evidence is sufficient to declare that the stimulus was previously experienced (or previously experienced in the particular event of interest) or whether the stimulus is novel (or not from the event of interest). If a recognition event occurs—that is, if the memory search triggered by one of the faces in the lineup leads to a strong enough subjective experience that the face is familiar and/or the recollection of sufficient event details—then the witness may declare that they recognize the face as having been previ- ously encountered. Recognition memory decisions can thus be thought of as the final stage in the process of eyewitness identification. Because it is a form of memory retrieval, recognition memory is sus- ceptible to all of the factors summarized above that are known to interfere with retrieval. Recognition memory differs from other forms of retrieval (such as recalling a phone number or a cake recipe), however, in that a comparison must be made between the retrieved evidence and a decision threshold. That is, as noted above, recognition judgments require a decision criterion, an understanding of which presents a unique set of challenges for eyewitness identification (and recognition memory, generally). In particular, an observer’s report of recognition (or, in a lineup setting, of identification) is influenced not simply by the strength or quality of the recalled memory evidence. The report of recognition (identification of a lineup member) is also influenced by the level of evidence that the observer finds acceptable to reach such a decision, i.e., by his or her decision criterion, or bias. An observer who holds a liberal criterion will likely recognize many true targets (i.e., the guilty), but will frequently err by reporting recognition of many false targets (i.e., innocents). Conversely, an observer who holds a conserva- tive criterion will avoid the problem of erroneous recognition (identifica- tion), but will fail to identify some true targets. Estimating (or controlling) the observer’s decision criterion is thus a critical step in efforts to judge the validity of an identification (see also Chapter 5). Recognition memory for faces differs greatly between familiar and un- familiar faces.87 Because we often identify familiar individuals with ease, 87 P. J. B. Hancock, V. Bruce, and A. M. Burton, “Recognition of Unfamiliar Faces,” Trends in Cognitive Sciences 4(9): 330–337 (2000). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification BASIC RESEARCH ON VISION AND MEMORY 69 we tend to think we are generally very good at face recognition. However, we are not as good with unfamiliar faces.88 All of the sources of noise that influence perception and memory contribute to these difficulties, and they are exacerbated by the attempts by criminals to conceal their identity (even a change in hairstyle and clothing can have a major effect on recognition). The ability to recognize unfamiliar faces differs widely across individu- als. At one extreme are those people, referred to as “super recognizers,” who rarely forget a face.89 At the other end of the spectrum are “face-blind people (prosopagnosics),” who have great difficulty recognizing even highly familiar faces.90 Current estimates of the fraction of the general population afflicted by prosopagnosia are as high as ~2 percent.91 The ability of an eyewitness to identify a suspect may thus differ greatly from individual to individual simply as a consequence of general variations in face recognition ability. CONCLUSION The shortcomings of eyewitness identification present a societal prob- lem that has profound implications for our systems of law and justice. Ultimately, a solution to this problem must be informed by a thorough understanding of human vision and memory. The processes of vision and memory, which are fundamental to human experience, have been frequent targets of scientific investigation since the 19th century. The past few de- cades have seen an explosion of additional research that has led to impor- tant insights into how vision and memory work, what we see and remember best, and what causes these processes to fail. The committee has reviewed much of this research, as it pertains to eyewitness identification, and has identified restrictions on what can be seen under specific environmental and behavioral conditions (e.g., as poor illumination, limited viewing dura- tion, viewing angle), factors that impede the ability to attend to critically informative features of a visual scene (e.g., the deleterious effect of an attention-grabbing element, such as a weapon, on the ability to correctly perceive the features of the assailant’s face), distortions of perceptual ex- perience derived from expectations, and ways in which emotion and stress enhance or suppress specific perceptual experiences. Memory is often far 88 V. Bruce, “Changing Faces: Visual and Non-Visual Coding Processes in Face Recognition,” British Journal of Psychology 73: 105–116 (1982). 89 R. Russell, B. Duchaine, and K. Nakayama, “Super-Recognisers: People with Extraordi- nary Face Recognition Ability,” Psychonomic Bulletin and Review 16(2): 252–272 (2009). 90 T. Susilo and B. Ducahine, “Advances in Developmental Prosopagnosia Research,” Cur- rent Opinion in Neurobiology 2(3):423–429 (2013). 91 I. Kennerknecht et al., “First Report of Prevalence of Nonsyndromic Hereditary Prosop- agnosia (HPA),” American Journal of Medical Genetics Part A 140(15): 1617–1622 (2006). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 70 IDENTIFYING THE CULPRIT from a faithful record of what was perceived through the sense of sight: its contents can be forgotten or contaminated at multiple stages, it can be biased by the very practices designed to elicit recall, and it is heavily swayed by emotional states associated with witnessed events and their recall. From this analysis, the committee must conclude that there are insurmountable limits on vision and memory imposed by our biological nature and the properties of the world we inhabit. With this knowledge, it is possible to more fully appreciate the value and risks associated with eyewitness reports and accordingly advise those who collect, handle, defend, consider, and adjudicate such reports. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 71 5 Applied Eyewitness Identification Research The committee was tasked with (1) critically assessing the existing body of scientific research on eyewitness identification; (2) identi-fying gaps in the literature; and (3) suggesting other research that would further the understanding of eyewitness identification and improve law enforcement and courtroom practice. Eyewitness identification research resides in both the scientific literature and the law and justice-related schol- arly literature. Although experiential, anecdotal, and some administrative records from law enforcement and the judiciary could contribute to a better understanding of eyewitness identification, the committee did not comprehensively review this more qualitative material. The committee did, however, examine select examples of law enforcement policies and influen- tial judicial rulings. In late 2013, the committee compiled an extensive and comprehensive bibliography from the following nine electronic databases, with the search limited to publications over the past two decades (i.e., since 1993): Aca- demic Search Premier (EBSCO), Embase (Elsevier), MEDLINE (National Library of Medicine), NCJRS Abstracts Database (U.S. Department of Jus- tice), PsycINFO (American Psychological Association), PubMed (National Institutes of Health), Scopus (Elsevier), Web of Science (Thomson Reuters), and LexisNexis.1 Papers were drawn from such fields as social science, cognitive science, behavioral science, neuroscience, criminology, and law 1 The law review literature was represented by the citations from the LexisNexis search. While all these materials were not reviewed in detail, several of the documents informed Chapter 3 of this report (The Legal Framework for Assessment of Eyewitness Identification). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 72 IDENTIFYING THE CULPRIT using Boolean-logic-based search strategies designed to identify empirical research reports, review articles, systematic reviews, meta-analyses, and articles in law reviews and legal journals. The committee concentrated its review on the subset of the bibliography deemed most important to its task, focusing more on the scientific literature than on the law review literature. These materials included meta-analyses and systematic reviews and primary research in neuroscience, statistics, and eyewitness identification. This report also was informed by several early foundational papers and written comments from, and presentations to, the committee by representatives from science, law enforcement, state courts and government, private organizations, and other interested parties. The comments and presentations revealed additional highly relevant new findings, some recently published or in press and others in submission. The agenda for each committee meeting is available in Appendix B. All materials submitted to the committee are retained in the Academies’ public access file and are available upon request. COMMITTEE ASSESSMENT Many factors affect eyewitness accuracy. Some factors are related to protocols within the law enforcement and legal systems, while others are related to characteristics associated with the crime scene, perpetrator, and witness. System variables are those that the criminal justice system can influence through the enforcement of standards and through education and training of law enforcement personnel in the use of best practices2 and procedures (e.g., by specifying the content and nature of instructions given to witnesses prior to a lineup identification). Estimator variables include factors operat- ing either at the time of the criminal event (relating to visual experience or memory encoding) or during the retention interval (the time between witnessing an event and the identification process). Specific examples in- clude the eyewitness’ level of stress or trauma at the time of the incident, the light level and nature of the visual conditions that affect visibility and clarity of a perpetrator’s features, similarity of age and race of the witness and perpetrator, presence or absence of a weapon during the incident, and the physical distance separating the witness from the perpetrator. A scientific consensus about the effects of some factors has emerged, but no such consensus exists for many other factors. One method of assess- ing scientific consensus is by surveys of experts. A 2001 survey collected 2 As noted in Chapter 1, for the purposes of this report, the committee characterizes best practice as the adoption of standardized procedures based on scientific principles. The commit- tee does not make any endorsement of practices designated as best practices by other bodies. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPLIED EYEWITNESS IDENTIFICATION RESEARCH 73 responses from 64 psychologists about their courtroom experiences and their opinions on 30 eyewitness-related phenomena to determine the “gen- eral acceptance” of these phenomena within the eyewitness identification research community.3 General acceptance is relevant to whether scientific testimony is admissible as evidence in court (see Chapter 3). The survey revealed substantial agreement about which findings these experts felt were sufficiently reliable to present in court.4 The committee examined the scientific literature on eyewitness identi- fication, focusing first on quantitative syntheses, largely systematic reviews and meta-analyses, which were identified in a comprehensive search of electronic databases designed to locate research on both estimator and system variables. In addition, primary research studies were identified in this database search, many of which were also highlighted in the relevant systematic reviews and meta-analyses. Finally, some researchers forwarded manuscripts to the committee that have been submitted for peer-review or are in press. In their examination of this body of literature, the committee examined the quality of the identified research and, where possible, worked to derive summary empirical generalizations related to variables of interest. Quantitative Syntheses of Eyewitness Identification Research The committee first evaluated the consistency of research findings across studies for system and estimator variables by studying published quantitative reviews of empirical research. Systematic reviews, which collect and appraise available research on specific hypotheses or research ques- tions, are efforts to synthesize the effects of variables across studies. Within systematic reviews, meta-analysis is often, but not always, used to compute the effects of variables as well as to identify factors that explain differ- ences across studies. When assumptions about consistency of data collected across studies are met, meta-analysis provides a quantitative summary of empirical findings by statistically averaging effect sizes across individual studies, thereby increasing the precision of the effect size estimate as well 3 S. M. Kassin et al., “On the ‘General Acceptance’ of Eyewitness Testimony Research: A New Survey of the Experts,” American Psychologist 56(5): 405–416 (2001). 4 Kassin et al. also compared the reliability assessments of the 2001 survey to assessments from a similar 1989 survey and noted that, for the 17 propositions retested, there was a remarkable degree of consistency: “most experts saw as sufficiently reliable expert testimony on the wording of questions,” lineup instructions, attitudes and expectations, the accuracy- confidence correlation, the forgetting curve, exposure time, and unconscious transference. “There was less, if any, consensus on the effects of color perception in monochromatic light,” “observer training, high levels of stress, the accuracy of hypnotically refreshed testimony, and event violence.” The authors observed that two phenomena were seen as significantly more reliable than had been the case when the initial survey was conducted: weapon focus effect and hypnotic suggestibility effects. See p. 410. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 74 IDENTIFYING THE CULPRIT as the statistical power to detect effects. Done well, systematic reviews with or without meta-analysis provide evidence for practice and policy for such fields as health care,5 crime and justice, social welfare, and education.6 The utility of systematic reviews for informing practice and policy is predicated on the included studies being transparently reported, conducted so as to minimize risk of bias, and representing as complete a sample as possible of research conducted on the central question, including both published and unpublished studies. In turn, systematic reviews should specify inclusion criteria and data extraction procedures a priori, use independent and dupli- cate procedures for study selection and data extraction, rigorously evaluate potential biases in included studies, and interpret results of meta-analyses in terms that are useful to decision-makers. Further, meta-analyses should not be conducted outside the context of systematic reviews. In short, both systematic reviews and the studies they include need to be transparent and reproducible in order to best inform practice and policy decisions about eyewitness identification. The committee examined quantitative reviews that covered decades of research on both estimator variables (exposure duration,7 retention interval,8 stress,9 weapon focus,10 own-race bias,11 and own-age bias12) and system variables (identification test medium, i.e., live lineup versus photo array,13 5 See the Cochrane Collaboration, available at: http://www.cochrane.org. 6 See the Campbell Collaboration, available at: http://www.campbellcollaboration.org. 7 B. H. Bornstein et al., “Effects of Exposure Time and Cognitive Operations on Facial Identification Accuracy: A Meta-Analysis of Two Variables Associated with Initial Memory Strength,” Psychology, Crime and Law 18(5): 473–490 (2012). 8 K. A. Deffenbacher et al., “Forgetting the Once-Seen Face: Estimating the Strength of an Eyewitness’s Memory Representation,” Journal of Experimental Psychology: Applied 14(2): 139–150 (2008). 9 K. A. Deffenbacher et al., “A Meta-Analytic Review of the Effects of High Stress on Eyewit- ness Memory,” Law and Human Behavior 28(6): 687–706 (2004). 10 J. M. Fawcett et al., “Of Guns and Geese: A Meta-Analytic Review of the ‘Weapon Focus’ Literature,” Psychology, Crime and Law 19(1): 35–66 (2013). 11 C. A. Meissner and J. C. Brigham, “Thirty Years of Investigating the Own-Race Bias in Memory for Faces—A Meta-Analytic Review,” Psychology, Public Policy, and Law 7(1): 3–35 (2001). 12 M. G. Rhodes and J. S. Anastasi, “The Own-Age Bias in Face Recognition: A Meta- Analytic and Theoretical Review,” Psychological Bulletin 138(1): 146–174 (2012). 13 B. L. Cutler et al., “Conceptual, Practical, and Empirical Issues Associated with Eyewit- ness Identification Test Media,” in Adult Eyewitness Testimony: Current Trends and Devel- opments, ed. D. F. Ross (New York: Press Syndicate of the University of Cambridge, 1994), 163–181. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPLIED EYEWITNESS IDENTIFICATION RESEARCH 75 biased and unbiased lineup instructions,14 post-identification feedback,15 si- multaneous versus sequential lineup presentation,16 target absent versus tar- get present lineups,17 foil similarity,18 blinding,19 showup versus lineup,20 prior mug shot exposure,21 verbal description and identification,22 and the cognitive interview23). Many of these quantitative reviews were published recently, with more than one-third published since 2010. However, none of the reviews met all current standards for conducting and reporting sys- 14 S. E. Clark, “A Re-Examination of the Effects of Biased Lineup Instructions in Eyewit- ness Identification,” Law and Human Behavior 29(4): 395–424 (2005). S. E. Clark, “Costs and Benefits of Eyewitness Identification Reform: Psychological Science and Public Policy,” Perspectives on Psychological Science 7(3): 238–259 (2012). N. K. Steblay, “Social Influence in Eyewitness Recall: A Meta-Analytic Review of Lineup Instruction Effects,” Law and Human Behavior 21(3): 283–297 (1997). N. K. Steblay, G. L. Wells, and A. B. Douglass, “The Eyewit- ness Post Identification Feedback Effect 15 Years Later: Theoretical and Policy Implications,” Psychology, Public Policy, and Law 20(1): 1–18 (2014). 15 S. E. Clark and R. D. Godfrey, “Eyewitness Identification Evidence and Innocence Risk,” Psychonomic Bulletin and Review 16(1): 22–42 (2009). A. B. Douglass and N. K. Steblay, “Memory Distortion in Eyewitnesses: A Meta-Analysis of the Post-Identification Feedback Effect,” Applied Cognitive Psychology 20(7): 859–869 (2006). 16 Clark, “Costs and Benefits of Eyewitness Identification Reform.” S. E. Clark, R. T. Howell, and S. L. Davey, “Regularities in Eyewitness Identification,” Law and Human Behavior 32(3): 187–218 (2008). N. K. Steblay et al., “Eyewitness Accuracy Rates In Sequential and Simulta- neous Lineup Presentations: A Meta-Analytic Comparison,” Law and Human Behavior 25(5): 459–473 (2001). N. K. Steblay et al., “Seventy-two Tests of the Sequential Lineup Superiority Effect: A Meta-Analysis and Policy Discussion,” Psychology, Public Policy, and Law 17(1): 99–139 (2011). 17 Clark, “A Re-Examination of the Effects of Biased Lineup Instructions in Eyewitness Identification.” Clark, Howell, and Davey, “Regularities in Eyewitness Identification.” Clark and Godfrey, “Eyewitness Identification Evidence and Innocence Risk.” 18 Clark, “Costs and Benefits of Eyewitness Identification Reform.” Clark and Godfrey, “Eyewitness Identification Evidence and Innocence Risk.” Clark, Howell, and Davey, “Regu- larities in Eyewitness Identification.” R. J. Fitzgerald et al., “The Effect of Suspect-Filler Simi- larity on Eyewitness Identification Decisions: A Meta-Analysis,” Psychology, Public Policy, and Law 19(2): 151–164 (2013). S. L. Sporer et al., “Choosing, Confidence, and Accuracy: A Meta-Analysis of the Confidence-Accuracy Relation in Eyewitness Identification Studies,” Psychological Bulletin 118(3): 315–327 (1995). 19 Clark, “Costs and Benefits of Eyewitness Identification Reform.” 20 Clark, “Costs and Benefits of Eyewitness Identification Reform.” N. K. Steblay et al., “Eyewitness Accuracy Rates in Police Showup and Lineup Presentations: A Meta-Analytic Comparison,” Law and Human Behavior 27(5): 523–540 (2003). 21 K. A. Deffenbacher et al., “Mugshot Exposure Effects: Retroactive Interference, Mugshot Commitment, Source Confusion, and Unconscious Transference,” Law and Human Behavior 30(3): 287–307 (2006). 22 C. A. Meissner, S. L Sporer, and K. J. Susa, “A Theoretical Review and Meta-Analysis of the Description-Identification Relationship in Memory for Faces,” European Journal of Cognitive Psychology 20(3): 414–455 (2008). 23 A. Memon et al., “The Cognitive Interview: A Meta-Analytic Review and Study Space Analysis of the Past 25 Years,” Psychology, Public Policy, and Law 16(4): 340–372 (2010). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 76 IDENTIFYING THE CULPRIT tematic reviews,24 and few met even a majority of these standards, making assessment of the credibility of their findings problematic. After examining the reviews, the committee concluded that the findings may be subject to unintended biases and that the conclusions are less cred- ible than was hoped. In many cases, the data from the studies cited were not readily available or were not clearly presented. Nevertheless, these reviews were helpful in highlighting some of the issues associated with specific research questions and in identifying primary studies that might be both credible and important. RESEARCH STUDIES ON SYSTEM VARIABLES After its assessment of the systematic reviews and meta-analytic studies, the committee’s review focused on the most-studied system variables. Key system variables, such as lineup procedures (e.g., simultaneous vs. sequen- tial lineups, blinded vs. non-blinded lineup administration) and the collec- tion/use of witness confidence statements, can have a marked influence over the validity of eyewitness identifications. In the following section, one of the most important practical issues raised by this influence is addressed: What is the best way to evaluate the effects of system variables on the diagnostic accuracy of eyewitness reports, and how might we use the results of such an evaluation to optimize the states of key system variables and thus maximize performance of an eyewitness? This question is, in principle, relevant to all system variables, but we address it first in the timely and controversial context of simultaneous versus sequential lineup presentations and in the role of eyewitness confidence judgments in evaluation of identification per- formance. This examination of lineup procedures and confidence reports is followed by a brief discussion of the effects on eyewitness performance of another important system variable: the extent and content of communica- tions between the witness and the larger community (law enforcement, legal defense, the press, family and friends, etc.). Evaluating Eyewitness Performance Perhaps the most important empirical question that can be asked about eyewitness identification is: How well do witnesses perform as a function of different system and estimator variables? For example, do factors such as the structure of a lineup, stress, or weapon focus affect the ability of 24 See, e.g., Institute of Medicine, Finding What Works in Health Care: Standards for System- atic Reviews (Washington, DC: The National Academies Press, 2011) and B. J. Shea et al., Devel- opment of AMSTAR: A Measurement Tool to Assess the Methodological Quality of Systematic Reviews, BMC Medical Research Methodology 2007, 7:10 doi:10.1186/1471-2288-7-10. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPLIED EYEWITNESS IDENTIFICATION RESEARCH 77 a witness to provide reliable information? If so, what practices will yield the best performance? The issues are multifaceted, and the answers likely depend upon many factors. Given the complexity of these issues, the experi- mental literature to date has focused largely on one of the more tractable problems: How do different lineup identification procedures affect witness identifications? The committee will use this focus (and its eminent practi- cal relevance) to illustrate how one might go about evaluating eyewitness performance generally. Most lineup identification procedures take one of two forms: simul- taneous or sequential. In a simultaneous procedure, the witness views all individuals in the lineup at the same time and either identifies one (or more) as the perpetrator or reports that the person she or he saw at the crime scene was not in the lineup. In a sequential procedure, the witness views individuals one at a time and reports whether or not each one is the person from the crime scene. Rigorous evaluation of eyewitness identifica- tion performance as a function of these two procedures requires a formal understanding of the task that the witness confronts, and it requires criteria for assessing the outcome. The task of a witness viewing a lineup is an example of what is known as a binary classification problem.25 Each eyewitness faces two possible (bi- nary) states associated with each person in the lineup (guilt or innocence), and the witness must assign each person to one of two classes (guilty or innocent). For each decision, the witness can be correct or incorrect, yield- ing four possible outcomes: a correct classification as guilty (“hit”), an incorrect classification as guilty (“false alarm”), a correct classification as innocent (“correct rejection”), and an incorrect classification as innocent (“miss”). These outcomes are commonly presented in a contingency table26 (see Figure 5-1), and the frequencies in each part of that table are the raw data used to evaluate performance on a binary classification task, such as eyewitness identification.27 There are many different performance measures that can be derived from data of this sort—indeed, the fields of statistical classification and ma- chine learning are replete with tools for the evaluation of binary classifiers.28 25 The binary classifier in this context is defined as the witness operating under a specific set of conditions, such as lineup procedures. 26 Also termed “confusion matrix.” 27 The prevalence or “base-rate”—the fraction of individuals in each category (guilty or in- nocent, in the eyewitness problem) in the population is also a factor that may come into play when evaluating binary classification performance. 28 See, e.g., T. Hastie, R. Tibshirani, and J. H. Friedman, The Elements of Statistical Learn- ing: Data Mining, Inference, and Prediction (New York: Springer, 2009) and A. Smola and S. V. N. Vishwanathan, Introduction to Machine Learning (Cambridge: Cambridge University Press, 2008). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 78 IDENTIFYING THE CULPRIT The preferred measure will depend to a large degree upon the criteria one adopts for performance evaluation. Perhaps the simplest measure of binary classification performance is the ratio of hit rates (HR) to false alarm rates (FAR), i.e., HR/FAR.29 The magnitude of this measure, which is known in the eyewitness identification literature as the “diagnosticity ratio,” is proportional to the likelihood that a classification is correct, i.e., that the person identified as guilty is actually guilty.30 The diagnosticity ratio is appealing if the most critical criterion is avoiding erroneous identifications. 29 The “rate” associated with each cell of the contingency table is computed as the number of counts within that cell (e.g., number of people correctly classified as guilty) divided by the number of instances that are truly in that class (e.g., total number of guilty people being classified). Thus, hit rates (HR) = number of hits / (number of hits+number of misses), and false alarm rate (FAR) = number of false alarms / (number of false alarms+number of correct rejections). 30 The “diagnosticity ratio” is also known in other disciplines by other names; e.g., “posi- tive likelihood ratio” or “LR+ = Likelihood Ratio of a Positive Call;” see Peter Lee, Bayesian Statistics: An Introduction (Chichester: Wiley, 2012), Sec 4.1. FIGURE 5-1 Contingency table for possible eyewitness identification outcomes. SOURCE: Courtesy of Thomas D. Albright. Witness Classificaon of Lineup Parcipant True Status of Lineup Parcipant guilty guilty innocent innocent “Hit” (true posive) “False Alarm” (false posive) “Miss” (false negave) “Correct Rejecon” (true negave) Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPLIED EYEWITNESS IDENTIFICATION RESEARCH 79 Not surprisingly, the diagnosticity ratio was adopted in pioneering efforts to identify lineup conditions that would yield better witness identi- fication performance.31 Most laboratory-based studies and meta-analyses of the effects of lineup procedures on eyewitness identification performance show that, with standard lineup instructions informing the witness that the perpetrator may or may not be present, the sequential procedure produces a higher diagnosticity ratio.32 That is, when considering only those cases in which a witness actually selects someone from a lineup, the ratio of correct to false identifications is commonly higher with the sequential than with the simultaneous procedure.33 A higher diagnosticity ratio could result from a higher hit rate, a lower false alarm rate, or some combination of the two. Some early reports sug- gested that sequential procedures (relative to simultaneous) lead to fewer false alarms without changing the frequency of hits, which would result in a higher diagnosticity ratio.34 More recent laboratory-based studies and meta-analyses typically show that sequential procedures (relative to simul- taneous) are associated with a somewhat reduced hit rate accompanied by a larger reduction in the false alarm rate, thereby resulting in diagnosticity ratios higher than those yielded by simultaneous procedures.35 In other 31 R. C. L. Lindsay and G. L. Wells, “Improving Eyewitness Identifications from Lineups: Simultaneous Versus Sequential Lineup Presentation,” Journal of Applied Psychology 70(3), 556–564 (1985). 32 Steblay et al. “Eyewitness Accuracy Rates in Sequential and Simultaneous Lineup Presenta- tions.” Steblay, et al., “Seventy-two Tests of the Sequential Lineup Superiority Effect.” S. D. Gronlund et al., “Robustness of the Sequential Lineup Advantage,” Journal of Experimental Psychology: Applied 15(2): 140–152 (2009). S. D. Gronlund, J. T. Wixted, and L. Mickes, “Evaluating Eyewitness Identification Procedures Using ROC Analysis,” Current Directions in Psychological Science 23(1): 3–10 (2014). 33 But see C. A. Carlson, S. D. Gronlund, and S. E. Clark, “Lineup Composition, Suspect Position, and the Sequential Lineup Advantage,” Journal of Experimental Psychology-Applied 14(2): 118-128 (2008), for a counterexample. Also, Clark, Moreland, and Gronlund have demonstrated that the accuracy advantage of sequential lineups as measured by diagnosticity ratios has decreased over time since the original report. Reanalysis of diagnosticity data for sequential studies showed slight, non-significant decreases in correct identification effects and increases in false identification effects, which together combine to produce a significant de- crease in the advantage of sequential over simultaneous lineup methods. See S. E. Clark, M. B. Moreland, and S. D. Gronlund, “Evolution of the Empirical and Theoretical Foundations of Eyewitness Identification Reform,” Psychonomic Bulletin and Review 21(2): 251–267 (2014). 34 R. C. L. Lindsay, “Applying Applied Research: Selling the Sequential Lineup,” Applied Cognitive Psychology 13(3): 219–225 (1999). G. L. Wells, S. M. Rydell, and E. P. Seelau, “The Selection of Distractors for Eyewitness Lineups,” Journal of Applied Psychology 78(5): 835–844 (1993). 35 A recent field-based study comparing sequential to simultaneous procedures in a limited number of jurisdictions computed the diagnosticity ratio using filler identifications as the false alarm rate (because the innocence or guilt of the suspect is unknown in such situations). See G. L. Wells, N. K. Steblay, J. E. Dysart, “Double-Blind Photo-Lineups Using Actual Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 80 IDENTIFYING THE CULPRIT words, when using a single diagnosticity ratio as a measure of eyewitness performance, the sequential procedure (relative to simultaneous) comes closer to satisfying the popular criterion that those identified as guilty are actually guilty. In light of these findings, many policy makers have advo- cated sequential procedures, and those procedures have been adopted by law enforcement in many jurisdictions. While policy decisions and practice have been influenced by the afore- mentioned studies, there are other criteria worthy of consideration when evaluating eyewitness performance. One alternative is revealed by asking why the diagnosticity ratio changes across lineup conditions. This ques- tion can be addressed given a plausible model of the mechanisms underly- ing human recognition memory. Most models of recognition memory are based on the idea that a cue (e.g., a face in a lineup) results in the retrieval of information stored in memory (see Chapter 4). When the retrieved in- formation provides enough evidence to satisfy the observer, they make an identification—that is, they decide that the stimulus is “recognized.” Ex- plicit in this model are two important parameters: the observer’s memory sensitivity (that is, the “discriminability” between the strength of memory evidence elicited by a previously encountered stimulus and that elicited by novel stimuli), and the degree of evidence that the observer requires to make an identification (“response criterion” or “bias”) (see Box 5-1). The first of these two parameters—discriminability—is important for evaluating eyewitness performance. It tells whether a difference in per- formance under different task conditions reflects a true improvement in memory-based discrimination, i.e., an improvement in the strength of the observer’s retrieved memory evidence of the perpetrator. The fact that these two measures (the likelihood that an identified person is guilty vs. discriminability) do not assess the same thing is coun- terintuitive—a fact that has generated controversy in the field of eyewitness Eyewitnesses: An Experimental Test of a Sequential versus Simultaneous Lineup Procedure,” Law and Human Behavior, 15 June 2014, doi: 10.1037/lhb0000096. When computed in this manner, the data revealed a modest diagnosticity ratio advantage for the sequential procedure. However, Amendola and Wixted re-analyzed a subset of the data for which proxy measures of ground truth were available [K. Amendola and J. T. Wixted, “Comparing the Diagnostic Accu- racy of Suspect Identifications Made by Actual Eyewitnesses from Simultaneous and Sequential Lineups,” accepted by Journal of Experimental Criminology (2014)]. Their analyses suggested that identification of innocent suspects is less likely and identification of guilty suspects is more likely when using the simultaneous procedures. While future field studies are needed, these latter findings raise the possibility that diagnosticity is higher for the simultaneous procedure. See also Clark, Moreland, and Gronlund, who report that published diagnosticity ratios have changed over time, reflecting a significant decrease in the advantage of sequential over simul- taneous lineup procedures. (Clark, Moreland, and Gronlund, “Evolution of the Empirical and Theoretical Foundations of Eyewitness Identification Reform.”) Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPLIED EYEWITNESS IDENTIFICATION RESEARCH 81 BOX 5-1 The Influences of Discriminability and Response Bias on Human Binary Classification Decisions All human decisions about the classification of objects based on memory— including a witness’ classifications of guilt or innocence for faces in a lineup, an individual’s decision as to whether a piece of luggage is his or her own, a bota- nist’s recognition of a specific type of fern, a radiologist’s detection of a tumor in a mammogram, or the determination of the sex of a newly-hatched chicken—can be distilled down to the influence of two factors that are rooted in causal models of recognition memory:the degree to which the relevant objects are discriminable by the decider (the decider’s sensitivity to the difference between them), and the decider’s criterion for making a decision (response bias, or the decider’s degree of specificity in making choices).a There are, of course, many other variables that will affect the outcome (e.g., levels of stress, attentional focus, potential rewards or expectations), but all of these are believed to exert their influence over memory- based classification decisions by affecting discriminability and/or response bias. To illustrate the distinction between discrimination and response bias as applied to a real-world decision problem, consider how an audiologist conducts a hearing test. In a hearing test, an individual might be asked to detect sounds along a continuum of loudness and to indicate when a sound is present. The audiologist wants to know how well someone can discriminate presence versus absence of a sound, but that assessment is complicated by the criterion people use when deciding to say that they heard a sound (response bias). Some people are hesitant to respond positively, saying “I hear it” only when they are absolutely certain (“conservative” responders). Others are more willing to respond positively, saying “I hear it” with less information and greater uncertainty (“liberal” respond- ers). Those with a conservative bias are less likely to report hearing a sound in general, so they will have both fewer correct detections (“hits”) and fewer overt mistakes (“false alarms”). By contrast, those with a liberal bias are more likely to say that they heard a sound, so they will have more hits but also more false alarms. Importantly, this can occur even if the conservative and liberal respond- ers do not differ in their ability to discriminate the presence or absence of sound. aSee, e.g., W. P. Banks, “Signal Detection Theory and Human Memory,” Psychological Bul- letin 74(2): 81–99 (1970); J. P. Egan, Recognition Memory and the Operating Characteristic (Bloomington: Indiana University Hearing and Communication Laboratory, 1958); D. M. Green and J. A. Swets, Signal Detection Theory and Psychophysics (New York: Wiley,1966). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 82 IDENTIFYING THE CULPRIT identification research.36 Intuitively, if sequential lineups yield a higher likelihood that an identified person is guilty (as quantified by a higher diagnosticity ratio), then it seems as if that procedure yields objectively better performance. The problem with this intuition is that it fails to take into account the second of the two parameters of recognition memory models—the response bias or degree of evidence that the observer finds ac- ceptable to make an identification. This parameter, which is distinct from discriminability, reflects the witness’ tendency to pick or not to pick some- one from the lineup. If a witness sets a high bar for acceptable evidence—a conservative bias—then he or she will be unlikely to select anyone from the lineup (low pick frequency), meaning that they will have more misses (will be more likely to fail to select the suspect because they are less likely to make a selection at all) and fewer false alarms. Conversely, if a witness sets a low bar for acceptable evidence—a liberal bias—then she or he will be more likely to make a selection from the lineup (a high pick frequency), meaning he or she will have more hits and will make more false identifications. Differences in pick frequency can, and gen- erally do, lead to differences in the ratio of hit rates to false alarm rates; all else being equal, the diagnosticity ratio will be higher for a conservative bias than for a liberal bias.37 In other words, simply by inducing a witness to adopt a more conservative bias, it is possible to increase the likelihood that an identified person is actually guilty. Importantly, this may be true even if the procedure yields no better, or potentially worse, discriminability.38 Despite its merits, a single diagnosticy ratio thus conflates the influences of discriminability and response bias on binary classification, which mud- dies the determination of which procedure, if any, yields objectively better discriminability in eyewitness performance. To overcome this problem, some investigators have recently adopted a technique from signal detection 36 See, e.g., J. T. Wixted and L. Mickes, “The Field of Eyewitness Memory Should Aban- don Probative Value and Embrace Receiver Operating Characteristic Analysis,” Perspectives on Psychological Science 7(3): 275-278 (2012); Clark, “Costs and Benefits of Eyewitness Identification Reform”; G. L. Wells, “Eyewitness Identification Probative Value, Criterion Shifts, and Policy Regarding the Sequential Lineup,” Current Directions in Psychological Science 23(1): 11–16 (2014); and Steblay, et al. “Seventy-two Tests of the Sequential Lineup Superiority Effect.” 37 The sole exception to this rule is the case in which classifications are made at chance level of performance, i.e., when the observer exhibits no ability to discriminate. 38 L. Mickes, H. D. Flowe, and J. T. Wixted, “Receiver Operating Characteristic Analysis of Eyewitness Memory: Comparing the Diagnostic Accuracy of Simultaneous vs. Sequential Line- ups,” Journal of Experimental Psychology: Applied 18 (4): 361–376 (2012). C. A. Meissner et al., “Eyewitness Decisions In Simultaneous and Sequential Lineups: A Dual Process Signal Detection Theory Analysis,” Memory and Cognition 33(5): 783–792 (2005). M. A. Palmer and N. Brewer, “Sequential Lineup Presentation Promotes Less-Biased Criterion Setting but Does Not Improve Discriminability,” Law and Human Behavior 36(3): 247–255 (2012). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPLIED EYEWITNESS IDENTIFICATION RESEARCH 83 theory, which distinguishes the relative influences of discriminability and bias on binary classification.39 This technique involves analysis of Receiver Operating Characteristics (see Box 5-2). ROC analysis has been used ex- tensively in multiple contexts of human decision-making, notably in basic research on visual perception and memory and applied studies of medical diagnostic procedures.40 In essence, ROC analysis examines diagnosticity ratios integrated over different response biases. This approach to eyewitness research has been promoted based on the claim that it can enable lineup procedures to be evaluated by their effect on discrimination, separate from response bias, and—importantly—because the dimensions of analysis (dis- criminability and response bias) correspond to the mechanistic parameters of causal models of human recognition memory. Use of ROC analysis to evaluate eyewitness performance requires cal- culating the diagnosticity ratio for different response bias conditions (see Box 5-2). Using expressed confidence level (ECL) as a proxy for response bias (see below), a small set of recent studies using ROC analysis has re- ported that discriminability (area under the ROC curve) for simultaneous lineups is as high, or higher, than that for sequential lineups.41 In other words, when eyewitness identification performance is evaluated based on a criterion of bias-free discriminability, the results differ from those based on a single diagnosticity ratio, and they do so because the latter fails to account for response bias. Looking broadly at the many empirical studies that have used a single diagnosticity ratio to evaluate eyewitness performance, as well as the more recent findings using ROC analysis, it appears that the practical advantage of one lineup procedure over another depends to a large degree upon the performance criterion that one adopts. From the perspective of many, the ideal lineup procedure would elicit a conservative bias (thus reducing false identifications) and high discriminability (that is, optimizing memory sensitivity). If there exists no discriminability advantage for one lineup 39 D. M. Green and J. A. Swets, Signal Detection Theory and Psychophysics (New York: Wiley, 1966); D. McNicol, A Primer of Signal Detection Theory (London: George Allen and Unwin, 1972). 40 J. A. Swets, “ROC Analysis Applied to the Evaluation of Medical Imaging Techniques,” Investigative Radiology 14(2): 109–121 (1979). 41 Mickes, Flowe, and Wixted, “Receiver Operating Characteristic Analysis of Eyewitness Memory." C. A. Carlson and M. A. Carlson, “An Evaluation of Lineup Presentation, Weapon Presence, and a Distinctive Feature Using ROC Analysis,” Journal of Applied Research in Memory and Cognition 3(2): 45–53 (2014). D. G. Dobolyi and C. S. Dodson, “Eyewitness Confidence in Simultaneous and Sequential Lineups: A Criterion Shift Account for Sequential Mistaken Identification Overconfidence,” Journal of Experimental Psychology: Applied 19 (4): 345–357 (2013). S. D. Gronlund et al., “Showups Versus Lineups: An Evaluation Using ROC Analysis,” Journal of Applied Research in Memory and Cognition 1(4): 221–228 (2012). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 84 IDENTIFYING THE CULPRIT BOX 5-2 Analysis of Receiver Operating Characteristics (ROCs) Binary classification decisions by human observers are affected by both discriminability (the observer’s sensitivity to the difference between target and non-targets) and response bias (the observer’s degree of specificity in making a response). Analysis of Receiver Operating Characteristics (ROCs) is a method from signal detection theory that enables one to distinguish the relative influences of discriminability and response bias on binary classification decisions. ROC analysis is performed by plotting the frequency of decisions that are hits (correctly detecting a target) versus the frequency of decisions that are false alarms (incor- rectly classifying a non-target as a target). The positive diagonal in an ROC plot (see figure next page) corresponds to response bias, moving from high specificity at the lower left corner [no detection of targets (hit rate = 0) and no incorrect attribution of non-targets as targets (false alarm rate = 0)], to low specificity at the upper right corner [all targets detected (hit rate = 1.0) and all non-targets attributed as targets (false alarm rate = 1.0)]. Because all points along this positive diagonal reflect equal ratios of hits to false alarms, they vary in response bias (i.e., the frequency of lineup picks, or “pick frequency”), but they do not manifest differences in discriminability. The negative diagonal in an ROC plot corresponds, by contrast, to discriminability, moving from chance discriminability at the intersection with the positive diagonal, where hits and false alarms are equally likely, to the highest discriminability in the upper left corner, where all targets are detected (hit rate = 1.0), but no non-targets are at- tributed as targets (false alarm rate = 0). To see how measured hit and false alarm rates vary over different conditions of discriminability and response bias in laboratory experiments, one can manipu- late or estimate these conditions and record a diagnosticity ratio (HR/FAR) for each condition. The typical result is a set of diagnosticity ratios that, when plotted in the ROC space (represented by the dots in the figure at right), form a curve spanning from lower left to upper right. The extent to which that curve deviates (bows above and away) from the positive diagonal is a quantitative measure of discriminability (assessed as the area under the curve) for which response bias has been factored out. ROC analysis has been used extensively in basic and applied research on recognition memory. In these experiments, response bias is sometimes ma- nipulated explicitly by encouraging observers to be more or less selective in Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPLIED EYEWITNESS IDENTIFICATION RESEARCH 85 their responses. Frequently, however, “expressed confidence level” (ECL)—the confidence that an observer holds in his or her classification—is used as a proxy for response bias, based on the assumption that more confident observers are likely to be more specific (conservative) in their responses, whereas less confident observers are likely to be less specific (liberal) in their responses. Receiver Operating Characteristic (ROC) curve. SOURCE: Courtesy of Thomas D. Albright. False Alarm Rate H it Ra te 0 1.0 1.0 Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 86 IDENTIFYING THE CULPRIT procedure over another,42 then eyewitness performance may benefit from any procedure (such as sequential) that elicits a more conservative response bias.43 But one can only make that judgment after having applied an em- pirical test to determine whether a procedure offers a discriminability ad- vantage. Future research might explore the possibility that other methods of inducing a conservative response bias (such as verbal instructions to the witness to be cautious in making an identification) might be combined with procedures that improve discriminability in order to optimize eyewitness identification performance. Perhaps the greatest practical benefit of recent debate over the utility of different lineup procedures is that it has opened the door to a broader consideration of methods for evaluating and enhancing eyewitness identi- fication performance. ROC analysis is a positive and promising step with numerous advantages. For example, the area under the ROC curve is a single-number index of discriminability. Moreover, this index reflects a parameter-free approach to binary classification performance; the outcome is entirely data-dependent and thus identical across all users drawing from 42 The committee notes that some of the few recent reports using ROC analysis indeed claim improved discriminability for simultaneous lineup conditions, but the reported discriminability improvements are small. 43 In reality, a more conservative bias may not always be beneficial, and whether it is or not depends upon a number of factors that have an impact distinct from diagnostic accuracy and are difficult to quantify. All else being equal, the “best” response bias will be one that maximizes the “expected value” of the outcome (Green and Swets, Signal Detection Theory and Psychophysics; Swets, “ROC Analysis Applied to the Evaluation of Medical Imaging Techniques”). For the problem of eyewitness identification, the response bias that maximizes expected value can be computed from the prevalence of guilty suspects in lineups and from societal values or costs associated with each of the possible eyewitness decisions (errors and correct assignments). Reliable data on prevalence are difficult to come by, and value/cost quantities are difficult to assign and likely to vary significantly across crimes and cultures. One can nonetheless gain an intuition for how these factors might define the best response bias conditions. Consider, for example, the consequences of decreasing the prevalence of guilty suspects in lineups. In this case, expected value can be maximized by inducing a conservative bias—i.e., if innocence is a priori likely, then there is value gained by being more selective in your response. Similarly, the optimal response bias will depend upon normative costs associ- ated with different types of eyewitness errors. Generally speaking, if a society places greater emphasis on not identifying the innocent, relative to failing to identify the guilty, then expected value can be increased by inducing a more conservative response bias. But the opposite would be true if there were greater societal pressures for identifying the guilty, relative to protect- ing the innocent. Although an understanding of the relationship between response bias and expected value is important, expected value in this case has little to do with the diagnostic accuracy of an eyewitness report. But it does nonetheless bear on decisions about which lineup procedure should be employed. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPLIED EYEWITNESS IDENTIFICATION RESEARCH 87 the same data set.44 Most importantly for its application to the problem of evaluating eyewitness performance, the ROC approach possesses a dis- tinct advantage because the dimensions of analysis—discriminability and response bias—map directly onto the mechanistic parameters of causal models of human recognition memory (see Chapter 4). In other words, the approach affords insight into and quantification of the sensory and cogni- tive processes that are believed to underlie memory-based classification decisions (see Box 5-1), such as eyewitness identifications. Despite these merits, as a general statistical procedure for evaluation of binary classification performance and as a tool for evaluation of eyewit- ness performance, the ROC approach has some well-documented quantita- tive shortcomings. For example, ROC analysis depends on the ability to manipulate response bias or to estimate it from some other variable, and in the case of eyewitness identification that ability has been the subject of some debate. Recent studies have used expressed confidence level (ECL)— a measure of a witness’ confidence in his or her selection—as a proxy for response bias,45 based on the common-sense logic that a witness who has high confidence in their lineup selection should manifest a more conserva- tive response bias than a witness who selected someone from the lineup despite lacking confidence in that selection (i.e., someone who made a selection even though they were not certain—a liberal response bias). This proxy relationship is inherently noisy within individuals, and the noisy rela- tionship is exacerbated by the fact that the eyewitness identification ROC is population-based; individual data points are obtained from different people who may scale their confidence reports differently.46 On the other hand, it is empirically clear that, when scaled appropriately (within and across individuals), different levels of expressed confidence do, in fact, correspond to different pick frequencies and response biases.47 44 Green and Swets, Signal Detection Theory and Psychophysics. D. J. Hand, “Measuring Classifier Performance: A Coherent Alternative to the Area under the ROC Curve,” Machine Learning 77, 103–123 (2009). 45 See, e.g., N. Brewer and G. L. Wells, “The Confidence-Accuracy Relationship in Eyewit- ness Identification: Effects of Lineup Instructions, Foil Similarity, and Target-Absent Base Rates,” Journal of Experimental Psychology: Applied 12(1): 11–30 (2012); Mickes, Flowe, and Wixted, “Receiver Operating Characteristic Analysis of Eyewitness Memory”; and Carlson and Carlson, “An Evaluation of Lineup Presentation.” 46 ECL is affected by over-confidence and under-confidence at the individual level, and the current implementation of the ROC approach, combining results across subjects, does not build this measurement error into the analysis or the comparison of empirical ROC curves. See Appendix C. 47 See, e.g., Table 1 of Mickes, Flowe, and Wixted, “Receiver Operating Characteristic Anal- ysis of Eyewitness Memory,” which summarizes confidence ratings, hit rates, false alarm rates, and diagnosticity ratios (HR/FAR) derived from data published in Brewer and Wells, “The Confidence-Accuracy Relationship in Eyewitness Identification.” Brewer and Wells employed Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 88 IDENTIFYING THE CULPRIT An additional prerequisite for the use of ECL as a measure of re- sponse bias is that an orderly relationship exists between confidence and accuracy—that witnesses expressing greater confidence are more likely to be accurate in their identifications. Although this hypothesis conforms to intuition,48 the existence of a significant confidence–accuracy relationship has been challenged repeatedly over the years.49 Recent evidence, however, suggests ways of improving the confidence–accuracy relationship (and ob- taining more reliable measurements of it).50 While the ECL measure thus has potential, more research on this and other possible methods of estimat- ing or controlling response bias is warranted to support efforts to extract a bias-free measure of discriminability. Another technical concern raised by the use of ROC analysis to evalu- ate eyewitness identification performance is that it relies on a partial, rather than full, area under the ROC curve measure (see Box 5-2) as an index of discriminability that is separate from response bias. This is necessitated by the fact that the highest false alarm rates in eyewitness identification data are commonly well below 1.0, even under the most liberal response bias a “confidence calibration” technique to normalize scaling of expressed confidence across witnesses. Both hit rates and false alarm rates declined steeply—implying an increasingly con- servative response bias—as confidence levels increased. Diagnosticity ratios increased mono- tonically with increasing confidence. An identical pattern can be seen in Table 3 of Mickes, Flowe, and Wixted, “Receiver Operating Characteristic Analysis of Eyewitness Memory.” See also H. L. Roediger III, J. T. Wixted, and K. A. DeSoto, “The Curious Complexity Between Confidence and Accuracy in Reports from Memory,” in Memory and Law, ed. L. Nadel and W. Sinnott-Armstrong (Oxford: Oxford University Press, 2012), 97. 48 K. A. Deffenbacher and E. F. Loftus, “Do Jurors Share a Common Understanding Con- cerning Eyewitness Behavior?,” Law and Human Behavior 6: 15–30 (1982); and G. L. Wells, T. J. Ferguson, and R. C. L. Lindsay, “The Tractability of Eyewitness Confidence and Its Implication for Triers of Fact,” Journal of Applied Psychology 66: 688–696 (1981). 49 G. L. Wells and D. M. Murray, “Eyewitness Confidence,” in Eyewitness Testimony: Psy- chological Perspectives, ed. G. L. Wells and E. F. Loftus (New York: Cambridge University Press, 1984). B. L. Cutler and S. D. Penrod, Mistaken Identification: The Eyewitness, Psy- chology, and the Law (Cambridge: Cambridge University Press, 1995). R. K. Bothwell, K. A. Deffenbacher, and J.C. Brigham,“Correlation of Eyewitness Accuracy and Confidence: Opti- mality Hypothesis Revisited,” Journal of Applied Psychology 72:691–695 (1987). S. L. Sporer et al., “Choosing, Confidence, and Accuracy: A Meta-Analysis of the Confidence-Accuracy Relation in Eyewitness Identification Studies,” Psychological Bulletin 118(3): 315–327 (1995). T. A. Busey et al., “Accounts of the Confidence-Accuracy Relation in Recognition Memory,” Psychonomic Bulletin and Review 7(1): 26-48 (2000). 50 N. Brewer and G. L. Wells, “The Confidence-Accuracy Relationship in Eyewitness Identi- fication. P. Juslin, N. Olsson, and A. Winman, “Calibration and Diagnosticity of Confidence in Eyewitness Identification: Comments on What Can Be Inferred From the Low Confidence- Accuracy Correlation,” Journal of Experimental Psychology: Learning, Memory, and Cog- nition 22(5): 1304–1316 (September 1996). Roediger, Wixted, and DeSoto, “The Curious Complexity between Confidence and Accuracy.” Mickes, Flowe, and Wixted, “Receiver Operating Characteristic Analysis of Eyewitness Memory.” Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPLIED EYEWITNESS IDENTIFICATION RESEARCH 89 conditions.51 In practice, partial area under the curve is computed by trun- cating the ROC curve at the highest false alarm rate obtained. Because the standard error of the partial area under the curve measure depends upon the degree of truncation, accuracy of this discriminability measure can easily vary across conditions and across studies, making the interpretation difficult.52 While ROC analysis has many recognized merits for the evaluation of binary classification, the residual concerns associated with its typical use for evaluating eyewitness performance merit consideration of other sta- tistical approaches to this problem. As noted above, many methods have been proposed—and adopted in specific applications—for evaluation of binary classification performance.53 The committee knows of no instance in which any of these alternative methods has been applied to the problem of eyewitness identification. Moreover, because they have not been vetted, the committee is not in a position to endorse any specific statistical tool, the committee nevertheless encourages a general exploration of these alterna- tives. These alternatives may have their own share of unforeseen problems, and/or the performance criteria employed by them may bear no meaningful relationship to the sensory and cognitive processes involved in eyewitness identification. Nonetheless, some of these methods may provide greater insight into the factors that affect eyewitness identification performance and may, in turn, suggest ways of improving performance. To illustrate this opportunity by example, we consider the following possibilities. It has been argued that a basic weakness of the existing ROC approach to binary classification performance results from the fact that, in principle 51 Carlson and Carlson, “An Evaluation of Lineup Presentation.” Mickes, Flowe, and Wixted, “Receiver Operating Characteristic Analysis of Eyewitness Memory.” 52 Along the same lines, accuracy of discriminability measures derived from ROC studies may be called into question when those studies do not take into account uncertainty in the data used to construct the ROC curves; see Appendix C. An argument has also been made that the area under the ROC curve can be a flawed metric for comparing binary classification conditions when the costs of classification errors are not precisely known and are different for different conditions (Hand, “Measuring Classifier Performance”). The costs of classification errors may be similar across some lineup comparisons and across some conditions of other systems variables, and for others they may be different. But for the most part they are not precisely known, and this is thus a topic that deserves greater attention given the growing use of ROC-based evaluation of eyewitness identification performance. 53 Numerous methods for the evaluation of binary classifiers have been developed and applied in the field of machine learning, which seeks to optimize autonomous classification devices (such as, for example, the fingerprint lock access control on a smart phone, which must quickly and reliably distinguish the finger from another). This field has a long and rich history, and candidate methods are summarized in several texts on statistical classification and machine learning, such as Hastie, Tibshirani, and Friedman, The Elements of Statistical Learn- ing and A. Smola and S. V. N. Vishwanathan, Introduction to Machine Learning (Cambridge: Cambridge University Press, 2009). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 90 IDENTIFYING THE CULPRIT (and in practice under certain commonly unrecognized conditions), the area under the ROC curve is dependent on imprecise assumptions about the costs of classification errors across different classification conditions.54 One might suppose, for example, that the cost of a miss for a crime of murder is greater than the cost of a miss for a stolen car. But without a precise understanding of these relative decision costs, the area under the ROC curve measure can be incoherent, in that it depends as much on the classification conditions as it does on the sensitivity of the classifier. An al- ternative method has been proposed to address this problem—derivation of the “H measure”—that enables the performance of binary classifiers to be compared using a common metric that is independent of the cost distribu- tions for different types of classification errors.55 The committee supports exploration of this alternative. Another avenue for exploration emerges from the fact that the litera- ture evaluating eyewitness identification performance has focused exclu- sively on the positive predictive value (PPV) of a witness’ classification as guilty. For a given response bias, PPV is related to the diagnosticity ratio, in that, given equal prevalence of the culprit in two conditions (e.g., lineup procedures) being compared, a higher diagnosticity ratio leads to a higher PPV. As discussed above, the diagnosticity ratio is a critical piece of information in efforts to evaluate eyewitness performance. As for any binary classification, however, there is also information associated with a negative response, which is the predictive value of a classifier’s assertion that a target is not present (in the eyewitness case, the witness’ assertion of innocence). This negative predictive value (NPV) is related to a different ratio of decisions, namely (1-HR)/(1-FAR),56 in that, given equal prevalence of the target in the two procedures being compared, higher values of this ratio correspond to higher values of NPV. While NPV is commonly used to evaluate the accuracy of human classification decisions, such as in medical diagnosis, and is a source of information that may similarly be of additional value in efforts to evalu- ate lineup procedures, it has been largely neglected in the field of eyewit- ness identification.57 One might hold the intuition that PPV and NPV are monotonically related to one another—believing that the likelihood that the 54 See Hand, “Measuring Classifier Performance.” 55 Ibid. 56 The reciprocal of this ratio is called the “negative likelihood ratio.” See, e.g., T. Hoffmann, S. Bennett, and C. del Mar, Evidence-Based Practice Across the Health Professionals (Chatswood: Elsevier Australia, 2009). 57 It seems likely that this neglect stems from the fact that the primary concern in eyewit- ness identification has been on incorrect assertions of guilt (i.e., false identifications) rather than incorrect assertions of innocence. There are normative values in society that reinforce this concern (as exemplified, for example, by Blackstone’s formulation: “Better that 10 guilty persons escape than that one innocent suffer.”) Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPLIED EYEWITNESS IDENTIFICATION RESEARCH 91 witness will correctly identify the culprit is proportional to the likelihood that the witness will correctly identify lineup candidates as innocent—and thus conclude that evaluation of PPV alone is sufficient. Contrary to that intuition, however, evidence from studies of analogous binary classifica- tion problems reveals that these two predictive probabilities can vary with respect to one another in complex ways.58 In practice, NPV-related measures (quantified as negative likelihood ratios) can be subjected to ROC analysis to account for the effects of response bias in the same manner as PPV-related measures (quantified as positive likelihood ratios, i.e., diagnosticity ratios)—the ROC axes in the NPV case corresponding to 1-HR and 1-FAR. Consideration of NPV and its relationship to PPV, by this and other means, may provide additional in- sight into the ways in which estimator and system variables (such as lineup procedures) influence eyewitness identification performance.59 In sum, a formal understanding of the task facing an eyewitness, in conjunction with an appreciation of causal models of human recogni- tion memory, has led to a potentially more comprehensive method—ROC analysis—for evaluating eyewitness identification performance. Despite these advances, it is important that practitioners in this field broadly ex- plore the large and rich field of statistical tools for evaluation of binary classifiers. While the committee recognizes that these tools are uninvesti- gated for this application and may possess their own share of unforeseen problems or disadvantages, a move in this direction may be of great value for improving the validity of eyewitness identification. Interactions with Eyewitnesses (Feedback) The nature of law enforcement interactions with the eyewitness be- fore, during, and after the identification plays a role in the accuracy of eyewitness identifications and in the confidence expressed in the accuracy of those identifications by witnesses.60 Law enforcement’s maintenance of neutral pre-identification communications—relative to the identification of a suspect—is seen as vital to ensuring that the eyewitness is not subjected to conscious or unconscious verbal or behavioral cues that could influence the 58 S-Y Shiu and C. Gatsonis, “The Predictive Receiver Operating Characteristic Curve for the Joint Assessment of the Positive and Negative Predictive Values,” Philosophical Transactions, Series A, Mathematical, Physical and Engineering Sciences 366 (1874): 2313–2333 (2008). 59 Another potentially informative analysis that combines PPV and NPV measures is known as a PROC (predictive ROC), which affords the opportunity to see how a given system or estimator variable may have interacting—synergistic or antagonistic—effects on assertions of guilt and innocence. See Shiu and Gatsonis, “The Predictive Receiver Operating Characteristic Curve.” 60 S. E. Clark, T. E. Marshall, and R. Rosenthal,“Lineup Administrator Influences on Eyewit- ness Identification Decisions,” Journal of Experimental Psychology: Applied 15(1): 63 (2009). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 92 IDENTIFYING THE CULPRIT eyewitness’ identification (see Box 2-1).61 If a witness happened to overhear an officer say, “We’ve got him, but before we finalize the arrest, let’s have the witness confirm it,” the witness might be biased to confirm the suspect’s identity in a showup. Furthermore, some types of law enforcement commu- nication with a witness, after the witness has made an identification (e.g., “Good work! You picked the right guy…”), can increase confidence in an identification, regardless of whether the identification is correct.62 As discussed in Chapter 2, use of “blinded” or “double-blind” lineup identification procedures is an effective strategy for reducing the likeli- hood that a witness will be exposed to cues from interactions with law enforcement (such as feedback) that could influence identifications and/ or confidence in those identifications. More generally, efforts to maintain objectivity and eliminate potentially informative communication will help ensure that eyewitness reports are not contaminated by knowledge or opin- ions held by others. RESEARCH STUDIES ON ESTIMATOR VARIABLES The impact of estimator variables on eyewitness accuracy is harder to measure in the field than the impact of system variables.63 Consequently, estimator variables have been studied nearly exclusively in laboratory set- tings. The committee’s review revealed the need for further empirical re- search in individual studies and systematic reviews of research on these factors. The committee’s review focused on the most-studied estimator vari- ables: weapon focus, stress and fear, own-race bias, exposure, and retention interval. It is important to emphasize, however, that numerous other estima- tor variables may affect both the reliability and the accuracy of eyewitness identifications. Research has shown that the physical distance between the witness and the perpetrator is an important estimator variable, as it directly affects the ability of the eyewitness to discern visual details,64 including features of the perpetrator65 (see discussion of vision in Chapter 4). Re- 61 Clark, Moreland, and Gronlund, “Evolution of the Empirical and Theoretical Foundations of Eyewitness Identification Reform”: “…the performance advantage for unbiased instruc- tions has decreased only slightly over the past 32 years. However, none of the correlations approached statistical significance.” p. 258. 62 Douglas and Steblay, “Memory Distortion in Eyewitnesses.” 63 G. L. Wells, “What Do We Know about Eyewitness Identification?” American Psycholo- gist (May 1993): 553, 555. 64 B. Uttl, P. Graf, and A. L. Siegenthaler, “Influence of Object Size on Baseline Identifica- tion, Priming, and Explicit Memory: Cognition and Neurosciences,” Scandinavian Journal of Psychology 48(4): 281–288 (2007). 65 C. L. Maclean et al., “Post-Identification Feedback Effects: Investigators and Evaluators,” Applied Cognitive Psychology 25(5): 739–752 (2011). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPLIED EYEWITNESS IDENTIFICATION RESEARCH 93 search has also shown that an appearance change can greatly diminish the eyewitness’ ability to recognize the perpetrator; the eyewitness’ ability to remember faces of his or her own age group is often superior to his or her ability to remember faces of another age group (own-age bias); and if an eyewitness hears information or misinformation from another person be- fore law enforcement involvement, his or her recollection of the event and confidence in the identification can be altered (co-witness contamination).66 Interactions between and among these variables have not been addressed systematically by researchers. Weapon Focus The presence of an unusual object at the scene of a crime can impair visual perception and memory of key features of the crime event. Research suggests that the presence of a weapon at the scene of a crime captures the visual attention of the witness and impedes the ability of the witness to attend to other important features of the visual scene, such as the face of the perpetrator (see also discussion of visual attention in Chapter 4). The ensuing lack of memory of these other key features may impair recognition of a perpetrator in a subsequent lineup. A 1992 analysis of weapon focus studies found that the presence of a weapon reduced both identification accuracy and feature accuracy (e.g., the eyewitness’ ability to recall clothing, facial features, and more).67 A more recent analysis of the weapon focus literature concluded that the presence of a weapon has an inconsistent effect on identification accuracy, in that larger effect sizes were observed in threatening scenarios than in non-threatening ones.68 As the retention interval increased, the weapon focus effect size decreased. The analysis further indicated that the effect of a weapon on accuracy is slight in actual crimes, slightly larger in laboratory studies, and largest for simulations. One possible cause of the inconsistent effects of the presence of a weapon is suggested by a recent laboratory-based study that exposed par- ticipants to crime videos.69 These investigators used ROC analysis to inves- tigate discriminability as a function of (1) sequential versus simultaneous lineups; (2) the presence of a weapon; and (3) the presence of a distinctive facial feature. Importantly for the present discussion, discriminability was 66 R. Zajac and N. Henderson, “Don’t It Make My Brown Eyes Blue: Co-Witness Misinfor- mation about a Target’s Appearance Can Impair Target-Absent Lineup Performance,” Memory 17(3): 266–278 (2009). 67 N. K. Steblay, “A Meta-analytic Review of the Weapon Focus Effect,” Law and Human Behavior 16(4): 413, 415–417 (1992). 68 Fawcett et al., “Of Guns and Geese.” 69 Carlson and Carlson, “An Evaluation of Lineup Presentation.” Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 94 IDENTIFYING THE CULPRIT reduced when the perpetrator possessed a weapon, but only when no dis- tinctive facial feature was present. This interaction between weapon focus and distinctive feature highlights the importance of exploring the effects of interactions between different estimator variables on eyewitness identifica- tion performance. Additional questions remain as to what is the cause of reduced eyewit- ness performance in cases where a weapon is present. Is the effect caused by a diversion of selective attention, as is suggested by basic research on the phenomenon of inattentional blindness (see Chapter 4)? Is stress a significant factor, i.e., does anxiety cause the witness to focus less on the features of a person’s face? To what extent is the prominence of the issue an artifact of the particular studies included in the meta-analysis? Is it possible, for example, that the magnitude of the weapon effect depends on whether the data are collected in a laboratory setting versus the real world? To this latter point, some analyses of weapon focus have been conducted using archival records of crimes involving weapons.70 Unfortunately, such efforts often encounter serious methodological difficulties that include a lack of information about the crime (e.g., exposure duration) and the general lack of “ground truth” regarding accuracy of any identification, among other problems. Stress and Fear High levels of stress or fear can affect eyewitness identification.71,72,73 This finding is not surprising, given the known effects of fear and stress on vision and memory (see Chapter 4). Under conditions of high stress, a wit- ness’ ability to identify key characteristics of an individual’s face (e.g., hair length, hair color, eye color, shape of face, presence of facial hair) may be significantly impaired.74 In the particular case of weapon focus, it may not be possible to suf- ficiently test the effects of stress and heightened stress in the laboratory because of limitations on human participant research that uses realistic and heightened threats. A meta-analysis of the effect of high stress on eyewitness 70 See, e.g., Fawcett et al., “Of Guns and Geese.” 71 Deffenbacher et al., “A Meta-Analytic Review of the Effects of High Stress.” 72 C. A. Morgan III et al., “Accuracy of Eyewitness Memory for Persons Encountered Dur- ing Exposure to Highly Intense Stress,” International Journal of Law and Psychiatry 27(3): 265–279 (2004). 73 C. A. Morgan III et al., “Accuracy of Eyewitness Identification Is Significantly Associated with Performance on a Standardized Test of Recognition,” International Journal of Law and Psychiatry 30 (3): 213–223 (2007). 74 C. A. Morgan III et al., “Misinformation Can Influence Memory for Recently Experienced, Highly Stressful Events,” International Journal of Law and Psychiatry 36(1): 11–17 (2013). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPLIED EYEWITNESS IDENTIFICATION RESEARCH 95 memory nonetheless found some support for the notion that stress impairs both eyewitness recall and identification accuracy.75 The study authors noted that lineup type “moderated the effect of heightened stress on the false alarm rate.”76 They also suggested that the modest effect of stress may be caused by the fact that the analysis included many studies that involved modest stress-induction.77 Earlier studies were more mixed but with clearer results at “high levels of cognitive anxiety.”78 The findings of an earlier study “provide a concrete illustration of catastrophic decline” of eyewitness identification perfor- mance at high anxiety levels.79 The correct identification rate went from 75 percent for those with low-state anxiety to 18 percent rate for those with high-state anxiety.80 The effects of suggestion may be particularly important when the original memory is of a highly stressful event. A recent study looked at more than 850 active-duty military personnel participating in a mock POW camp phase of U.S. military survival school training, which included aggressive interrogation and physical isolation-related stress.81 The study found that misinformative details of the interrogation event (e.g., regarding the identity of the interrogator), which were introduced after the event had been encoded into long-term memory, affected identification accuracy. The study also found that memories acquired during stressful events are highly vulnerable to modification by exposure to post-event misinformation, even in individuals whose level of training and experience might be considered relatively immune to such influences. Another recent study comparing the eyewitness accuracy of officers and citizens, concentrated on the effects of stress and weapon focus.82 The results of this study showed that officers were less stressed and aroused than 75 Deffenbacher et al., “A Meta-Analytic Review of the Effects Of High Stress.” It should be noted that the effect sizes for stress-induced support were small with wide confidence intervals, indicating considerable heterogeneity across studies. Although the authors assert that 300 studies with null findings would be required to negate the small effects found in this meta-analysis, fewer studies might be needed if they resulted in opposite effects. 76 Ibid, 700. 77 Ibid, 704. 78 Ibid, 689. 79 T. Valentine and J. Mesout, “Eyewitness Identification Under Stress in the London Dun- geon,” Applied Cognitive Psychology 23(2): 151–161 (2009). 80 K. A. Deffenbacher, “Estimating the Impact of Estimator Variables on Eyewitness Iden- tification: A Fruitful Marriage of Practical Problem Solving and Psychological Theorizing,” Applied Cognitive Psychology 22(6): 822 (2008). 81 Morgan et al., “Misinformation Can Influence Memory.” 82 J. C. DeCarlo, “A Study Comparing the Eyewitness Accuracy of Police Officers and Citi- zens,” (PhD Diss, City University of New York, 2010). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 96 IDENTIFYING THE CULPRIT citizens, but that both police and citizens made more errors when a weapon was inferred or present. Own-Race Bias The race and ethnicity of a witness as it relates to that of the perpetra- tor is another important estimator variable. In eyewitness identification, own-race bias describes the phenomenon in which faces of people of races different from that of the eyewitness are harder to discriminate (and thus harder to identify accurately) than are faces of people of the same race as the eyewitness.83 In the laboratory, this effect is manifested by higher hit rates and lower false alarm rates (higher diagnosticity ratio) in the recogni- tion of an observer’s own race relative to hits and false-alarms for recogni- tion of other races.84 Own-race bias occurs in both visual discrimination and memory tasks, in laboratory and field studies, and across a range of races, ethnicities, and ages. Recent analyses revealed that cross-racial (mis) identification was present in 42 percent of the cases in which an erroneous eyewitness identification was made.85 A recent meta-analysis of own-race bias found an interaction between own-race bias and the duration of viewing exposure: reducing the amount of time allowed for viewing of each face significantly increased the magni- tude of the bias, largely manifested as an increase in the proportion of false alarm responses to other-race faces.86 Own-race bias also interacts with the memory retention interval; cross-race errors of identification were greater when there were longer periods of time between the initial exposure and the memory retrieval.87 A recent study found that “context reinstatement,” wherein a researcher asks an individual to mentally re-create the context in which an incident occurred, failed to influence the identification of other- race faces.88 Although the existence of own-race bias is generally accepted, the causes for this effect are not fully understood. Some possible explanations are rooted in in-group/out-group models of human behavior (e.g., favorit- 83 R. S. Malpass and J. Kravitz, “Recognition for Faces of Own and Other Race,” Journal of Personality and Social Psychology 13(4): 330–334 (1969). 84 Meissner and Brigham, “Thirty Years of Investigating the Own-Race Bias.” 85 The Innocence Project, “What Wrongful Convictions Teach Us About Racial Inequality,” available at: http://www.innocenceproject.org/Content/What_Wrongful_Convictions_Teach_ Us_About_Racial_Inequality.php. 86 Meissner and Brigham, “Thirty Years of Investigating the Own-Race Bias.” 87 Ibid. 88 J. R. Evans, J. L. Marcon, and C.A. Meissner, “Cross-Racial Lineup Identification: As- sessing the Potential Benefits of Context Reinstatement,” Psychology, Crime, and Law 15 (1): 19–28 (2009). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPLIED EYEWITNESS IDENTIFICATION RESEARCH 97 ism in which decisions regarding members of one’s own “group” are re- garded as having greater importance than decisions regarding members of a different “group”) and differential perceptual expertise that results from different degrees of exposure to and familiarity with same versus other races. Recent work has examined the role that stereotyping might play.89 One study suggests that, in general, cross-race identification is further impaired when faces are presented in a group (as opposed to one at a time).90 Addi- tional research is needed to identify procedures that may help estimate the degree of own-race biases in individual eyewitnesses following an identifi- cation procedure. Until the scientific basis for these effects is better under- stood, great care may be warranted when constructing lineups in instances where the race of the suspect differs from that of the eyewitness. Exposure Duration Eyewitness identification researchers have long believed that exposure duration (e.g., time spent observing a perpetrator’s face during a crime) is correlated with greater accuracy of eyewitness identification. The courts also have assumed that exposure duration has an effect on identification accuracy.91 Meta-analyses on the effects of exposure time have found that relatively long exposure durations produce greater accuracy92 and a larger and more stable effect size for exposure duration on eyewitness identi- 89 H. M. Kleider, S. E. Cavrak, and L. R. Knuycky, “Looking Like a Criminal: Stereotypical Black Facial Features Promote Face Source Memory Error,” Memory and Cognition 40(8): 1200–1213 (2012). 90 K. Pezdek, M. O’Brien, and C. Wasson, “Cross-Race (but Not Same-Race) Face Identifica- tion Is Impaired by Presenting Faces in a Group Rather Than Individually,” Law and Human Behavior 36(6): 488–495 (2012). 91 Manson v. Brathwaite, 432 U.S. 98, 114 (1977), for example, included as a factor for assessing the reliability and admissibility of an identification, “the opportunity of the witness to view the criminal at the time of the crime” and explained that this factor includes both the length of time and the viewing conditions. 92 B. H. Bornstein et al., “Effects of Exposure Time and Cognitive Operations on Facial Identification Accuracy: A Meta-Analysis of Two variables Associated with Initial Memory Strength,” Psychology, Crime, and Law 18 (5): 473–490 (2012). The authors state, “We used z as the primary effect size measure for differences between proportions correct, but we also converted z to Pearson’s r for comparability to other meta-analyses (see Tables 1 and 2). The rs were then normalized and averaged to obtain the overall mean effect sizes. We also report the value of Cohen’s d associated with each mean effect size” (Bornstein et al., “Effects of Exposure Time and Cognitive Operations).” Although not defined, presumably z refers to the usual difference in means divided by its standard error, and, from their tables, their r was calculated as z divided by the square root of the report sample size. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 98 IDENTIFYING THE CULPRIT fication accuracy.93 Longer exposures were associated with higher rates of correct identifications and lower false alarm rates. Exposure duration may affect, or interact with, other variables, including own-race bias and the confidence–accuracy relationship assessed immediately after the lineup decision.94 The findings and conclusions from eyewitness identification studies of exposure duration are in keeping with much of the basic research on visual system function (reviewed in Chapter 4). This basic research indicates that the additional information available from longer viewing times reduces un- certainty and enables better detection and discrimination of visual stimuli. Retention Interval Retention interval, or the amount of time that passes from the initial observation and encoding of a memory to a future time when the initial ob- servation must be recalled from memory, can affect identification accuracy. Laboratory studies have demonstrated that stored memories are more likely to be forgotten with the increasing passage of time and can easily become “enhanced” or distorted by events that take place during this retention in- terval (see discussion of memory in Chapter 4). The amount of time between viewing a crime and the subsequent identification procedure can be expected to similarly affect the accuracy of the eyewitness identification, either inde- pendently or in combination with other variables.95 It is difficult to specify the precise relationship between retention inter- val and the accuracy of eyewitness identification testimony and to estimate when a lengthy retention interval will significantly impair the accuracy of identification. Although, in general, it appears that longer retention inter- vals are associated with poorer eyewitness identification performance, the strength of this association appears to vary greatly across the circumstances of the initial encounter, identification procedures, and research method- 93 B. H. Bornstein, K. A. Deffenbacher, E. K. McGorty, and S. D. Penrod, “The Effect of Cognitive Processing on Facial Identification Accuracy: A Meta-Analysis” (Unpublished manu- script, University of Nebraska-Lincoln, 2007). 94 M. A. Palmer, et al., “The Confidence–Accuracy Relationship for Eyewitness Identification Decisions: Effects of Exposure Duration, Retention Interval, and Divided Attention,” Journal of Experimental Psychology: Applied 19(1): 55–71 (2013). 95 One month is the most commonly encountered delay by British police. G. Pike, N. Brace, and S. Kynan, The Visual Identification of Suspects: Procedures and Practice (London: Polic- ing and Reducing Crime Unit, 2002), cited by Deffenbacher et al., “Forgetting the Once-Seen Face.” Law enforcement authorities may have little control over the time required to identify a suspect and obtain the cooperation of the eyewitness to participate in an identification procedure. Thus, retention interval has commonly been considered an estimator variable in eyewitness identification studies. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPLIED EYEWITNESS IDENTIFICATION RESEARCH 99 ologies.96 A meta-analysis of published facial recognition and eyewitness identification studies found, for example, that an increase in the retention interval was associated with a decreased probability of an accurate identifi- cation of a previously seen but otherwise unfamiliar face.97 This same study also found that the rate of forgetting for an unfamiliar face is greatest soon after the initial observation and tends to level off over time, but was unable to specify the shape of this function. The effect of the retention interval also is influenced by the strength and quality of the initial memory that is encoded, which, in turn, may be influenced by other estimator variables associated with witnessing the crime (such as the degree of visual attention) and viewing factors (such as distance, lighting, and exposure duration). As the retention interval be- comes longer, the opportunity for intervening events to alter the memory also becomes greater, and other variables may interact with the retention interval to impair performance (see also discussion of memory in Chapter 4). During the retention interval, the ability to accurately identify faces of other races drops off especially quickly, relative to same-race accuracy.98 Also, for those eyewitnesses who initially express less confidence in their identification, there is a greater decrease in accuracy of identification when the retention interval is longer.99 CONCLUSION Research on eyewitness identification has appropriately identified the variables that may affect an individual’s ability to make an accurate iden- tification. Early research findings played an important role in alerting law enforcement, prosecutors, defense counsel, and the judiciary to factors that 96 See J. Dysart and R. C. L. Lindsay, “The Effects of Delay on Eyewitness Identification Ac- curacy: Should We Be Concerned?” in The Handbook of Eyewitness Psychology: Volume II: Memory for People, ed. R. C. L. Lindsay, D. F. Ross, J. D. Read, and M. P. Toglia. (Mahwah: Lawrence Erlbaum and Associates, 2006), 361–373. 97 Deffenbacher et al., “Forgetting the Once-Seen Face.” More than 20 of the published stud- ies included in the meta-analysis found no significant effect of retention interval. 98 J. L. Marcon et al., “Perceptual Identification and the Cross-Race Effect,” Visual Cogni- tion 18(5): 767–779 (2010) (finding that the cross-race effect was more pronounced when the retention interval was lengthened). Meissner and Brigham, “Thirty Years of Investigat- ing the Own-race Bias” [meta-analysis finding that as retention time increased “participants increasingly adopted a more liberal response criterion when responding to other-race faces. This liberal response criterion indicated that participants required less evidence from memory (e.g., familiarity or memorability of the face) to respond that they had previously seen an other-race face.”]. 99 J. Sauer et al., “The Effect of Retention Interval on the Confidence–Accuracy Relationship for Eyewitness Identification,” Law and Human Behavior 34: 337–347 (2010) (finding greater overconfidence at lengthy retention intervals). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 100 IDENTIFYING THE CULPRIT might influence the accuracy of identifications. In some jurisdictions, eye- witness identification research was used to improve policies and procedures and to educate and train officers. However, much remains unsettled in many areas of eyewitness identification research. While past research appropriately identified system and estimator vari- ables that may affect an individual’s ability to make an accurate iden- tification, this research might be strengthened in several ways. Greater collaboration between the police, courts, and researchers might lead to increased consensus on research agendas and the conceptualization of vari- ables to be examined. More attention to reproducibility and transparency is needed in the selection of data collection strategies and reporting of data. Analyses need to be reported completely, including estimates of ef- fects, confidence intervals, and significance levels. Further, in order to be useful to stakeholders, the statistical findings of this research need to be translated back into terms that can be readily understood by practice and policy decision-makers. Further, our understanding of errors in eyewitness identification will benefit from more effective research designs, more informative statistical measures and analyses, more probing analyses of research findings, and more sophisticated systematic reviews and meta-analyses. In view of the complexity of the effects of both system and estimator variables, and their interactions, on eyewitness identification accuracy, better experimental de- signs that incorporate selected combinations of these variables (e.g., pres- ence or absence of a weapon, lighting conditions, etc.) will elucidate those variables with meaningful influence on eyewitness performance, which can inform law enforcement practice of eyewitness identification procedures. To date, the eyewitness literature has evaluated procedures mostly in terms of a single diagnosticity ratio or an ROC curve; even if uncertainty is incorpo- rated into the analysis, many other powerful tools for evaluating a “binary classifier” are worthy of consideration.100 When primary studies such as those described above are available in sufficient quantities, it is important that their results are synthesized us- ing systematic reviews that conform to current best standards.101 These quantitative reviews would necessarily employ transparent, reproducible procedures for locating all relevant published and unpublished research; employ independent, duplicate procedures for selection of studies, extrac- tion of data, and assessment of risk of bias; use meta-analytic procedures 100 Hastie, Tibshirani, and Friedman, The Elements of Statistical Learning. 101 See, e.g., A. Liberati, et al., “The PRISMA Statement for Reporting Systematic Reviews and Meta-Analyses of Studies That Evaluate Health Care Interventions: Explanation and Elaboration,” PLoS Medicine 6(7): e1000100. doi:10.1371/journal.pmed.1000100 (2009) and Institute of Medicine, Finding What Works in Health Care: Standards For Systematic Reviews (Washington, DC: The National Academies Press, 2011). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPLIED EYEWITNESS IDENTIFICATION RESEARCH 101 that account for the heterogeneity of outcomes both within and across stud- ies; and interpret confidence intervals around pooled effects in a way that is readily understandable by stakeholders. These systematic reviews (which would be regularly updated as new studies are conducted) can be used to further refine the research agenda in eyewitness identification research and to establish priorities for funding of additional primary research. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 103 6 Findings and Recommendations Eyewitnesses make mistakes. Our understanding of how to improve the accuracy of eyewitness identifications is imperfect and evolving. In the previous chapters, we described law enforcement procedures to elicit accurate eyewitness identifications; the courts’ handling of eyewit- ness identification evidence; the science of visual perception and memory as it applies to eyewitness identifications; and the contributions of scientific research to our understanding of the variables that affect the accuracy of identifications. On the basis of its review, the committee offers its findings and recommendations for • identifying and facilitating best practices in eyewitness procedures for the law enforcement community; • strengthening the value of eyewitness identification evidence in court; and • improving the scientific foundation underpinning eyewitness identification. OVERARCHING FINDINGS The committee is confident that the law enforcement community, while operating under considerable pressure and resource constraints, is working to improve the accuracy of eyewitness identifications. These efforts, how- ever, have not been uniform and often fall short as a result of insufficient training, the absence of standard operating procedures, and the continuing Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 104 IDENTIFYING THE CULPRIT presence of actions and statements at the crime scene and elsewhere that may intentionally or unintentionally influence eyewitness’ identifications. Basic scientific research on human visual perception and memory has provided an increasingly sophisticated understanding of how these systems work and how they place principled limits on the accuracy of eyewitness identification (see Chapter 4).1 Basic research alone is insufficient for un- derstanding conditions in the field and thus has been augmented by studies applied to such specific practical problem of eyewitness identification (see Chapter 5). Such applied research has identified key variables that affect the accuracy and reliability of eyewitness identifications and has been in- strumental in informing law enforcement, the bar, and the judiciary of the frailties of eyewitness identification testimony. A range of best practices has been validated by scientific methods and research and represents a starting place for efforts to improve eyewitness identification procedures. A number of law enforcement agencies have, in fact, adopted research-based best practices. This report makes actionable recommendations on, for example, the importance of adopting “blinded” eyewitness identification procedures. It further recommends that standard- ized and easily understood instructions be provided to eyewitnesses and calls for the careful documentation of eyewitness’ confidence statements. Such improvements may be broadly implemented by law enforcement now. It is important to recognize, however, that, in certain cases, the state of sci- entific research on eyewitness identification is unsettled. For example, the relative superiority of competing identification procedures (i.e., simultane- ous versus sequential lineups) is unresolved. The field would benefit from collaborative research among scientists and law enforcement personnel in the identification and validation of new best practices that can improve eyewitness identification procedures. Such a foundation can be solidified through the use of more effective research designs (for example, those that consider more than one variable at a time, and in different study populations to ensure reproducibility and generaliz- ability), more informative statistical measures and analyses (i.e., methods from statistical machine learning and signal detection theory to evaluate the performance of binary classification tasks), more probing analyses of research findings (such as analyses of consequences of data uncertainties), and more sophisticated systematic reviews and meta-analyses (that take 1 Basic research on vision and memory seeks a comprehensive understanding of how these systems are organized and how they operate generally. The understanding derived from basic research includes principles that enable one to predict how a system (such as vision or memory) might behave under specific conditions (such as those associated with witnessing a crime), and to identify the conditions under which it will operate most effectively and those under which it will fail. Applied research, by contrast, empirically evaluates specific hypotheses about how a system will behave under a particular set of real-world conditions. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification FINDINGS AND RECOMMENDATIONS 105 account of current guidelines, including transparency and reproducibility of methods). In view of the complexity of the effects of both system and estimator variables and their interactions on eyewitness identification accuracy, bet- ter experimental designs that incorporate selected combinations of these variables (e.g., presence or absence of a weapon, lighting conditions, etc.) will elucidate those variables with meaningful influence on eyewitness performance, which can, in turn, inform law enforcement practice of eye- witness identification procedures. To date, the eyewitness literature has evaluated procedures mostly in terms of a single diagnosticity ratio or an ROC (Receiver Operating Characteristic) curve; even if uncertainty is incorporated into the analysis, many other powerful tools for evaluating a “binary classifier” are available and worthy of consideration.2 Finally, syntheses of eyewitness research has been limited to meta-analyses that have not been conducted in the context of systematic reviews. Systematic reviews of stronger research studies need to conform to current standards and be translated into terms that are useful for decision-makers. The committee offers the following recommendations to strengthen the effectiveness of policies and procedures used to obtain accurate eyewitness identifications. RECOMMENDATIONS TO ESTABLISH BEST PRACTICES FOR THE LAW ENFORCEMENT COMMUNITY The committee’s review of law enforcement practices and procedures, coupled with its consideration of the scientific literature, has identified a number of areas where eyewitness identification procedures could be strengthened. The practices and procedures considered here involve acquisi- tion of data that reflect a witness’ identification and the contextual factors that bear on that identification. A recurrent theme underlying the commit- tee’s recommendations is development of, and adherence to, guidelines that are consistent with scientific standards for data collection and reporting. Recommendation #1: Train All Law Enforcement Officers in Eyewitness Identification The resolution and accuracy of visual perceptual experience, as well as the fidelity of our memories to events perceived, may be compromised by many factors at all stages of processing (see Chapter 4). Perceptual experi- ences are limited by uncertainties and biased by expectations. Unknown 2 T. Hastie, R. Tibshirani, and J. H. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction (New York: Springer, 2009). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 106 IDENTIFYING THE CULPRIT to the individual, memories are forgotten, reconstructed, updated, and distorted. An eyewitness’s memory can be contaminated by a wide variety of influences, including interaction with the police. The committee recommends that all law enforcement agencies provide their officers and agents with training on vision and memory and the vari- ables that affect them, on practices for minimizing contamination, and on effective eyewitness identification protocols. In addition to instruction at the police academy, officers should receive periodic refresher training, and officers assigned to investigative units should receive in-depth instruction. Dispatchers should be trained not to “leak” information from one caller to the next and to ask for information in a non-leading way. Police officers should be trained to ask open-ended questions, avoid suggestiveness, and efficiently manage scenes with multiple witnesses (e.g., minimize interac- tions among witnesses). Recommendation #2: Implement Double-Blind Lineup and Photo Array Procedures Decades of scientific evidence demonstrate that expectations can bias perception and judgment and that expectations can be inadvertently com- municated.3 Even when lineup administrators scrupulously avoid comments that could identify which person is the suspect, unintended body gestures, facial expressions, or other nonverbal cues have the potential to inform the witness of his or her location in the lineup or photo array. Double-blinding is central to the scientific method because it minimizes the risk that experimenters might inadvertently bias the outcome of their research, finding only what they expected to find. For example, in medical clinical trials, double-blind designs are crucial to account for experimenter biases, interpersonal influences, and placebo effects. To minimize inadvertent bias, double-blinding procedures are some- times used in which the test administrator does not know the composition of the photo array or lineup. If administrators are not involved with con- struction of the lineup and are unaware of the placement of the potential suspect in the sequence, then they cannot influence the witness. Some in the law enforcement community have responded to calls for double-blind lineup administration with concern, citing the potential for increased financial costs and human resource demands. The committee be- lieves there are ways to reduce these costs and recommends that police de- partments consider procedures and new technologies that increase efficiency of data acquisition under double-blind procedures or those procedures that closely approximate double-blind procedures. If an administrator who does 3 See Box 2-1. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification FINDINGS AND RECOMMENDATIONS 107 not know the identity of the suspect cannot be assigned to the task, then a non-blind administrator (one knowing the status of the individuals in the lineup) might use a computer-automated presentation of lineup photos. If computer-based presentation technology is unavailable, then the adminis- trator could place photos in numbered folders that are then shuffled, as is current practice in some jurisdictions. The committee recommends blind (double-blind or blinded) admin- istration of both photo arrays and live lineups and the adoption of clear, written policies and training on photo array and live lineup administration. Police should use blind procedures to avoid the unintentional or intentional exchange of information that might bias an eyewitness. The “blinded” procedure minimizes the possibility of either intentional or inadvertent suggestiveness and thus enhances the fairness of the criminal justice system. Suggestiveness during an identification procedure can result in suppression of both out-of-court and in-court identifications and thereby seriously impair the prosecutions’s ability to prove its case beyond a reasonable doubt. The use of double-blind procedures will eliminate a line of cross- examination of officers in court. Recommendation #3: Develop and Use Standardized Witness Instructions The committee recommends the development of a standard set of easily understood instructions to use when engaging a witness in an identification procedure. Witnesses should be instructed that the perpetrator may or may not be in the photo array or lineup and that the criminal investigation will continue regardless of whether the witness selects a suspect. Administrators should use witness instructions consistently in all photo arrays or lineups, and can use pre-recorded instructions or read instructions aloud, in the manner of the mandatory reading of Miranda Rights. Accommodations should be made when questioning non-English speakers or those with restricted linguistic ability. Additionally, the committee recommends the development and use of a standard set of instructions for use with a wit- ness in a showup. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 108 IDENTIFYING THE CULPRIT Recommendation #4: Document Witness Confidence Judgments Evidence indicates that self-reported confidence at the time of trial is not a reliable predictor of eyewitness accuracy.4 The relationship between the witness’ stated confidence and accuracy of identifications may be greater at the moment of initial identification than at the time of trial. However, the strength of the confidence-accuracy relationship varies, as it depends on complex interactions among such factors as environmental conditions, persons involved, individual emotional states, and more.5 Expressions of confidence in the courtroom often deviate substantially from a witness’ initial confidence judgment, and confidence levels reported long after the initial identification can be inflated by factors other than the memory of the suspect. Thus, the committee recommends that law enforcement docu- ment the witness’ level of confidence verbatim at the time when she or he first identifies a suspect, as confidence levels expressed at later times are subject to recall bias, enhancements stemming from opinions voiced by law enforcement, counsel and the press, and to a host of other factors that render confidence statements less reliable. During the period between the commission of a crime and the formal identification procedure, officers should avoid communications that might affect a witness’ confidence level. In addition, to avoid increasing a witness’ confidence, the administrator of an identification procedure should not provide feedback to a witness. Following a formal identification, the administrator should obtain level of confidence by witness’ self-report (this report should be given in the witness’ own words) and document this confidence statement verbatim. Accommodations should be made for non-English speakers or those with restricted linguistic ability. Recommendation #5: Videotape the Witness Identification Process The committee recommends that the video recording of eyewitness identification procedures become standard practice. 4 See, e.g., C. M. Allwood, J. Knutsson, and P. A. Granhag, “Eyewitnesses Under Influence: How Feedback Affects the Realism in Confidence Judgements,” Psychology, Crime, and Law 12(1): 25–38 (2006); B. H. Bornstein and D. J. Zickafoose, “‘I Know I Know It, I Know I Saw It’: The Stability of the Confidence-Accuracy Relationship Across Domain,” Journal of Experimental Psychology-Applied 5(1): 76–88 (1999); P. A. Granhag, L. A. Stromwall, and C. M. Allwood, “Effects of Reiteration, Hindsight Bias, and Memory on Realism in Eyewitness Confidence,” Applied Cognitive Psychology 14(5): 397–420 (2000); and H. L. Roediger III, J. T. Wixted, and K. A. DeSoto, “The Curious Complexity between Confidence and Accuracy in Reports from Memory” in Memory and Law, ed. L. Nadel and W. P. Sinnott-Armstrong (Oxford: Oxford University Press, 2012). 5 See, e.g., J. M. Talarico and D. C. Rubin, “Confidence, Not Consistency, Characterizes Flashbulb Memories,” Psychological Science 14(5): 455–461 (September 2003). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification FINDINGS AND RECOMMENDATIONS 109 Although videotaping does have drawbacks (e.g., costs, witness advo- cates opposing videotaping of witnesses’ faces, and witnesses not wanting to be videotaped), it is necessary to obtain and preserve a permanent record of the conditions associated with the initial identification. When necessary, efforts should be made to obtain non-intrusive recordings of the initial identification process and to accommodate non-English speakers or those with restricted linguistic ability. Measures should also be taken to protect the identity of eyewitnesses who may be at risk of harm because they make an identification. RECOMMENDATIONS TO STRENGTHEN THE VALUE OF EYEWITNESS IDENTIFICATION EVIDENCE IN COURT The best guidance for legal regulation of eyewitness identification evi- dence comes not from constitutional rulings, but from the careful use and understanding of scientific evidence to guide fact-finders and decision- makers. The Manson v. Brathwaite test under the Due Process Clause of the U.S. Constitution for assessing eyewitness identification evidence was established in 1977, before much applied research on eyewitness identifi- cation had been conducted. That test evaluates the “reliability” of eyewit- ness identifications using factors derived from prior rulings and not from empirically validated sources. As critics have pointed out, the Manson v. Brathwaite test includes factors that are not diagnostic of reliability. More- over, the test treats factors such as the confidence of a witness as indepen- dent markers of reliability when, in fact, it is now well established that confidence judgments may vary over time and can be powerfully swayed by many factors. While some states have made minor changes to the due process framework, (e.g., by altering the list of acceptable “reliability” fac- tors; see Chapter 3), wholesale reconsideration of this framework is only a recent development (e.g., the recent decisions by state supreme courts in New Jersey and Oregon; see Chapter 3). Recommendation #6: Conduct Pretrial Judicial Inquiry Eyewitness testimony is a type of evidence where (as with forms of forensic trace evidence) contamination may occur pre-trial. Judges rarely make pre-trial inquiries about evidence in criminal cases without one of the parties first raising an objection. In cases involving eyewitness evidence, however, parties may not be sufficiently knowledgeable about the relevant scientific research to raise concerns. Judges have an affirmative obligation to insure the reliability of evi- dence presented at trial. To meet this obligation, the committee recom- mends that, as appropriate, a judge make basic inquiries when eyewitness Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 110 IDENTIFYING THE CULPRIT identification evidence is offered. While the contours of such an inquiry would need to be established on a case-by-case basis, at a minimum, the judge could inquire about prior lineups, what information had been given to the eyewitness before the lineup, what instructions had been given to the eyewitness in connection with administering the lineup, and whether the lineup had been administered “blindly.” The judge could also entertain requests from the parties for additional discovery and could ask the parties to brief any issues raised by these inquiries. A judge also could review re- ports of the eyewitness’ confidence and any recordings of the identification procedures. When assessing the reliability of an identification, a judge could also inquire as to what eyewitness identification procedures the agency had in place and the degree to which they were followed. Both pre-trial judicial inquiries and any subsequent judicial review would create an incentive for agencies to adopt written eyewitness identification procedures and to docu- ment the identifications themselves. If these initial inquiries raise issues with the identification process, a judge could conduct a pre-trial hearing to review the reliability and admis- sibility of eyewitness identification evidence and to assess how it should be treated at trial if found admissible. If indicia of unreliable eyewitness identifications are present, the judge should apply applicable law in decid- ing whether to exclude the identifications or whether some lesser sanction is appropriate. As discussed in the sections that follow, a judge may limit portions of the testimony of the eyewitness. A judge can also ensure that the jury is provided with a scientific framework within which to evaluate the evidence. Recommendation #7: Make Juries Aware of Prior Identifications The accepted practice of in-court eyewitness identifications can influ- ence juries in ways that cross-examination, expert testimony, or jury in- structions are unable to counter effectively. Moreover, as research suggests (see Chapters 4 and 5), the passage of time since the initial identification may mean that a courtroom identification is a less accurate reflection of an eyewitness’ memory. In-court confidence statements may also be less reli- able than confidence judgments made at the time of an initial out-of-court identification; as memory fails and/or confidence grows disproportionately. The confidence of an eyewitness may increase by the time of the trial as a result of learning more information about the case, participating in trial preparation, and experiencing the pressures of being placed on the stand. An identification of the kind dealt with in this report typically should not occur for the first time in the courtroom. If no identification procedure was conducted during the investigation, a judge should consider ordering that an identification procedure be conducted before trial. In any case, Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification FINDINGS AND RECOMMENDATIONS 111 whenever the eyewitness identifies a suspect in the courtroom, it is impor- tant for jurors to hear detailed information about any earlier identification, including the procedures used and the confidence expressed by the witness at that time. The descriptions of prior identifications and confidence at the time of those earlier out-of-court identifications provide more useful infor- mation to the fact-finders and decision-makers. Accordingly, the committee recommends that judges take all necessary steps to make juries aware of prior identifications, the manner and time frame in which they were con- ducted, and the confidence level expressed by the eyewitness at the time. Recommendation #8: Use Scientific Framework Expert Testimony The committee finds that a scientific framework describing what factors may influence a witness’ visual experience of an event and the resolution and fidelity of that experience, as well as factors that underlie and influence subsequent encoding, storage, and recall of memories of an event, can in- form the fact-finder in a criminal case. As discussed throughout this report, many scientifically established aspects of eyewitness memory are counter- intuitive and may defy expectations. Jurors will likely need assistance in understanding the factors that may affect the accuracy of an identification. In many cases this information can be most effectively conveyed by expert testimony. Contrary to the suggestion of some courts, the committee recommends that judges have the discretion to allow expert testimony on relevant pre- cepts of eyewitness memory and identifications. Expert witnesses can ex- plain scientific research in detail, capture the nuances of the research, and focus their testimony on the most relevant research. Expert witnesses can convey current information based on the state of the research at the time of a trial. Expert witnesses can also be cross-examined, and limitations of the research can be expressed to the jury. Certainly, qualified experts will not be easy to locate in a given juris- diction; and indigent defendants may not be able to afford experts absent court funds. Moreover, once the defense secures an expert, the prosecution may retain a rebuttal expert, adding complexity to the litigation. Further investigation may explore the effectiveness of expert witness presentation of relevant scientific findings compared with jury instructions. Until there is a clearer understanding of the strengths and weaknesses of this techni- que, the committee views expert testimony as an appropriate and effective means of providing the jury with information to assess the strength of the eyewitness identification. Expert witnesses should not be permitted to testify without limits. An expert explaining the relevant scientific framework can describe the state of the research and focus on the factors that are particularly relevant in a Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 112 IDENTIFYING THE CULPRIT given case. However, an expert must not be allowed to testify beyond the limits of his or her expertise. Although current scientific knowledge would allow an expert to inform the jury of factors bearing on their evaluation of an eyewitness’ identification, the committee has seen no evidence that the scientific research has reached the point that would properly permit an expert to opine, directly or through an equivalent hypothetical question, on the accuracy of an identification by an eyewitness in a specific case. In many jurisdictions, expert witnesses who can testify regarding eye- witness identification evidence may be unavailable. In state courts, funding for expert witnesses may be far more limited than funding in federal courts. The committee recommends that local jurisdictions make efforts to ensure that defendants receive funding to obtain access to qualified experts. Recommendation #9: Use Jury Instructions as an Alternative Means to Convey Information The committee recommends the use of clear and concise jury instruc- tions as an alternative means of conveying information regarding the fac- tors that the jury should consider. Jury instructions should explain, in clear language, the relevant prin- ciples. Like the New Jersey instructions,6 the instructions should allow judges to focus on factors relevant to the specific case, since not all cases implicate the same factors. Jury instructions do not need to be as detailed as the New Jersey model instructions and do not need to omit all reference to underlying research. With the exception of the New Jersey instructions, jury instructions have tended to address only certain subjects, or to repeat the problematic Manson v. Brathwaite language, which was not intended as instructions for jurors. Appropriate legal organizations, together with law enforcement, pros- ecutors, defense counsel, and judges, should convene a body to establish model jury instructions regarding eyewitness identifications. 6 New Jersey Criminal Model Jury Instructions, Identification (July 19, 2012), available at: http://www.judiciary.state.nj.us/pressrel/2012/jury_instruction.pdf. New Jersey Court Rule 3:11, Record of an Out-of-Court Identification Procedure (July 19, 2012), available at: http:// www.judiciary.state.nj.us/pressrel/2012/new_rule.pdf, New Jersey Court Rule 3:13-3. Discov- ery and Inspection (July 19, 2012), available at: http://www.judiciary.state.nj.us/pressrel/2012/ rev_rule.pdf. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification FINDINGS AND RECOMMENDATIONS 113 RECOMMENDATIONS TO IMPROVE THE SCIENTIFIC FOUNDATION UNDERPINNING EYEWITNESS IDENTIFICATION RESEARCH Basic scientific research on visual perception and memory provides important insight into the factors that can limit the fidelity of eyewitness identification (see Chapter 4). Research targeting the specific problem of eyewitness identification (see Chapter 5) complements basic scientific re- search. However, this strong scientific foundation remains insufficient for understanding the strengths and limitations of eyewitness identification procedures in the field. Many of the applied studies on key factors that directly affect eyewitness performance in the laboratory are not readily ap- plicable to actual practice and policy. Applied research falls short because of a lack of reliable or standardized data from the field, a failure to include a range of practitioners in the establishment of research agendas, the use of disparate research methodologies, failure to use transparent and repro- ducible research procedures, and inadequate reporting of research data. The task of guiding eyewitness identification research toward the goal of evidence-based policy and practice will require collaboration in the setting of research agendas and agreement on methods for acquiring, handling, and sharing data. Recommendation #10: Establish a National Research Initiative on Eyewitness Identification To further our understanding of eyewitness identification, the com- mittee recommends the establishment of a National Research Initiative on Eyewitness Identification (hereinafter, the Initiative). The Initiative should involve the academic research community, law enforcement community, the federal government, and philanthropic organizations. The Initiative should (1) establish a research agenda to guide research for the next decade; (2) formulate practice- and policy-relevant research questions; (3) identify op- portunities for additional data collection; (4) systematically review research to examine emerging findings on the impact of system and estimator vari- ables; (5) translate research findings into policies and procedures that are both practical and appropriate for law enforcement; and (6) set priorities and timelines for issues to be addressed, the conduct of research, the devel- opment of best practices, and formal assessments. The committee notes that there appear to be few existing partnerships between the scientific community and law enforcement organizations and therefore recommends that the National Science Foundation (NSF) and the National Institute of Standards and Technology (NIST) take a leadership role working with other federal agencies, such as the National Institute of Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 114 IDENTIFYING THE CULPRIT Justice (NIJ), the Bureau of Justice Statistics (BJS), and the Federal Bureau of Investigation (FBI), to support such collaborations. The impact on society of innocents being incarcerated while perpetra- tors remain free, in conjunction with limited federal resources, highlights the need for both public and private support for this Initiative. To enhance the scientific foundation of eyewitness identification re- search and practice, the Initiative should commit to the following: a. Include a practice- and data-informed research agenda that incor- porates input from law enforcement and the courts and establishes methodological and reporting standards for research to assess the fundamental performance of various aspects of eyewitness identi- fication procedures as well as synthesize research findings across studies. b. Develop protocols and policies for the collection, preservation, and exchange of field data that can be used jointly by the scien- tific and law enforcement communities. Data collection procedures used in the field should be developed to ensure the relevance of the collected data, to facilitate analysis of the data, and to mini- mize potential bias and loss of data through incomplete recording strategies. Law enforcement agencies should take the lead in collecting, maintain- ing, and sharing relevant data from the field. Much of the data that would be useful for the evaluation of eyewitness identification proce- dures have been collected in the form of administrative records and may be readily adapted for use in research. Comprehensive data should be collected on lineup composition and witness selections (i.e., fillers, non-identifications, and position of suspect in lineup). c. Develop and adopt guidelines for the conduct and reporting of applied scientific research on eyewitness identification that con- form to the highest scientific standards. All eyewitness research, including field-based studies, laboratory-based studies, and re- search synthesis, should use rigorous research methods and pro- vide detailed reporting of both methods and results, including (1) pre-registration of all study protocols; (2) investigation of research questions and hypotheses informed by the needs of practice and policy; (3) adoption of strict operationalization of key measures and objective data collection; (4) development of experimental designs informed by analytical concerns; (5) use of proper statisti- cal procedures that account for the often nontraditional nature of data in this field (e.g., estimates of effects with appropriate state- Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification FINDINGS AND RECOMMENDATIONS 115 ments of uncertainty, multiple responses from different scenarios from the same individuals, effects of order and time of presentation when important, treatment of extreme observations or outliers); (6) reporting of participant recruitment and selection and assign- ment to conditions; (7) complete reporting of findings including effect sizes and associated confidence intervals for both significant and non-significant effects; and (8) derivation of conclusions that are grounded firmly in the findings of the study, are framed in the context of the strengths and limitations of study methodology, and clearly state their implications for practice and policy decisions. Strict adherence to guidelines for eyewitness identification research will result in more credible research findings that can guide policy and practice. Research that conforms to guidelines will withstand rigorous scrutiny by peers, will be verifiable through replication, and will permit inclusion in systematic reviews, leading to greater confidence in the validity and generalizability of findings. d. Adopt rigorous standards for systematic reviews and meta-analytic studies. Meta-analyses of primary studies should be conducted only in the context of systematic reviews that locate and critically ap- praise all research findings, including those from unpublished stud- ies. Analyses should consistently appraise and account for possible biases in the included research. Studies that do not adequately con- duct or report research methods, such as randomization, should be identified in the findings. Sensitivity analyses considering impacts of lower quality or inadequately reported studies on pooled effect esti- mates should be conducted and reported. When attempting to draw conclusions from studies with missing data, reviewers should first attempt to contact the authors of the research for additional infor- mation. When missing data cannot be retrieved from researchers, imputation methods should, if used, be specific, transparent, and reproducible. Statistical methods for meta-analysis should conform to current best practice, using models appropriate to the level of heterogeneity of results across studies, computing both point esti- mates and confidence intervals around effect sizes, and translating the results of meta-analyses into terms that are both understand- able and useful to practice and policy decision makers. e. Provide basic instruction for police, prosecutors, defense counsel, and judges on aspects of the scientific method relevant to eye- witness identifications procedures (e.g., the rationale for blinded administration), including principles of research design and the un- certainties associated with data analysis. Training should cover the Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 116 IDENTIFYING THE CULPRIT importance of data collection and interpretation, including the role of standardized eyewitness identification procedures and documen- tation of witness statements of confidence. Competencies acquired through such training (quantitative reasoning, understanding prin- ciples of research design, and recognition of data uncertainties) are likely to apply to issues beyond eyewitness identification. For example, the knowledge and skills from training can be applied to other issues that personnel face, either in forensic science technolo- gies or in process administration, evaluation, and quality improve- ment. Similarly, scientists will benefit from a greater knowledge of legal issues, standards, and procedures related to the problem of eyewitness identification. Training of both communities (law and science) will enhance communication and lead to productive collaborations. The collaborative research initiative between researchers and law en- forcement communities will be challenging as it will necessitate (1) stan- dardized police procedures;7 (2) systematic valid evidence collection and data entry and analysis; and (3) education and training for both research- ers and law enforcement professionals on the differences between these two communities in their use of terms and considerations of standards of evidence and uncertainties in data. These three elements of a collaborative initiative are critical to advancing the science related to eyewitness identifi- cations, as each bears directly on the integrity of the foundation upon which the efficacy and validity of current and future practices will be judged. Without such a foundation, practical advances in our scientific understand- ing are unlikely to occur. The committee further recommends that the Initiative support research to better understand the following: (1) the variables that affect the accu- racy, precision, and reliability of eyewitness identifications, and how those variables interact and vary in practice; (2) the (possibly joint) impact of estimator and system variables on both identification accuracy and response bias; (3) best practices for probing witness memory with the least potential for bias or contamination; (4) best strategies to assess witnesses’ confidence levels when making an identification; (5) appropriate types of instructions for police, witnesses, and juries to best inform and facilitate the collection and interpretation of eyewitness identifications; (6) photo array composi- 7 The term standardized procedures refers to the notion that professionals reliably follow the same set of steps or procedures. Such standardization ensures that data across cases can be considered comparable and, to a greater extent, more reliable. Although reliability is not equivalent to validity, it is essential before researchers can assess questions of validity. Without standardized procedures, valid comparisons between departments and regions of the country cannot be achieved. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification FINDINGS AND RECOMMENDATIONS 117 tion and procedures; (7) identification procedures in the field (showups); (8) innovative technologies that might increase the reliability of eyewit- ness testimony (e.g., algorithm-based computer face recognition software, computer administered photo arrays, and mobile technologies with photo identification programs); and (9) the most effective means of informing jurors how to consider the factors that affect the strengths and weaknesses of eyewitness identification evidence. Recommendation #11: Conduct Additional Research on System and Estimator Variables Among the many variables that can affect eyewitness identification, the procedures for constructing a lineup have received the greatest atten- tion in recent years. As discussed in Chapter 5, the question as to whether a simultaneous or sequential lineup is preferred is a specific case of the more general question of what conditions might improve the performance of an eyewitness. The answer to that question depends upon the criteria used to evaluate performance, and much of the debate has thus focused on the analysis tools for evaluation. These tools have improved significantly over the years, beginning with the use of a diagnosticity ratio, which uses the likelihood that the person identified is actually guilty as an evaluation criterion. More recently, the diagnosticity ratio approach has been aug- mented by analysis of Receiver Operating Characteristics (ROC analysis), which uses a measure of discriminability (i.e., a measure of how well the witness can discriminate between different possible matches to his or her memory of the face of the culprit) as an evaluation criterion. In principle, ROC analysis is a positive step, if only because it incorporates more infor- mation (i.e., the earlier diagnosticity ratio is one component of the ROC analysis). But a more complex question concerns how policy-makers and practitioners should weigh the two evaluation criteria that have been con- sidered thus far—likelihood of guilt and discriminability—when making a decision about which lineup procedures to adopt. The answer is particularly nuanced because the two criteria do not always lead to the same conclusion; one lineup procedure may yield poorer discriminability while at the same time increasing the likelihood that the identified person is actually guilty. The committee concludes that there should be no debate about the value of greater discriminability—to promote a lineup procedure that yields less discriminability would be akin to advocating that the lineup be per- formed in dim instead of bright light. For this reason, the committee rec- ommends broad use of statistical tools that can render a discriminability measure to evaluate eyewitness performance. But a lineup procedure that improves discriminability can yield greater or lesser likelihood of correct identification, depending on how the procedure is applied (see Chapter 5). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 118 IDENTIFYING THE CULPRIT For lineup procedures that yield greater discriminability, greater likelihood of correct identification would appear preferable and can be achieved by methods that elicit a more conservative response bias, such as a sequential (relative to simultaneous) lineup procedure.8 The committee thus recom- mends a rigorous exploration of methods that can lead to more conser- vative responding (such as witness instructions) but do not compromise discriminability. In view of these considerations of performance criteria and recom- mendations about analysis tools, can we draw definitive conclusions about which lineup procedure (sequential or simultaneous) is preferable? At this point, the answer is no. Using discriminability as a criterion, there is, as yet, not enough evidence for the advantage of one procedure over another. The committee thus recommends that caution and care be used when con- sidering changes to any existing lineup procedure, until such time as there is clear evidence for the advantages of doing so. From a larger perspective, the identification of factors (such as specific lineup procedures or states of other system variables) that can objectively improve eyewitness identifica- tion performance must be among the top priorities for this field. This leads us to three additional recommendations. a. The committee recommends a broad exploration of the merits of different statistical tools for use in the evaluation of eyewitness performance. ROC analysis represents an improvement over a single diagnosticity ratio, yet there are well-documented quantita- tive shortcomings to the ROC approach. But are there alternatives? As noted in Chapter 5, the task facing an eyewitness is a binary classification task and there exist many powerful statistical tools for evaluation of binary classification performance that are widely used, for example, in the field of machine learning. While none of these tools has been vetted for application to the problem of eyewitness identification, they offer a potentially rich resource for future investigation in this field. b. The alternative (sequential) lineup procedure was introduced as part of an effort to improve eyewitness performance. While, as noted above, it remains unclear whether the procedure has im- proved eyewitness performance, that goal is still primary. In an effort to achieve that goal, many studies over the past three de- cades have explored the possibility that other factors may also affect performance, but until recently these investigations have not 8 The committee stresses, however, that adoption of a more conservative response bias neces- sitates a compromise by which fewer lineup “picks” are made overall and thus fewer guilty suspects are identified (see Chapter 5). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification FINDINGS AND RECOMMENDATIONS 119 evaluated performance using a discriminability measure. The com- mittee therefore recommends a broad exploration of the effects of different system variables (e.g., additional variants on lineup pro- cedures, witness lineup instructions) and estimator variables (e.g. presence or absence of weapon, elapsed time between incident and identification task, levels of stress) and—importantly—interactions between these variables using either the ROC approach or other tools for evaluation of binary classifiers that can be shown to have advantages over existing analytical methods. c. Building upon the committee’s call for a practice- and data-in- formed research agenda that incorporates input from law enforce- ment and the courts and establishes methodological and reporting standards for research, the committee recommends that the sci- entific community engaged in studies of eyewitness identification performance work closely with law enforcement to identify other system and estimator variables that might influence performance and practical issues that might preclude certain strategies for influ- encing performance. In addition, the committee recommends that policy decisions regarding changes in procedure should be made on the basis of evidence of superiority and should be made in consulta- tion with police departments to determine which procedure yields the best combination of performance and practicality. CONCLUSION Eyewitness identification can be a powerful tool. As this report indi- cates, however, the malleable nature of human visual perception, memory, and confidence; the imperfect ability to recognize individuals; and policies governing law enforcement procedures can result in mistaken identifications with significant consequences. New law enforcement training protocols, standardized procedures for administering lineups, improvements in the handling of eyewitness identification in court, and better data collection and research on eyewitness identification can improve the accuracy of eyewit- ness identifications. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification Appendixes Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 123 Appendix A Biographical Information of Committee and Staff CO-CHAIRS Thomas D. Albright, Ph.D., (NAS) is Professor and Conrad T. Prebys Chair in Vision Research at the Salk Institute for Biological Studies, where he joined the faculty in 1986. Dr. Albright is also Director of the Salk Institute Center for the Neurobiology of Vision, Adjunct Professor of Psychology and Neurosciences at the University of California, San Diego, and Visiting Centenary Professor at the Indian Institute of Science, Bangalore. Dr. Albright is an authority on the neural basis of visual perception, memory, and visually guided behavior. Probing the relationship between the activity of brain cells and perceptual state, his laboratory seeks to un- derstand how visual perception is affected by attention, behavioral goals, and memories of previous experiences. His discoveries address the ways in which context influences visual perceptual experience and the mechanisms of visual associative memory and visual imagery. An important goal of this work is the development of therapies for blindness and perceptual impair- ments resulting from disease, trauma, or developmental disorders of the brain. A second aim of Dr. Albright’s work is to use our growing knowledge of brain, perception, and memory to inform design in architecture and the arts, and to leverage societal decisions and public policy. Albright received a Ph.D. in psychology and neuroscience from Princ- eton University in 1983. He is a recipient of numerous honors for his work, including the National Academy of Sciences Award for Initiatives in Research. Dr. Albright is a member of the National Academy of Sciences, a fellow of the American Academy of Arts and Sciences, a fellow of the Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 124 APPENDIX A American Association for the Advancement of Science, and an associate of the Neuroscience Research Program. He is currently president of the Academy of Neuroscience for Architecture; a member of the National Academy of Sciences Committee on Science, Technology, and Law; and serves on the Scientific Advisory Committee for the Indian National Brain Research Center. Jed S. Rakoff, J.D., has been a United States District Judge for the Southern District of New York since 1996. Prior to his appointment, he was a federal prosecutor (1973–1980) and a criminal defense lawyer at two large New York law firms (1980–1995). Judge Rakoff is coauthor of 5 books and the author of more than 110 published articles, 500 speeches, and 1,200 judi- cial opinions. He has been an Adjunct Professor at Columbia Law School since 1988, teaching upper class seminars in science and the law, class ac- tions, white collar crime, and the interplay of civil and criminal law. Judge Rakoff is a Commissioner on the National Commission on Fo- rensic Science and is a former member of the Governance Board of the MacArthur Foundation Initiative on Law and Neuroscience. He was a mem- ber of the National Research Council Committee on the Development of the Third Edition of the Reference Manual on Scientific Evidence and the Committee on the Review of the Scientific Approaches Used During the FBI’s Investigation of the 2001 Bacillus anthracis Mailings. He is a member of the American Academy of Arts and Sciences and the American Law Institute. He is a Judicial Fellow at the American College of Trial Lawyers, a former director of the New York Council of Defense Lawyers, and former chair of the Criminal Law Committee, New York City Bar Association. Judge Rakoff received a B.A. from Swarthmore College in 1964, an M.Phil. from Oxford University in 1966, and a J.D. from Harvard Law School in 1969. MEMBERS William G. Brooks III is the Chief of the Norwood, Massachusetts Police Department. He began his tenure on May 1, 2012. He served as the Deputy Chief with the Wellesley Police Department from 2000 to 2012. As Deputy Chief, Brooks was involved in hiring, discipline, administration, budget- ing, training, and multi-agency coordination. Prior to 2000, he served as a patrolman with the Westwood Police Department from 1977 to 1982 and as an officer with the Norwood Police Department from 1982 to 2000. In Norwood, he served as a patrolman and sergeant and as a detective sergeant for 14 years, supervising all criminal investigations conducted by detectives. Chief Brooks has been a police academy instructor for 30 years and a presenter on eyewitness identification for 6 years. He presents nation- Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPENDIX A 125 ally on behalf of the Innocence Project, is a member of the Massachusetts Supreme Judicial Court’s Study Committee on Eyewitness Identification, and was the 2012 recipient of the Innocence Network’s Champion of Jus- tice Award. Chief Brooks holds a master’s degree in criminal justice and is a graduate of the FBI National Academy. Joe S. Cecil, Ph.D., J.D, is a Project Director in the Division of Research at the Federal Judicial Center. Currently, he is directing the Center’s Program on Scientific and Technical Evidence. As director, Dr. Cecil is responsible for judicial education and training in the area of scientific and technical evi- dence and served as principal editor of the first two editions of the Center’s Reference Manual on Scientific Evidence, which is the primary source book on evidence for federal judges. He also has published several articles on the use of court-appointed experts. Dr. Cecil is currently directing a research project that examines the difficulties that arise with expert testimony in federal courts, with an emphasis on clinical medical testimony and forensic science evidence. Other areas of research interest include federal civil and appellate procedure, jury competence in complex civil litigation, and assess- ment of rule of law in emerging democracies. Dr. Cecil serves on the edito- rial boards of social science and legal journals. He previously served on the National Academies’ Panel on Confidentiality and Data Access and the Committee on Identifying the Needs of the Forensic Sciences Community. He currently is a member of the National Academy of Sciences’ Commit- tee on Science, Technology, and Law and was a member of its Access to Research Data: Balancing Risks and Opportunities subcommittee. Dr. Cecil received his doctorate (in psychology) and law degree from Northwestern University. Winrich Freiwald, Ph.D., is Assistant Professor, Laboratory of Neural Systems, The Rockefeller University. Dr. Freiwald is interested in the neu- ral processes that form object representations as well as those that allow attention to make those representations available for social behavior and cognition. Dr. Freiwald co-discovered a specialized neural machinery for face processing located in the temporal and frontal lobes of the brain. He and his colleagues further showed that this machinery is composed of a small network of a fixed number of face selective regions, termed face patches, each dedicated to a different aspect of face processing and all closely connected with each other. Dr. Freiwald’s laboratory aims to understand the inner workings of this system, from the level of individual cells to the interactions of brain areas, in order to answer questions such as: How does face selectivity emerge in a single cell? How is information transformed from one face patch to another? What is the contribution of each face patch to different face recognition abilities like the recognition of Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 126 APPENDIX A a friend or a smile? How do the different face patches interact in different tasks? And how is information extracted from a patch when a perceptual decision is made? Dr. Freiwald, a native of Oldenburg, Germany, performed his gradu- ate work at the Max Planck Institute for Brain Research in Frankfurt and received his Ph.D. from Tübingen University in 1998. He then joined the In- stitute for Brain Research at the University of Bremen as a lecturer. Starting in 2001, he worked as a postdoctoral fellow at the Massachusetts Institute of Technology, Massachusetts General Hospital, Harvard Medical School, and the Hanse Institute for Advanced Study in Delmenhorst, Germany. He was head of the primate brain imaging group at the Centers for Advanced Imaging and Cognitive Sciences in Bremen from 2004 to 2008 and a visit- ing associate at the California Institute of Technology in 2009. He joined The Rockefeller University as assistant professor in 2009. Dr. Freiwald was named a Pew Scholar in 2010, a McKnight Scholar in 2011, and a NYSCF—Robertson Neuroscience Investigator in 2013. Brandon L. Garrett is the Roy L. and Rosamond Woodruff Morgan Pro- fessor of Law at the University of Virginia Law School. Garrett joined the law faculty in 2005. His research and teaching interests include criminal procedure, wrongful convictions, habeas corpus, corporate crime, scientific evidence, civil rights, civil procedure, and constitutional law. Mr. Garrett’s recent research includes studies of DNA exonerations, organizational prosecutions, and eyewitness identification procedures in Virginia. In 2011, Harvard University Press published Mr. Garrett’s book, Convicting the Innocent: Where Criminal Prosecutions Go Wrong, exam- ining the cases of the first 250 people to be exonerated by DNA testing. In 2013, Foundation Press published his co-authored casebook, Federal Habeas Corpus: Executive Detention and Post-Conviction Litigation. Mr. Garrett is currently completing a new book, in contract with Harvard University Press, examining corporate prosecutions. Mr. Garrett attended Columbia Law School, where he was an articles editor of the Columbia Law Review and a Kent Scholar. After graduating, he clerked for the Honorable Pierre N. Leval of the United States Court of Appeals for the Second Circuit. He then worked as an associate at Neufeld, Scheck & Brustin LLP in New York City. Karen Kafadar, Ph.D., is Commonwealth Professor and Chair of Statistics at the University of Virginia. Dr. Kafadar received her B.S. in mathemat- ics and M.S. in statistics at Stanford University and her Ph.D. instatis- tics from Princeton University. Before joining the Statistics Department in 2014, she was Mathematical Statistician at the National Institute of Stan- dards and Technology, member of the technical staff at Hewlett Packard’s Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPENDIX A 127 RF/Microwave R&D Department, Fellow in the Division of Cancer Pre- vention at National Cancer Institute, Professor and Chancellor’s Scholar at University of Colorado-Denver, and Rudy Professor of Statistics at Indiana University-Bloomington. Her research focuses on robust methods, explor- atory data analysis, characterization of uncertainty in the physical, chemi- cal, biological, and engineering sciences, and methodology for the analysis of screening trials, with awards from CDC, American Statistical Association (ASA), and American Society for Quality. Kafadar was editor of Technometrics and the review section of the Journal of the American Statistical Association and is currently Biology, Medicine, and Genetics Editor for The Annals for Applied Statistics. She has served on several National Research Council committees and is a past or present member on the governing boards for ASA, Institute of Math- ematical Statistics, International Statistical Institute, and National Institute of Statistical Sciences. She is a Fellow of the ASA, the American Association for the Advancement of Science, and the International Statistics Institute; she has authored more than 100 journal articles and book chapters; and has advised numerous M.S. and Ph.D. students. A.J. Kramer, J.D., is Federal Public Defender for the District of Columbia. He earned a Bachelor’s of Arts from Stanford University (1975), followed by a Juris Doctorate from the Boalt Hall School of Law at the University of California at Berkeley (1979). Mr. Kramer clerked for the Honorable Procter Hug, Jr., at the United States Court of Appeals for the Ninth Cir- cuit in Reno, Nevada. He spent seven years as an Assistant Federal Public Defender in San Francisco, California, followed by three years as the Chief Assistant Federal Public Defender in Sacramento, California. He taught legal research and writing at Hastings College of the Law, University of California, San Francisco from 1982 to 1988. Mr. Kramer was appointed Federal Public Defender for the District of Columbia in 1990. A permanent faculty member at the National Criminal Defense College in Macon, Georgia, and at the Western Trial Advocacy Institute in Laramie, Wyoming, Mr. Kramer is a Fellow of the American College of Trial Law- yers. He is currently a member of the American Bar Association Criminal Justice Section Council and a member of the United States Judicial Confer- ence Advisory Committee on the Rules of Evidence. Scott McNamara, J.D., graduated from Syracuse University with a major in mathematics. Mr. McNamara attended Vermont Law School, graduat- ing cum laude in 1991. On July 20, 1992, he became an Oneida County Assistant District Attorney. As such, he handled thousands of cases with a concentration in narcotic and homicide prosecutions. McNamara was the Bureau Chief of the Narcotics Unit for twelve years, and he was also Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 128 APPENDIX A the First Assistant District Attorney for six years. During his years in the District Attorney’s Office, he was a member and the lead prosecutor as- signed to the Oneida County Drug Task Force. He also chaired the Oneida County District Attorney’s Office Death Penalty Committee. From 2001 to 2006, Mr. McNamara represented the District Attorney’s Office on the Joint Terrorism Task Force. In January of 2007, Mr. McNamara took office as the Oneida County District Attorney and has since been elected, and re- elected, by the citizens of Oneida County. His tenure as District Attorney has been one of proactive engagement and problem-solving. He has created an Economic Crime Unit, a Conviction Integrity Unit, and he has appointed a community liaison to improve communication and accessibility between the District Attorney’s Office and the diverse population it serves. In addi- tion, Mr. McNamara initiated a strategy of video recording all police inter- rogations in Oneida County. He has always maintained that his goal as the county’s chief law enforcement officer is to continue the legacy of bringing justice to those victimized by crime while recognizing the need to safeguard and enhance fairness within the legal system. For 10 years, Mr. McNamara taught search and seizure at the Mohawk Valley Police Academy. He was also an adjunct instructor at Mohawk Valley Community College, where he taught both criminal law and constitutional criminal procedural law. McNamara currently is an adjunct instructor at Utica College, where he teaches legal concepts of criminal fraud. Charles Alexander Morgan III, M.D., is Associate Clinical Professor of Psy- chiatry, Yale University School of Medicine. Over the course of twenty years at Yale University and the Neurobiological Studies Unit of National Center for Posttraumatic Stress Disorder, Dr. Morgan’s neurobiological and foren- sic research has established him as an international expert in posttraumatic stress disorder (PTSD), in eyewitness memory, and in human performance under conditions of high stress. He is a forensic psychiatrist and has testified as an expert on memory and PTSD at the International Tribunal on War Crimes, the Hague, Netherlands. Dr. Morgan is subject matter expert in the selection and assessment of U.S. Military Special Operations and Special Mission Units. His work has provided insight into the psycho-neurobiology of resilience in elite soldiers and has contributed to the training mission of U.S. Army special programs. For his work in the special operations com- munity, Dr. Morgan was awarded the U.S. Army Award for Patriotic Service in 2008. In 2010, Dr. Morgan was awarded the Sir Henry Welcome Medal and Prize for his research on enhancing cognitive performance under stress in special operations personnel. In 2011, Dr. Morgan deployed to Afghani- stan as an operational advisor with the Asymmetric Warfare Group. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPENDIX A 129 Elizabeth A. Phelps, Ph.D., is Silver Professor of Psychology and Neural Science at New York University. Her research examines the cognitive neu- roscience of emotion, learning, and memory. Her primary focus has been to understand how human learning and memory are changed by emotion and to investigate the neural systems mediating their interactions. She has approached this topic from a number of different perspectives, with an aim of achieving a more global understanding of the complex relations between emotion and memory. As much as possible, Dr. Phelps has tried to let the questions drive the research, not the techniques or traditional definitions of research areas. Dr. Phelps has used a number of techniques (behavioral studies, physiological measurements, brain-lesion studies, fMRI) and has collaborated with a number of people in other domains (social and clini- cal psychologists, psychiatrists, neuroscientists, economists, physicists). Dr. Phelps received a Ph.D. in neuroscience from Princeton University. Daniel J. Simons, Ph.D., is a professor in the department of psychology at the University of Illinois, where he heads the Visual Cognition Laboratory. His research explores the limits of awareness and memory, the reasons why we often are unaware of those limits, and the implications of such limits for our personal and professional lives. He is best known for his research that demonstrates how people are far less aware of their visual surround- ings than they think. Dr. Simons received his B.A. from Carleton College and his Ph.D. in experimental psychology from Cornell University. He then spent 5 years on the faculty at Harvard University before being recruited to Illinois in 2002. He has published more than 50 articles for professional journals, and his work has been supported by the National Institutes of Health, the National Science Foundation, and the Office of Naval Research. He is a Fellow and Charter Member of the Association for Psychological Science and an Alfred P. Sloan Fellow, and he has received many awards for his research and teaching, including the 2003 Early Career Award from the American Psychological Association. His research adopts methods ranging from real-world and video-based approaches to computer-based psycho- physical techniques, and it includes basic behavioral measures, survey and individual difference methods, simulator studies, and training studies. This diversity of approaches helps establish closer links between basic research on the mechanisms of attention, perception, memory, and awareness and how those mechanisms operate in the real world. In addition to his scholarly research, Dr. Simons is the co-author (with Christopher Chabris) of the New York Times bestselling book, The Invis- ible Gorilla. He has penned articles for the New York Times, the Wall Street Journal, the Los Angeles Times, and the Chicago Tribune (among others), and he appears regularly on radio and television. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 130 APPENDIX A Anthony D. Wagner, Ph.D., is a Professor of Psychology and Neurosci- ence and Co-Director, Center for Cognitive and Neurobiological Imaging, Stanford University. He is also Director of the Stanford Memory Labora- tory. At Stanford since 2003, Dr. Wagner’s research explores how the brain supports learning, memory, and executive function. In addition to his basic science, his research examines memory dysfunction in clinical populations and the role of neuroscience evidence in legal and educational settings. He is on the faculty in the Psychology Department and participates in the Neurosciences Program, the Symbolic Systems Program, the Human Biol- ogy Program, and the Stanford Center for Longevity. Externally, he is a member of the MacArthur Foundation’s Research Network on Law and Neuroscience. He is a Fellow of the American Association for the Advance- ment of Science, and a recipient of the American Psychological Associa- tion’s Distinguished Scientific Award for Early Career Contribution, among other honors. Dr. Wagner received a Ph.D. in psychology from Stanford University in 1997. Joanne Yaffe, Ph.D., is Professor, College of Social Work, University of Utah and Adjunct Professor of Psychiatry, College of Medicine, University of Utah. Her scholarly interests are in evidence based practice and using scientific knowledge for policy and practice decisions. She is particularly interested in the synthesis of research through systematic reviews and meta-analysis, and, with colleagues in the United Kingdom, was funded by the Cochrane Collaboration to develop guidelines for reporting sys- tematic reviews without included studies. She is affiliated with the Social Welfare Coordinating Group and the Knowledge Translation Group of the Campbell Collaboration and has worked with the Methods Group of the Cochrane Collaboration. Dr. Yaffe is a member of the International Advisory Group for CONSORT-SPI, which has developed guidelines for the reporting of randomized trials for complex social and psychological interventions. Dr. Yaffe received a B.S. in Psychology from University of Massachusetts, an M.S.W. from the University of Michigan, and a Ph.D. in Social Work and Psychology from the University of Michigan. She has advanced training in systematic reviews and meta-analysis. STAFF Anne-Marie Mazza, Ph.D., is the Director of the Committee on Science, Technology, and Law. Dr. Mazza joined the National Academies in 1995. She has served as Senior Program Officer with both the Committee on Science, Engineering and Public Policy and the Government-University- Industry Research Roundtable. In 1999, she was named the first director of the Committee on Science, Technology, and Law, a newly created activity Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPENDIX A 131 designed to foster communication and analysis among scientists, engineers, and members of the legal community. Dr. Mazza has been the study director on numerous Academy reports including, Reference Manual on Scientific Evidence, 3rd Edition (2011); Review of the Scientific Approaches Used During the FBI’s Investigation of the 2001 Anthrax Letters (2011); Manag- ing University Intellectual Property in the Public Interest (2010); Strength- ening Forensic Science in the United States: A Path Forward (2009); Science and Security in A Post 9/11 World (2007); Reaping the Benefits of Genomic and Proteomic Research: Intellectual Property Rights, Innovation, and Public Health (2005); and Intentional Human Dosing Studies for EPA Regulatory Purposes: Scientific and Ethical Issues (2004). Between October 1999 and October 2000, Dr. Mazza divided her time between the National Academies and the White House Office of Science and Technology Policy, where she served as a Senior Policy Analyst responsible for issues associated with a Presidential Review Directive on the government-university research partnership. Before joining the Academy, Dr. Mazza was a Senior Consul- tant with Resource Planning Corporation. She is a fellow of the American Association for the Advancement of Science. Dr. Mazza was awarded a B.A., M.A., and Ph.D. from The George Washington University. Arlene F. Lee, J.D., is the Board Director for the Committee on Law and Justice (CLAJ). Prior to joining CLAJ, Ms. Lee was the Director of Policy at the Center for the Study of Social Policy, where she focused on helping federal and state elected officials develop research-informed policies and funding to improve results for children and families. In this capacity, she oversaw PolicyforResults.org, a leading national resource for results-based policy. Previously she was the Executive Director of the Maryland Gov- ernor’s Office for Children, where she chaired the Children’s Cabinet and was responsible for the cabinet’s fund of 60+ million dollars annually. She has served as the Deputy Director of the Georgetown University Center for Juvenile Justice Reform, Director of the Federal Resource Center for Chil- dren of Prisoners, and Youth Strategies Manager for the Governor’s Office of Crime Control and Prevention. Ms. Lee is also the author of numerous articles and coauthored The Impact of the Adoption and Safe Families Act on Children of Incarcerated Parents. She has a B.A. in Sociology from Washington College and a J.D. from Washington College of Law, American University. As a result of her work, Ms. Lee was named one of Maryland’s Top 100 Women and has received three Governor’s Citations. Steven Kendall, Ph.D., is Program Officer for the Committee on Science, Technology, and Law. Dr. Kendall has contributed to numerous Academy reports including the Reference Manual on Scientific Evidence, 3rd Edition (2011); Review of the Scientific Approaches Used During the FBI’s Inves- Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 132 APPENDIX A tigation of the 2001 Anthrax Mailings (2011); Managing University Intel- lectual Property in the Public Interest (2010); and Strengthening Forensic Science in the United States: A Path Forward (2009). Dr. Kendall received his Ph.D. from the Department of the History of Art and Architecture at the University of California, Santa Barbara, where he wrote a dissertation on 19th century British painting. He received his M.A. in Victorian Art and Architecture at the University of London. Prior to joining the National Re- search Council in 2007, Dr. Kendall worked at the Smithsonian American Art Museum and The Huntington in San Marino, California. Karolina Konarzewska is Program Coordinator for the Committee on Sci- ence, Technology, and Law. Ms. Konzarzewska received a B.A. in Political Science from the College of Staten Island, City University of New York and an M.A. in International Relations, New York University. Prior to joining The National Academies, she worked at various research institutions in Washington, DC, where she covered political and economic issues pertain- ing to Europe, Russia, and Eurasia. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 133 Appendix B Committee Meeting Agendas Meeting 1 Washington, DC Monday, 2 December 2013 OPEN SESSION 8:00 Continental Breakfast 8:30 Opening Remarks and Introductions Co-chairs: Thomas D. Albright, Salk Institute for Biological Studies Jed S. Rakoff, U.S. District Court for the Southern District of New York 8:45–9:30 Charge to the Committee Speaker: Anne Milgram, Laura and John Arnold Foundation 9:30–11:00 The Science of Memory—A Dynamic Process Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 134 APPENDIX B Speakers: Daniel L. Schacter, Harvard University (via videoconference) John T. Wixted, University of California, San Diego 11:00–11:15 Break 11:15–12:00 Overview of Eyewitness Identification Speaker: Gary L. Wells, Iowa State University 12:00–1:00 Lunch 1:00–2:30 Meta-Analytical Reviews of System and Estimator Variables Speakers: Nancy K. Steblay, Augsburg College Christian A. Meissner, Iowa State University Kenneth Deffenbacher, University of Nebraska at Omaha 2:30–3:00 Strengths and Weaknesses of Eyewitness Research Methodologies Speaker: Steven D. Penrod, John Jay College of Criminal Justice 3:00–3:30 General Acceptance of Eyewitness Testimony Research Speaker: Saul Kassin, John Jay College of Criminal Justice 3:30–3:45 Break Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPENDIX B 135 3:45–4:15 Simultaneous and Sequential Lineups Speaker: Roy S. Malpass, University of Texas at El Paso 4:15–5:15 Perspectives on Eyewitness Identification Speakers: John Firman, International Association of Chiefs of Police David LaBahn, Association of Prosecuting Attorneys Kristine Hamann, National District Attorney’s Association Barry Scheck, The Innocence Project Tuesday, 3 December 2013 CLOSED SESSION: 8:00–9:15 OPEN SESSION 9:30–10:15 Police Practices Speakers: Joseph Salemme, Chicago Police Department Rob Davis, Police Executive Research Forum 10:15–11:45 Judicial Findings and Recommendations—Including Jury Instructions Speakers: The Honorable Robert J. Kane, Supreme Judicial Study Group on Eyewitness Identification (MA) The Honorable Geoffrey Gaulkin, Special Master, State v. Henderson (NJ) The Honorable Paul De Muniz, Oregon Supreme Court The Honorable Barbara Hervey, Texas Court of Criminal Appeals Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 136 APPENDIX B 11:45–12:30 Research on Jury Instructions Speakers: Shari Seidman Diamond, Northwestern University and American Bar Foundation David V. Yokum, University of Arizona CLOSED SESSION: 12:30–2:00 Meeting 2 Washington, DC Thursday, 6 February 2014 OPEN SESSION 8:30–8:45 Opening Remarks and Introductions Co-chairs: Thomas D. Albright, Salk Institute for Biological Studies Jed S. Rakoff, U.S. District Court for the Southern District of New York 8:45–9:30 The Illinois Pilot Program on Sequential Double-Blind Identification Procedures Speaker: Sheri Mecklenburg, U.S. Department of Justice 9:30–10:15 Face Recognition and Human Identification Speaker: P. Jonathon Phillips, National Institute of Standards and Technology 10:15–10:30 Break Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPENDIX B 137 10:30–11:15 Evaluating Eyewitness Research in Court: Moving from General to Specific Inference Speaker: John Monahan, University of Virginia 11:15–12:00 Eyewitness Identification from the Perspective of State Attorney Generals Speaker: Peter Kilmartin, State of Rhode Island 12:00–12:45 Lunch 12:45–1:30 Costs and Benefits of Eyewitness Identification Reforms Speaker: Steven E. Clark, University of California, Riverside 1:30–2:30 Misinformation and the Creation of False Memories Speaker: Elizabeth Loftus, University of California, Irvine—via videoconference 2:30–3:15 Obtaining Better Descriptive Information: The Use of the Cognitive Interview Speaker: Ronald Fisher, Florida International University CLOSED SESSION: 3:30–5:30 Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 138 APPENDIX B Friday, 7 February 2014 CLOSED SESSION: 8:00–2:00 Meeting 3 Washington, DC Thursday, 24 April 2014 OPEN SESSION 10:30 Welcome Co-chairs: Thomas D. Albright, Salk Institute for Biological Studies Jed S. Rakoff, U.S. District Court for the Southern District of New York 10:35–11:30 Photo Arrays in Eyewitness Identification Procedures Speaker: Karen L. Amendola, Police Foundation CLOSED SESSION: 11:45–5:00 Friday, 25 April 2014 CLOSED SESSION: 8:30–3:00 Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 139 Appendix C Consideration of Uncertainty in Data on the Confidence-Accuracy Relationship and the Receiver Operating Characteristic (ROC) Curve What has happened is history. What might have happened is science and technology. So what you are really interested in is what might have hap- pened if you could do it all over again. John W. Tukey, 18 November 1992, in a discussion of assessing the uncertainty in cancer mortality rates at the National Cancer Institute Both the Receiver Operating Characteristic (ROC) and the confidence– accuracy relationship involve data (usually, as the proportions of par- ticipants in a given study that meet some criterion) and hence are subject to various sources of uncertainty, including measurement error, random variations from external conditions, and biases (such as the tendency to respond “conservatively” or “liberally”; see examples of these biases in Chapter 5). Appendix C focuses on quantification of uncertainty in some of the errors caused by measurement and other random sources. Because the confidence-based ROC curve is justified by an implicit assumption that confidence and accuracy are related, the first section of this appendix dis- cusses the incorporation of uncertainty when assessing the strength of the confidence–accuracy relationship, and the second section does the same for the ROC curve. In what follows, HR denotes the hit rate (or “sensitivity” of a procedure on which the confidence–accuracy relationship or ROC is Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 140 APPENDIX C being constructed), and FAR (or, 1 – specificity; see Chapter 5) denotes the false alarm rate.1 CONFIDENCE–ACCURACY RELATIONSHIP When authors talk about the confidence–accuracy relationship, they usually are referring to a correlation coefficient or to a slope of the line fitted to the points (C, A), where a measure of the eyewitness’ expressed confidence level C is on the x-axis, and a measure of the witnesses’s ac- curacy A is on the y-axis. However one measures the significance of the confidence–accuracy relationship (e.g., in either a correlation coefficient or a slope of the line fitted to the [C, A] points), it is important to note that both expressed confidence level (C) and reported accuracy (A) are based on data and thus are subject to uncertainty, both from random and systematic sources of variation and from biases (see, e.g., Chapter 5 for examples of biases and other variables, such as the type of lineup procedure). In this appendix, we consider the effects of uncertainty in only “A” and “C” in assessing the strength of the confidence–accuracy relationship. Ideally, one would repeat the incident multiple times and assess the error in the repeti- tions. Unfortunately, such repetition is usually not possible, and one must rely on approximate measures of uncertainty with regard to the (C, A) points. Approaches for characterizing the uncertainty in the confidence– accuracy relationship, using data in the published literature, follow. Consider the following data:2 1) n1 = 44 participants who expressed “Low” confidence (confidence ratings 1,2,3); their overall accuracy was stated as 61%. Taking the median of these three confidence ratings, C1 = 2 and A1 = 0.61. The estimated standard error of this proportion is (0.61 · 0.39/44)1/2 = 0.0735. 1 The data cited here are used for convenience, as the source publications provided sufficient details about the illustrations. 2 These data are cited in H. L. Roediger III, J. T. Wixted, and K. A. DeSoto. “The Curious Complexity Between Confidence and Accuracy in Reports from Memory” in Memory and Law, ed. L. Nadel and W. P. Sinnott-Armstrong (Oxford: Oxford University Press, 2012), p. 109, who in turn cite Odinot, Wolters, and van Koppen [G. Odinot, G.Wolters, and P. J. van Koppen, “Eyewitness Memory of a Supermarket Robbery: A Case Study of Accuracy And Confidence after 3 Months," Law and Human Behavior 33: 506–514 (2009)] as the source of these data, from nine “central witnesses” (five other witnesses were not interviewed by the police). The sample sizes (44, 203, 326) apparently arise from having “averaged across different categories (person descriptions, object descriptions, and action details) for the nine central witnesses interviewed in that study”; see J. T. Wixted et al., “Confidence Judgments Are Useful in Eyewitness Identifications: A New Perspective,” submitted to Applied Psychol- ogy 2014, p. 17. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPENDIX C 141 2) n2 = 203 participants who expressed “Medium” confidence (con- fidence ratings 4,5,6); their overall accuracy was stated as 71%. Taking the median of these three confidence ratings, C2 = 5 and A2 = 0.71. The estimated standard error of this proportion is (0.71 · 0.29/203)1/2 = 0.0318. 3) n3 = 326 participants who expressed “High” confidence (confi- dence rating 7); their overall accuracy was stated as 85%. Thus, C3 = 7 and A3 = 0.85. The estimated standard error of this propor- tion is (0.85 · 0.15/326)1/2 = 0.0198. A plot of these three data points might suggest a highly convincing relationship between accuracy and confidence. However, the relationship is not “statistically significant” when assessed via a weighted linear regres- sion (where weights are inversely proportional to either the standard errors or the variances), nor via an unweighted Pearson correlation coefficient or a Spearman’s rank correlation coefficient (which depends less on the assignment of “Low,” “Medium,” and “High” as 2, 5, 7, respectively, than do the other two methods). Separate tests comparing the proportions 0.85 (“High”) versus either 0.71 (“Medium”) or 0.61 (“Low”) are “sta- tistically significant,” but not the test for comparing the proportions 0.71 (“Medium”) and 0.61 (“Low”). Statistical significance is difficult to achieve with only three data points. Moreover, none of these tests takes into ac- count the potential for error in the self-reported “C” values (2,5,7), which, as discussed in the previous paragraph, is likely to exist. Consider a second set of data, reported in Juslin, Olsson, and Winman.3 In this article, the authors considered two lineup conditions, denoted as “suspect-similarity” and “culprit-description.” The authors correctly note that the identification rates at each expressed confidence level for these two conditions are very similar; hence, as the condition had no effect on identification accuracy, one might as well pool “successes/trials” across the two conditions to reduce the uncertainty in each of the accuracy rates and thus gain greater power. Even after combining the two conditions, however, the numbers of trials in the 10 ECL categories (0.1 = “10% confident,” 0.2 = “20% confident” ... 1.0 = “100% confident”) are not very high (the 10 numbers range from 7 for ECL = 20% to 45 for ECL = 90%). To increase the chances of seeing a meaningful relationship between confidence and accuracy, the authors pool 0.1 with 0.2, 0.3 with 0.4, 0.5 with 0.6, 0.7 with 0.8, and 0.9 with 3 P. Juslin, N. Olsson, and A. Winman, “Calibration and Diagnosticity of Confidence in Eyewitness Identification: Comments on What Can Be Inferred from the Low Confidence- Accuracy Correlation,” Journal of Experimental Psychology: Learning, Memory, and Cogni- tion 22(5): 1304–1316 (September 1996). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 142 APPENDIX C 1.0. Although Table 2 in Juslin, Olsson, and Winman provides the counts (numbers of trials), it does not tabulate the accuracies (numbers of correct responses). One can estimate these accuracies by weighted averages of the displayed percentages shown in the plots in their Figure 24 for the “suspect- similarity condition” (“A” = 0.27, 0.38, 0.51, 0.55, 0.87; n = 15, 21, 25, 29, 51) and for the “culprit-description condition” (“A” = 0.18, 0.66, 0.63, 0.90, 0.91; n = 10, 18, 28, 41, 37). In the confidence level categories (15%, 35%, 55%, 75%, 95%), the accuracies (with their standard errors and the total sample sizes on which they are based following them in parentheses) are, respectively, 23.4% (8.5%, n = 25), 50.9% (8.0%, n = 39), 52.6% (6.9%, n = 53), 75.5% (5.1%, n = 70), and 88.7% (3.4%, n = 88). For these data, both the unweighted correlation coefficient, 0.9766 (t-statistic = 7.865, p- value 0.004), and the slope of the weighted linear regression (points weighted inversely proportional to their standard errors), 0.773 (standard error 0.085, p-value 0.003), are statistically significant, in that such convincing data of a relationship between correlation and accuracy would be unlikely to arise if, in fact, no association existed. Another method for assessing the significance of the unweighted cor- relation is through the simulation of a large number of trials on the basis of the data that were observed. For each trial, one can first simulate five confidence values, uniformly distributed between the endpoints that were observed: c1 is uniformly distributed between (0.05, 0.25) (mean is the ob- served 0.15); c2 is uniformly distributed between (0.25, 0.45) (mean is the observed 0.35); ... c5 is uniformly distributed between (0.85, 1.00). Next, one simulates five proportions using the observed conditions: a1 is a bino- mial variate (n = 25, p = 0.234) divided by n = 25; a2 is a binomial variate (n = 39, p = 0.509) divided by n = 39; ... a5 is a binomial variate (n = 88, p = 0.887) divided by n = 88. For each trial with five simulated c values and their five corresponding a values, one calculates a Pearson correlation coef- ficient. Figure C-1 shows a plot of the five data points, with limits of one standard error on the estimated accuracies (left panel) and the histogram of the 1,000 simulated Pearson correlation coefficients (right panel). The me- dian is 0.9534 (close to the observed 0.9766), the upper and lower quartiles are 0.916 and 0.977, and the central 90% of the 1,000 values lie between 0.8650 and 0.993. Thus, an approximate 90% confidence interval for the true correlation coefficient (0.865, 0.993) definitely does not include zero, a further indication of the significance of the Pearson correlation coefficient. The example above illustrates the importance of incorporating known uncertainty in the estimated accuracy for the confidence level category. The relationship between confidence and accuracy should take into account (1) 4 See pages 1310–1311 of Juslin, Olsson, and Winman for the data in their Table 2 and Figure 2, respectively. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPENDIX C 143 the repeated responses of a limited number of “eyewitnesses” in the study and (2) the uncertainty in an eyewitness’ “expressed confidence level.” The 2009 National Research Council report, Strengthening Forensic Science in the United States: A Path Forward, cited studies in which fingerprint exam- iners reached different conclusions when presented with exactly the same evidence at a later time.5 Quite possibly, in many of these laboratory studies on which these confidence–accuracy relationships are based, participants 5 National Research Council, Strengthening Forensic Science in the United States: A Path Forward (Washington, DC: The National Academies Press, 2009), p. 139. FIGURE C-1 Data Inferred from Juslin, Olsson, and Winman. NOTE: Adapted from Juslin, Olsson, and Winman, “Calibration and Diagnosticity of Confidence in Eyewitness Identification.” The left panel plots confidence-accu- racy data from p. 1311. Data are pooled into five categories; accuracies are inferred from p. 1313. Data are shown with limits of one standard error and weighted least squares regression line. The right panel is a histogram of 1000 simulated Pearson correlation coefficients, using data from 5 categories shown in right panel. The central 90% of the simulated values lie between 0.853 and 0.993, indicating that the true unweighted Pearson correlation coefficient is significantly different from zero. Courtesy of Karen Kafadar. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 144 APPENDIX C may express different levels of confidence if presented with exactly the same set of circumstances and procedures 6 months later. The existing literature varies in its assessment of the significance of the confidence–accuracy relationship, with some articles suggesting a very strong relationship and many others suggesting that the relationship is weak or nonexistent. The lack of significance in the confidence–accuracy relationship may result from other factors not taken into account. For example, Smalarz and Wells suggest that restricting the plot to only those data corresponding to “choosers” may strengthen the relationship.6 Other factors that might affect the relationship include the presence or absence of weapon, the level of stress during the incident, and the length of exposure to the perpetrator. Roediger and colleagues state that the simple assumption usually made that confidence and accuracy are always tightly linked is wrong…the relation between confidence and ac- curacy depends on the method of analysis, on the target material being remembered, on who is doing the remembering, and (in situations where memory is tested by recognition) on the nature of the lures and distrac- tors. In addition, there is more than one way to measure the relationship between confidence and accuracy, and not every way is equally relevant to what courts of law would like to know about the issue.7 Studies that incorporate numerous variables, as well as soliciting a confi- dence statement at various times (e.g., immediately, or 10 minutes after the incident, or 1 hour after the incident), would be valuable. RECEIVER OPERATING CHARACTERISTIC ANALYSIS A receiver operator characteristic (ROC) is a reliable, time-honored assessment of test performance. ROC has been used for decades in the medical test diagnostic literature. Conventionally, as noted in Chapter 5, two procedures were compared using a single diagnosticity ratio: DR = HR/ FAR = hit rate/false alarm rate, or sensitivity / (1 – specificity). Wixted and colleagues observed that the diagnosticity ratio, DR, can vary depending 6 L. Smalarz and G. L. Wells, “Eyewitness Certainty as a System Variable,” in Reform of Eyewitness Identification Procedures, ed. B. L. Cutler (Washington, DC: American Psychologi- cal Association, 2013), 161–177. 7 Roediger, Wixted, and DeSoto, “The Curious Complexity Between Confidence and Ac- curacy in Reports from Memory. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPENDIX C 145 on an eyewitness’ ECL and hence proposed the use of an ECL-based ROC curve to compare two lineup procedures (simultaneous versus sequential).8 The ECL-based ROC curve for a given procedure (e.g., simulta- neous) is constructed as follows: 1) Collect participants in a study and subject them to the experimental conditions. 2) For each participant, record whether she or he accurately selected the correct suspect or accurately passed over the filler and the ex- pressed confidence level in the decision. 3) Collect all the responses for participants who answered “100% confident” (say, n1 of them) and record the combined FAR (false alarm rate, or 1 – specificity) and HR (hit rate, or sensitivity) across n1 participants (FAR1, HR1). 4) Repeat step 3 for all participants who answered “90% confident” (or higher; say, n0.9 of them), resulting in the data pair (FAR0.9, HR0.9). 5) Repeat step 3 for all participants who answered “80% confident” (or higher; say, n0.8), resulting in the data pair (FAR0.8, HR0.8). 6) Continue to repeat step 3 for the groups of participants who an- swered “70% confident” ... “10% confident” (or higher; say, n0.7... n0.1 of them). 7) Plot the 10 data pairs, (FAR1, HR1), ..., (FAR0.1, HR0.1). This plot results in the ROC curve, whose points (HR, FAR) correspond to different ECLs. The plotted points usually are connected by straight lines, and the slope of the ROC curve at each of those plotted points represents the DR cor- responding to that confidence category. The ROC curve illustrates the sepa- rate DRs rather than calculating a single DR collapsed across all confidence categories. As with the confidence–accuracy relationship, it is important to recognize the uncertainty in the estimated (FAR, HR) data points. How does the uncertainty in FAR and HR, and hence in the diagnosticity ratio (DR = HR/FAR), translate into uncertainty into the ROC curve? The effect of uncertainty in estimates of HR, FAR, DR (= HR/FAR) on the ROC curve can be seen by simulating new HR and FAR rates, 8 L. Mickes, H. D. Flowe, and J. T. Wixted, “Receiver Operating Characteristic Analysis of Eyewitness Memory: Comparing the Diagnostic Accuracy of Simultaneous and Sequential Lineups,” Journal of Experimental Psychology: Applied 18: 361–376 (2012). See especially pp. 362–365 for a description of ROC analysis in the medical literature and applied to the eyewitness identifications. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 146 APPENDIX C assuming that the observed HR and FAR rates are true “means” from the simulated distributions. As a first example, consider the set of data from Brewer and Wells9 which is cited by Mickes, Flowe, and Wixtedin their Table 1.10 The data are: HR = (.090,.237,.320,.355,.370); FAR = (.002,.015,.030,.038,.041), leading to five diagnosticity ratios (rounded) DR = ( 45,16,11,9,9). The article states that the experiment involved 1,200 participants. As above, one can simulate each of the five hit rates and the five false alarm rates, with 4,000 independent trials and 1,200 participants, in such a way that the means of the five distributions of hit rates (HRs) and the means of the five distributions of false alarm rates (FARs) equal the values observed in the experiment [e.g., 0.090, 0.237, 0.320, 0.355, 0.370 for HR and (0.002, 0.015, 0.030, 0.038, 0.041) for FAR], leading to five dis- tributions of 4,000 diagnosticity ratios (HR/FAR). For example, consider simulating 1,200 individuals whose HR is 0.090 = 9.0%. One expects that, on average, about (9%) × 1,200 = 108 of the simulated 1,200 participants will have “hits.” When repeating this trial of 1,200 individuals, the num- ber might be 110, or 95, or some other number around, but usually not exactly, 108. Repeating the trial 4,000 times, one can average the 4,000 numbers (e.g., 108, 110, 95…) and divide by 1,200, yielding a mean simu- lated HR. The advantage is that one can also use the 4,000 numbers to calculate a standard deviation.11 One repeats exactly the same exercise for the five FAR rates, yielding a mean FAR and a standard deviation, SDFAR. As noted in Chapter 5, in real life, HR and FAR will be estimated on the same set of 1,200 participants, so the two numbers, HR and FAR, in the five (HR, FAR) pairs, will be correlated. In the simulation, HR and FAR are independent, so the estimated uncertainties are likely to be optimistic; the real uncertainties could well be larger. One can then plot three sets of points (each set contains five points): (1) (mean HR, mean FAR) (this plot should look qualitatively similar to the one in Figure 6(A) in Mickes, Flowe and Wixted;12 (2) (mean HR − SDHR, mean FAR − SDFAR) [these points should lie somewhat below the points plotted in (1)[; and (3) (mean HR + SDHR, mean FAR + SDFAR) [these points should lie somewhat above the points plotted in (1)]. 9 N. Brewer and G. L. Wells, “The Confidence-Accuracy Relationship in Eyewitness Identifi- cation: Effects of Lineup Instructions, Foil Similarity, and Target-Absent Base Rates,” Journal of Experimental Psychology: Applied 12(1): 11–30 (2012) (as cited by Mickes et al., Table 1, p. 367). 10 Mickes, Flowe, and Wixted, p. 367. 11 Or Standard Deviation Hit Rate (SDHR), which also can be obtained from standard formu- las for the standard deviation of the binomial distribution. See G. Snedecor and W. Cochran, Statistical Methods, Sixth Ed. (Ames, Iowa: Iowa State University Press, 1967). 12 Mickes, Flowe, and Wixted, p. 371. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPENDIX C 147 FIGURE C-2 Data from Brewer and Wells. NOTE: Adapted from Brewer and Wells, “The confidence-accuracy relationship in eyewitness identification.” The data are cited by Mickes, Flowe, and Wixted, “Receiver Operating Characteristic Analysis of Eyewitness Memory.” Courtesy of Karen Kafadar. Figure C-2 shows bands of one standard error in both HR and FAR, illustrating one source of uncertainty in the ROC curve due to estimating HR and FAR. The same approach to calculating uncertainties was used for the two sets of (HR, FAR) values given by the “simultaneous” and “se- quential” data in Mickes, Flowe, and Wixted, Table 3.13 The text indicates that Experiment 1A used n = 598 participants, so the simulation assumed n = 600. In Figure C-3, “M” refers to “siMultaneous,” and “Q” refers to “seQuential.” Note that the “M” and “Q” points fall roughly in the same pattern as in Mickes, Flowe, and Wixted’s Figure 6A.14 Note the substan- tial overlap in the bands of “one standard deviation” surrounding each of the data points, indicating no “statistically significant” differences between the “M” (simultaneous) and “Q” (sequential) points.15 If one were to take 13 Ibid, p. 372. 14 Ibid, p. 371. 15 The bands of two standard deviations would overlap even more. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 148 APPENDIX C into account the effects of using the same eyewitness in the same study with different responses to different tasks, the variability would be even larger. When the same exercise is repeated for the data in Experiment 2 (n=631), similarly ambiguous results (see Figure C-4) are obtained. As Mickes and colleagues suggest, the differences between simultaneous and sequential are even less impressive, and especially so once bands of one standard errors around the points are shown. These further analyses on these published data sets suggest the follow- ing conclusions. 1) The strength of the confidence-accuracy relationship involves un- certainty in the measures of both A (accuracy) and C (confidence), as well as other factors that can influence the relationship. 2) A ROC curve incorporates more information than a single DR (diagnosticity ratio = HR/FAR) using a third variable [different test thresholds in the medical literature; in the present context, different expressed confidence levels (ECLs); i.e., HR and FAR at FIGURE C-3 Data from Experiment 1A in Mickes, Flowe, and Wixted. NOTE: Adapted from Mickes, Flowe, and Wixted, “Receiver Operating Character- istic Analysis of Eyewitness Memory.” Courtesy of Karen Kafadar. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPENDIX C 149 different expressed confidence levels]. As is true with any data, the data from which a ROC is constructed (FARs, HRs, expressed confidence levels) have uncertainty, and that uncertainty is passed on to the ROC. A comparison of two ROCs without recognizing that uncertainty can be misleading. As with any tool, one must be careful in how one draws inferences when comparing ROC curves. 3) Other methods for comparing two procedures (in which the out- come is a binary classification such as “identification” / “no iden- tification” of an individual) exist in other literature.16 These analyses considered only the most obvious form of random measurement error. The ROC may be influenced by other sources of bias; these sources are not considered or displayed in the plots shown here (see Chapter 5). Also, the ROC curve takes into consideration only the prob- 16 See, e.g., T. Hastie, R. Tibshirani, and J.H. Friedman, The Elements of Statistical Learn- ing: Data Mining, Inference, and Prediction (New York: Springer, 2009) for a discussion on classification and evaluation methods of statistical machine learning research. FIGURE C-4 Data from Experiment 2 in Mickes, Flowe, and Wixted. NOTE: Adapted from Mickes, Flowe, and Wixted, “Receiver Operating Character- istic Analysis of Eyewitness Memory.” Courtesy of Karen Kafadar. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 150 APPENDIX C ability that an eyewitness who makes a positive identification of a suspect has correctly identified the true culprit (positive predictive value); it does not take into consideration the rule-out probability that an eyewitness who fails to make an identification of a suspect has correctly recognized that the suspect is not the true culprit (negative predictive value) (see Chapter 5). ALTERNATIVE ANALYSIS TO CONFIDENCE-BASED ROC FOR COMPARING PROCEDURES As noted in Chapter 5, the diagnosticity ratio [hit rate/false alarm rate = HR/FAR = sensitivity/(1 – specificity)] can depend not only on an eyewitness’ tendency toward “conservative” or “liberal” identification (as measured by expressed confidence level), but also on numerous other fac- tors, including: (1) lineup procedure (e.g., two levels: simultaneous versus sequential); (2) presence or absence of a weapon (two levels; more levels could be considered, such as gun, knife, towel, none); (3) stress (e.g., three levels: high, medium, low); (4) elapsed time between incident and exam (e.g., three levels: 30 min, 2 hours, 1 day); (5) race difference (e.g., two lev- els: same or different race or four levels: eyewitness/culprit = white/white; white/non-white; non-white/white; non-white/non-white; non-white/white); (6) participant (e.g., N levels, corresponding to N participants). If a study is sufficiently large, one could develop a performance metric for each participant in the study corresponding to each of these conditions. For example, one could construct a ROC curve and calculate as the per- formance metric the logarithm of the area under the curve, or log(AUC), for each person and each condition in the study. One could also use as a performance metric the logarithm of the odds (log odds) of a correct deci- sion; e.g., log(HR/(1-HR)) or log((1-FAR)/FAR). Consider the following approach: Let yijklmnr denote the log(AUC) or a log odds (or another performance metric) for the rth trial using participant n (n = 1, ...,N) for procedure i, weapon level j, stress level k, time condition l, and cross-race effect m.17 One could write: yijklmnr = μ + αi + βj + γk + δl + φm + (αβ)ij + ...(interactions)... + εijklmnr 17 When the performance metric is a log odds, this model is known as logistic regression; see, e.g., F. Harrell, Regression Modeling Strategies (New York: Springer-Verlag, 2001). A model where the performance metric is log(AUC) was studied by F. Wang and C. Gatsonis. See F. Wang and C. Gatsonis, “Hierarchical Models for ROC Curve Summary Measures: Design and Analysis of Multi-Reader, Multi-Modality Studies of Medical Tests,” Statistics in Medicine 27: 243-256 (2008). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPENDIX C 151 where μ represents the overall average log(AUC) or log odds across all conditions, the next six terms reflect the main effects of A (lineup proce- dure: i = 1 for sequential and i = 2 for simultaneous); B (weapon: j = 1 for presence and j = 2 for absence of weapon); C (stress level: k = 1 for low, k = 2 for medium, k = 3 for high); D (elapsed time between incident and report: ℓ= 1 for 30 minutes, ℓ = 2 for 2 hours, ℓ = 3 for 1 day); E (cross-race effect: m = 1 for same race and m = 2 for different races); F (participant effect: n = 1, 2, ...,N participants); “(interactions)” reflects the joint effect of two or more factors together; and the last term, εijklmnr represents any random error in the rth trial that is not specified from the previous terms (e.g., measurement, ”ECL,” multiple trials). This approach would allow one to separate the effects of the different factors, to assess which factors have the greatest influence on the outcome (here, logarithm of the area under the ROC curve: bigger is better), and to evaluate the importance of these factors relative to variation among “eyewitnesses.” It may be that eyewitnesses are the greatest source of variability, dominating the effects of all other factors. Or it may be that, in spite of person-to-person vari- ability, one or more factors still stand out as having strong influence on the outcome. Note that (1) other covariates could be included, such as age and gender of participant; and (2) the ROC curve need not be defined in terms of expressed confidence level thresholds if a more sensitive measure of response bias (tendency toward “liberal” versus “conservative” identifi- cations) can be developed. For example, C. A. Carlson and M. A. Carlson18 use partial area under the curve, or pAUC, as a summary measure of the information in an ROC curve (bigger is better), for each of twelve different conditions defined by three factors: (1) Procedure, three levels: simultaneous (SIM: suspect in position 4), sequential (SEQ2: suspect in position 2), sequential (SEQ5: sus- pect in position 5); (2) Weapon focus, two levels: present versus absent; (3) Distinctive feature, two levels: present versus absent. The data are provided in their Table 3, along with 95% confidence intervals.19 Because the length of a confidence interval is proportional to the standard error, pAUC values with shorter confidence intervals correspond to smaller standard errors and hence should have higher weights. The logarithms of the reported pAUC values and weights (reciprocals of the lengths of the reported confidence intervals) are given below in Table C-1. For the Carlson study, the data on all N = 2,675 participants (720 un- dergraduates and 1,955 SurveyMonkey respondents) were combined, and 18 C. A. Carlson and M. A. Carlson, “An Evaluation of Lineup Presentation, Weapon Pres- ence, and a Distinctive Feature Using ROC Analysis,” Journal of Applied Research in Memory and Cognition 3(2): 45–53 (2014). 19 Ibid., p. 49. Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 152 T A B L E C -1 C on di ti on s an d L og ar it hm s of R ep or te d pA U C V al ue sa C on di ti on Pr oc ed ur e W ea po n Fe at ur e 5 + lo g( pA U C ) W ei gh t 1 SI M Y es Y es 1. 31 11 2 47 .6 2 SI M Y es N o 1. 72 98 3 33 .3 3 SI M N o Y es 0. 92 54 6 55 .6 4 SI M N o N o 1. 87 64 3 45 .5 5 SE Q 2 Y es Y es 1. 49 34 4 47 .6 6 SE Q 2 Y es N o 1. 22 77 4 47 .6 7 SE Q 2 N o Y es 1. 08 79 8 52 .6 8 SE Q 2 N o N o 1. 58 87 5 41 .7 9 SE Q 5 Y es Y es 1. 70 31 6 38 .5 10 SE Q 5 Y es N o 0. 98 26 2 58 .8 11 SE Q 5 N o Y es 0. 65 71 9 66 .7 12 SE Q 5 N o N o 1. 49 34 4 55 .6 a A da pt ed f ro m d at a on p A U C f ro m T ab le 3 in C . A . C ar ls on a nd M . A . C ar ls on . “ A n E va lu at io n of L in eu p Pr es en ta ti on , W ea po n Pr es en ce , an d a D is ti nc ti ve F ea tu re U si ng R O C A na ly si s, ” Jo ur na l of A pp lie d R es ea rc h in M em or y an d C og ni ti on 3 (2 ): 4 5– 53 ( 20 14 ). T he a dd it io n of “ 5” t o lo g( pA U C ) is s im pl y to a vo id n eg at iv e nu m be rs ; th e in fe re nc es f ro m t he a na ly si s re m ai n un ch an ge d. C ou rt es y of K ar en K af ad ar . Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification APPENDIX C 153 expressed confidence levels were solicited on a 7-point scale. Variations in the twelve log(pAUC) values can be decomposed into three main effects (one each for procedure, weapon, and feature), and their two-way interac- tions. (The raw data may permit a more detailed analysis.) The data can be analyzed using a less complex model than that stated above (because the model has fewer terms): yijk = μ + αi + βj + γk + (αβ)ij + (αγ)ik + (βγ)jk + εijk where yijk denotes (5 + log(pAUC)) for procedure i (i = 1, 2, 3), weapon condition j (j = 1, 2), and feature k (k = 1, 2); μ represents the overall aver- age log(pAUC) across all conditions; αi represents the effect of procedure i; βj represents the effect of weapon condition j; γk represents the effect of feature condition k; and the next three terms reflect the three two-factor interactions between the main factors. The analysis of variance, where log(pAUC) values are weighted according to the values in the last column of Table C-1, is given in Table C-2 below. None of the factors is signifi- cant.20 It must be stressed that the complete set of raw data may yield a more powerful analysis with different results, as might a different summary measure of the ROC curve, such as AUC, or area under the ROC curve.21 20 We can decompose the two degrees of freedom in the sum of squares for Procedure (three levels), 8.04, into two single degree of freedom contrasts, SEQ2 versus SEQ5 (4.14), and sim versus the average of SEQ2 and SEQ5 (3.90), and consider all pairwise interaction terms among the four “main effects.” All single degree-of-freedom effects remain non-significant, in either this weighted analysis or in an unweighted analysis. 21 For a discussion of the advantages and disadvantages of using AUC versus pAUC as a summary measure, see S. D. Walter, “The Partial Area Under the Summary ROC Curve,” Statistics in Medicine 24(13): 2025–2040 (July 2005). Copyright © National Academy of Sciences. All rights reserved. Identifying the Culprit: Assessing Eyewitness Identification 154 T A B L E C -2 A na ly si s of V ar ia nc e Ta bl e fo r lo g( pA U C )a So ur ce o f V ar ia ti on D eg re es o f Fr ee do m Su m o f Sq ua re s M ea n Sq ua re F- st at is ti c p- va lu e Pr oc ed ur e 2 8. 04 4. 02 1. 12 9 0. 47 0 W ea po n 1 2. 94 2. 94 0. 82 6 0. 46 0 Fe at ur e 1 14 .7 2 14 .7 2 4. 13 8 0. 17 9 Pr oc ed ur e× W ea po n 2 0. 59 0. 30 0. 08 3 0. 92 3 Pr oc ed ur e× Fe at ur e 2 10 .4 1 5. 21 1. 46 3 0. 40 6 W ea po n× Fe at ur e 1 34 .8 0 34 .8 0 9. 78 0 0. 08 9 R es id ua ls 2 7. 12 3. 56 a A da pt ed f ro m d at a on p A U C f ro m T ab le 3 in C . A . C ar ls on a nd M . A . C ar ls on . “ A n E va lu at io n of L in eu p Pr es en ta ti on , W ea po n Pr es en ce , an d a D is ti nc ti ve F ea tu re U si ng R O C A na ly si s, ” Jo ur na l o f A pp lie d R es ea rc h in M em or y an d C og ni ti on 3 (2 ): 4 5– 53 ( 20 14 ). C ou rt es y of K ar en K af ad ar . 34 LHUMB 241 Page 1 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. Law and Human Behavior June, 2010 Original Article *241 REPEATED EYEWITNESS IDENTIFICATION PROCEDURES: MEMORY, DECISION MAKING, AND PRO- BATIVE VALUE Ryan D. Godfrey, Steven E. Clark [FNa1] Copyright © 2010 by American Psychology-Law Society/Division 41 of the American Psychological Association; Ryan D. Godfrey, Steven E. Clark Published online: 8 July 2009 © American Psychology-Law Society/Division 41 of the American Psychological Association 2009 Abstract Two experiments examined the effects of multiple identification procedures on identification responses, con- fidence, and similarity relationships. When the interval between first and second identification procedures was long (Experi- ment 1), correct and false identifications increased, but the probative value of a suspect identification changed little; consistent witnesses were more confident than inconsistent witnesses; and the similarity relationships between suspect and foils were unchanged. When the interval between first and second identification procedures was short (Experiment 2), suspect identifi- cation rates changed little, but foil identifications increased significantly; confidence for all identifications increased; consistent witnesses were more confident than inconsistent witnesses; and similarity relationships changed such that witnesses were less likely to identify the suspect as being the best match to the perpetrator. Keywords Eyewitness identification • Repeated testing procedures • Memory • Decision making This paper examines an enduring problem in the area of eyewitness memory and eyewitness identification, illustrated by the following scenario: An eyewitness to a crime is presented with a suspect, to either identify that person as the perpetrator of the crime or to exclude that person from further suspicion. Later, that same suspect is presented to that same witness, again for the purpose of identification or exclusion. The question, of course, is: How is the second identification procedure influenced by the first identification procedure? In actual criminal investigations these identification procedures may vary from mugshot searches, one-person in-field showups, photographic and live lineups, appearances at evidentiary and preliminary hearings, and finally at trial before the jury. Thus, a suspect may be presented to the same witness on several occasions prior to the witness identifying that person at the trial, in court, before the jury. There is archival evidence to suggest that repeated identification procedures are fairly common in criminal investigations (Behrman & Davey, 2001). Repeated identification procedures have been involved in several highly publicized DNA exoneration cases. Ronald Cotton served 12 years and Clark McMillan served 22 before DNA established their innocence. Thomas Brewster's DNA results led to his release as he was awaiting trial. In the Cotton case, the witness identified him three times, from a photographic lineup, a live lineup, and at a preliminary hearing, prior to identifying him at trial (Thompson-Cannino, Cotton, & Torneo, 2009). McMillan was identified at trial by one witness who did not identify him from the first lineup, and by a second witness 34 LHUMB 241 Page 2 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. who did not identify him from the first two lineups (www.innocenceproject.org). Brewster was identified only after the second lineup--11 years later (Wells & Loftus, 2003). Of course, in each case, the identifications were mistaken. These cases are consistent with laboratory results showing that false identification rates increase, and accuracy on the whole decreases, when there are multiple identification procedures (Haw, Dickinson, & Meissner, 2007; Hinz & Pezdek, 2001; Memon, Hope, Bart- lett, & Bull, 2002; Pezdek & Blandon-Gitlin, *242 2005; see Deffenbacher, Bornstein, & Penrod, 2006, for a review). This paper addresses questions both theoretical and applied. At the theoretical level, the critical questions are concerned with how a prior identification procedure, denoted P1, affects the witness's identification decision, and his or her confidence in that decision, at a subsequent identification procedure P2. The applied questions concern the changes in the patterns of correct and false identifications at P2 following P1, and how the changes in these patterns affect the evidentiary or probative value of the evidence that is presented to the jury. THEORETICAL QUESTION: HOW ARE WITNESS DECISIONS AT P2 AFFECTED BY P1? Exposure of the suspect to the witness at P1 can affect the outcome at P2 in two ways that have been extensively discussed in the literature. P1 and P2 can be linked by the witness's commitment to his or her previous decision. In this scenario, the witness identifies the suspect at P1 and then becomes attached or committed to that response such that it carries forward to P2. However, it is difficult to determine, for a given witness who makes the same response at P1 and P2, whether the repetition of that response occurs because the witness is committed to his or her P1 response or is simply consistent. In either case, the P1 response carries forward to P2. To capture both commitment and consistency, we will refer to the repetition of identification responses broadly as a consistency effect. Alternatively, P1 and P2 can be linked due to an alteration of memory. In this scenario, information about the suspect is stored in memory in the course of the first identification procedure at P1. Later at P2, the suspect is indeed a good match to memory because information about that suspect is stored in memory. Even if the witness does not identify the suspect at P1, that suspect may seem more familiar at P2, simply by having been seen at P1. In these cases, the witness may be unable to partition his or her memory in such a way as to know that the suspect's increased familiarity is due to the exposure at P1, rather than the suspect's presence at the time of the crime. The witness may make a source-confusion error (Johnson, Hashtroudi, & Lindsay, 1993; Lindsay, Allen, Chan, & Dahl, 2004; Lindsay, Hagen, Read, Wade, & Garry, 2004), as the source of the suspect's fa- miliarity is misattributed to the crime rather than the prior identification procedure. This error has also been described as an unconscious transference (Loftus, 1976), so named because the witness unwittingly transfers the image of the suspect from P1 to the scene of the crime, such that the suspect, who may be innocent, becomes the witness's memory of the perpetrator. Consistency and familiarity effects may be distinguished based on the witness's response at P1. Consistency, whether or not it is accompanied by increased familiarity, results in two identifications of the suspect, at P1 and again at P2. Increased famil- iarity alone, however, will produce a different pattern of responses, a nonidentification of the suspect at P1 followed by an identification of the repeated suspect at P2. We refer to this pattern of results, a nonidentification of the suspect at P1, followed by an identification of that same suspect at P2, as a no-to-suspect shift. These no-to-suspect shifts are the signature of misplaced familiarity effects. The effects of a prior identification on a subsequent identification procedure are often assessed by comparing the per- formance of an experimental group (denoted here as P1/P2) to that of a control group (denoted here as C/P2). However, ex- amination of two hypothetical patterns of results in Table 1 shows that one cannot assess the effects of multiple identification procedures by comparing P2 performance for the P1/P2 experimental group to that of the C/P2 control group. The C/P2 control condition is shown in the top panel of Table 1, and two different versions of the P1/P2 experimental condition are shown in the middle and bottom panels. Both examples show 50 witnesses in the control condition, 50 witnesses in the experimental condition, and a higher rate of false identifications for the experimental (.50) than for the control (.10). The middle panel illustrates the consistency pattern. In this experimental condition, 20 of 50 witnesses identified the 34 LHUMB 241 Page 3 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. suspect at P1 (.40), and 5 more identified him at P2 (.50). Clearly the bulk of the mistaken identifications occurred at P1 and carried forward to P2. The bottom panel illustrates the misplaced familiarity pattern, with 5 of 50 witnesses making a false identification at P1 (.10), and with 20 more added at P2 (.50). Here, the higher false identification rate is driven by the 20 no-to-suspect shifts. Table 1 Illustrations of increased false identification rates with multiple identification procedures Identification procedures P1 P2 Control - 5/50 (.10) Experimental (consistency) 20/50 (.40) 25/50 (.50) Experimental (familiarity) 5/50 (.10) 25/50 (.50) Table given number and proportions of witnesses who identify the suspect. P1 and P2 refer to the first and second identi- fication procedures, respectively *243 These two examples illustrate the importance of the P1 data for interpreting the link between P2 and P1. The examples also show that consistency and commitment effects will not produce much of an increase in false identification rates unless witnesses first identify the suspect at P1; otherwise, there is no identification to which a witness can become committed, and there can be no consistent identification of the suspect. Several studies have shown the pattern in which P2 false identification rates are higher for Experimental than for Control conditions. We now ask to what extent these increased false identification rates are the product of no-to-suspect shifts. Only a few studies report the P1 and P2 identification data in sufficient detail to allow an assessment of no-to-suspect shifts. Dysart, Lindsay, Hammond, and Dupuis (2001) used a yoked design to compare P2 lineup identifications for three groups of witnesses. One group had identified one of the P2 lineup members in the course of a previous mugshot search; another group had seen that person during the mugshot search, but had not identified him; the third group had not seen that person during the mugshot search. Dysart et al.'s results showed a very strong consistency effect and virtually no familiarity effect. Witnesses who identified the person at P1 were very likely (.614) to identify him at P2; witnesses who saw the person at P1, but did not identify him, were less likely (.091) to identify him at P2 than witnesses who had not seen the person at all (.205). Memon et al. (2002) also presented mugshots at P1, followed by a lineup at P2. The identification rate for the innocent suspect at P1 was .10, and the false identification rate at P2 was .29. The increase was due primarily to a consistency effect, as over half (.580) of the witnesses who identified the suspect at P1 did so again at P2. There was also some evidence for a fa- miliarity effect shown in the number of no-to-suspect switches; 30 out of 116 (.259) participants who did not identify the suspect at P1 did identify him at P2. 34 LHUMB 241 Page 4 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. Hinz and Pezdek (2001) presented six-person lineups at both P1 and P2. In their experimental condition, an innocent suspect was presented in the lineup at both P1 and P2, whereas in the control condition the innocent suspect was presented only in the P2 lineup and not in the P1 lineup. The false identification rate in the P2 lineup was higher for the experimental condition (.467) than for the control condition (.120). However, these results are difficult to interpret for two reasons. First, analogous to the illustration in the middle panel of Table 1, 32 of 107 (.299) witnesses made a false identification at P1, and nearly half (.485) of the witnesses who made any identification at P1 identified the innocent suspect, a figure considerably higher than the expected false identification rate if the lineup were unbiased (.167), which suggests that the P1 lineup was quite biased. [FN1] Thus, many of the identifications at P2 were obtained from the biased lineup at P1 and then carried forward. Nonetheless, there were 10 (out of a possible 41) no-to-suspect shifts. However, the interpretation of this result is questionable because the suspect appeared in the same position, position 5, in both lineups, which may have created a response bias for position 5. Evidence consistent with the response bias hypothesis is provided by an additional condition that presented a lineup with six novel faces at P2. Of the witnesses who made an identification from this lineup, 67% identified the person in position 5. Taken together, the identification results in the experimental condition may be the product of a biased lineup at P1 and a response bias to pick position 5, both of which carried forward to P2. Consistency in Nonidentifications, Criterion Shifts, and Other Strategic Effects The results from Dysart et al. (2001) suggest another side of response consistency that has been largely overlooked. Not only may witnesses be consistent in their identifications; they may also be consistent in their nonidentifications. In their study, witnesses who passed over the repeated suspect at P1 were less likely to identify him at P2 than witnesses who had not seen that suspect. Witnesses may also adjust their criterion for making an identification, becoming more or less willing to identify an- yone (suspect or foil) as a result of the identification procedure at P1. Multiple Identifications and Eyewitness Confidence Multiple identification procedures may affect witness confidence in three ways. First, confidence may carry forward from P1 to P2. Thus, if P1 identifications are made with relatively higher confidence, that higher confidence may carry forward such that witnesses are more confident *244 in their decisions at P2 when P2 is preceded by P1 than when P2 is not preceded by P1. Second, Shaw and his colleagues (Shaw, 1996; Shaw & McClure, 1996) have shown that confidence may increase with re- peated questioning. Thus, for witnesses who identify the suspect, or reject the suspect, consistently at P1 and again at P2, their confidence in their decision may increase from P1 to P2. Third, witnesses whose response patterns are inconsistent, for example by shifting from a nonidentification at P1 to a suspect-identification at P2, may respond with lower confidence compared to witnesses who were not exposed to P1. All three of these possibilities may contribute to the confidence witnesses express in their identification responses at P2. The mean confidence expressed by witnesses in the experimental P1/P2 group relative to the C/P2 control group will likely depend on the mix of these different possible outcomes. APPLIED QUESTION: HOW DOES P1 AFFECT THE PROBATIVE VALUE OF SUSPECT IDENTIFICATIONS AT P2? What does it mean when a witness identifies a suspect as the perpetrator of a crime? Does the identification mean, “he did it”? Or, is it a false identification of an innocent person? The succinct form of this question is: Given that the witness identified the suspect, what is the likelihood that he is in fact the perpetrator? This is a question of probative value (see Federal Rules of Evidence, 2004). In eyewitness identification experiments, the probative value of a suspect identification may be estimated as a conditional probability. Specifically, it is the probability that the suspect is guilty, given that the suspect was identified which, in its simplest form, is calculated as the conditional probability <>, where <>is the probability of a suspect identification given that the suspect is guilty, and <> is the probability of a suspect identification given that the suspect is innocent (see Clark & Godfrey, 2009; Clark, Howell, & Davey, 2008; Wells & Lindsay, 1980; Wells & Olson, 2002). 34 LHUMB 241 Page 5 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. For present purposes, the probative value question is a question about how a prior identification procedure affects both correct and false identification rates. However, almost all of the published studies have focused only on false identification rates. A few studies have focused only on correct identification rates (Greenberg & Ruback, 1992; Lindsay, Nosworthy, Martin, & Martynuck, 1994), and a few have examined the changes in both correct and false identification rates that are necessary to evaluate probative value (Brown, Deffenbacher, & Sturgill, 1977; Haw et al., 2007). Brown et al. (1977) conducted two experiments which presented guilty or innocent suspects in the course of a mugshot search (P1) that preceded a photographic lineup (P2). Correct and false identifications both increased as a result of the inter- vening exposure. In one experiment, the false identification rate increased (from .08 to .20) proportionally more than the correct identification rate (which increased from .51 to .65), producing an overall decrease in the probative value of a suspect identi- fication, whereas in the other experiment the increases in correct (.24 to .45) and false identifications (.18 to .29) were roughly proportional, leading to a very slight increase in the probative value of a suspect identification. The close applicability to eyewitness identification is limited by the fact that their lineups consisted of various combinations of multiple targets and multiple innocent suspects. In the Haw et al. (2007) study, participants were presented with 8 target faces in the initial presentation phase, 6 test trials in the intervening (P1) phase, and 16 test trials in the critical test phase (P2). Both correct and false identifications increased as a result of the intervening exposure. A greater increase in false identifications (from .07 to .46) than correct identifications (.37 to .70) led to a substantial decrease in the probative value of a suspect identification (from .84 to .60). It is not surprising that under these conditions, with several similar faces studied and tested within a short period of time, that there would be some confusion on the part of the witness as to whether a particular face was seen during the initial study phase or during the P1 test phase. This study leaves open the question of whether source confusions will arise if there is only one person shown during the initial exposure, and only one intervening test trial. The “Or” Rule for Suspect Identifications One last issue needs to be considered in evaluating multiple identification procedures. The two examples from Table 1 assumed 100% consistency for suspect identifications between P1 and P2. That is, all witnesses who identified the suspect at P1 identified him again at P2. This 100% consistency effect does not typically occur in the laboratory. Rather, just as some of the nonidentifications switch to suspect identifications at P2, some of the suspect identifications at P1 may also switch to noni- dentifications at P2. In laboratory experiments, all of the witnesses in the experimental condition are presented with P2, and the relevant comparison is between P2 in the P1/P2 experimental group and P2 in the C/P2 control group. Actual criminal investi- gations may not follow that procedure. Rather, it may often be the case that a witness is “finished” if he or she identifies the suspect, and will *245 continue on to P2 only if he or she did not identify the suspect at P1. To take this into consideration in laboratory experiments, one can compare the proportion of witnesses who identify the suspect from P1 or P2 in the experimental group with the proportion of witness who identify the suspect in the C/P2 control group. To illustrate, assume that 20 out of 50 witnesses identify the suspect at P1. Of those 20 witnesses, assume that 5 did not repeat their identification at P2. Of the 30 witnesses who did not identify the suspect at P1, 10 do identify the suspect at P2. Thus, there would be 25 (20 - 5 + 10) suspect identifications out of 50 (50% suspect ID rate); however, by the “or” rule there would be 30 identifications out of 50 (20 + 10) because the 5 who switched to a nonidentification would have been “finished” after their initial identification (60% suspect ID rate). THE PRESENT EXPERIMENTS In the present experiments, both guilty suspect and innocent suspect conditions are compared in experimental (P1/P2) and control (C/P2) conditions (The terms guilty-suspect condition and innocent-suspect condition correspond to the terms tar- get-present condition and target-absent condition that are often used in the literature). This allows us to evaluate the effect of repeated lineup procedures on both correct and false identifications and to calculate the probative value of a suspect identifi- cation. 34 LHUMB 241 Page 6 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. Participants in the experimental condition were presented with a photographic showup at P1, followed by a photographic lineup at P2. The photo showup and lineup always contained the same suspect. That is, a guilty-suspect showup was followed by a guilty-suspect lineup, and an innocent-suspect showup was followed by an innocent-suspect lineup. For the showup, witnesses were given the response options of identifying the suspect or giving one of two nonidentification responses, a definitive “not-him” rejection, or a not-sure response. This allows us to examine which subset of witnesses is more likely to make no-to-suspect shifts. One reasonable prediction is that the shifts will occur mainly for witnesses who responded not sure on the showup rather than for witnesses who rejected the showup. This prediction is based on a simple assumption regarding the match of the suspect to the witness's memory. For witnesses who identify the suspect at the showup the match between the suspect and memory is above a high criterion, for witnesses who reject the suspect at the showup the match of the suspect to memory is below a low criterion, and for witnesses who give not sure responses the match of the suspect to memory is above the lower criterion, but below the higher criterion. If the features of the suspect are added to memory as a result of the showup. these middling match values may be increased to allow an identification response. We also asked witnesses who did not make an identification from the lineup to indicate who they thought was most similar to the perpetrator. The most similar question examines how the showup at P1 may change the similarity relationships of lineup members at P2. The likelihood of a suspect identification at P2 may increase as a result of the witness's exposure to the suspect at P1 because the suspect becomes more likely to be the best match. In this case, P1 changes the similarity relationships of the lineup members. Alternatively, the exposure to the suspect at P1 may leave similarity relationships unchanged, having its effect only on suspects who were already the witness's best match to memory. EXPERIMENT 1 Participants Participants were 240 undergraduate students at the University of California, Riverside, who participated in partial ful- fillment of a requirement for an introductory psychology class. Materials and Procedure Participants were seated in a small room in groups of one to four where they watched a videotape that depicted a person being robbed at an ATM. There were two versions of the video, each showing a different perpetrator committing the same crime. The video, filmed from the perspective of a passing witness, began with the viewer walking through the central com- mons area of a college campus. Moments after passing a woman at an ATM, a male voice is heard off-camera yelling and demanding money. The camera turns toward the commotion, at which point the perpetrator is on camera, with gun in hand, for approximately 10-15 s. One of two different perpetrators was presented to half of the witnesses within each condition. After watching the video, participants completed a brief questionnaire (which asked for general demographic information) and a short version of the Big Five Personality Inventory (John, Donahue, & Kentle, 1991). These tasks took roughly 5-10 min. Following these tasks participants were asked to write down everything they could recall about the crime including a description of the perpetrator. They were also asked to indicate on a 1-to-6 scale how confident they were *246 that they would be able to identify the perpetrator later. [FN2] Participants were then presented with either a one-person photographic showup, or with a control task. One week later, they returned to the laboratory and were shown a guilty-suspect or innocent-suspect lineup. The details of the showups and lineups are as follows. Brief verbal descriptions and digital photographs of the perpetrators were used as input into an image-driven software system at the San Bernardino County, California Sheriff's Department, to obtain a pool of mugshot photographs similar in appearance to the perpetrator. One of the photographs was selected as the innocent suspect based upon similarity ratings ob- tained in another study, and the others were used as lineup fillers in the guilty-suspect lineup. The digital photograph of the 34 LHUMB 241 Page 7 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. innocent suspect was then re-entered into the software to select a pool of fillers similar to the innocent suspect. Thus, fillers in the guilty suspect lineup were selected based on their similarity to the guilty suspect and fillers in the innocent suspect lineup were selected based on their similarity to the innocent suspect. To produce homogeneity across the photographs, all backgrounds were digitally altered to a uniform shade of gray and all clothing was altered to a simple black t-shirt. The brightness was adjusted as necessary to give a consistent appearance across all photographs. Showups or Control Question at P1 Shortly after viewing the video, half of the witnesses were presented with a guilty-suspect or an innocent-suspect showup. The same suspect photographs were used in the showup and lineup, with the showup photographs being slightly enlarged and mirror-reversed. Following instructions indicating that the person in the photograph may or may not be the ATM robber, par- ticipants gave one of three responses--to identify the suspect, reject the suspect, or indicate that they were not sure. These witnesses also rated their confidence on a 1-to-6 scale. The control group witnesses were presented with a single photograph, but one quite dissimilar from anyone in the ATM robbery video (the photograph showed an Asian female), and they were asked a question that had no relevance to the robbery and did not require episodic memory (about the person's college major). Lineups at P2 One week later all witnesses were presented with either a guilty-suspect or innocent-suspect lineup, consisting of the suspect plus five fillers who were similar either to the guilty or innocent suspect, arranged in a 3 x 2 array, with the position of the suspect counterbalanced across all six positions. Instructions prior to the lineup explicitly stated that the perpetrator may or may not be in the lineup. Participants were allowed to identify a photograph, to state that the perpetrator was not present (reject the lineup), or to say they were not sure. After their response, they were asked to rate their confidence on a 1-to-6 scale. Finally, nonidentifying witnesses were asked who in the lineup looked most similar to the perpetrator. RESULTS AND DISCUSSION Five sets of analyses are presented. First, we present the identification data for the showup (P1). Second, we present the identification data for the lineup (P2), comparing performance of witnesses who were given the showup to witnesses who were not given the showup. Third, we present the lineup identification data conditional upon witness responses to the showup, an- alyzing response consistency and response switching. Fourth, we present the probative value analyses. Fifth, we present the confidence data. Showup Identifications The guilty suspect was identified slightly but not significantly more often (.200) than the innocent suspect (.100), X2 (1, N = 120) = 2.35, p = .125, r = .14. Definitive rejection responses were given significantly more often when the suspect was in- nocent (.733) than when he was guilty (.467), X2 (1, N = 120) = 8.89, p = .029, r = .27. Not sure responses were given signif- icantly more often *247 when the suspect was guilty (.333) than when innocent (.167), X2 (1, N = 120) = 4.44, p = .035, r = .19. Lineup Identifications Response rates for the lineup task, for suspect, foil, reject, and not sure responses, for both guilty and innocent suspect conditions, for the Experimental and Control conditions, are shown in Table 2. The presentation of the analyses is organized as follows: Because of the special status of suspect identifications in criminal investigations and trials, suspect identification analyses are presented first, for: (a) correct and false suspect identification rates, (b) suspect identification rates based on the “or” rule, and (c) probative value of suspect identifications. A second set of analyses focuses on the theoretical explanations, examining (a) how witness responses were consistent or shifted from showup to lineup, and (b) similarity assessments for 34 LHUMB 241 Page 8 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. control and experimental conditions. Suspect Identification Rates Correct and False Suspect Identification Rates Witnesses who had seen the guilty suspect in a showup were more likely to identify that person from the lineup a week later (.433) than witnesses who had not seen that suspect in an earlier showup (.233), X2 (1, N = 120) = 5.40, p = .020, r = .21. A similar pattern was shown for the innocent suspect condition. False identification rates of the innocent suspect were higher for witnesses who had seen the suspect a week before in the showup (.183) than for witnesses who had not seen the showup (.117); however, this increase was not statistically reliable, X2 (1, N = 120) = 1.05, p = .306, r = .09. Table 2 Lineup identification response proportions and frequencies by showup response for Experiment 1 Identification response (P2) Suspect Foil Reject Not sure Guilty suspect Control .233 (14) .150 (9) .317 (19) 300 (18) Experimental .433 (26) .133 (8) .200 (12) .233 (14) Showup response (P1) Identify suspect .667 (8) .167 (2) .083 (1) .083 (1) Reject suspect .321 (9) .179 (5) .357 (10) .143 (4) Not sure .450 (9) .050 (1) .050 (1) .450 (9) Innocent suspect Control .117 (7) .300 (18) .317 (19) .267 (16) 34 LHUMB 241 Page 9 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. Experimental .183 (11) .267 (16) .267 (16) .283 (17) Showup response (P1) Identify suspect .333 (2) .333 (2) .167 (1) .167 (1) Reject suspect .182 (8) .250 (11) .341 (15) .227 (10) Not sure .100 (1) .300 (3) .000 (0) .600 (6) Suspect Identifications Based on the “Or” Rule We calculated suspect identification rates using the “or” rule; specifically, if a witness identified the suspect at the showup or at the lineup, his or her response was counted as a suspect identification. The “or” rule produced a correct identification rate of .500, significantly higher than that of the control group (.233), X2 (1, N = 120) = 9.19, p = .002, r = .28 and a false identi- fication rate (.250) that was marginally significantly higher than that of the control group (.117), X2 (1, N = 120) = 3.56, p = .059, r = .17. Probative Value of a Suspect Identification The probative value of a suspect identification was calculated by dividing the number of (correct) guilty suspect identifi- cations by the sum of the guilty and innocent suspect identifications. This probative value calculation speaks to the question, Given that the suspect was identified, what is the likelihood that he is guilty? The probative value calculations were .667 for the control condition, .702 for the experimental condition, and .667 for the experimental condition using the “or” rule. Clearly, the probative value of a suspect identification did not change as a result of the intervening showup. Response Distributions, Consistency, and Change Consistency and Response Switches Patterns of response consistency and response switching varied depending on the witness's first response and the guilt or innocence of the suspect. The lineup identification responses, conditional upon the showup identification response, are given in Table 2. When the suspect was guilty, witnesses who identified him at the showup (n = 12) were likely to identify him again (8 of 12) at the lineup. However, 9 out of 28 (.321) witnesses who incorrectly rejected the guilty-suspect showup and 9 out of 20 (.450) who had responded that they were not sure identified the guilty suspect at the lineup. A higher proportion of witnesses who had responded not sure at the showup later shifted to identify the suspect at the lineup compared to witnesses who had rejected the showup; however, the difference was not statistically reliable, X2 (1, N = 48), = .82, p = .36, r = .13. Combined, 18 of 48 nonidentification responses (.375) were changed to suspect identifications at the lineup. Thus, when the suspect was guilty, witnesses were consistent in their identifications, but a high *248 proportion (.375) of nonidentifying witnesses shifted 34 LHUMB 241 Page 10 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. to identify the suspect. The results were slightly different when the suspect was innocent. Two-thirds of the witnesses who misidentified the innocent suspect at the showup did not identify him at the lineup. Although the numbers are small (only six people identified the innocent suspect from the showup), they appear not to show the same level of consistency that was shown when the suspect was guilty. There were, however, a few no-to-suspect switches: 8 of 44 witnesses (.182) who rejected the showup and 1 out of 10 who had responded not sure went on to identify the innocent suspect from the lineup. Combined, 9 of 54 correct nonidentifi- cations at the showup (.167) were changed to false identifications of the innocent suspect at the lineup. Foil, Reject, and Not Sure Responses The differences in foil, reject, and not sure responses for the lineup, comparing experimental and control conditions, were all small, and none reached statistical significance (all p values > .14). Of course, for the suspect identification rates to increase, the frequencies of other responses must decrease. The fact that the differences were small suggests that the increases in suspect identifications did not draw responses disproportionately from the other response categories. Most Similar Question Witnesses who did not make an identification were asked to indicate who looked the most similar to the perpetrator, and these responses were added to the suspect identification rates to give an estimate of the number of witnesses who assessed the suspect as being the best match to memory. The result of this analysis is shown in the left-hand panel of Fig. 1. Fig. 1 Proportions of witnesses who identified the suspect, or did not identify the suspect, but identified him as the best match. for Experiments 1 and 2 TABULAR OR GRAPHIC MATERIAL SET FORTH AT THIS POINT IS NOT DISPLAYABLE When the suspect was guilty, 22 of 37 (.595) nonidentifiers in the control condition, and 11 of 26 (.423) nonidentifiers in the experimental condition, chose the suspect as being their best match. Adding these to the suspect identifications gives totals of 36 (14 + 22) for the control condition and 37 for the (26 + 11) for the experimental condition. Clearly, the proportion of witnesses for whom the suspect was the best match did not differ for control (.600) and experimental (.617) group witnesses, X2 (1, N = 120) = .03, p = .99, r = .016. For the innocent suspect, 10 of 17 (.588) nonidentifying witnesses in the control condition, and 12 of 33 (.364) in the experimental condition, chose the suspect as the best match. Thus, a total of 17 (7 + 10) control and 23 (11 + 12) experimental group witnesses indicated the suspect was the best match. The proportion of witnesses who identified the suspect as the best match was higher for the experimental (.383) than for the control condition (.283); however, the difference was not statistically reliable, X2 (1, N = 120) = 1.35, p = .33, r = .106. These analyses suggest that the intervening lineup did not substantially alter the similarity relationships for the suspect relative to other lineup members. This was clearly the case for guilty-suspect lineups, where the proportions of best-match results were nearly identical for experimental and control conditions. The implication of these results is that the exposure to the suspect at the showup had little effect unless the suspect was a good match to memory to begin with. This is less clear for the innocent suspect lineups, where the best-match proportion was slightly higher for the experimental group than for the control group. Although the statistical comparison did not reach *249 significance, we should not rule out the possibility that exposure to the innocent suspect may in fact change the similarity relationships in the lineup such that the suspect, who otherwise would not have been the best match to memory (among all lineup members), becomes the best match to memory because of the ex- posure at the showup. Confidence Judgments 34 LHUMB 241 Page 11 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. Confidence ratings were obtained at three points: After the video, participants indicated their prospective confidence that they would be able to identify the perpetrator, and they were asked to indicate their confidence in the identification decisions they made immediately following the showup (experiment group only) and lineup. Experimental Lineup Versus Control Lineup Means and standard deviations for confidence are presented in Table 3, for suspect, foil, and reject responses, for guilty suspect and innocent suspect lineups in the experimental and control conditions. The analyses do not include confidence judgments for not-sure responses. Confidence in these responses is implied by the response category itself. Three separate 2 x 2 analyses of variance were conducted for the confidence of suspect, foil, and reject responses at the lineup, comparing exper- imental and control conditions and guilty versus innocent suspects. None of these analyses produced any main effects or in- teractions. Across all three analyses, for suspect, foil, and nonidentification responses, there were no significant differences between experimental and control conditions, or between guilty and innocent suspect conditions, and no interaction (.83 > p > .17). Table 3 Mean confidence judgments and standard deviations by response and condition for Experiment 1 Confidence judg- ments for “not sure” responses are not reported SD standard deviation Identification response (P1) Identification response (P2) Yes No Suspect Foil Reject No prior showup Guilty suspect - - 3.29 3.89 4.26 SD 1.01 0.60 1.24 n (14) (9) (19) Innocent sus- pect - - 3.71 3.22 4.37 SD 1.11 1.00 1.07 n (7) (18) (19) 34 LHUMB 241 Page 12 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. Prior showup (P1) Guilty suspect 4.17 4.18 3.81 3.50 4.25 SD 0.94 1.09 1.23 .926 .866 n (12) (28) (26) (8) (12) Innocent sus- pect 4.50 4.41 4.09 3.38 4.56 SD 0.84 1.26 1.22 1.09 .727 n (6) (41) (11) (16) (16) The nonsignificant differences between guilty and innocent suspect conditions suggest that the relationship between wit- ness confidence and accuracy was weak or nonexistent. Indeed, the confidence-accuracy correlations were r = .05, N = 87, p = .62 for the showup, r = .14, N = 89, p = .19 for the experimental group lineups, and r = .06, N = 86, p = .59 for the control group lineups. The nonsignificant differences between experimental and control conditions, however, do not imply that there were no differences at all between the experimental and control conditions, or that there were no differences within the experimental group witnesses. Additional analyses were conducted to determine the extent to which confidence on the showup carried forward to con- fidence on the lineup and to compare confidence for consistent witnesses who identified the suspect at the showup and lineup to confidence for inconsistent witnesses who switched from a nonidentification (including both reject and not sure responses) on the showup to a suspect identification on the lineup. The results are shown in the left-hand panel of Fig. 2. The analyses that follow collapse over guilty and innocent suspect conditions, because separate analyses would be based on few observations. The table shows, however, that the patterns of results for guilty and innocent suspect identifications were very much the same. Fig. 2 Confidence for consistent and inconsistent witnesses in the experimental condition, and confidence in control condition, for Experiments 1 and 2. Results are collapsed over guilty and innocent suspect lineup conditions. Mean confidence ratings (M) and frequencies (n) are given beneath, separately for guilty and innocent suspect conditions EXPERIMENT 1 TABULAR OR GRAPHIC MATERIAL SET FORTH AT THIS POINT IS NOT DISPLAYABLE 34 LHUMB 241 Page 13 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. EXPERIMENT 2 TABULAR OR GRAPHIC MATERIAL SET FORTH AT THIS POINT IS NOT DISPLAYABLE Confidence Carried Forward from Showup to Lineup Confidence in suspect identifications at the showup was higher (M = 4.28, SD = 0.89) than confidence in suspect identi- fications made at the control condition lineup (M = 3.43, SD = 0.98), t (37) = 2.81, p = .008, r = .419. Ten of eighteen witnesses who identified the suspect at the *250 showup identified him again at the lineup. These consistent witnesses made suspect identifications with relatively higher confidence, at both the showup (M = 4.40, SD = 0.97) and at the lineup (M = 4.50, SD = 0.85) I week later. Thus, compared to the control condition, this small group of consistent witnesses made their suspect iden- tifications with higher confidence at the showup, and that high confidence carried forward to the lineup. These consistent witnesses were more confident in their suspect identifications at the lineup compared to witnesses in the control condition, t(29) = 2.97, p = .006, r = .483. Confidence of Consistent Versus Inconsistent Witnesses Of the 37 witnesses who identified the suspect at the lineup in the experimental condition, 27 (.729) had changed their response from a nonidentification at the showup. The suspect identifications of these inconsistent witnesses were made with less confidence (M = 3.67, SD = 1.27), compared to the consistent witnesses (M = 4.50, SD = 0.85), t(35) = 1.91, p = .064, r = 307. However, the confidence of suspect identifications for these inconsistent witnesses was not significantly different from the confidence of control condition witnesses, t(46) = 0.71, p = .481, r = .104. These analyses taken together show that the confidence of a suspect identification at the experimental condition lineup depended on whether that witness was consistent in identifying the suspect, or had switched from a nonidentification to a suspect-identification response. Witnesses who identified the suspect at the showup did so with higher confidence than wit- nesses who identified the suspect at the control condition lineup, and these higher-confidence suspect identifications did carry forward to the experimental condition lineup. The overall null result comparing the confidence for suspect identifications for experimental versus control condition lineups is likely due to the fact that .729 of suspect identifications in the experimental condition were relatively lower-confidence inconsistent responses rather than the relatively higher-confidence consistent suspect identification responses. We also considered the confidence of witnesses who were consistent in their rejection of the suspect in both the showup and lineup. These witnesses (n = 25) showed slightly less confidence in the rejection of the lineup (M = 4.43, SD = 0.82) than in the rejection of the showup (M = 4.57, SD = 1.24), a small decrease that was not statistically significant, t(22) = 0.62, p = .544, r = .131. They did not differ significantly from confidence for rejections given by witnesses in the control condition (M = 4.32, SD = 1.14), t(62) = 0.62, p = .537, r = .078. Summary and an Account of the Results The results of Experiment 1 may be summarized as follows: Photo showups conducted prior to a photographic lineup produced an increase in the correct identification rate for the guilty suspect, and an increase in the false identification rate for the innocent suspect. The increase in the correct identification rate was statistically significant, whereas the increase in the false identification rate was not. However, this combination of results should not be misinterpreted as “an effect” in one case and “no effect” in the other, as the increases for both correct and false identification rates were proportionally equal; thus, the probative value of a suspect identification was unchanged *251 by the prior presentation of the suspect's photograph in the showup procedure. The higher suspect identification rate in the experimental condition was produced by a combination of suspect identifica- 34 LHUMB 241 Page 14 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. tions at the showup that were carried forward to the lineup and also by a large number of no-to-suspect shifts. The increase in suspect identifications came about with no change in the similarity relationships between the suspect and the other lineup members. The proportions of witnesses who identified the suspect as the best match did not vary significantly across experi- mental and control conditions. Witnesses who identified the suspect at both the showup and lineup maintained their level of confidence over the week delay, and they were more confident than witnesses who shifted from nonidentification to suspect-identification responses and witnesses who identified the suspect in the control condition. Suspect identifications in the experimental condition were a mix of high-confidence consistent witnesses, but proportionally more low-confidence inconsistent, no-to-suspect shifting wit- nesses, and thus the mean confidence for the experimental condition was only slightly, but not significantly, higher in the experimental condition than the control condition. Our account of these results is simple. As the witness watched the video of the robbery, features of the perpetrator are stored in memory with contextual information linking those features to the crime. Then, at the showup, features of the suspect are added to memory and are episodically linked to the context of the showup. Thus, initially the memories of the perpetrator and the suspect may be more easily distinguished. However, over the course of the week delay between the showup and the lineup, as the memories and their contextual links fade, it becomes more difficult to distinguish the memory of the perpetrator from the memory of the suspect presented at the showup. Consequently, identification rates for the suspect increase. This account of the results makes a straightforward prediction: If the lineup were presented shortly after the showup, rather than a week later, no-to-suspect shifts would be much less likely, and suspect identification rates in the experimental condition would not be higher than those in the control condition. This prediction may seem to have already been disconfirmed by results of Haw et al. (2007) which showed large effects of intervening showups when the showups and lineups were conducted within the same experimental session. However, as Haw et al. noted, their procedures deviated from those of typical staged crime eyewitness experiments, in that their participants were presented with 8 target faces, followed by 6 showup test trials, followed by 16 lineup tests, all within one session. Such multiple-study, multiple-test procedures may add noise to the memory system (see Dennis & Humphreys, 2001, and Gronlund & Elam, 1994, for different perspectives regarding interfering noise and memory), increasing the likelihood of source confusion errors. If indeed the no-to-suspect shifts obtained by Haw et al. were due to the increased noise from multiple memory traces, then the shifts should be much less likely when there is one target, one showup, and one lineup within a single experimental session. There is, however, one other aspect of multiple testing procedures that is often overlooked and must be considered. The increase in the identification rate of a repeated face may arise because of an inherent suggestiveness in multiple testing pro- cedures, even when there is no source confusion, and no loss of contextual information to distinguish between the memory of the crime versus the memory of the prior identification procedure. Specifically, the witness may be quite aware that one person is repeated across identification procedures, and may consider that this repeated person is “special” in some way. In this case, the transference would certainly not be unconscious, but rather would arise from the conscious memory across identification procedures. Thus, as Wells and Quinlivan (2009) have noted, multiple identification procedures may be “highly suggestive to the extent that the witness can discern which person is common” to both procedures (p. 8). In this view, the increase in suspect identifications occurs not because of the limitations of memory, but in spite of such limitations. One other aspect of the Experiment 1 results is also addressed in Experiment 2. Experiment 1 showed no relationship between identification accuracy and confidence. This held for both experimental and control conditions. It may be that the ability of witnesses to assess their accuracy declines with longer retention intervals. This possibility is consistent with the optimality hypothesis proposed by Deffenbacher (1980), which predicts stronger confidence-accuracy relationships when information processing conditions are better. The optimality hypothesis, initially framed in terms of encoding conditions, has received empirical support (Bothwell, Deffenbacher, & Brigham, 1987; Olsson, 2000). The question posed here is whether the optimality hypothesis applies not only to the encoding of information, but also to the retention of information. There are little available data relevant to this point. Juslin, Olsson, and Winman (1996) examined confidence and accuracy at a 1-hour and 1-week delay, and found no change in the confidence-accuracy correlation. However, the interpretation of this 34 LHUMB 241 Page 15 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. null result is complicated because they also reported no change in accuracy over the 1-week delay, which suggests that the retention interval manipulation, for whatever reason, was simply ineffective. *252 EXPERIMENT 2 Experiment 2 used the same procedure and materials as Experiment 1 with two differences: The retention interval between the showup and lineup was shortened from 1 week to approximately 30 min. There were two additional groups, one that re- ceived a guilty-suspect showup followed by an innocent-suspect lineup, and one that received an innocent-suspect showup followed by a guilty-suspect lineup. The results of these two groups are not relevant to the hypotheses being considered and are not discussed further. Participants Participants were 192 undergraduate students at the University of California, Riverside, who participated as partial ful- fillment of a requirement for an introductory psychology class. Procedure The procedure was identical to that used for Experiment 1 through the presentation of the showup. The only difference up to that point is the suspect photograph in the showup was not altered in any way, and was thus identical to the suspect photo- graph to be viewed in the lineup. However, following the showup a second filler task--a questionnaire about various prefer- ences (music, food, etc.), which took approximately 20 min to complete--was administered prior to the presentation of the lineup. Following this second filler task all participants were presented with a six-person, guilty-suspect or innocent-suspect lineup. RESULTS AND DISCUSSION Following the same outline as Experiment 1, five sets of analyses are presented: (1) the showup results, (2) the overall identification results, (3) the response consistency and switching results, (4) the probative value calculations, and finally (5) the confidence results. Showup Identifications Suspect identification rates were higher for the guilty-suspect showup (.271) than for the innocent-suspect showup (.083), X2 (1, N = 96) = 5.79, p = .016, r = .25. Rejection rates were higher for the innocent suspect (.729) than for the guilty suspect (.438), X2 (1, N = 96) = 8.40, p = .004, r = .30. Rates of not sure responses did not differ significantly for guilty and innocent suspect showups, X2 (1, N = 96) = 1.43, p = .232, r = .12. Lineup Identifications The response proportions for suspect, foil, reject, and not sure responses are shown in Table 4. Analyses are presented for: (a) correct and false identification rates, (b) suspect identification rates based on the “or” rule, and (c) the probative value of a suspect identification. A second set of analyses examine (a) how witness responses were consistent or inconsistent from showup to lineup, and differences in response distributions for experimental and control conditions, and (b) the assessment of similarity relationships via the most similar question. Correct and False Identification Rates Neither correct nor false identification rates differed significantly, comparing experimental group witnesses presented with 34 LHUMB 241 Page 16 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. the showup versus control group witnesses who were not, X2 (1, N = 96) = .05, p = .832, r = .02 for correct identifications, and X2 (1, N = 96) = .10, p = .749, r = .03 for false identifications. Suspect Identifications Based on the “Or” Rule Calculation of suspect identification rates based on the “or” rule produced only very small changes in the correct and false identification rates. With the additional suspect identifications, the correct (.375) and false identifications (.125) in the ex- perimental condition were identical to those in the control group. Table 4 Lineup identification response proportions and frequencies by showup response for Experiment 2 Identification response (P2) Suspect Foil Reject Not sure Guilty suspect Control .354 (17) .146 (7) .229 (11) .271 (13) Experimental .357 (18) .229 (11) .250 (12) .146 (7) Showup response (P1) Identify suspect 1.000 (13) .000 (0) .000 (0) .000 (0) Reject suspect .143 (3) .333 (7) .476 (10) .048 (1) Not sure .143 (2) .286 (4) .143 (2) .429 (6) Innocent suspect Control .125 (6) .167 (8) .479 (23) .229 (11) Experimental .104 (5) .375 (18) .292 (14) .229 (11) 34 LHUMB 241 Page 17 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. Showup response (P1) Identify suspect .750 (3) .000 (0) .000 (0) .250 (1) Reject suspect .029 (1) .400 (14) .400 (14) .171 (6) Not sure .111 (1) .444 (4) .000 (0) .444 (4) *253 Probative Value of a Suspect Identification The probative value of a suspect identification was calculated as it was for Experiment 1. Given the very small changes in suspect identification rates in the experiment condition, the differences in probative value were negligible across control (.739), experimental (.783), and “or” rule (.750) calculations. Response Distributions, Consistency, and Change Consistency and Response Switches The consistency and response switching data are shown in Table 4. Clearly, those who identified the guilty suspect stuck to their decision. None of the 13 witnesses who identified the guilty suspect at the showup changed their response at the lineup. Only 4 people identified the innocent suspect at the showup; of these 3 identified the innocent suspect again at the lineup. The one who did not switched to a “not sure” response. Foil, Reject, and Not Sure Responses For guilty-suspect lineups, comparing the experimental condition to the control condition, foil identifications increased slightly, from .146 to .229, lineup rejections increased by one, and not sure responses decreased, from .271 to .146. None of these analyses approached statistical significance (all p's > .13). For innocent-suspect lineups, foil identifications increased from .167 to .375, X2 (1, N = 96) = 5.27, p = .022, r = .234), and lineup rejections decreased from .479 to .292, X2 (1, N = 96) = 3.56, p = .059, r = .193. Most Similar Question The analysis of responses for the most similar question is shown on the right side of Fig. 1. For the guilty suspect, 7 of 30 nonidentifying witnesses indicated the suspect was the best match for the experimental condition and 15 of 31 witnesses in- dicated that the suspect was the best match for the control condition. Adding the most similar responses to the identification responses, the proportions of witnesses who thought the suspect was the best match was slightly, but not significantly, lower for the experimental condition (.521) than for the control condition (.667), X2 (1, N = 96) = 2.12, p = .212, r = .147. For the innocent suspect condition, 13 of 42 nonidentifying witnesses in the control condition, but only 3 of 43 nonidentifying witnesses in the experimental condition, chose the suspect as most similar. Adding these most similar responses to the identification responses, 34 LHUMB 241 Page 18 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. the proportion of witnesses for whom the innocent suspect was the best-match was significantly lower for the experimental condition (.167) than for the control condition, (.396), X2 (1, N = 96) = 6.24, p = .022, r = .255. Confidence Judgments Experimental Versus Control Condition Lineups Mean confidence judgments for suspect, foil, and reject responses, for guilty and innocent suspect lineups, in the exper- imental and control conditions, are shown in Table 5. *254 Separate 2 x 2 analyses of variance were conducted for suspect, foil, and reject responses, with lineup condition and suspect guilt as factors. Table 5 Mean confidence judgments and standard deviations by response and condition for Experiment 2 Confidence judg- ments for “not sure” responses are not reported SD standard deviation Identification response (P1) Identification response (P2) Yes No Suspect Foil Reject No prior showup Guilty suspect - - 4.24 2.57 4.09 SD 0.83 0.79 1.45 n (17) (7) (11) Innocent sus- pect - - 2.67 3.00 4.04 SD 0.82 1.31 1.15 n (6) (8) (23) Prior showup (P1) 34 LHUMB 241 Page 19 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. Guilty suspect 4.08 3.86 4.56 4.00 3.67 SD 1.26 1.24 1.10 1.55 1.15 n (13) (21) (18) (11) (12) Innocent sus- pect 4.25 4.43 3.80 3.78 4.07 SD 0.50 1.01 0.84 1.17 1.21 n (4) (35) (5) (18) (14) These analyses showed that suspect identifications were given with higher confidence in the experimental lineup than in the control condition lineup, F(1, 42) = 4.90, p = .032, r = .323, and were given with greater confidence when the suspect was guilty than when he was innocent, F(1, 42) = 12.53, p = .001, r = .479. The interaction between lineup condition and suspect guilt was not statistically significant, F(1, 42) = 1.534, p = .222, r = .188. Foil identifications were made with greater confidence in the experimental lineup condition than in the control lineup condition, F(1, 40) = 7.48, p = .009, r = .397, but did not differ as a function of the suspect's guilt or innocence, F(1, 40) = 0.07, p = .799, r = .042. There was no interaction between lineup con- dition and suspect guilt, F(1, 40) = 0.65, p = .425, r = 1.26. For lineup rejections, there were no effects of lineup condition, suspect guilt, and no interaction (all p's > .49). Confidence-accuracy correlations were also calculated. These correlations were r = .18, N = 73, p = .13 for the showup, r = .22, N = 78, p = .051 for the experimental condition lineup, and r = .37, N = 72, p = .002 for the control condition lineup. Two additional sets of analyses examined the extent to which confidence carried forward from showup to lineup in the experimental condition, and compared consistent versus inconsistent suspect identifications within the experimental condition. Each of these is presented below, and relevant data are shown in the right side of Fig. 2. Again, the number of false identifi- cations was very small and thus the analyses presented below collapse over correct and false identifications. Confidence Carried Forward from Showup to Lineup Experimental witnesses who identified the suspect on the showup (M = 4.12, SD = 1.11) expressed greater confidence than control witnesses who identified the suspect on the lineup (M = 3.83, SD = 1.07); however, this difference was not statistically reliable, t(38) = 0.84, p = .408, r = .111. Of the 17 witnesses who identified the suspect from the showup, 16 identified him a second time from the lineup. These consistent witnesses were confident in their identification on the showup (M = 4.13, SD = 1.15) and became significantly more confident on the lineup (M = 4.75, SD = 1.06) than they had been on the showup, t(15) = 2.44, p = .028, r = .533. Compared to the control witnesses, this group of consistent witnesses expressed greater confidence at the showup, which not only carried forward but actually increased on the subsequent lineup. This led to significantly higher confidence in their suspect identifications on the lineup compared to control witnesses who did not view a showup, t(37) = 2.65, 34 LHUMB 241 Page 20 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. p = .012, r = .399. Confidence of Consistent Versus Inconsistent Witnesses There were 23 witnesses who identified the suspect at the lineup, and of those, 7 (.304) had not made an identification at the showup. The suspect identifications of these inconsistent witnesses were made with significantly less confidence (M = 3.57, SD = 0.53) than consistent witnesses (M = 4.75, SD = 1.06), t(22) = 2.74, p = .012, r = .504. There was no significant difference between the confidence expressed by inconsistent witnesses versus the confidence of control condition witnesses for suspect identifications (M = 3.83, SD = 1.07), t(29) = 0.60, p = .553, r = .111. We also considered witnesses who rejected both the showup and the lineup. Their confidence in rejecting the showup (M = 4.58, SD = 1.02) was significantly higher than their confidence in rejecting the lineup (M = 3.96, SD = 1.16), t(23) = 3.16, p = .004, r = .550. There were too few inconsistent rejectors (n = 2) to analyze. Summary The results of Experiment 2 may be summarized as follows. Lineup Identification Responses The lineup identification results showed a strong carryforward consistency pattern for suspect identifications. Almost all of those who identified the suspect at the showup identified him again at the lineup. Only a handful of witnesses who did not identify the suspect at the showup later identified him at the lineup. However, witnesses in the experimental condition were more likely to identify someone from the lineup than were witnesses in the control condition. The increase in the identification rate was not focused on the suspect, however, but was distributed across lineup members. Witness Confidence Confidence was a significant predictor of accuracy, for experimental and control lineups, but not for the showup in the experimental condition. Suspect and foil identifications were both made with more confidence in the experimental condition than in the control condition. For consistent suspect identifiers in the experimental *255 condition, confidence increased from the showup to the lineup, and for consistent rejectors, the rejection of the lineup was made with significantly less confidence than the rejection of the showup. Similarity The proportion of witnesses who identified the suspect as being the best match to the perpetrator, either by actually iden- tifying him as the perpetrator, or by choosing him as most similar, was lower in the experimental than the control condition. Thus, the showup changed the similarity relationships between the suspect and the foils in the lineup. Comparisons Between Experiment 1 and Experiment 2 Experiments 1 and 2 differed only in the delay between the showup and the lineup--1 week in Experiment 1 versus about 30 min in Experiment 2. This difference produced very different patterns of identification responses, had limited effects on con- fidence, and changed the pattern of similarity relationships between the suspect and the foils in the lineup. The specifics are given as follows: Identification Responses 34 LHUMB 241 Page 21 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. Experiment 1 showed a mix of carry-forward consistency and nonidentification to suspect-identification switches, whereas Experiment 2 showed a pattern of consistency and almost no nonidentification to suspect-identification switches. Confidence Consistent with Deffenbacher's (1980) optimality hypothesis, the relationship between confidence and accuracy for lineup identifications was stronger in Experiment 2 which had a very short retention interval than in Experiment 1 which had a 1-week retention interval. The key comparison is for the control condition because there is no intervening identification procedure that might alter the confidence-accuracy relationship. A direct comparison of the correlations for Experiment 1 (.059) and Ex- periment 2 (.367) shows the difference to be statistically significant (z = 2.0, p = .046). The statistical comparisons between experimental and control conditions were quite different for Experiments 1 and 2. Specifically, suspect and foil identifications were made with greater confidence in the experimental condition than the control condition in Experiment 2, but not in Experiment 1. However, although the statistical outcomes were different, the patterns for consistent and inconsistent witnesses were very much the same, as shown in Fig. 1: Suspect identifications at the lineup, from witnesses who had not identified the suspect at the showup, were made with equal confidence as suspect identifications made by control group witnesses. Consistent suspect identifications at the lineup were made with higher confidence than inconsistent or control group suspect identifications. The statistical differences between the two experiments were due to differences in the mix of high-confidence consistent witnesses versus low-confidence inconsistent witnesses. In Experiment 1 only 27.1% of the suspect identifications were high-confidence consistent identifications, whereas in Experiment 2, 69.4% of suspect identifica- tions were high-confidence consistent identifications. Similarity The patterns of responses to the most similar question were very different in Experiment 1 versus Experiment 2, as shown in Fig. 2. The figure shows the proportions of witnesses who identified the suspect, and the proportions who did not identify the suspect, but identified him as most similar. It is clear from the figure that this proportion did not vary across control and ex- perimental conditions in Experiment 1. However, in Experiment 2 the best-match proportion was significantly lower in the experimental condition than in the control condition. GENERAL DISCUSSION In this section we outline a tentative account of the results from both experiments and discuss the implications of the results for the criminal justice system. Memory, Misplaced Familiarity, Suggestiveness, and Criterion Shifts The results of the two experiments are consistent with a simple memory explanation. In Experiment 1, when the suspect is presented at the showup, information about him is stored in a memory trace that is distinct from the memory trace representing the perpetrator. Because these two memory traces are initially distinct, witnesses in Experiment 2 were able to selectively retrieve information from their memory of the perpetrator while partitioning out or discounting their memory of the suspect from the showup. However, over time, the two memories--the memory of the perpetrator from the crime and the memory of the suspect from the showup--become less distinguishable, and it is this merging of the two memory *256 traces that produces the switches from nonidentification to suspect identification responses in Experiment 1. The results of Experiment 2 are consistent with the memory-based theory outlined above, but are inconsistent with the view that suspect identification rates increase because of the inherent suggestiveness due to the suspect's repetition across identifi- cation procedures. This suggests that the increases in suspect identifications reported by Haw et al. (2007) were not due simply to the repetition of the suspect, but rather to the multiple-target, multiple-test procedure. 34 LHUMB 241 Page 22 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. It is important to note that although suspect identifications did not increase substantially in Experiment 2, the overall identification rate did. In addition, these identifications were made with higher confidence relative to the control group. These results suggest that the repeated identification procedures increased witnesses' expectations that the perpetrator would be present in the lineup and are consistent with a downward shift in the decision criterion. Taken together, the results of Experiments 1 and 2 suggest two separate components that contribute to multiple testing procedures. First, misplaced familiarity due to the memory of the suspect, which cannot be selectively disregarded when it becomes entangled with the memory of the perpetrator, produces the no-to-suspect shifts shown Experiment 1. Second, heightened expectations and suggestiveness altered the decision processes, increasing the overall identification rate and changed the patterns of confidence in Experiment 2. One aspect of the Experiment 2 results that is not easily explained is why nonidentifying witnesses in the experimental condition were less likely to identify the suspect as being most similar to the perpetrator. This may have been a by-product of disregarding the memory of the suspect from the showup, or may be a response bias. Practical Implications In Experiment 1, a prior showup led to increases in both correct and false identification rates. Guilty suspects and innocent suspects were both more likely to be identified, and as a result the probative value of a suspect identification changed very little. This pattern was shown in the simple comparison of lineup identification responses and when suspect identifications were counted by the “or” rule. There have been many recommendations for reform of eyewitness identification procedures (U.S. Department of Justice, 1999; Wells & Seelau, 1995; Wells et al., 1998), but a ban on multiple identification procedures has not been among them. It is unlikely that there will be any ban to disallow law enforcement from conducting multiple identification procedures. Nor it is likely that multiple identification procedures will figure into decisions about admissibility at trial. Thus, it will be largely up to juries to consider the implications of such procedures in rendering verdicts. The simplest summary of these implications is that multiple identification procedures increase the likelihood that the suspect will be identified, but they do not increase the like- lihood that those who are identified are actually guilty. The suspect will simply be identified more often, whether he is guilty or not. In addition, the confidence of the witness should carry less weight as it may be a product of the multiple identification procedures, rather than an indicator of accuracy. Prior identification procedures can affect subsequent identifications in at least three ways. (1) As we showed in Experiment 1, and as Memon et al. (2002) showed earlier, misplaced familiarity and source confusion can lead witnesses who did not identify the suspect at P1 to identify him at P2. Consider that in these cases, the witness has initially provided probative ex- culpatory evidence by not identifying the suspect (at P1), but as a result of having seen the suspect (at P1) becomes more likely to identify that suspect when presented with him later (at P2). (2) Prior identification procedures can also produce higher suspect identification rates if the identification procedure for P1 is biased. Consider the case in which a witness identifies the suspect from a biased lineup at P1. Such an identification might carry little weight with a jury who would correctly see it as the product of the bias. But what if the biased lineup is followed by an unbiased lineup? Will the jury now accept the identification? Results from Hinz and Pezdek (2001) suggest that a false identification produced by a biased first lineup is not undone by the presentation of a second unbiased lineup. (3) Prior identification procedures can also alter witnesses' expectations and decision strategies at the subsequent lineup. Although this did not produce a significant increase in the suspect identification rate in Experiment 2, it remains an open question as to whether such alterations in decision processes might significantly increase suspect identification rates under other conditions. Limitations, New Questions, and Future Research In the present experiments, the first identification procedure was always a showup. How might this affect the patterns of results? Showups tend to have low identification rates. Also, showups allow for little confusion as to who appeared at the first procedure, unlike what may occur when witnesses are shown several (Memon et al., 2002) or hundreds (Dysart et al., 2001) of 34 LHUMB 241 Page 23 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. mugshots or a six-person lineup (Hinz & Pezdek, 2001). The point, of course, is that what carries forward from P1 to P2 may depend on the identification procedure used at P1. *257 A second limitation of Experiment 2 is that the short interval between the first and second identification procedure is unlikely in real criminal investigations. What is most important is the functional situation created by the short P1--P2 interval, rather than the means by which it was created. Experiment 2 shows what can happen when witnesses are able to partition their memories so as to disregard their memories of the suspect from the showup. Our interpretation of the results is that under such conditions memory effects may disappear, but suggestiveness effects and criterion shifts may emerge. Specifically, the re- peated identifications may be suggestive in increasing witnesses' expectation that the perpetrator will be in the lineup, and as a result, they lower their decision criterion. The ability to disregard the memory of the suspect does not imply that the first identification procedure will have no effect on the second identification. The disregard of the memory of the suspect, obtained with a short P1-P2 interval, may be obtained by other means. For example, detectives may review a witness's previous state- ments in the course of the second identification procedure. Also, there may be other circumstances that minimize the entangling of the memory of the perpetrator versus the memory of the suspect in a prior identification procedure. For example, the crime and the initial identification may occur in very similar contexts, or may occur in very different contexts. One might expect that if the crime and the first identification procedure occur in different contexts that witnesses would be more likely to keep the two memories separated over time. There are, of course, many questions left for future research. One question that we did not address in this research con- cerned witness beliefs about the second identification procedure. Specifically, what did they think about why they were pre- sented with another identification procedure? Witnesses in actual criminal investigations may hold a wide range of beliefs and assumptions about the purpose of multiple identification procedures. Is the second lineup being conducted because the first response was incorrect? Have the police shifted their focus to a different suspect? Is the second procedure a “formality” for which the expectation is that the witness will repeat previous responses? What witnesses believe about subsequent identifica- tion procedures will no doubt depend on the specific communications they have with police. One other result, somewhat outside of the main focus of the research, was that the confidence-accuracy relationship ap- peared to be stronger in Experiment 2 in which the retention interval was short than in Experiment 1 in which the retention interval was much longer. The result is consistent with the optimality hypothesis. However, there are little data bearing on the confidence-accuracy relationship as a function of retention interval, and additional research would be quite useful. Most important is the need for additional studies that directly compare the effects of multiple identification procedures for both correct and false identifications. The results of Experiment 1, like those of Brown et al. (1977) and Haw et al. (2007) showed that correct and false identifications both increased due to the pre-lineup exposure to the suspect. In Experiment 1, the increases were proportional so as to produce no change in the probative value of a suspect identification. How general is that result? Are there conditions in which correct and false identification rates might vary separately or disproportionately? Future research to address these questions will deepen our understanding of how memory and decision processes are altered through multiple identification procedures, and will expand our knowledge of the effects of multiple identification procedures in the criminal justice system. Acknowledgments The authors thank Kathy Pezdek for providing the raw data from Hinz & Pezdek (2001), Neil Brewer for his insightful comments and suggestions, and the San Bernardino County Sheriff's Department for their assistance. This research was supported by National Science Foundation grant SES-0647947. REFERENCES Behrman, B. W., & Davey, S. L. (2001). Eyewitness identification in actual criminal cases: An archival analysis. Law and Human Behavior, 25, 475-491. Bothwell, R. K., Deffenbacher, K. A., & Brigham, J. C. (1987). Correlation of eyewitness accuracy and confidence: Op- 34 LHUMB 241 Page 24 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. timality hypothesis revisited. Journal of Applied Psychology, 72, 691-695. Brown, D. L., Deffenbacher, K. A., & Sturgill, W. (1977). Memory for faces and the circumstances of encounter. Journal of Applied Psychology, 62, 311-318. Buckhout, R., Figueroa, D., & Hoff, E. (1975). Eyewitness identification: Effects of suggestion and bias in identification from photographs. Bulletin of the Psychonomic Society, 6, 71-74. Clark, S. E., & Godfrey, R. D. (2009). Eyewitness identification evidence and innocence risk. Psychonomic Bulletin and Review, 16, 22-42. Clark, S. E., Howell, R. T., & Davey, S. L. (2008). Regularities in eyewitness identification. Law and Human Behavior, 32(3), 187-218. Deffenbacher, K. A. (1980). Eyewitness accuracy and confidence: Can we infer anything about their relationship? Law and Human Behavior, 4(4), 243-260. Deffenbacher, K. A., Bornstein, B. H., & Penrod, S. D. (2006). Mugshot exposure effects: Retroactive interference, mugshot commitment, source confusion, and unconscious transference. Law and Human Behavior. 30(3), 287-307. Dennis, S., & Humphreys, M. S. (2001). A context noise model of episodic word recognition. Psychological Review, 108(2), 452-478. Dysart, J. E., Lindsay, R. C. L., Hammond, R., & Dupuis. P. (2001). Mug shot exposure prior to lineup identification: Interference, transference and commitment effects. Journal of Applied Psychology, 86(6), 1280-1284. *258 Federal Rules of Evidence. (2004). Committee on the Judiciary, 108th Congress, House of Representatives. Washington DC: U.S. Government Printing Office. Greenberg, M. S., & Ruback, R. B. (1992). Effect of a prior lineup identification on subsequent lineup identification. In T. Grisso (Ed.) Perspectives in law & psychology: After the crime, victim decision making (pp. 65-69). New York: Plenum Press. Gronlund, S. D., & Elam, L. E. (1994). List-length effect: Recognition accuracy and variance of underlying distributions. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20(6), 1355-1369. Haw, R. M., Dickinson, J. J., & Meissner, C. A. (2007). The phenomenology of carryover effects between show-up and lineup identification. Memory, 15(1), 117-127. Hinz, T., & Pezdek, K. (2001). The effect of exposure to multiple lineups on face identification accuracy. Law and Human Behavior, 25, 185-198. John, O. P., Donahue, E. M., & Kentle, R. (1991). “The big five” inventory-- version 4a and 54. Technical Report. Berkeley, CA: University of California. Berkeley, Institute of Personality and Social Psychology. Johnson, M. K., Hashtroudi, S., & Lindsay, S. D. (1993). Source monitoring. Psychological Bulletin, 114(1), 3-28. Juslin, P., Olsson, N., & Winman, A. (1996). Calibration and diagnosticity in eyewitness identification: Comments on what can be inferred from the low confidence-accuracy correlation. Journal of Experimental Psychology. Learning, Memory, and Cognition, 22(5), 1304-1316. 34 LHUMB 241 Page 25 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. Lindsay, R. C. L., Nosworthy, G. J., Martin, R., & Martynuck, C. (1994). Using mug shots to find suspects. Journal of Applied Psychology, 79, 121-130. Lindsay, S. D., Allen, G. P., Chan. J. C. K., & Dahl, L. C. (2004). Eyewitness suggestibility and source similarity: In- trusions of details from one event into memory reports of another event. Journal of Memory and Language, 50(1), 96-111. Lindsay, S. D., Hagen, L., Read, J. D., Wade, K. A., & Garry, M. (2004). True photographs and false memories. Psy- chological Science, 15(3), 149-154. Loftus, E. F. (1976). Unconscious transference in eyewitness identification. Law and Psychology Review, 2, 93-98. Memon, A., Hope, L., Bartlett, J., & Bull, R. (2002). Eyewitness recognition errors: The effects of mugshot viewing and choosing in young and old adults. Memory & Cognition, 30, 1219-1227. Olsson, N. (2000). A comparison of correlation, calibration, and diagnosticity as measures of the confidence-accuracy relationship in witness identification. Journal of Applied Psychology, 85, 504-511. Pezdek, K., & Blandon-Gitlin, I. (2005). When is an intervening lineup most likely to affect eyewitness identification accuracy? Legal & Criminological Psychology, 10, 247-263. Shaw, J. S. (1996). Increases in eyewitness confidence resulting from postevent questioning. Journal of Experimental Psychology: Applied, 2(2), 126-146. Shaw, J. S., & McClure, K. A. (1996). Repeated postevent questioning can lead to elevated levels of eyewitness confi- dence. Law and Human Behavior, 20(6), 629-653. Thompson-Cannino, J., Cotton, R., & Torneo, E. (2009). Picking cotton: Our memoir of injustice and redemption. New York: St. Martin's Press. U.S. Department of Justice. (1999). Eyewitness evidence: A guide for law enforcement. Washington DC: U.S. Department of Justice. Wells, G. L., & Lindsay, R. C. (1980). On estimating the diagnosticity of eyewitness nonidentifications. Psychological Bulletin, 3, 776-784. Wells, G. L., & Loftus, E. F. (2003). Eyewitness memory for people and events. In A. Goldstein (Ed.), Comprehensive handbook of psychology, vol. 11. Forensic psychology (pp. 149-160). New York: Wiley. Wells, G. L., & Olson, E. A. (2002). Eyewitness identification: Information gain from incriminating and exonerating behaviors. Journal of Experimental Psychology: Applied, 8(3), 155-167. Wells, G. L., & Quinlivan, D. S. (2009). Suggestive eyewitness identification procedures and the supreme court's relia- bility test in light of eyewitness science: 30 years later. Law and Human Behavior, 33, 1-24. Wells, G. L., & Seelau, E. P. (1995). Eyewitness identification: Psychological research and legal policy on lineups. Psychology, Public Policy, and Law, 1(4), 765-791. 34 LHUMB 241 Page 26 34 Law & Hum. Behav. 241 © 2011 Thomson Reuters. No Claim to Orig. US Gov. Works. Wells, G. L., Small, M., Penrod. S., Malpass. R. S., Fulero, S. M., & Brimacombe, C. A. E. (1998). Eyewitness identi- fication procedures: Recommendations for lineups and photospreads. Law and Human Behavior, 22(6), 603-647. www.innocenceproject.org. Accessed 30 June 2009. [FNa1]. R. D. Godfrey S. E. Clark Psychology Department, University of California, Riverside, Riverside, CA 92521, USA e-mail: ryan.godfrey @email.ucr.edu [FN1]. Our assertion that the target-absent lineup was biased may appear at odds with the results of an empirical mock witness evaluation performed by Hinz and Pezdek (2001). which showed that the innocent suspect was identified at a rate no different from chance, by participants who were provided with a description of the target. However, this was a weak test of lineup fairness, as the description provided to participants actually contained no information about the target's appearance. The de- scription did not give information of height, weight, race, or hair color, and was thus likely to have been irrelevant to the lineup fairness evaluation. Under such circumstances, unless the lineup photographs are very different (such as those presented by Buckhout, Figueroa, & Huff, 1975), the mock witness test reduces to guessing. [FN2]. The prospective confidence data, for Experiment 1 and for Experiment 2, are not central to the main questions addressed in this paper, and detailed analyses are not presented. We did analyze these data, however, and in the interest of full disclosure, note some unexpected results. For both experiments the prospective confidence data were analyzed as a 2 x 2 ANOVA. with condition (experimental versus control) and guilt of the suspect (guilty or innocent) as factors. Both experiments showed no differences due to the experimental condition or the guilt of the suspect, as would be expected, given that these confidence judgments are obtained prior to the identification procedures. However, both experiments showed an unexpected significant interaction (although not the same pattern). Through additional analyses we traced the interactions to a single research assistant in each experiment who elicited high prospective confidence, and did not conduct each condition of the experiment equally often. It is not surprising that witnesses' reported subjective feelings of confidence would vary across experimenters. However, the key question is whether these experimenter-specific effects carried forward in such a way as to alter witnesses' responses to subsequent identification and confidence questions. To address this possibility we conducted all of the other analyses, with and without data collected by these two research assistants. Not only were the general patterns of results the same across both sets of analyses, so too were the statistical conclusions. 34 Law & Hum. Behav. 241 END OF DOCUMENT Effects of Mugshot Commitment on Lineup Performance in Young and Older Adults CHARLES A. GOODSELL1*, JEFFREY S. NEUSCHATZ2 and SCOTT D. GRONLUND1 1University of Oklahoma, USA 2The University of Alabama in Huntsville, USA SUMMARY Two experiments assessed the effects of mugshot commitment on the ability to make a subsequent lineup identification. Young (17–37 years) and older (55–87 years) participants viewed a crime video featuring a younger (20 years) or older (64 years) culprit. Some participants viewed a 50-photograph culprit-absent mugbook. Following a 1-week delay, participants returned to view a culprit present lineup. In Experiment 1, mugbook choosers tended to select their prior selection in the lineup and mugbook nonchoosers tended to reject the lineup. In Experiment 2, mugshot choosers rejected a lineup that did not contain their prior selection. Commitment to a prior selection and commitment to a selection strategy were the cause of the majority of lineup errors. As previously reported, mugshot exposure harms subsequent lineup identification, and this appears to be primarily the result of commitment effects. Copyright # 2008 John Wiley & Sons, Ltd. In criminal cases, eyewitness identification testimony can be the most compelling and persuasive evidence against a defendant (e.g. Cutler, Penrod, & Dexter, 1990; Fox & Walters, 1986; Wells, Ferguson, & Lindsay, 1981). It also is one of the most common forms of direct evidence used in the courtroom, and research has illustrated that direct evidence is more persuasive to juries than circumstantial evidence (Bergman, 1996; Dorf, 2001; Kaci, 1995). Because eyewitness identification can be inaccurate (Neuschatz & Cutler, 2008; Rattner, 1988; Wells, 1993; Wells, Small, Penrod, Malpass, Fulero, & Brimacombe, 1998), this is problematic for an innocent suspect. According to the Innocence Project (http:// www.innocenceproject.com, retrieved 28 February 2008), of the individuals released from prison due to incontrovertible DNA evidence, approximately 75% of these individuals were convicted due in part to mistaken eyewitness identification (Connors, Lundregan, Miller, & McEwan, 1996; Scheck, Neufeld, & Dwyer, 2000). Unfortunately, the most persuasive evidence may not be the most reliable. Wells (1978) argued that the most effective way that psychologists could respond to the inaccuracy of eyewitness research was through system variable research. System variable research involves investigation of the procedures under control of the justice system to see if changes to how those procedures were conducted could enhance eyewitness accuracy. APPLIED COGNITIVE PSYCHOLOGY Appl. Cognit. Psychol. 23: 788–803 (2009) Published online 5 September 2008 in Wiley InterScience (www.interscience.wiley.com) DOI: 10.1002/acp.1512 *Correspondence to: Charles A. Goodsell, Department of Psychology, University of Oklahoma, 455 W. Lindsey, Norman, OK 73019, USA. E-mail: cgoodsell@ou.edu Copyright # 2008 John Wiley & Sons, Ltd. One such procedure commonly used by the police involves presenting mugshots to a witness prior to viewing a lineup. The mugshot exposure effect (e.g. Memon, Hope, Bartlett, & Bull, 2002) occurs when an eyewitness is shown photographs (mugshots) following a crime and then, at a later time, shown a lineup. Past research has demonstrated that the exposure to mugshots can produce a variety of negative effects on lineup performance, including a decrease in correct identifications (e.g. problems for convicting the guilty) and an increase in incorrect identifications (e.g. potential conviction of the innocent) (Deffenbacher, Bornstein, & Penrod, 2006). These effects include: Selecting an innocent individual (a foil) from a lineup that was previously seen among a number of photographs shown to the eyewitness prior to the lineup (Brown, Deffenbacher, & Sturgill, 1977), or selecting a foil that was previously selected from a number of mugshots (Brigham & Cairns, 1988; Dysart, Lindsay, Hammond, & Dupuis, 2001; Gorenstein & Ellsworth, 1980). There are generally two explanations offered to account for the negative influence of mugshot exposure on lineup performance: Familiarity and commitment. Familiarity effects can occur due to a phenomenon known as unconscious transference. For an eyewitness to a crime, this occurs when the memory of an innocent bystander is confused with their memory of the culprit (Buckhout, 1974; Loftus, 1976; Read, Tollestrup, Hammersley, McFadzen, & Christensen, 1990; Ross, Ceci, Dunning, & Toglia, 1994). This effect can be understood in terms of the source-monitoring framework (Johnson, Hashtroudi, & Lindsay, 1993) or fuzzy trace theory (see Reyna & Brainerd, 1995). The source-monitoring framework predicts that witnesses may have difficulty discriminating between separate memory traces (e.g. a bystander vs. culprit) and may not realise that a feeling of familiarity (when making an identification) may be due to some other past exposure to an individual rather than witnessing the crime (Lindsay, 1994). With regard to mugshots, participants are more likely to make a false identification of an individual viewed in the mugshot phase compared to photographs of individuals that had not been seen before (e.g. Brown et al., 1977; Davies, Shepherd, & Ellis, 1979). The exposure to mugshots could lead to a source error because a foil seems familiar due to the exposure from the mugshot rather than from witnessing the event. From the perspective of fuzzy trace theory (Reyna &Brainerd, 1995), mugshot exposure strengthens verbatim representations of those foils relative to the witnessed event, making a familiar foil more likely to be selected. The second factor, and the focus of this paper, suggests that the negative effects of mugshot exposure result from commitment. The commitment effect can take two forms: (1) Committing to a previously selected foil and (2) committing to a selection strategy. Gorenstein and Ellsworth (1980) proposed that once an eyewitness has chosen someone from an initial group of photographs, they are likely to choose that same person again in a later identification task, regardless of whether they chose the culprit originally (committing to a previously selected foil). The mugshot choosers in their study were more likely to select their prior choice than the actual culprit or other foils. Further support for committing to a previously viewed foil was demonstrated by Blunt and McAllister (in press), Dysart et al. (2001), Haw, Dickinson, and Meissner (2007), Heinz and Pezdek (2001) and Schooler, Foster, and Loftus (1988). Others have proposed that committing to a particular foil may not be necessary to observe the deleterious effects of mugshot exposure. It may be that witnesses commit to a response strategy (e.g. mugshot choosers continue to choose and nonchoosers do not). Consistent with the commitment to strategy notion, Memon et al. (2002) demonstrated that individuals who selected someone from a set of culprit-absent mugshots (mugshot choosers) were more likely to falsely identify a Copyright # 2008 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. 23: 788–803 (2009) DOI: 10.1002/acp Mugshot commitment 789 previously seen critical foil during the subsequent lineup phase. That is, mugshot choosers continue to choose from a subsequent lineup task, regardless of whether their prior choice is there. In a similar vein, Brigham and Cairns (1988) found that those individuals who do not select a foil (mugshot nonchoosers) tend to remain committed to NOTmaking a choice; that is, they indicate that the culprit is absent in both the mugshots and the lineup. Considering mugshot choosers and nonchoosers separately is an important consider- ation and differential predictions emerge when considering familiarity versus commitment explanations of mugshot exposure effects. For mugshot choosers, the familiarity hypothesis predicts different outcomes depending on whether a prior mugshot choice is in the lineup along with a familiar foil (an unselected individual from the mugbook). First, in the absence of their prior mugshot choice, participants should select either the familiar foil, because it is the only that has been viewed previously and therefore should be perceived as most familiar, or reject the lineup because the familiar foil falls below their selection criterion (because it seems less familiar than their prior choice, which is not there). Second, in the presence of their prior choice, the diagnosticity of familiarity may be reduced due to the presence of the chooser’s prior mugshot choice along with another familiar foil. Participants could select their prior choice because it is more familiar, choose between the two if they are equally familiar, or reject the lineup because more than one choice is familiar. Haw et al. (2007) found that witnesses who identified an innocent suspect from a show-up were highly likely (74%) to select that same foil again in a target absent lineup. When witnesses who identified the innocent show-up were presented with a target present array, they were less likely (50%) to choose him again. Here we see the effects of competition due to familiarity affecting lineup decisions. Note that in the culprit- present lineups employed in the current study, the culprit also should seem familiar and would compete with a familiar foil (from the mugbook) and their prior mugshot choice. In contrast, the commitment explanation (commit-to-foil) predicts mugshot choosers should select their prior choice in a lineup that contains it, and reject the lineup if their mugshot selection is not present, irrespective of the presence of another familiar foil or even the culprit. For mugshot nonchoosers, according to the familiarity hypothesis, exposure to a familiar foil during the lineup should increase the likelihood of making a false identification of that foil (e.g. transference effects). However, the commitment explanation (commit-to- selection-strategy) predicts mugshot nonchoosers subsequently will reject the lineup (Brigham & Cairns, 1988). Note that this could be due to poor memory for the event or simply a high selection criterion. However, we will investigate whether these effects change in the presence of the actual target (culprit-present lineups). The actual target should be a familiar selection and should compete with other familiar foils. One recent study has attempted to empirically disentangle the contributions of familiarity and commitment as they contribute to mugshot exposure effects. Memon et al. (2002) had both young and older participants view a video of a mock crime and exposed some of them to a series of 12 culprit-absent mugshots. One of these mugshots was deemed the critical foil and also appeared in the culprit-absent lineup phase 2 days later. Memon and her colleagues found that mugshot choosers were more likely to select the familiar critical foil from the lineup than either mugshot nonchoosers or the no-mugshot control group. They concluded that mugshot choosing is linked to selection of a familiar foil because the mugshot choosers selected the critical foil at a greater rate. Additionally, Memon et al. concluded that commitment did not play a major role in lineup decisions. However, close inspection of Memon and her colleagues’ data indicate that they did find Copyright # 2008 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. 23: 788–803 (2009) DOI: 10.1002/acp 790 C. A. Goodsell et al. evidence for commitment. By definition, the commit-to-foil strategy would require that the witness be able to choose their mugbook selection in the lineup. However, in the Memon et al. (2002) study, only those individuals who happened to choose the predesignated critical foil could show commitment effects. There were only 13 of 73 individuals (17.8%) who chose the critical foil from the mugbook. As a consequence, these are the only participants that could be tested for the commit-to-foil strategy. Of these 13, 8 remained committed to the critical foil, showing a strong commit-to-foil effect. In fact, this commitment to foil rate (P¼.615) matched that reported in Dysart et al. (2001) (P¼.614). Furthermore, the mugshot nonchoosers showed a high rate of lineup rejection (P¼.65), thereby exhibiting a strong commitment to their selection strategy (Brigham & Cairns, 1988). Interpretation of the Memon et al. (2002) results are problematic because, as designed, familiarity was confounded with commitment: It is unclear which effect is responsible for the high rate of false identifications. Mugshot choosers from the Memon et al. study who did not see their prior selection during the lineup exhibited a transference error if they selected the familiar critical foil, while those who did see their prior choice exhibited a commit-to-foil error. Additionally, because only culprit-absent lineups were employed, it is unclear if these effects would persist given the opportunity to view the actual culprit. The goal of the present study was to evaluate the commitment effect while controlling the contribution from familiarity. Specifically, lineup construction was tailored for each individual so that all mugshot choosers would see a lineup that contained their prior choice along with both a familiar foil, as well as the actual culprit. This design will allow a test of the contributions of commitment. The present study also adopts several additional methodological changes to better simulate mugshot exposure in the real world. For example, Dysart et al. (2001) pointed out that a limitation of past research on mugshot exposure was the unrealistically low number of mugshots used in searches. Previous researchers (Brown et al., 1977, Brigham&Cairns, 1988; Gorenstein & Ellsworth, 1980; Memon et al., 2002) typically have employed fewer than 20 mugshots. Police departments may only use such a small number when they have a specific suspect in mind. In such a circumstance, this procedure would be treated as an identification and another identification procedure would be unlikely (L. Perkins, personal communication, 2 October 20071). Dysart and her colleagues showed participants up to 600 mugshots. In line with their observation that participants find searching though such a large number of mugshot confusing (see Lindsay, Nosworthy, Martin, & Martynuck, 1994), and the need for further research on the optimal mugbook size (McAllister, 2007), we decided to try 50 photos. In addition to the number of photos shown, many studies used small retention intervals from 20minutes to 1–2 days between mugshot search and identification (e.g. Blunt & McAllister, in press; Dysart et al., 2001; Memon et al., 2002). It is unclear if these effects persist after a longer delay. In addition, it would be rare for the police to build a case against someone identified in a mugshot search and conduct a formal identification procedure in such a short period of time (L. Perkins, personal communication, 2 October 20071). Participants in the current study experienced a 1-week retention interval before making a lineup identification. To our knowledge, this is the first study to test commitment over such a long retention interval. 1H. Lloyd Perkins, Chief of police, Skaneateles Police Department, Skaneateles, NY. Copyright # 2008 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. 23: 788–803 (2009) DOI: 10.1002/acp Mugshot commitment 791 Finally, little research has examined the effects of mugshot commitment on elderly eyewitnesses. Of the four studies reported in the recent Deffenbacher et al. (2006) meta- analysis that included the possibility for commitment effects, only Memon et al. (2002) included the age-of-witness factor, and none included the age-of-culprit factor. In the current study, young and older participants viewed a mock crime video featuring either a younger or older culprit. Prior research showed that memory for faces (e.g. Anastasi & Rhodes, 2005; Anastasi & Rhodes, 2006; Bartlett & Fulton, 1991; Bartlett, Strater, & Fulton, 1991; Lamont, Stewart-Williams, & Podd, 2005; Perfect & Harris, 2003; Searcy, Bartlett, & Memon, 1999), as well as eyewitness identification accuracy (e.g. Memon, Bartlett, Rose, & Gray, 2003; Wright & Stroud, 2002), varied for young and older adults when older targets were used. For example, Wright and Stroud (2002) proposed an own- age bias in recognition. That is, younger individuals are better at identifying a younger culprit correctly, and older individuals are better at identifying an older culprit correctly. However, in the Wright and Stroud study, older individuals did not display the typical higher rate of false recognitions (e.g. Fulton & Bartlett, 1991). One possible explanation for this lack of replication is that Wright and Stroud’s older population ranged in age from only 35 to 55. Memon et al. (2003) used a more representative older sample and found that for both culprit present and absent lineups, older participants made more false choices than did the younger participants, regardless of the age of the culprit. Our no-mugshot participants will serve as a test of these hypotheses, as they cannot be affected by the influence of mugshot exposure. We expect similar results to that of Memon et al. (2003) (higher false alarms for the older witnesses), as this part of the experiment closely resembles their study. In sum, those individuals exposed to mugshots will show a decreased correct identification rate and an increased incorrect identification rate (Deffenbacher et al., 2006). Furthermore, we are predicting a difference between mugshot choosers and mugshot nonchoosers. Mugshot choosers should select the same foil again in the lineup (commit-to- foil), while mugshot nonchoosers should reject the lineup (commit-to-selection strategy). We will examine both age-of-witness and age-of-culprit factors for these effects. EXPERIMENT 1 Method Participants Young witness participants (ages 17–37, M¼ 21.1, N¼ 108) were recruited from introductory psychology classes at The University of Alabama in Huntsville. All participants received course credit in exchange for their participation. Older witness participants (ages 55–87,M¼ 70.2, N¼ 97) were recruited from local churches, retirement communities, and senior citizen organisations. All participants were treated in accordance with the APA ethical guidelines. Design The study employed a 2 (Age of Participant: Young vs. Older) 2 (Age of Culprit: Young vs. Older) 2 (Prelineup Condition: No-Mugshot vs. Mugshot) between-subjects design. Copyright # 2008 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. 23: 788–803 (2009) DOI: 10.1002/acp 792 C. A. Goodsell et al. Older and young participants were randomly assigned to the mugshot or no-mugshot condition as well as the older or younger culprit condition. Dependent measures included final lineup selection and an assessment of their confidence in their lineup decision. Materials Event. All participants were randomly assigned to view one of two short videos of a staged crime featuring either a young culprit or an older culprit. The video begins showing a secretary’s office. After 5 seconds elapses, the culprit, either a 20-year-old white male with brown hair and no facial hair (young culprit condition) or a 64-year-old white male with gray hair and no facial hair (older culprit condition) enters and begins a conversation with the secretary. After a 10-second conversation, the culprit hands the secretary a piece of paper. She takes the paper and walks into the next room. As soon as she enters the next room, the culprit reaches over the desk and steals a wallet out of her purse and walks quickly out of the room. The culprit in each video is in view for 25 seconds. The only difference in the two videos is the age of the culprit. Photographs. Photographs selected for the mugbook and the lineup were chosen based on their match to the description of the culprit (Clark & Tunnicliff, 2001; Luus &Wells, 1991; Wells, Rydell, & Seelau, 1993). Specifically, an introductory psychology section at The University of Alabama in Huntsville viewed the crime videos and provided an eyewitness description of both culprits, following the witness description method suggested by a chief of police1. These descriptions were tallied for the most frequently occurring features. Photographs for the study were generated from volunteers from local churches, senior centres and colleges from regions of upstate New York. Photographers were given these descriptions of the culprits and asked to take pictures of individuals who matched the description. A different introductory class was given the witness descriptions of the culprit and asked to rate the 76 photographs collected for each culprit on their similarity to the descriptions using a five-point scale. These ratings were averaged and the photographs were rank-ordered. Photographs ranked number 4–53 were placed in the mugbook. Mugshots. Two mugbooks were created using Microsoft PowerPoint and were each composed of 50 800 600 pixel head and shoulder digital colour photographs (mugshots) of either young (18–25 years old) or older (55–93 years old) white males. The culprit was not included. One mugshot appeared on each page and each photograph was numbered from 1 to 50. Lineups. Each participant had a lineup created for his or her particular condition. The lineup consisted of six colour photographs presented simultaneously. For participants in the no-mugshot condition, the lineup consisted of the culprit, photographs ranked number 4 and 5 (from the mugbook), and the photographs ranked number 1–3. Photographs 1–3 did not appear in the mugshot phase. Participants in the mugshot condition who selected a photograph from the mugshot phase (mugshot choosers) saw the culprit, the photograph they selected from the mugbook, the photograph ranked number 4, (from the mugbook), and the three new photographs that the no-mugshot group saw (photographs ranked number 1–3). When a participant did not select a photograph from the mugbook phase (mugshot nonchoosers), or choose the photograph ranked number 4 (the familiar foil), their lineup was the same as the no-mugshot group. Copyright # 2008 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. 23: 788–803 (2009) DOI: 10.1002/acp Mugshot commitment 793 Confidence assessment. A seven-point scale (1¼ not at all certain, 7¼ totally certain) was administered to assess the participant’s confidence in their lineup decision. Screening measures/filler task. The digit span and the digit symbol coding tests from the Weschler adult intelligence scale (WAIS-III; Weschler, 1997) were administered to all participants as a filler task. If the participant did not score within 1.5 standard deviations of their own age norm, the decision was made in advance not to record their data. None were excluded for this reason. Also, each participant was given an eye test using a Snellen chart to assure that his or her corrected vision was as least 20/40. Only one individual was excluded due to poor vision. Procedure During the first session, participants were seated in a small room and shown a 30-second video. They were instructed to focus on any conversations they saw in the video and watch for nonverbal behaviour. After the video, participants completed the digit span, the digit symbol coding test (WAIS-III; Weschler, 1997) and the Snellen eye test. These tests were given to simulate the delay that typically occurs between witnessing an event and police questioning. Following these tasks, participants in the no-mugshot condition were informed that they had just witnessed a mock crime and that they would be asked questions about the event in 1 week. They were dismissed and asked to return a week later. Participants in the Mugshot condition were asked to search though the mugbook and indicate whether the culprit was present or if he was not present among the photographs they viewed. They were given as much time as they desired to look at the photos and were allowed to look at the photographs more than once. Following their mugbook decision, they were asked to return in 1 week for some follow-up questions. All participants returned a week later and were shown the appropriate culprit-present lineup. Participants were asked to indicate whether the culprit was in the lineup, and if so, indicate which photo. They were told that the culprit may or may not be in the lineup. Next, all the participants indicated their confidence in their decision on a seven-point scale. After completing this procedure the participants were debriefed, and thanked for their participation. Results and discussion Mugshot exposure effects First we address whether exposure to mugshots resulted in lineup performance decrements assessed in terms of correct identifications, incorrect identifications and incorrect rejections. These dependent variables were treated as a dichotomy and we conducted three separate 2 (Prelineup Condition: No-Mugshot vs. Mugshot) 2 (Age of Witness: Young vs. Older) 2 (Age of Culprit: Young vs. Older) logistic analyses. Effect size measures are reported as log odds ratios (ESlor). Correct identifications. Exposure to the mugbook had a significant negative impact on participants’ abilities to identify the culprit correctly from the lineup. Results from the logistic analysis revealed a significant effect of mugshot condition, x2(1,N¼ 205)¼ 12.01, p< .001, ESlor¼ 1.24, which indicated that those in the mugshot condition were significantly less likely to correctly identify the culprit. A significant effect of witness age emerged, x2(1, N¼ 205)¼ 4.21, p< .05, ESlor¼ .71, indicating that younger witnesses Copyright # 2008 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. 23: 788–803 (2009) DOI: 10.1002/acp 794 C. A. Goodsell et al. were significantly more likely to identify the culprit correctly. There was no significant effect of culprit age and none of the interactions were significant. Incorrect identifications. Mugshot viewing significantly increased incorrect identifications. Those in the Mugshot group were significantly more likely to error compared to the no- mugshot group, x2(1, N¼ 205)¼ 6.38, p< .05, ESlor¼ .73. No other significant effects emerged. Incorrect rejection. The logistic analysis showed no significant results for incorrect rejections. This indicated that the rate of incorrect rejection did not differ by mugshot condition, age of witness or age of the culprit. Mugshot choosing and lineup performance The goals of this paper were to extend work on mugshot exposure and the commitment bias to an older culprit and an older population of witnesses, examine the commitment bias in the face of the true culprit, and determine whether the Memon et al. (2002) results would extend to a different experimental design. Because mugshot choosers, cannot make the same type of lineup errors as mugshot nonchoosers, we analysed their lineup outcomes separately. That is, mugshot choosers have an opportunity to commit to a previously selected foil whereas mugshot nonchoosers cannot. Both choosers and nonchoosers, on the other hand, can commit to a selection strategy (i.e. choosing/not choosing). Table 1 illustrates the proportion of errors attributed to the various erroneous selections each group made (correct identifications excluded). Mugshot choosers. Of the 74mugbook choosers, only 7 correctly identified the culprit from the lineup. Of the 67 participants who chose incorrectly from the lineup, 48 (P¼.716) chose the same innocent foil they had selected previously from the mugbook. As shown in Table 1, those who committed to a foil represented the majority of the errors compared to those who subsequently indicated the culprit was not present (P¼.149, Z¼ 6.63, p< .001, h¼ 1.24), or those who selected a new foil (P¼.090, Z¼ 7.40, p< .001, h¼ 1.44) or a familiar foil (P¼.045, Z¼ 8.01, p< .001, h¼ 1.65). In addition the commit-to-foil best explains these errors because only 9 mugshot choosers chose someone other than their original choice (i.e. commit-to-selection-strategy). Note that Memon et al. (2002) found that older participants were significantly more likely to make a selection from a mugbook. If this is true, then older eyewitnesses may be susceptible to different types of errors in the lineup task. However, a 2 (witness age) 2 (culprit age) logistic analysis on choosing rate revealed that there were no significant differences in the rate of mugshot choosing between the young witnesses and the older witnesses, x2(1, N¼ 102)¼ .32, ns, ESlor¼ .05, or between either culprit x2(1, N¼ 102)¼ Table 1. Proportion of lineup errors as a function of type of selection for the mugshot condition Group Commitment Familiar foil New foil Miss Choosers (N¼ 67) .716 (48) .045 (3) .090 (6) .149 (10) Nonchoosers (N¼ 23) .913 (21) .000 (0) .087 (2) N/A Note: N presented in parentheses. Miss is an incorrect rejection. Miss and commitment are equivalent for nonchoosers. Copyright # 2008 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. 23: 788–803 (2009) DOI: 10.1002/acp Mugshot commitment 795 .68, ns, ESlor¼ .36. and no interaction, x2(1, N¼ 102)¼ 3.69, ns, ESlor¼ 1.15. In addition, the type of lineup errors did not differ between young and old witnesses. Specifically, the 2 2 logistic analysis on commit-to-foil errors indicated no affects of age of witness, x2(1, N¼ 67)¼ 2.07, ns, ESlor¼ .79 or culprit, x2(1, N¼ 67)¼ .24, ns, ESlor¼ .38 as well as no interaction, x2(1, N¼ 67)¼ 1.65, ns, ESlor¼ .50. Mugshot nonchoosers. Of the 28 participants who did not select a photo from the mugbook, 21 (P¼.750) indicated that the culprit was not present in the lineup, thus showing a commitment to response strategy (Brigham & Cairns, 1988). This represented the majority (P¼.913) of the lineup errors made by the mugshot nonchoosers (see Table 1). Because of the small number of nonchoosers, we combined across the age variables and performed simple z for proportion tests. Note that effect size measures for the difference in proportions are given by Cohen’s h (see Cohen, 1988). The rate at which the Mugshot nonchoosers rejected the lineup (21 out of 28 or P¼.750) was greater than that of the mugshot choosers (P¼.135, Z¼ 6.03, p< .001, h¼ 1.3), as well as those in the no-mugshot condition (P¼.301, Z¼ 4.31, p< .001, h¼ .77). Age effects—no-mugshot condition To examine a possible own-age bias, we conducted two separate 2 (age of witness) 2 (age of culprit) logistic analyses on hits and incorrect identifications, respectively. No support for an own-age bias was evident. The results indicated that for both hit rate and incorrect identification rate, there were no differences between young and older witness viewing either a young or older culprit. Confidence measure Following each participant’s lineup decision, a measure of confidence was taken. Responses were given on a seven-point scale, where a low score indicated a low degree of certainty and a high score indicated a high degree certainty. Two factors are of interest here. First, does commitment to an incorrect foil result in greater confidence, and second, does age or mugshot exposure play a role? Commitment to a foil did not result in greater confidence than any other lineup outcome. For example, those who remained committed to a foil (M¼ 3.65) were no more confident than those in the no-mugshot condition who correctly identified the culprit (M¼ 3.74, t (195)¼.692, ns). Next we conducted a 2 (Age of Witness: Young vs. Older) 2 (Age of Culprit: Young vs. Older) 2 (Mugshot Condition: Mugshot vs. No-Mugshot) ANOVA. There was a significant main effect of age- of-witness; younger witnesses (M¼ 3.90) were more certain in their selections than older witnesses (M¼ 3.30), F (1, 196)¼ 5.61, p< .05, h2p ¼ .028. This replicates prior eyewitness research showing greater confidence of younger witnesses (e.g. Neuschatz et al., 2005). There were no significant differences found between accurate and inaccurate eyewitnesses. Overall, the majority of errors due to mugshot exposure were mediated by commitment. That is, commitment to a foil, as well as commitment to selection strategy, were respon- sible for 76% of the total errors made. Mugshot choosers were highly likely to select their prior mugshot choice in a lineup, even when that lineup contained the real culprit. Finally, mugshot nonchoosers were highly likely to reject the lineup that contained the real culprit. Although mugshot choosers were likely to remain committed to their prior selection, it is still unclear what would happen if their prior mugshot choice was not available, given a culprit-present lineup. Recall that Memon et al. (2002) showed that mugshot choosers were Copyright # 2008 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. 23: 788–803 (2009) DOI: 10.1002/acp 796 C. A. Goodsell et al. highly likely to select the familiar critical foil in a culprit-absent lineup two days later (e.g. commit-to-selection strategy). That is, choosers will continue to choose, and because the critical foil seems familiar, it is the best choice. Would mugshot choosers make similar decisions in the context of our design? Experiment 2 was designed to test this possibility. EXPERIMENT 2 If mugshot choosing does promote a strategy for lineup choosing of a familiar foil (Memon et al., 2002), it should be the case that those participants that made a mugshot selection would still choose if their choice was not in the lineup but was replaced with an arguably familiar foil, the actual culprit. In the current experimental context, mugshot choosers could select either the familiar foil or the actual culprit, because they should both seem familiar. The results from Experiment 1 were consistent with a source-monitoring framework (Lindsay, 1994) in that a new memory trace may have been created when a photo was selected from the mugbook that was confused with the memory of the actual culprit. However, if prior commitment is the factor responsible for poorer performance for those that view a mugbook, we should see that those individuals who choose from the mugbook should be hesitant to choose from a lineup that did not contain that prior choice, despite the familiarity of the alternatives. To test this, procedures were the same as Experiment 1, except that for mugshot choosers in Experiment 2, their lineups did not contain their prior choice. We were interested in whether they would (1) choose from the lineup based on familiarity (e.g. choose the familiar foil or the culprit; commit-to-selection strategy) or (2) reject the lineup, presumably because their prior choice was not there (commit-to-foil). Method Participants Participants (ages 18–37, M¼ 20.5, N¼ 56) were recruited from introductory psychology classes at The University of Alabama in Huntsville. All participants received course credit in exchange for their participation and were treated in accordance with the APA ethical guidelines. Design Participants were randomly assigned to either the mugshot or no-mugshot condition. The dependent measure of interest was their final lineup selection. Materials and procedure The materials were identical to those described in Experiment 1. All materials were based on the young-culprit condition from the prior experiment, as we did not find any age-of- witness or age-of-culprit differences in Experiment 1. The procedure was identical as well; however, lineup construction was slightly different for those in the mugshot condition. For mugshot choosers, their lineup did not contain their mugbook choice. This meant that the lineups were identical for both mugshot nonchoosers and no-mugshot participants. The lineups were the same as those in the no-mugshot condition from Experiment 1. The lineups contained the culprit, a familiar foil (from the mugbook), and four new foils. Copyright # 2008 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. 23: 788–803 (2009) DOI: 10.1002/acp Mugshot commitment 797 Results and discussion Mugshot exposure effects Replicating the results from Experiment 1, the no-mugshot group (P¼.407) tended to make more correct identifications than the Mugshot group (P¼.207, Z¼ 1.63, P¼.051, one-tailed, h¼ .48). However, there were no differences between the no-mugshot group (P¼.259) and the mugshot group (p¼ .241, Z¼ .154, p> .05, h¼ .05) in incorrect identifications. This result is seemingly at odds with the results from Experiment 1; however, the mugbook choosing data will provide an explanation of the discrepant results. Consistent with Experiment 1, there was no difference in the incorrect rejection rate. Mugshot choosing effects The goal of Experiment 2 was to determine whether mugshot choosers would select from a lineup that did not contain their choice. Memon et al. (2002) demonstrated that mugshot choosers would select the familiar foil at a high rate (P¼.40) from a culprit absent lineup. However, this was not the case in the present experiment, which employed culprit present lineups. Of the 25 mugshot choosers in this experiment, only 4 (P¼.16) selected the familiar mugbook foil from the lineup showing a small transference effect due to familiarity. The actual culprit fared no better; only 3 (P¼.12) correctly identified him. The foil identification rate was similar, as another 3 (P¼.12) selected a new foil. The majority of mugshot choosers falsely rejected the culprit-present lineup, as 15 (P¼.60) witnesses indicated the culprit was not in the lineup. Recall that in Experiment 1, mugshot choosers were likely to select their prior choice in the lineup. The results of Experiment 2 suggest that mugshot choosers were looking for their previous mugbook selection, and when that selection was not present, decided to select no one. Consistent with the commitment to foil hypothesis, the selection of an incorrect foil from a series of mugshots becomes a better choice in the lineup, even when pitted against the actual culprit. This explains the high rate of incorrect identifications in Experiment 1, and the high rate of incorrect rejections in Experiment 2 for those in the mugshot conditions. Interestingly, we did observe evidence for a limited role for familiarity, as the number of participants demonstrating a transference error equaled the correct identification rate. GENERAL DISCUSSION The results of these two experiments are clear: Mugshot choosing behaviour has a strong affect on subsequent lineup decision-making. Mugshot exposure led to high rates of incorrect identifications as well as incorrect lineup rejections. This primarily was due to the effect of commitment. These results add to prior work on the commitment effect (Brigham &Cairns, 1988; Dysart et al., 2001), help to specify better the commitment hypothesis, and extend this work to include both age-of-witness and age-of-culprit variables. Interestingly, for those in the mugshot conditions, the commitment effect was so strong that no reliable age difference emerged. That is, errors were driven by the commitment effect regardless of either the age of witness or the age of the culprit. We outlined commitment as having two components, one for mugshot choosers and another for mugshot nonchoosers. Mugshot choosers will select their prior mugshot choice if given the opportunity and will reject a lineup that does not contain it. This occurred even when the opportunity to select the actual culprit was available. It demonstrates how Copyright # 2008 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. 23: 788–803 (2009) DOI: 10.1002/acp 798 C. A. Goodsell et al. prejudicial mugbook viewing is. Following a choice from the mugbook, these errors are most likely due to a source monitoring error because the mugshot choosers confused their mugshot choice for the actual culprit. Alternatively, according to fuzzy trace theory (Reyna & Brainerd, 1995), the memory trace of the foil chosen from the mugbook could result in a stronger verbatim trace. Although the perceived familiarity of the individual from the mugshot undoubtedly plays a role in its subsequent choice from the lineup, familiarity alonewould not seem to be very diagnostic when the lineup contains their familiar mugshot choice, another familiar nonchosen mugbook foil, plus a photo of the actual culprit. For mugshot choosers, selection from the mugbook creates a new memory trace that is stronger than the original trace of the culprit due to its recency. This is evident as the majority of mugshot choosers selected their mugshot choice from the lineup. One could argue, however, that perhaps the original memory for the culprit was somehow biased, altered or replaced, as a result of selecting someone from the mugbook (Loftus & Loftus, 1980; Schooler et al., 1988). However, McCloskey and Zaragoza (1985; Zaragoza, McCloskey, & Jamis, 1987) demonstrated that memory for true and suggested events may remain separately represented. Our data would seem to be more consistent with this latter interpretation given that a small number of mugshot choosers (N¼ 7) subsequently were able to identify the culprit correctly. The second component of our commitment hypothesis stated that mugshot nonchoosers are likely to reject a subsequent lineup after failing to select from a mugshot search. This finding suggests that those who rejected the mugbook and lineup may be committing to a strategy of not choosing. Alternatively, they may simply have a high selection criterion due to the witness’s internal criterion for lineup decisions, or perhaps viewing the mugshots causes the witness to raise their criterion because they become confused and less confident in their memories. Somewho correctly reject the mugbook actually may have better memories of the event. After all, not selecting from the mugbook was the correct decision because the culprit was not in the mugbook. However, because they rejected the lineup incorrectly, the 1-week delay apparently caused a witness to (1) no longer have a strong memory for the event, or (2) use a high selection criterion. Interestingly, those mugshot nonchoosers who did select from the lineup were highly accurate; 5 of 7 (P¼.714) correctly identified the culprit. This matches the results for nonchoosers reported in the Brigham and Cairns (1988) study (9 of 11, P¼.818). These Mugshot condition participants were the only ones who made two correct decisions (rejecting the mugbook and selecting the culprit), so perhaps those with a better memory are less subject to the deleterious effects of mugshot exposure. It is unclear if this is the result of an individual difference factor that imbues some participants with a better than averagememory, or if anyone that happens to encode an event well can resist the deleterious effects of mugshot exposure. The results of our two experiments failed to support the Memon et al. (2002) finding that mugshot choosers would select a familiar foil (from the mugbook) at a higher rate. Several differences between our study and theirs, such as the perceived familiarity of our familiar foil, differences in delay, and use of culprit present lineups may account for the discrepant results. However, Blunt and McAllister (in press) also found no evidence of familiarity effects with procedures that were similar to Memon et al., with the only notable exception being that they manipulated mugbook size. Furthermore, a commit-to-foil effect was found in their large mugbook condition (200 mugshots) but not in the small mugbook (12 mugshots) condition. These results are consistent with what we found in the Experiment 1. It may be the case that differences in encoding (exposure duration and quality) and at Copyright # 2008 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. 23: 788–803 (2009) DOI: 10.1002/acp Mugshot commitment 799 retrieval (effects of delay) can explain the differences between the three studies. Delay between mugshot exposure and lineup varied from 20minute in the Blunt and McAllister study, to 48 hour in the Memon et al. study, to 1 week in the current study. Perhaps the commit-to-foil effect is most likely when memory for the culprit is weak. Additionally, we should note that the photos seen in the lineup phase were the same photos that had been viewed in the mugbook. Perhaps commitment would operate differently if the mugshot differed between the mugbook and the lineup. Clearly, more work is needed in order to understand how these variables affect lineup decisions. We did not observe differences between our young and older witnesses. Memon et al. (2002) reported that older witnesses chose more often from both the mugbook and the lineup. They also reported that they tended to false alarm more frequently to the familiar foil. Close inspection of their data indicates that older witnesses in the no-mugshot group selected the foil designated as the critical foil (for the mugshot conditions) at a very low rate (P¼.05) compared to the young witnesses (P¼.20). Following mugshot exposure, the older witnesses choose the familiar foil at a similar rate (P¼.28) as the younger witnesses (P¼.29)2. Although this interaction was not significant, age differences (i.e. the older witnesses choosing the familiar foil at a greater rate in the mugshot condition) are probably responsible for their results. Perhaps this difference would have emerged as significant given a larger sample size; however, because measures of effect size were not reported, it is not clear if this would be the case. Note that this interpretation fits with the finding that seniors relied more on perceived familiarity when making lineup decisions (Bartlett & Fulton, 1991; Fulton & Bartlett, 1991; Searcy et al., 1999). This highlights the importance of several issues. First, effects of familiarity and commitment should be considered individually in order to evaluate their separate effects. Note that these are not mutually exclusive explanations. Second, age may have an affect on lineup decisions involving familiar foils. Recall that we found some evidence of age differences in confidence, but did not find evidence of age differences on lineup performance as has been reported elsewhere (e.g. Searcy et al., 1999). There are several possible explanations. First, considering mugshot exposure, the commitment effect was so strong that it represented the majority of lineup errors regardless of the age variables. Second, by using a larger search set (e.g. number of mugshots) and longer delay between event and lineup identification, we could have limited our young participant’s ability to use recollection, which has been shown to account for their recognition advantages (see Yonelinas, 2002 for a review). Indeed, in the recent review of the literature on age effects in eyewitnesses, Bartlett and Memon (2007) found that age differences are greatest when eyewitness performance is better. This may explain why our no-mugshot group did not show the predicted age-related differences (Memon et al., 2003). Lastly, our experiment, like many others, only used a single young and a single older culprit. It is problematic to make generalisations about estimator variables (Wells, 1978) based on a single exemplar. The same logic can be applied to our null findings for the no- mugshot witnesses. We did not support the own-age bias (Wright & Stroud, 2002). Anastasi and Rhodes (2006) found that the own-age bias appeared when subjective ratings of age were considered. To test this possibility, we presented a separate group of younger and older individuals (from the same age ranges used in the current study) the photographs of the two culprits. An introductory psychology class at the University of Oklahoma (N¼ 36) rated the young culprit as 21.8 years old (actual age¼ 20) and the older culprit as 2Mugshot data represent combined mugshot conditions from Memon et al. (2002) study assuming equal Ns. Copyright # 2008 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. 23: 788–803 (2009) DOI: 10.1002/acp 800 C. A. Goodsell et al. 59.8 (actual age¼ 64), and a group of older adult volunteers (N¼ 36) rated the young culprit as 22.7 years old and the older culprit as 64.4 years old. It does not appear that perception of age affected the participants in the current study. As noted by previous researchers (Dysart et al., 2001; Memon et al., 2002), eyewitnesses who identify a photograph from a series of mugshots should not be asked to make a subsequent lineup identification. However, as Lindsay et al. (1994) pointed out, by allowing witnesses to choose more than one photo using a ‘might be’ criterion, a mugshot search may provide a useful search tool to obtain a smaller group of suspects for the police to pursue. More research is needed to see if alternative techniques like this would protect against the deleterious effects of commitment. Finally, as computer technology advances, the ability to conduct mugshot searches is becoming not only more sophisticated (e.g. sorting techniques, use of dynamic information like audio/video), but more common in police departments (see McAllister, 2007 for a review). Research on both system and estimator variables is important (see Wells, Memon, & Penrod, 2006) and it is clear that further research on how these variables affect mugshot viewing is needed. ACKNOWLEDGEMENTS The authors thanks Rose Marie Devine and Colleen Goodsell for help with the creation of the stimuli. Additionally, we thank Mitra Adhami, Casey Cozelos, Ashley Hayes, Marla Pigg, Deah Quinlivan and Julie Mathis for assistance with the data collection. This study was supported by an American Psychology–Law Society (AP–LS) grant-in-aid awarded to the first author. REFERENCES Anastasi, J. S., & Rhodes, M. G. (2005). An own-age bias in face recognition for children and older adults. Psychonomic Bulletin & Review, 12, 1043–1047. Anastasi, J. S., & Rhodes, M. G. (2006). Evidence for an own-age bias in face recognition. North American Journal of Psychology, 8, 237–252. Bartlett, J. C., & Fulton, A. (1991). Familiarity and recognition of faces in old age. Memory & Cognition, 19, 229–238. Bartlett, J. C., & Memon, A. (2007). Eyewitness memory in young and older adults. In R. C. L. Lindsay, D. F. Ross, J. D. Read, & M. P. Togila (Eds.), The handbook of eyewitness psychology, vol II: Memory for people (pp. 309–338). Mahwah: Lawrence-Erlbaum. Bartlett, J. C., Strater, L., & Fulton, A. (1991). False recency and false fame of faces in young adulthood and old age. Memory & Cognition, 19, 177–188. Bergman, P. (1996). A bunch of circumstantial evidence. University of San Francisco Law Review, 30, 985. Blunt, M. R., & McAllister, H. A. (in press). Mug shot exposure effects: Does size matter? Law and Human Behavior. Brigham, J. C., & Cairns, D. L. (1988). The effect of mugshot inspections on eyewitness identification accuracy. Journal of Applied Social Psychology, 18, 1394–1410. Brown, E., Deffenbacher, K., & Sturgill, W. (1977). Memory for faces and the circumstances of the encounter. Journal of Applied Psychology, 62, 311–318. Buckhout, R. (1974). Eyewitness testimony. Scientific American, 231, 23–31. Clark, S. E., & Tunnicliff, J. L. (2001). Selecting lineup foils in eyewitness identification exper- iments: Experimental control and real-world simulation. Law and Human Behavior, 25, 199–216. Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Hillsdale, NJ: Erlbaum. Copyright # 2008 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. 23: 788–803 (2009) DOI: 10.1002/acp Mugshot commitment 801 Connors, E., Lundregan, T., Miller, N., & McEwan, T. (1996). Convicted by juries, exonerated by science: Case studies in the use of DNA evidence to establish innocence after trial. NIJ Research Report. US Department of Justice. Cutler, B. R., Penrod, S., & Dexter, H. R. (1990). Juror sensitivity to eyewitness identification evidence. Law and Human Behavior, 14, 185–191. Davies, G., Shepherd, J., & Ellis, H. (1979). Effects of interpolated mug shot exposure on accuracy of eyewitness identification. Journal of Applied Psychology, 64, 232–237. Deffenbacher, K. A., Bornstein, B. H., & Penrod, S. D. (2006). Mugshot exposure effects: Retroactive Interference, mugshot commitment, source confusion, and unconscious transference. Law and Human Behavior, 30, 287–307. Dorf, M. C. (2001). How reliable is eyewitness testimony?: A decision by New York state’s highest court reveals unsettling truths about juries. Writ: FindLaw’s Legal Commentary. Retrieved 8 July 2007 from http://writ.news.findlaw.com/dorf/20010516.html Dysart, J. E., Lindsay, R. C. L., Hammond, R., &Dupuis, P. (2001). Mug shot exposure prior to lineup identification: Interference, transference, and commitment effects. Journal of Applied Psychology, 86, 1280–1284. Fox, S. G., & Walters, H. A. (1986). The impact of general versus specific expert testimony and eyewitness confidence upon mock juror judgment. Law and Human Behavior, 10, 215–228. Fulton, A., & Bartlett, J. (1991). Young and old faces in young and old heads: The factor of age in face recognition. Psychology & Aging, 6, 623–630. Gorenstein, G. W., & Ellsworth, P. C. (1980). Effect of choosing an incorrect photograph on a later identification by an eyewitness. Journal of Applied Psychology, 65, 616–622. Haw, R. M., Dickinson, J. J., & Meissner, C. A. (2007). The phenomenology of carryover effects between show-up and line-up identification. Memory, 15, 117–127. Heinz, T., & Pezdek, K. (2001). The effect of exposure to multiple lineups on face identification accuracy. Law and Human Behavior, 25, 185–198. Innocence Project. (n.d.). Retrieved 29 February 2008 from http://www.innocenceproject.com Johnson, M. K., Hashtroudi, S., & Lindsay, D. S. (1993). Source monitoring. Psychological Bulletin, 114, 3–28. Kaci, J. H. (1995). Culprit evidence (3rd ed.). Cincinnati: Copperhouse Publishing. Lamont, A. C., Stewart-Williams, S., & Podd, J. (2005). Face recognition and aging: Effects of target age and memory load. Memory & Cognition, 33, 1017–1024. Lindsay, R. C. L., Nosworthy, G. J., Martin, R., & Martynuck, C. (1994). Using mug shots to find suspects. Journal of Applied Psychology, 79, 121–130. Lindsay, D. S. (1994). Memory source monitoring and eyewitness testimony. In D. F. Ross, J. D. Read, & M. P. Toglia (Eds.), Eyewitness testimony: Current trends and developments. New York: Springer-Verlag. Loftus, E. F. (1976). Unconscious transference in eyewitness identification. Law and Psychology Review, 2, 93–98. Loftus, E. F., & Loftus, G. R. (1980). On the permanence of stored information in the human brain. American Psychologist, 35, 409–420. Luus, C. A. E., & Wells, G. L. (1991). Eyewitness identification and the selection of distractors for lineups. Law and Human Behavior, 15, 43–57. McAllister, H. (2007). Mug books: More than just large photospreads. In R. C. L. Lindsay, D. F. Ross, J. D. Read, & M. P. Togila (Eds.), The handbook of eyewitness psychology, vol II: Memory for people (pp. 35–58). Mahwah: Lawrence-Erlbaum. McCloskey, M., & Zaragoza, M. (1985). Misleading postevent information and memory for events. Arguments and evidence against memory impairment hypotheses. Journal of Experimental Psychology: General, 114, 3–18. Memon, A., Bartlett, J., Rose, R., & Gray, C. (2003). The aging eyewitness: Effects of age on face, delay, and source-memory ability. Journal of Gerontology: Psychological Sciences, 58B, 338–345. Memon, A., Hope, L., Bartlett, J., & Bull, R. (2002). Eyewitness recognition errors: The effects of mugshot viewing and choosing in young and old adults. Memory & Cognition, 30, 1219–1227. Neuschatz, J. S., & Cutler, B. L. (2008). Eyewitness identification. In H. L. Roediger, III (Ed.), Cognitive psychology of memory. Vol 2 of learning and memory: A comprehensive reference, 4 vols. (J. Byrne Editor; pp. 845–865). Oxford: Elsevier. Copyright # 2008 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. 23: 788–803 (2009) DOI: 10.1002/acp 802 C. A. Goodsell et al. Neuschatz, J. S., Preston, E. L., Burkett, A. D., Toglia, M. P., Lampinen, J. M., Neuschatz, J. S., et al. (2005). The effects of post-identification feedback and age on retrospective eyewitness memory. Applied Cognitive Psychology, 19, 435–453. Perfect, T. J., & Harris, L. J. (2003). Adult age differences in unconscious transference: Source confusion or identity blending? Memory & Cognition, 31, 570–580. Rattner, A. (1988). Convicted but innocent: Wrongful conviction and the criminal justice system. Law and Human Behavior, 12, 283–293. Read, J. D., Tollestrup, P., Hammersley, R., McFadzen, E., & Christensen, A. (1990). The unconscious transference effect: Are innocent bystanders ever misidentified? Applied Cognitive Psychology, 4, 3–31. Reyna, V. F., & Brainerd, C. J. (1995). Fuzzy-trace theory: An interim synthesis. Learning & Individual Differences, 7, 1–75. Ross, D. R., Ceci, S. J., Dunning, D., & Toglia, M. P. (1994). Unconscious transference and mistaken identity: When a witness misidentifies a familiar but innocent person. Journal of Applied Psychology, 79, 918–930. Scheck, B., Neufeld, P., & Dwyer, J. (2000). Actual innocence. New York: Doubleday. Schooler, J. W., Foster, R. A., & Loftus, E. F. (1988). Some deleterious consequences of the act of recollection. Memory & Cognition, 16, 243–251. Searcy, J. H., Bartlett, J. C., & Memon, A. (1999). Age differences in accuracy and choosing in eyewitness identification and face recognition. Memory & Cognition, 27, 538–552. Wells, G. L. (1978). Applied eyewitness-testimony research: System variables and estimator variables. Journal of Personality and Social Psychology, 36, 1546–1557. Wells, G. L. (1993). What do we know about eyewitness identification? American Psychologist, 48, 553–571. Wells, G. L., Ferguson, T. J., & Lindsay, R. C. L. (1981). The tractability of eyewitness confidence and its implication for triers of fact. Journal of Applied Psychology, 66, 688–696. Wells, G. L., Memon, A., & Penrod, S. D. (2006). Eyewitness evidence: Improving its probative value. Psychological Science in the Public Interest, 7, 45–75. Wells, G. L., Rydell, S. M., & Seelau, E. P. (1993). The selection of distractors for eyewitness lineups. Journal of Applied Psychology, 73, 835–844. Wells, G. L., Small, M., Penrod, S. D., Malpass, R. S., Fulero, S. M., & Brimacombe, C. A. E. (1998). Eyewitness identification procedures: Recommendations for lineups and photospreads. Law and Human Behavior, 22, 603–607. Weschler, D. (1997). The WAIS-III administration and scoring manual. San Antonio: The Psycho- logical Corporation. Wright, D. B., & Stroud, J. N. (2002). Age differences in lineup identification accuracy: People are better with their own age. Law and Human Behavior, 26, 641–654. Yonelinas, A. P. (2002). The nature of recollection and familiarity: A review of 30 years of research. Journal of Memory and Language, 46, 441–517. Zaragoza, M. S., McCloskey, M., & Jamis, M. (1987). Misleading postevent information and recall of the orginal event: Further evidence against the memory impairment hypothesis. Journal of Experimental Psychology: Learning, Memory, and Cognition, 13, 36–44. Copyright # 2008 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. 23: 788–803 (2009) DOI: 10.1002/acp Mugshot commitment 803 Citation: 13 U. Pa. J.L. & Soc. Change 137 2009-2010 Content downloaded/printed from HeinOnline (http://heinonline.org) Fri Oct 16 09:33:30 2015 -- Your use of this HeinOnline PDF indicates your acceptance of HeinOnline's Terms and Conditions of the license agreement available at http://heinonline.org/HOL/License -- The search text of this PDF is generated from uncorrected OCR text. -- To obtain permission to use this article beyond the scope of your HeinOnline license, please use: https://www.copyright.com/ccc/basicSearch.do? &operation=go&searchType=0 &lastSearch=simple&all=on&titleOrStdNo=1091-4803 ANATOMY OF A WRONGFUL CONVICTION: STATE v. DEDGE AND WHAT IT TELLS US ABOUT OUR FLAWED CRIMINAL JUSTICE SYSTEM ARMEN H. MERJ IAN "A moment of rare enlightenment is at hand. For generations, American lawyers and crusaders have fobught to overturn the convictions ofpeople they believed innocent. Until recently, they had to rely on witnesses to recant or for the real perpetrators to confess. In what seems like a flash, DNA tests performed during the last decade of the century not only have freed seventy-four individuals but have exposed a system of law that has been far too complacent about its fairness and accuracy. What matters most is not how these people got out ofjail but how they got into it. " Barry Scheck, Peter N~euifeld, and Jim Dwyer, Actual Innocence' "[TJhe law holds, that it is better that ten guity persons escape, than that one innocent suffer. " Lord William Blackstone2 1. INTRODUCTION It is one of the greatest injustices of all: the wrongful conviction and imprisonment of an innocent person. Although a few notorious examples have garnered publicity, it is an injustice that has been repeated thousands of times over the past few decades in the United States.' Because the majority of these cases involve serious crimes, moreover, the consequences have been dire: even in non-capital cases, those wrongfully convicted typically lose years of their lives behind bars while struggling to prove their innocence. A 2005 non-exhaustive study of exonerations in the United States from 1989 through 2003 "found 340 exonerations, 327 men and .Member, New York and Connecticut Bars. B.A. Yale University 1986:- J.D. Columbia University 1990. The author is a civil rights and poverty lawyer at Housing Works. Inc.. the largest provider of HIV/AIDS services in the State of New York. The author wishes to thank Wilton Dedge. Gary Dedge. Mary Dedge. Nina Morrison, Mark Horwitz, Milton Hirsch, and Sandy D'Alemberte for their invaluable assistance. I BARRY SCHECK. PETER NEU4FELD& Jim DWYER, ACTUAL INNOCENCE xix-xx (2001). 2 LORD WILLiAM BLACKSTONE, COMMENTARIES ON THE LAWS OF ENGLAND, BOOK IV, CH. 27. 359 (1765). 3See, e.g.. Richard A. Wise, Kirsten A. Dauphinais & Martin A. Safer. A Tripartite Solution to Eyewitness Error. 97 J. CRIM. L. & CRIM[NOLOGY 807. 809 (2007) ("One survey of Ohio criminal justice officials estimates that wrongful convictions occur in about 1 of every 200 felony criminal cases (.5%o). This translates to more than 5000 innocent persons being convicted of serious crimes in 2002."). 138 UNI V OF PENNSYL VANIA JO URNAL OF LA W AND SOCIAL CHANGE [o.1 13 women; 144 of them were cleared by DNA evidence, [and] 196 by other means ." The authors found that in most cases, [the ultimately exonerated individuals] had been in prison for years. More than half had served terms of ten years or more; 80% had been imprisoned for at least five years. As a group, they had spent more than 3400 years in prison for crimes for which they should never have been convicted-an average of more than ten years each.5 In four of the cases, the state acknowledged the innocence of the wrongfully convicted posthumously, for the men had died in prison.6 At least two recent members of the United States Supreme Court have decried the alarming frequency of wrongful convictions, particularly in capital cases. In a recent speech, Justice John Paul Stevens expressed his concerns, stating, "I[tlhe recent development of reliable scientific evidentiary methods has made it possible to establish conclusively that a disturbing number of persons who had been sentenced to death were actually innocent.",7 Justice Sandra Day O'Connor similarly observed, "we cannot ignore the fact that in recent years a disturbing number of inmates on death row have been exonerated. These exonerations have included at least one mentally retarded person who unwittingly confessed to a crime that he did not commit."8 A recent report by Professor James Liebman of Columbia Law School, et al., reveals the extent of the problem: 680% of all death verdicts imposed and fully reviewed during the 1973-1995 study period were reversed by courts due to serious errors. Analyses presented for the first time here reveal that 76o% of the reversals at the two appeal stages where data are available for study were because defense lawyers had been egregiously incompetent, police and prosecutors had suppressed exculpatory evidence or committed other professional misconduct, jurors had been misinformed about the law, or judges and jurors had been biased. . .. 82 %o of the cases sent back for retrial at the second appeal phase ended in sentences less than death, including 9O% that ended in not guilty verdicts.9 4~ Samuel R. Gross. Kristen Jacoby, Daniel J. Matheson, Nicholas Montgomery & Sujata Patil. Exonerations in the United States 1989 Through 2003. 95 J. CRim. L. & CRIMINOLOGY 523. 524 (2005) [hereinafter Gross et al.]. 5Id. See Facts on Post-Conviction DNA Exonerations, The Innocence Project, available at http://www.innocenceproject.org/Content/351.php (examining 242 post-conviction DNA exonerations in the United States, the innocence Project reports that the average length of time served by exonerees is twelve years.). 6 Gross et al.. supra note 4. at 524. 7Justice Stevens Criticizes Election of Judges. WASH. POST. Aug. 4. 1996. at A 14. 8 Atkins v. Virginia, 536 U.S. 304, 320 n.25 (2002). 9 JAMES S. LIEBMAN El AL.. A BROKEN SYSTEM PART 11: WHY THERE IS SO MUCH ERROR IN CAPITAL CASES AND WHAT CAN BE DONE ABOUT IT i (Feb. 11. 2002), http://www.law.columbia.edu /broken sy stem2/report.pdf. 138 Vol. 13 2009-20 10]ANATOMY OF WRONGFUL CONVICTION13 In other words, more than eight in ten cases retried because of serious error were found not to merit the death penalty, and nearly one of every ten defendants sentenced to die was found not guilty of the crime of which he or she was convicted.10 Several factors contribute to wrongful convictions, including ineffective assistance of counsel, police and prosecutorial misconduct, false confessions, mistaken identification, the use of unreliable jailhouse informants, or "snitches," and the admission of faulty "scientific" evidence.1 1 Often, as in the case to which we will soon turn, several of these factors combine to produce an erroneous conviction. Of all of these factors, however, eyewitness misidentification has proven to be the most troublesome. Four decades ago, Justice Brennan warned of the dangers of mistaken identification: The vagaries of eyewitness identification are well-known; the annals of criminal law are rife with instances of mistaken identification. Mr. Justice Frankfurter once said: "What is the worth of identification testimony even when uncontradicted? The identification of strangers is proverbially untrustworthy. The hazards of such testimony are established by a formidable number of instances in the records of English and American trials. These instances are recent-not due to the brutalities of ancient criminal procedure." 1 2 Numerous subsequent studies have confirmed Justice Brennan's wisdom. According to the Center on Wrongful Convictions at Northwestern Law School, "[le~rroneous eyewitness testimony-whether offered in good faith or perjured-no doubt is the single greatest cause of wrongful convictions in the U.S. criminal justice system." 1 3 Indeed, data gathered by Cardozo Law School's Innocence Project shows that erroneous eyewitness identifications "contributed to over 75%~ of the 177 wrongful convictions" that were overturned by the use of DNA evidence 14through 2006 . Meanwhile, a 2004 study by Yale University and U.S. Navy researchers, led by Yale behavioral scientist Charles A. Morgan, found that even healthy victims who get a good look at their perpetrators are unlikely to identify' them accurately later. 15The researchers evaluated elite Navy and Marine officers participating in Prisoner of War survival training, which includes sleep and food deprivation. They found that only thirty percent of officers in a high-stress group made accurate identifications of officers who had posed as "enemy" interrogators. 16Notably, 10 Id. 11 See. e.g.. The Causes of Wrongful Convictions, The Innocence Project, available athttp://www. innocenceprojiect.org/understand/. 12 United States v. Wade, 388 U.S. 218, 228 (1967). 13 ROB WARDEN, How MISTAKEN AND PERJURED EYEWITNESS TESTIMONY PUT 46 INNOCENT AMERICANS ON DEATH Row: AN ANALYSIS OF WRONGFUL CONVICTIONS SINCE RESTORATION OF THE DEATH PENALTY FOLLOWING FURMAN v. GEORGIA 1 (2001). http://www.deathpenaltyinfo.org/StudyCWC200I.pdf. See Kampshoff v. Smith. 698 F.2d 581, 585 (2d Cir. 1982) ("There can be no reasonable doubt that inaccurate eyewitness testimony may be one of the most prejudicial features of a criminal trial."). 14 INNOCENCE PROJECT. BENJAMIN N. CARDOZO SCHOOL OF LAW. EYEWITNESS MISIDENTIFICATION IN FLORIDA AND NATIONWIDE (2009). http://www.innocenceproject.org/docs/FloridaMistakenID.pdf. 15 Charles A. Morgan et al.. Accuracy of Eyewitness Memory for Persons Encountered During Exposure to Highly Intense Stress. 27 INT'L J.L. & PSYCHIATRY 265 (2004). 16 Id. at 272. 2009-2010] 139 140 UNI V OF PENNSYL VANIA JO URNAL OF LA W AND SOCIAL CHANGE [o.1 officers who were more confident about their identification were not more likely to be accurate.17 The study concluded that "[c~ontrary to the popular conception that most people would never forget the face of a clearly seen individual who had physically confronted them and threatened them for more than [thirty minutes], a large number of subjects in this study were unable to correctly identify their perpetrator."1 8 Thankfully, due largely to DNA testing and improved science, there is an increasing awareness of wrongful convictions in the United States, which parallels the increasing number of exonerations. The rate of exonerations increased sharply between 1989 and 2003, "from an average of twelve a year from 1989 through 1994, to an average of forty-two a year since 2000. The highest yearly total was forty-four, in 2002 and again in 2003."19 Yet, these exonerations have not come easily. In case after case, the wrongfully convicted have been forced to fight excruciating battles in an attempt to establish their innocence. They must struggle against a legal system that has erected significant hurdles in the path of post-conviction appeals, and against prosecutors who, time and time again, claim that "finality" and a sense of closure for victims and their families are more important than conclusively determining the truth .20 The case of State v. Dedge illustrates the tyranny of this system. 11. STATE v. DEDGE On the afternoon of December 8, 1981, Wilton Dedge was working as a mechanic in a garage in New Smyrna Beach, Florida, a small community located approximately fifteen miles '" Id. at 274. 18 Id. 19 Gross et al.. supra note 4. at 527. 20 See Sally Watt. Unlocking the Evidence. ORLANDO WEEKLY. June 28. 2000. http://www.orlandoweekly.com/util/printready.asp?id-1823:. Leonora LaPeter. Guilty Until Proven Innocent, Si. PETERSBURG limES. Nov. 14. 2004. available at http://www.sptimes.com/2004/11/14/State/ Guilty until proven i.shtml (quoting Florida chief assistant state attorney, Robert Holmes. who emphasized the need for finality); see. e.g.. Cynthia E. Jones. Evidence Destroyed, Innocence Lost. The Preservation of Biological Evidence Under Innocence Protection Statutes. 42 Am. Cm. L. REv. 1239. 1265-66 (2005) ("[C]riminal justice officials have argued that allowing belated actual innocence challenges grossly undermines the government's well1-e stabli shed interests in finality of judgments and providing victim closure.. .. Applying the interest in finality of judgments to post-conviction DNA testing. criminal justice officials argue that finality must trump the very human desire of the convicted to perpetually seek their freedom through every available avenue, including subjecting old evidence to DNA testing and other technologies that might become available."). It is not only prosecutors who adhere to this philosophy of "finality." For example. in 1998. Chief Justice Sharon Keller of the Texas Court of Criminal Appeals wrote an opinion denying a new trial to Roy Criner, who was convicted of rape and murder, even though DNA testing revealed that the semen found in the victim was not his. Guilt and Innocence: If a coroner relects a finding of homicide, should a conviction stand?. HOUSTON CHRONICLE. Sept. 16. 2009 ( ...We can't give new trials to everyone who establishes after conviction that they might be innocent.' Keller told a PBS interviewer. According to the judge, such a situation would mean there would be no finality in a criminal justice system. 'And finality.' she said, 'is important."'). See also In re Troy Anthony Davis, No. 08-1443. 2009 U.S. LEXIS 5037. at *7 (2009) (Scalia, J., dissenting) ("This Court has never held that the Constitution forbids the execution of a convicted defendant who has had a full and fair trial but is later able to convince a habeas court that he is 'actually' innocent. Quite to the contrary, we have repeatedly left that question unresolved, while expressing considerable doubt that any claim based on alleged 'actual innocence' is constitutionally cognizable.") (emphasis in original). 140 Vol. 13 2009-20 10]ANATOMY OF A WRONGFUL CONVICTION14 south of Daytona Beach .21' A high school drop-out and twenty years old at the time, Mr. Dedge lived with his parents in nearby Port St. John, Florida, scraping by with odd jobs that included, as on this fateful day, installing and rebuilding transmissions. "I was still a kid," Mr. Dedge explained, "surfing, skateboarding, having a good time, and just living for the moment. I really didn't have any plans yet on how I was going to live my life.", 22 That same afternoon at four o'clock, Jane Smith, a seventeen-year-old cosmetology student, returned to her family home in Canaveral Groves, Florida, about forty-seven miles south of New Smyrna Beach, after a job search. Her father, stepmother, and sister were not home. Changing clothes inside her room, she heard a noise and turned to face a large, tall, and powerful man wielding a razor knife.2 The assailant cut off her clothes and brutally raped her two times. In addition, using the razor knife, he slowly and deliberately cut her sixty-five times on her face 24and body over a forty-five minute period . After the assailant punched Ms. Smith in the face, he left with the contents of her purse.2 Ms. Smith then called her boyfriend, who took her to the hospital for treatment and for the preparation of a rape kit.26 She provided the police with a description of her assailant: he was between six feet and six feet two inches tall, weighing between 160 and 200 pounds,2 with hazel eyes, a receding hairline, and long, blond hair. Meanwhile, police carefully searched Ms. Smith's bedroom for clues, taking her sheets and other materials to the laboratory for analysis. The police found two pubic hairs, but nothing else of value at the scene. Days after the crime, Ms. Smith and her sister drove to a nearby town, stopping at a convenience store for refreshments. There, Ms. Smith saw a man who, she told her sister, looked like her attacker. Ms. Smith's sister recognized the man from elementary school; she believed his name was "Walter Hedge.",2 8 Ms. Smith refused to summon the police. About a w eek later, however, she returned to the convenience store and saw the same man. This time, she called the police and eventually met with a detective on January 6, 1982, nearly a month after the crime occurred. On January 8, 1982, Brevard County police arrested Walter Dedge, Wilton's older brother, based on the statements by Ms. Smith's sister.2 Walter Dedge was later released from 2 1 The background facts provided in the ensuing pages were gleaned from an interview with Wilton Dedge himself and with his mother and father, extensive interviews with various members of Mr. Dedge' s legal team, and a review of all of the papers and proceedings in the case of State v. Dedge. unless otherwise attributed. 22 Telephone Interview with Wilton Dedge (Dec. 30, 2006) [hereinafter Dedge Interview]. 23 Trial Transcript at 403-04, State v. Dedge. No. 82-135-CF-A (Fla. Cir. Ct. Aug. 22. 1984) [hereinafter 1984 Transcript] (prosecutor referring to the instrument in question as a "razor knife") (copy on file with author). The victim later clarified, "It was a razor blade, it wasn't a knife, and it had a little switch on it. Id. at 446. 24 Watt, supra note 20. 25 See 1984 Transcript. supra note 23. at 404:- LaPeter, supra note 20. 26 1984 Transcript. supra note 23. at 405:- Memorandum of Law in Support of Post-conviction Motion for an Order Releasing Trial Evidence for DNA Testing at 1, State v. Dedge. No. 82-135-CF-A (Fla. Cir. Ct. Apr. 24. 1997) [hereinafter 1997 Memo] (copy on file with author). 27 Dedge v. State. 442 So. 2d 429. 430 (Fla. Ct. App. 1983). 28Innocence Project. Know the Cases: Wi/ton Dedge. http://www.innocenceproject.org/ Content/84.php (last visited Mar. 6, 20 10) [hereinafter Innocence Pro 'ject]:- 1997 Memo. supra note 26. 29 Innocence Project. supra note 28: 1997 Memo. supra note 26. at 2. 2009-2010] 141 142 UNI V OF PENNSYL VANIA JO URNAL OF LA W AND SOCIAL CHANGE [o.1 custody after Ms. Smith identified Wilton Dedge in a photo lineup.' At the time, Wilton Dedge had long, blond hair, but he stood five feet five inches tall and weighed a scrawny 125 pounds,' 1 upwards of seventy-five pounds lighter and nine inches shorter than the assailant Ms. Smith described to the police. Dedge was dumbfounded, repeatedly proclaiming his innocence. He had no criminal record, much less a record of such brutality. In addition, several witnesses could place him at the garage at the time of the crime, and indeed the entire day. Ms. Smith erroneously identified Walter; now she misidentified Wilton. This would all be cleared up in short order. Wilton Dedge remembers not being worried, explaining: "I knew I was innocent, so I knew it would get cleared up. My parents are law-abiding people. I was raised to believe in the legal system. I knew it would get straightened out."'32 A. The Evidence and the Trial DNA analysis was not yet available to confirm the source of the pubic hairs found at the crime scene, but a forensic expert analyzed the hairs and compared them to samples from the victim and from Dedge. One of the hairs belonged to Ms. Smith. Comparing the other hair to the sample from Dedge, the expert observed both similarities and differences. "However," the expert noted, "the differences were not sufficient to entirely eliminate Dedge as a possible source ." The only other "evidence" that the police were able to develop before trial involved the use of a scent dog months after the crime. In March 1982, Dedge wet his hands in the Brevard County Courthouse bathroom, dried them on paper towels from a bathroom dispenser, and handed the paper towels to an investigator. The investigator grasped the paper towels by the edges, hung them to dry, and then placed them in a paper bag from a coffee shop in the building .' Eight days later, police dog handler John Preston and his German shepherd, Harrass 11, conducted a "scent lineup" using the sheets from Ms. Smith's bedroom and four dirty sheets from the local jail that Dedge had never touched. Harrass 11 sniffed the dried, eight-day-old paper towels in the bag and Preston walked the canine up and down the lineup of sheets, commanding him to "search." On the second pass, Harrass 11 stopped at the (bloody) sheet from Ms. Smith's bed, allegedly detecting Mr. Dedge's scent on the sheet-more than three months after the crime. Harrass 11 was later 30 See LaPeter. supra note 20. 3 1 Dec/ge. 442 So. 2d at 430; Innocence Project, supra note 28. 32 Watt. supra note 20. "LaPeter. supra note 20. Microscopy comparison, or comparing hairs under a microscope, has been used in criminal trials since 1879, and it has been widely criticized. Modem studies and the advent of DNA testing raise questions of the reliability of microscopy for determining guilt in a court of law. See Clive A. Stafford Smith & Patrick D. Goodman, Forensic Hair Comparison Analysis. ANineteenth Century Science or Twentieth Century Snake Oil?. COLUM. Hum. Rn. L. REv. 227, 233 (1996) (finding that hair comparisons have been accepted in criminal prosecutions without being subjected to validation required of any legitimate science). For example. in a blind test of 240 crime labs throughout the country. the rate of unacceptable matches-from failing to recognize a hair match to making an erroneous one-ranged from 27.6 to 67.8 percent. Diana Baldwin & Ed Godfrey, Hair Analysis Under Scrutiny. DMILY OKLAHOMAN. June 3. 2001 (discussing a 1970s proficiency testing program sponsored by the Law Enforcement Assistance Administration (LEAA). formerly a part of the U.S. Justice Department) (citing BARRY SCHECK. PETER NEUEELD & JJm DWYER, ACTUAL INNOCENCE 209-10 (200 1)). 34 See LaPeter. supra note 20. 142 Vol. 13 2009-20 10]ANATOMY OF A WRONGFUL CONVICTION14 brought to Ms. Smith's home, where he supposedly indicated Dedge's presence more than three months earlier by touching his nose to various areas in the house.~ The trial began in September 1982 and lasted eight days. The State relied upon three things to prove Dedge's guilt: 1) the eyewitness testimony of Ms. Smith; 2) the hair analysis; and 3) the dog scent lineup. First, Ms. Smith's testimony was alarmingly contradictory. Dedge is between seven and nine inches shorter than the assailant she had described to the police. Ms. Smith had described a large and muscular assailant with hazel eyes'6 and a "receding hairline."'71 She described him as a man with "big arms" who "looked like a construction worker," and who easily threw her around and pinned her down.' Dedge is a small, slight man-just one inch taller than the victim.' 9 He has blue eyes, not hazel, and to this day he sports a full head of hair.4 Second, the hair analysis certainly did not confirm Dedge's presence at the scene of the crime. Not only was there no identical match, but there were several differences; the State's own expert concluded merely that "the differences were not sufficient to entirely eliminate Dedge as a possible source."4 1 As Dedge later explained, "that's what their expert said, but during the course of the trial the D.A. reinforced it until at the end he was telling the jury that we have a perfect match." 4 2 Finally, the dog scent evidence was profoundly flawed. As a November 2000 article in Science magazine explains, canine scent evidence is routinely submitted in criminal trials despite the fact that there is "little or no underlying body of scientific evidence affirming the validity of its use."4 These tests are questionable at best, and here, the test was conducted more than three months after the crime, using eight-day-old, dried paper towels touched by others aside from Dedge and stored in a paper bag. The defense prepared to present expert testimony that would explain the flaws inherent in scent identification evidence and the impossibility of tracking scent under the circumstances, but the trial judge refused to admit the testimony.4 In fact, the judge rejected the testimony without even viewing the evidence the defense proffered .4 35 Dedge. 442 So. 2d at 430. 31 Petition for Expungement of Record, Factual Findings and Other Relief Including Actions for Declaratory Relief and Damages and Equitable Relief Under Extraordinary Writ Authority at 4, State v. Dedge. No. 82-135-CF-A (Fla. Cir. Ct. 2005) (document undated) [hereinafter 2005 Petition] (copy on file with author). 37 See 1984 Transcript. supra note 23. at 511. The victim also told the police that although she later discovered that the attacker had hair, at one point, "I1 got the appearance that he was bald." Id. at 512. Upon retrial in 1984, she described the attacker's hairline as "far receding." Id. at 456. 511. 38 1997 Memo. supra note 26. at 10. 39 See supra note 3 1 and accompanying text. 40 Adam Liptak. Prosecutors Fight DAA Use for Exoneration. N.Y. TIMffs. Aug. 29. 2003 ("He still sports a full head of hair."); Dedge Interview, supra note 22. 41 See supra note 33 and accompanying text. 42 Larry King Live (CNN television broadcast Dec. 21, 2005). 43 1. Lehr Brisbin. Jr., Steven Austad & Steven K. Jacobson, Canine Detectives: The Aose Knows -Or Does It? Unreliability of Scent Evidence. SCIENCE. Nov. 10. 2000, at 1093. 'Dedge. 442 So. 2d at 430-3 1. 45 See infra note 49 and accompanying text. 2009-2010] 143 144 UNI V OF PENNSYL VANJA JO URNAL OF LA W AND SOCIAL CHANGE [o.1 Mr. Dedge took the stand and proclaimed his innocence. In addition, six witnesses confirmed his alibi: he was at the auto shop nearly fifty miles away at the time of the crime.4 Four of the witnesses testified that they were certain Dedge worked at the shop until closing, between 5:00 and 5:30 p.m. In fact, the shop owner testified that he closed the shop with Dedge and the two of them rode their motorcycles to a nearby pub, eating and drinking together before heading to another bar.4 Mr. Dedge could not have committed the crime. "The shop was pretty small," said Dedge, "about the size of a house lot. It wasn't like I could disappear from the job without anyone noticing."1 48 The jury deliberated for four hours before pronouncing Dedge guilty of the rape and robbery .4 ' The judge sentenced him to thirty years in prison. Gary Dedge, Wilton's father, recalled his reaction to the jury's decision: "We felt like the world had just dropped out from under us.",50 He continued, "[w]e couldn't understand how the jury could believe the ridiculous evidence against Wilton. I've trailed deer, and even a good deer dog can't follow a deer trail beyond twenty-four hours. How in the world could a dog follow a human trail, through clothing and shoes, three months later?" 5 1 "I knew the guy was a scam artist," said Gary Dedge, "but most of the people on the jury had never gone hunting, and they believed anything the prosecutor told them, taking it as gospel.", 5 2 B. A Second Chance As Dedge served his prison sentence, his parents scraped together the money to mount an appeal. The trial court's exclusion of expert scent identification testimony provided a strong basis. On appeal, Florida's Fifth District Court of Appeal concluded that expert qualifications are "to be decided by the trial court determined by the testimony adduced," and the trial court's failure to view the defense's proffered videotape before determining that the defense witness did not qualify' 53as an expert in human scent discrimination was reversible error. As a result, Dedge was given a second chance. At the second trial, in August 1984, newly-retained defense counsel, Mark Horwitz, stood prepared to impeach the State's evidence. Upon investigating Preston, the dog handler, Mr. Horwitz discovered that "[tlhe guy would say just about anything."5 Transcripts of Preston's testimony in previous cases contained glaring inconsistencies and outrageous, unsupported claims .5 According to Horwitz, transcripts revealed that Preston would "say one thing on one day and the very opposite on another."5 For example, Preston testified that he was a member of 46 Dedge Interview. supra note 22. 47LaPeter. supra note 20. 48 Dedge Interview. supra note 22. 49LaPeter. supra note 20. 5 0 Telephone Interview with Gary Dedge (Nov. 5, 2006) [hereinafter G. Dedge Interview]. 51 id. 52 id. 53 Id. 54 Telephone Interview with Mark Horwitz (Oct. 9, 2006) [hereinafter Horwitz Interview]. 55 id. 56 id 144 Vol. 13 2009-20 10]ANATOMY OF A WRONGFUL CONVICTION14 the United States Police Canine Association, but he was not.5 Preston testified that he was a member of the United States Canine Association; he was not)8' Additionally, "[t]he amount of training the dog had allegedly received changed in different cases," says 1-orwitz. 59 Preston testified that his dog had received 540 hours of training at the Tom McGean School for Dogs, but later testified that the dog received merely 250 hours of training at the school.60 In fact, it appeared no one, including the Brevard County prosecutors who frequently utilized Preston's services, had ever tracked his testimony for consistency or critically analyzed Preston's assertions. Among his more outlandish claims, Preston asserted that Harass 11 could 61track a six-year-old scent . Preston also testified in a robbery trial that he was able to track the scent of the robber over an asphalt parking lot, two or three weeks after the crime, and that Harass 6211 could determine how the robber entered and exited the scene, solely based upon scent . On cross-examination, H-orwitz pointed out problems in the investigation and handling of evidence in the Dedge case. Namely, various investigators had handled the paper bag containing Dedge's paper towels, and the paper bag was kept in an evidence locker right next to the sheets, quite possibly contaminating the evidence. When Preston argued that the scent could not have passed through the paper bag, H-orwitz introduced testimony in the aforementioned robbery in which Preston contended that the robber's scent passed through his leather shoes and onto the asphalt .6 Since leather soles are thicker than a paper bags, Preston was forced to admit inconsistencies during cross-examination. Credibility issues of this sort rendered the prosecution's weak case against Dedge even weaker. 64Then came Clarence Zacke. A seven-time convicted felon , Zacke was a notorious snitch. As the St. Petersburg Times explains: [A] one-time millionaire with an auto salvage business, Zacke had been sentenced to 180 years for three murder-for-hire plots. He tried to hire two hit men to kill a witness in a drug-smuggling case against him. He tried to get someone else to murder one of the hit men. In jail, Zacke tried to hire another inmate to kill the state attorney who prosecuted him, to "get even."6 Prosecutors shaved 120 years off of Zacke's sentence for turning State's evidence on other defendants in the case.66 Unfortunately for Dedge, on his way to a bail hearing in January 1984, he shared a prison van with Zacke, striking up a conversation despite his attorney's strict instruction "not to talk to anyone.",6 7 That night, Zacke's son called the prosecutor to offer Zacke's testimony against Dedge. Specifically, Zacke claimed that Dedge-who had never met 57 1984 Transcript. supra note 23, at 864. 51 Id. at 865. 5 Horwitz Interview. supra note 54. 60 1984 Transcript. supra note 23. at 851-53. 6 1 Horwitz Interview. supra note 54. 62 id. 63 id. 64 1984 Transcript. supra note 23. at 1203. 65 LaPeter. supra note 20. 66 See. e.g.. Liptak. supra note 40 ("A truck confiscated by the State was also released to [Zacke's] wife as part of the samne deal."). 67 Horwitz Interview. supra note 54. 2009-2010] 145 146 UNI V OF PENNSYL VANIA JO URNAL OF LA W AND SOCIAL CHANGE [o.1 Zacke before-confessed to committing the crime, stating, "I just raped and cut some old hog."6 Ms. Smith was seventeen at the time of the attack .6 This was not the first time that Zacke had mysteriously provided key testimony for the Brevard County prosecutors in the retrial of a high-profile and questionable case. In 198 1, Gerald Stano was tried for the brutal murder of seventeen-year-old Cathy Lee Scharf.70 Stano, whom many believed was mentally ill,7 1 confessed to the crime, along with dozens of other murders in several states. Some of these confessions were dismissed as patently false, while other 72confessions led to plea bargains. There was not a shred of evidence linking Stano to any of the crimes, including the Scharf murder: no physical evidence, eyewitness testimony, or forensic evidence of any kind .7 Given this lack of evidence, and because the details of Stano's confession did not match the Scharf crime, the jury failed to reach a verdict. 7 4 During the second trial of Mr. Stano in 1983, the Brevard County state attorneys unleashed their secret weapon: Clarence Zacke. Zacke testified that Stano confessed to the murder when Stano was conveniently alone with Zacke .7 ' During his testimony, Zacke provided details that, unlike Stano's confession, matched the crime. This time, the jury convicted Stano and 76sentenced him to death . Over a decade later, in 1998, with Stano's appeals exhausted, there was a break in the case: in an interview with journalist Nash Rosenblatt, Zacke retracted his testimony. As Mr. Rosenblatt's sworn affidavit to the court describes: Zacke told me that what he testified to at Stano's trial was not true. Zacke said that Zacke's attorney came to him after the mistrial in Mr. Stanods case and said that the state wanted Zacke to testify' for them because they were having trouble obtaining a conviction of Mr. Stano. Zacke agreed to do so, in return for faxvors from the state. After that, according to Zacke, two persons from the prosecutor's office told him what to say at trial .7 In March 1998, the Florida Supreme Court denied Stano a retrial based on Zacke's retraction. The Court ruled that, even if Rosenblatt's affidavit were admissible evidence, "there was no reasonable probability that the outcome of a new trial would produce an acquittal."7 68 1984 Transcript. supra note 23. at 12 10. 69 Transcript of State's Closing Argument at 11, State v. Dedge, No. 82-135-CF-A (Fla. Cir. Ct. July 21. 2004) (copy on file with author). 70 See Stano v. State. 473 So. 2d 1282, 1285 (Fla. 1985). 71 See. e.g.. John A. Torres. Ta/es of a Jajiho use Snitch. FLA. TODAY. Nov. 26. 2004, at IA. 72 See Martin Dyckman. Infamous Justice. Si. PETERSBuRG TIMEfS. Aug. 22. 2004, at IP. 73 Brief of Appellant at 6. Stano v. State (Fla. March 20. 1998) [hereinafter Stano Brief]; available at http:Hww w. law. fsu.edu/library /flsupct/92 614/92 614ini.pdf ("No physical evidence, contraband, eyewitness testimony. or forensic evidence of any kind connects Petitioner to this or any other homicide."). 74 Torres. supra note 71. 75 Id. ("Then, a curious thing happened: Stano and Dedge spent Just enough time with Zacke for him to testify against them-saying they had confessed their crimes."); Dyckman, supra note 72 ("In both cases, the state needed stronger evidence for retrials. in both cases, Zacke miraculously turned up to say the defendants had boasted of the crimes."). 76 Stano Brief. supra note 73. at 6. 77 Id. at 21 (citation omitted). 78 Stano v. Florida, 708 So. 2d 271, 275 (Fla. 1998). 146 Vol. 13 2009-20 10]ANATOMY OF WRONGFUL CONVICTION14 Stano had confessed.7 It was of no moment that Stano's confession did not match the crime, or that the conviction hinged upon Zacke's testimony, without which the first jury had refused to convict Stano. Three days after the court's ruling, Stano was electrocuted.80 In his final statement, Stano exclaimed, "I am innocent.... Now I am dead and you do not have the truth.'8 1 In August 1984, Zacke took the stand in the trial against Wilton Dedge, State v. Dedge, and provided details of the crime that smacked of coaching, details that even Dedge did not know. "1-e knew more about my case than I did," Dedge observed.8 Zacke even threw in some fanciful touches: Dedge allegedly confessed to Zacke that he drove his motorcycle over 160 miles per hour to Ms. Smith's house, arriving in fifteen minutes (a nearly fifty-mile trip),8 ' and that he was able to commit the crime and return to the garage without anyone noticing his absence.8 To inflame the jury, Zacke testified that Dedge "never mentioned the girl's name, he just called her a bitch, that's all he ever said, that's the only name I knew . . .. He added that Dedge threatened to kill Ms. Smith if he was ever released. "So what you're telling us is you'll conspire to kill somebody to keep from going to jail, but you wouldn't lie to get out of jail?" Horwitz asked Zacke on cross- examination. 8 6 "Maybe hard to believe, yes," Zacke replied .8 7 "1-e was one of the most intelligent witnesses I'd ever seen," said Horwitz, "masterful," even "artistic" on the stand.88 Given that Dedge was innocent of the crime, and that Zacke was unequivocally lying, the question arises: where did Zacke obtain his information? Nina Morrison of the Innocence Project, who would later represent Dedge, explained: When you look at the wealth and type of detail he had, there are only three places that he could have received that information from: Wilton, Wilton's attorneys, or from prosecutors. We know it didn't come from Wilton or his lawyers. It doesn't take a rocket scientist to connect the dots.8 9 Talbot "Sandy" D'Alemberte, the former President of the American Bar Association and of Florida State University, who represented Mr. Dedge in his civil suit against the State of Florida, similarly wondered how Zacke got his information. "[H-fow did Zacke suddenly appear on this van with Wilton," D'Alemberte questioned. 90 "The state attorney said Zacke was going to his own hearing, but a review of the file shows no hearing for Zacke that matches with this time. What the heck is he doing on this van and who put him there?" 91 79 Id. at 272. 80 Dyckman. supra note 72. 81 Id. 82 Dedge Interview. supra note 22. 83 1984 Transcript. supra note 23. at 1214. 84 Id. at 1213-14. 85 Id. at 1213. 86 Id. at 1230. 87 id. 88 Horwitz Interview. supra note 54. 89 Torres. supra note 71. 90 Telephone Interview with Sandy D'Alemberte (Oct. 12. 2006) [hereinafter D'Alemberte Interview]. 91 Id. The prosecutors have denied any misconduct, noting that Zacke made no demands in return for his testimony against Mr. Dedge. After trial, however, Zacke was granted his two requests: his confiscated 2009-2010] 147 148 UNI V OF PENNSYL VANIA JO URNAL OF LA W AND SOCIAL CHANGE [o.1 Dedge spent the entire day in the transport van, from 6:30 in the morning until after dark. At the start of the journey, there were four or five other prisoners in the van, Dedge explained . "Then we stopped in a little town, and everyone else got off and boarded another van-even though some of the others were going in the same direction as I was-and Zacke got on. In hindsight, I should have known something wasn't right." 9 '3 Dedge's alibi witnesses from the auto shop did not testify. Horwitz believed that the gruff appearance of the alibi witnesses, with one witness having a criminal record, had worked against Dedge at the first trial.914 Dedge again took the stand, however, and forcefully denied the allegations, calling Zacke a bold-faced liar.9 5 And once again, defense witnesses challenged the hair evidence. "The experts admitted they couldn't say that it was Wilton Dedge's hair," said Horwitz, 'just that it's a white guy with blond hair, like Wilton."9 But that is not what the prosecutor told the jurors. He told the jurors: "this pubic hair was identical to some of [Dedge 's] hairs, identical in every single respect... . So we have hairs from this Defendant that are identical to, in every characteristic, to the pubic hair from this sheet."97 For Dedge to be innocent, the prosecutor added, "you would have to assume there is a man out there who committed this crime . and that ... [this] particular man would have pubic hair identical to Wilton Dedge .. .. "9 The all-male jury deliberated for seven hours and once again found Dedge guilty. In addition, Zacke's testimony that Dedge had threatened to kill Ms. Smith permitted the judge to increase the sentence: this time, he received a life sentence. 99 Dedge was in shock. "It was like watching everything through a third person," he explained. 100 "They said I showed no emotion, but they didn't understand what shock is." 101 C. Year after Year of Futile Hope It was 1984, and at the age of twenty-two, Dedge was facing the rest of his life in a Florida prison. Following the verdict, Dedge spent nearly two years locked away in solitary confinement, at his own request, preferring the cruel monotony of "solitary" to the horrors lurking beyond the solid steel door to his cell. "If I wanted to avoid rape and abuse," Dedge explained, "'my choice was simple: either stab a guy or go into solitary. I chose solitary." 10 2 Separated from the bad but also the meager good-the human contact, noises, smells, shadows, and sunlight that remind us of our humanity-Dedge searched for the strength to keep going. He could always truck was returned to his wife, and he was transferred to another prison. Laurin Sellers. DAA Test Prompts Brevard Man to Seek 3rd Trial. ORLANDO SENTINEL. July 15. 2002, at AL available at 2002 WLNR 12808577. 92 Dedge Interview. supra note 22. 93 id. 94 Horwitz Interview. supra note 54. 95 id. 96 id. 97 1984 Transcript. supra note 23. at 1659 (emphasis added). 98 Id. at 1659-60 (emphasis added). 99LaPeter. supra note 20. 10 Dedge Interview, supra note 22. 101 Id. 102 id 148 Vol. 13 2009-20 10]ANATOMY OF WRONGFUL CONVICTION14 choose to go back into the general population at any time. "[l~f you want to get out of solitary" and remain safe, a guard advised him, "you've got to go out there and stab somebody." 10 3 With two trials disastrously decided, hope was hard to come by. With the second trial only making things worse, what were the chances of a third trial? Those chances, however slim, continued to keep Dedge alive. Dedge's parents also held out hope and, with their modest means, took out a second mortgage on their house, depleted their pension, scrimped and saved every penny, and continued to appeal the conviction. 104 "1 basically lived from one appeal to the next," said Dedge. 105 "1 knew I wasn't guilty, so I believed in the system. I had faith that sooner or later I'd get out.", 106 As the years passed, however, the appeals dried up, and the precious years of young adulthood disappeared with them. Dedge wrote to dozens of lawyers, but none would take his case. In fact, only one of them even bothered to acknowledge his request, declining to assist him.10 Meanwhile, Dedge read anything he could find to pass the time in solitary, borrowing as many library books as the prison would allow. At least once or twice a month, Dedge's parents made the long, exhausting trip to visit their son. They did so every month of his confinement, at times traveling over 200 miles from their home. 108 Finally, in 1986, Dedge secured a transfer to a less dangerous prison facility, a facility in which he would venture out of solitary confinement. To call any of the facilities in which the state confined Dedge "safe," however, would be absurd. Dedge witnessed stabbings, rapes ("one day, the lights went out, and the back-up generator did not kick on"), 109 and beatings. "You've got to be on your toes 24/7," said Dedge. 110 "You're always looking over your shoulder, watching what you say and what you do."111 Yet it was not the violence but the boredom that posed the greatest challenge. "Ev ery day is the same as the last, week after week, month after month, and year after year. There is nothing to look forward to but the same monotony, day in and day out.", 112 Dedge did his best to improve himself, stay out of trouble, and stay sane. He took a course in small business management taught by a professor from the community college, earning about thirty credits until the State transferred him to a facility where no such courses were offered. The new facility needed a welder, and Dedge had taken welding courses in prison. 11' Dedge also studied water management and waste disposal, earning licenses in both waste water management and drinking water management. For the last eight years of his incarceration, he ran the water plant at the Cross City Correctional Facility in Cross City, Florida. 114 D. Enter the Innocence Project 103 id. 104 G. Dedge Interview, supra note 50. 105 Dedge Interview. supra note 22. 106 id. 107 id. 108 G. Dedge Interview, supra note 50. 109 Dedge Interview, supra note 22. 110 Id. 11 Id. 112 id. 113 id. 114 Dedge Interview, supra note 22. 2009-2010] 149 150 ~UNI V OF PEN SYL VAN/A JO URNAL OF LA W AND SOCIAL CHANGE [o.1 In 1988, when Dedge first learned of DNA testing, he wrote the Florida State Attorney's Office requesting that DNA testing be done on the physical evidence in his case. "I knew that DNA testing was my key to the door," said Dedge, "and that if I ever got the evidence tested, I'd be out.",115 The state attorney had the authority and discretion to authorize such DNA tests, but refused to do so." Six years later, in October 1994, Dedge happened to see the end of a segment on Good Morning America featuring Peter Neuifeld, co-Director of the Innocence Project in New York. Founded in 1992 by Mr. Neufeld and Barry Scheck (famous for his role as a member of the O.J. Simpson "Dream Team"), the Innocence Project helps ostensibly innocent inmates challenge their convictions using DNA evidence. Intrigued by what he heard, Dedge wondered if the Innocence Project might be able to help him. He decided to write Mr. Neufeld. Dedge knew it was a long shot, but he had written every other lawyer he could think of, so why not one more? In the moving 2006 documentary, After Innocence, which highlights the struggles of the wrongfully convicted after release, Dedge read portions of that fateful letter: Dear Mr. Neuifeld, my name is Wilton Dedge and I am very interested in your organization "Innocence Project." I caught the tail end of your interview on the Good Morning America show a few weeks ago. . .. I don't know where else to turn. I tried everything I could to prove my innocence when this first started. When I found out that the police were looking for me, I turned myself in, knowing it was all a mistake and that it would be straightened out. . .. I could write a number of pages telling you how outlandish the case is, but I know you are very busy so I'll close for now. I thank you in advance for your time and any help you can give." The small staff of attorneys at the Innocence Project was overwhelmed with requests like Dedge's. Even with the help of a small army of law student interns, the Innocence Project receives thousands more requests than it can handle. "We currently have 10,000 cases pending from inmates across the country," said Morrison. 1 ' Then there is the vetting process. Given their limited resources and the overwhelming demand, the Innocence Project must fully investigate a prisoner's claim, and the potential for a post-conviction appeal, before agreeing to add the case to their docket. This process can take three or four years." Dedge would not have to wait three or four years, however. Appalled by the weakness of the case against Dedge, and hopeful that DNA could help secure his freedom, the Innocence Project agreed to take his case in 1995. Distinguished Florida defense attorney Milton Hirsch agreed to assist as local counsel on a pro bono basis. In April 1997, Scheck and Hirsch filed a motion to permit DNA testing of the evidence taken from the scene of the crime. This included anal and vaginal swabs taken from the victim 115 id. 11' 2005 Petition, supra note 36. at 5. 117 AFTER INNOCENCE (New Yorker Films 2005) [hereinafter AFTER INNOCENCE]. 118 Interview with Nina Morrison, in New York, N.Y. (Oct. 11. 2006) [hereinafter Morrison Interview]. 119 Id. 150 Vol. 13 2009-20 10]ANATOMY OF WRONGFUL CONVICTION15 and the hairs recovered from her bed. 1'They emphasized the weakness of the prosecution's case, particularly the testimony of Preston: Prosecutors from New York, Florida, and Arizona, as well as Federal Postal Inspectors working in Florida, Ohio, Kentucky, and New York have found Mr. Preston's and his dogs' abilities to be questionable, his claims unfounded, and his testimony unusable. In fact, an internal investigation conducted in 1983 by the Special Investigations Division of the Chief Postal Inspector's office resulted in the recommendation that Mr. Preston should no longer be used by the Postal Inspector's service.12 A journalist in Arizona investigated Preston from May 1984 through October 1985, moreover, finding that "Mr. Preston's dogs were clearly wrong in some 40 different incidents, and that, early on, almost no prosecutors had ever conducted background checks of Preston's claims." 1 2 2 Results of this investigation were aired on the ABC television program 20/20 after Dedge's second trial had been completed. 1 2 ' And there was more. In 1986, the United States Court of Appeals for the Sixth Circuit reversed the Ohio robbery conviction of Dale Sutton-a conviction based in large measure on Preston's testimony. In reaching its decision, the court explained: Preston also testified for the government at Sutton's trial. Preston offered testimony both as to his expertise in training and using "scenting" dogs and as to Harass 11's training and qualifications as a "scenting" dog. Sutton alleges, and the government does not now contest, that during the course of Sutton's trial Preston testified untruthfully as to his credentials, background, and training, and as to the abilities and ancestry of his German shepherd, Harass 11.12 Given the paucity of evidence against Dedge, DNA tests were warranted to definitively establish whether he was in fact the perpetrator. The argument was powerful, but there was one formidable problem: at the time, Florida law did not expressly provide a right to DNA testing. Although DNA tests had been around for years-gaining widespread notoriety in the trial of O.J. Simpson in 1995 Florida, like the vast majority of states, had failed to enact legislation to keep pace with this critical new technology. Florida's rule governing "post-conviction" remedies, Rule 3.850, contained no provision for DNA testing. 15Under Rule 3.850, a convicted individual is prohibited from making a motion to vacate or set aside his sentence "more than 2 years after the judgment and sentence become 120 id. 121 1997 Memo. supra note 26, at 3 n.l1. 122 id. 123 id. 124 Sutton v. Rowland, No. 84-3785, 1986 U.S. App. LEXIS 19922, at *3-4 (6thCir. Jan. 7. 1986) (per curiam). After Dedge's conviction, Preston was also "discredited through an embarrassing field test ordered by a Brevard County judge in another case and the revelation that some of his 'credentials' were bogus." Sellers. supra. note 9 1. at Al. Soundly impeaching his claims in the Dedge case and others. Preston was unable to track a scent merely five days old. Id. 125 See FLA. R. CRTM. P. 3.850. 2009-2010] 151 152 UNI V OF PENNSYL VANIA JO URNAL OF LA W AND SOCIAL CHANGE [o.1 ,,126 final in a noncapital case . . .. To obtain relief based on newly discovered evidence under this rule, the Florida Supreme Court required the moving party to meet two requirements: 1) "the asserted facts must have been unknown by the trial court, by the party, or by counsel at the time of trial, and it must appear that defendant or his counsel could not have known them by the use of diligence"; and 2) "the newly discovered evidence must be of such nature that it would probably produce an acquittal on retrial." 127 Florida's state attorneys vigorously opposed the motion. They argued that Rule 3.850 did not sanction DNA testing and, even if it did, the DNA testing that Dedge requested had been available "since 1993," well over two years before his motion. 1'This new evidence was not, or should not have been, unknown to Dedge or his attorneys long before the motion. "He knew about DNA testing and didn't do anything," said Robert Holmes, the state attorney who prosecuted Dedge in 1984. He "sat on his hands." 1 2 9 This was merely the opening volley in a seven-year war that the state attorneys would wage against Dedge and his attorneys in their quest to secure DNA testing and, ultimately, Dedge's freedom. Never mind that Dedge had spent year after year writing every lawyer he could find to secure further representation, only to be rejected by every one of them, or that when he first learned of the possibility of DNA testing in 1988, he immediately wrote to the state attorney to request the procedure, only to be denied. And never mind that this new miracle test could firmly resolve, once and for all, Dedge's guilt or innocence, potentially freeing a man who had already spent fifteen years in prison, six of which occurred after the state attorney denied his request for DNA testing. Strict adherence to the proper procedure and the "finality" of the jury's decision were more important than the search for truth and justice. "Their position seemed to be, 'apr&s Wilton, le deluge,"' Hirsch explains.>10 If this prisoner secured DNA tests, then everyone would have to be granted DNA tests. "But this was factually untrue," Hirsch points out. 1' 1 For the majority of crimes, there is no biological evidence, or it is lost or destroyed after conviction. "More importantly, in Wilton's case, there was substantial doubt about his guilt, and there was a substantial basis to believe that DNA could prove his innocence. Agreeing to test everyone for whom there is an independent basis to doubt guilt, and readily available DNA evidence, is a good thing.">12 As for finality, the DNA tests could establish, once and for all, the innocence or guilt of the convicted. Indeed, Hirsch points out that 126 Id. at (b). 127 Jones v. Florida, 591 So. 2d 911, 914-15 (Fla. 1991) (citation and internal quotations omitted). 128 State's Response to Defendant's Motion for Order Releasing Trial Evidence at 2, State v. Dedge. No. 82-135-CF-A (Fl. Cir. Ct. July 22. 1997) (copy on file with author). 129 Watt. supra note 20. 13 0 Telephone Interview with Milton Hirsch (Sept. 22. 2006) (hereinafter Hirsch Interview). 131 id. 132 Id. Four years later, in adopting an amendment to the Florida Rules of Criminal Procedure permitting post-conviction DNA testing, Florida Supreme Court Justice Anstead echoed these thoughts: "We are hardly opening any floodgates. But, for the rare case that presents a credible claim, we have the unique opportunity to lay to rest, through definitive DNA testing. the concern that a serious miscarriage of justice may have occurr ed." Amendment to Florida Rules of Criminal Procedure Creating Rule 3.853 (DNA Testing) Amendment to Florida Rules of Appellate Procedure 9.140 & 9.141, 807 So. 2d 633. 654 (Fla. 2001) (per curiam) (Anstead. J., concurring in pant and dissenting in part). 152 Vol. 13 2009-20 10]ANATOMY OF WRONGFUL CONVICTION15 one of the first three Florida cases to examine DNA evidence after conviction confirmed the inmate's guilt.1 In August 1998, the court issued its decision, finding that because the DNA testing Dedge requested "was available in 1993 and the Defendant waited until April 24, 1997 to file his motion for release of evidence for DNA testing, the Defendant's DNA claim is procedurally barred .~> The Innocence Project appealed the decision to the court of appeal. Surely prisoners stuck in the state penitentiary could not be charged with knowledge of the latest DNA techno logy>35 and the two-year statute of limitations, particularly where, as here, the prisoner could not even find a lawyer to help him.>16 "[S]uch a holding," the Innocence Project asserted, "would be a monstrous distortion of Rule 3.850(b)(1), and of any notion of fair play.",1 7 In December 1998, approaching Dedge's seventeenth year in prison, the court of appeal issued its ruling. In a 2-1 decision, it affirmed the trial court's denial of Dedge's motion without comment.>83 Judge Winifred Sharp issued a strong dissent, emphasizing that "[tlhe evidence of Dedge's guilt, other than the victim's testimony, was minimal.">19 Specifically, Judge Sharp observed that the pubic hair from the crime scene "established only that Dedge 'could not be eliminated' as a possible source"; "[a~n inmate who had his sentence reduced from 180 years to 60 years testified Dedge confessed to him"; and "there was testimony that [scent dogs] were incorrect 400% of the time.",140 Judge Sharp noted the unfairness of charging Dedge with knowledge of both DNA testing and of the two-year limitation: Frankly, I think it is a very harsh reading of the two-year time limit in [R]ule 3.850 . . . . DNA testing is a recent, highly accurate, application of scientific principles unknown at the time of Dedge's trial. It is not well known to or understood by most lawyers and judges, I would wager, even in 1998. 1 think it unfair and unrealistic to expect an indigent, serving two life sentences in prison, to have had notice of the existence of PCR-based testing, and possible application to his case prior to 1995 when it was first discussed by a Florida court. 141 And she spoke passionately of the injustice of doing so in Dedge's case: One of my worst nightmares as a judge, is and has been, that persons convicted and imprisoned in a "legal" proceeding, are in fact innocent. If there is a way to 133 Hirsch Interview, supra note 130. 134 Order Denying Motion for an Order Releasing Trial Evidence at 3, State v. Dedge. No. 82-135- CF-A (Fla. Cir. Ct. Aug. 25, 1998) (copy on file with author). 135 See Initial Brief of Petitioner/Appellant at 7, Dedge v. State, 723 So. 2d 322 (Fla. Dist. Ct. App. 1998) (No. 82-2503) ("[1]he law does not impose the duty upon any defendant, much less an untutored. incarcerated defendant such as Wilton Dedge. to be diligent in learning of every development in DNA science that might alter his litigation posture."). 131 Id. .at 8 (4[flor a decade Mr. Dedge had no attorney."). 137 Id. at 7. 13 8 Dedge v. State. 723 So. 2d 322 (Fla. Dist. Ct. App. 1998). 139 Id. at 322 (Sharp. J., dissenting). 140 Id. at 322-23. 141 Id. at -3 24. 2009-2010] 153 154 UNI V OF PENNSYL VAN/A JO URNAL OF LA W AND SOCIAL CHANGE [o.1 establish their true innocence on the basis of a highly accurate objective scientific test, like the PUR, in good conscience it should be permitted. This case calls out for such relief: the evidence of Dedge's guilt at trial was minimal; the PCR test had not been developed at the time of his trial. . .. [lff successfully performed, [the test] will likely be absolutely conclusive of either his guilt or innocence. Not to do the testing consigns a possibly innocent man to spend the rest of his life in prison.14 Unfortunately, Judge Sharp did not carry the vote in December 1998; on formalistic grounds, the two-judge majority consigned Mr. Dedge to many more years of prison. E. The Inherent Authority of the Court Nearly two more years would pass without movement on the case, and seemingly without hope of any movement. The 1998 appellate decision appeared to foreclose any further DNA testing, and thus any further proceedings to prove Mr. Dedge's innocence. But Wilton's legal team had not given up hope. Notwithstanding the 1998 decision, Hirsch called upon Brevard Circuit Court Judge Bruce W. Jacobus to exercise the court's "inherent authority to permit the release of certain evidence for the purpose of conducting DNA testing." 1 4 3 Hirsch argued that Dedge was not seeking to overturn or vacate his conviction, but merely to obtain DNA testing that "might support an application for executive clemency." 14 4 It was a long shot, but it worked. On June 16, 2000, Judge Jacobus ordered release of the evidence to ReliaGene Technologies for DNA testing, ov er the state attorneys' vehement objections. 1'This was the first ruling of its kind in Florida history. 16The results, however, would not be available any time soon. The laboratories that conduct DNA tests have always faced large backloads, with the average case requiring five or six months for a result. In addition, Dedge and his team had decided to test the vaginal and anal swabs from the rape kit first. After awaiting the outcome, they learned that the samples were too degraded to procure results under the existing technology; they were thus forced to request a mitochondrial DNA test on the pubic hairs. 1 47 142 id. 143 Defendant's Amended Motion for Release of Certain Evidence for the Purpose of Conducting DINA Testing at 2. State v. Dedge, No. 82-135-CF-A (Fla. Cir. Ct. June 8. 2000) (citing Garmire v. Red Lake, 265 So. 2d 2. 4-5 (Fla. 1972) ("[C]riminal courts may fashion [measures] within their inherent powers to provide necessary procedures and processes for the recovery of evidentiary items held by them.")) (copy on file with author):- Miami Herald Publ'g Co. v. Collazo, 329 So. 2d 333, 336 (Fla. Dist. Ct. App. 1976) ("Every court has the inherent power to do all things that are reasonably necessary for the administration of justice within the scope of its jurisdiction, subject to valid existing laws and constitutional provisions."). 144 Defendant's Amended Motion for Release of Certain Evidence for the Purpose of Conducting DNA Testing. supra note 143. at 3. 145 Order Granting Defendant's Motion for Release of Certain Evidence for the Purpose of Conducting DNA Testing at 1, State v. Dedge, No. 82-135-CF-A (Fla. Cir. Ct. June 16. 2000) ("~Under the very unique factual circumstances of this case, the Court will exercise its inherent authority to release specified evidence for purposes of DNA testing which was not a readily available technology at the time of the Defendant's trial or appeals. .. ) (copy on file with author). 146 Morrison Interview, supra note 118. 147 id 154 Vol. 13 2009-20 10]ANATOMY OF A WRONGFUL CONVICTION15 (Mitochondrial DNA is contained in the cytoplasm of the cell, rather than the nucleus, and is passed by a mother to both male and female offspring.) This is an expensive test, however, and before the test could be conducted, Dedge's team was required to raise the necessary funds. 1 48 Just like that, another year of life was forfeited-or worse, it was relegated to the torture of prison. In June 2001, the results were in: the pubic hair recovered at the crime scene did not belong to Dedge. 19Since the victim had testified that only she, her sister (who shared the same mitochondrial DNA), and the perpetrator had ever been in her bed, 150 there could be no doubt that Wilton Dedge was innocent of the crime. Dedge's team again moved under Rule 3.850 yes, that Rule 3.850 to have his conviction overturned based on this new evidence. Unlike his first post- conviction motion, Dedge now possessed conclusive evidence of his innocence. "It took three years just to secure the right to get the evidence tested," said Hirsch, "and another year to secure the results. But when the results came back, I thought, we're done here, because clearly he's innocent." 15 1 But they were far from done. Despite the conclusive DNA test results, the Florida Attorney's Office again opposed the motion on the grounds that it was time- barred. 12"Rules are rules," said state attorney Holmes, and "[~i~t would be a nightmare if old cases were reopened. There is a need for finality ." 5 "The first three years that they fought to prevent the test, I was disappointed with the state attorneys' actions," Hirsch explains. 14He adds: But after the DNA results came back, I thought, this is Kafkaesque. It was obvious to the prosecutors that Wilton was innocent. They didn't care and they said they didn't care. As a former prosecutor, I was horrified that they said his innocence didn't matter and that he had no remedy. If any thing, they redoubled their efforts to prevent Wilton from gaining his release.5 The twenty-year anniversary of Dedge's incarceration came and went. In March 2002, Brevard County Judge Preston Silvernail denied Dedge's motion, agreeing with the state attorneys that, under Florida Supreme Court precedent, it was too late. 15 1 148 id. 149 Id. ReliaGene Technologies, which conducted the DNA testing. reported: "Wilton Dedge is excluded as the DNA donor." John A. Torres, Can DAA Set This Man Free?. FLA. TODAY. July 18. 2004. at 4. available at 2004 WXLI'R 23210128. 150 See infra note 182 and accompanying text. 15 1 Hirsch Interview, supra note 130. 152 State's Response to Defendant's Motion to Vacate Judgment and Sentence at 2. State v. Dedge. No. 82-135-CF-A. (Fla. Cir. Ct. Dec. 20. 2001) ("[H]e is procedurally barred from filing this successive motion.") (copy on file with author). 13Watt. supra note 20. 154 Hirsch Interview, supra note 130. 155 id. 156 Order Denying Defendant's Motion to Vacate Judgment and Sentence. State v. Dedge. No. 82- 13 5-CF-A (Fla. Cir. Ct. March 18. 2002) (copy on file with author). Anomalously. Judge Silvernail also ruled that even if the motion were not time-barred, and even if the DNA tests confirmed that the pubic hairs recovered from the victim's bed were not Dedge's, he would still not be entitled to post-conviction relief: The Court finds that there is not a reasonable probability that the outcome of the trial would have changed had the new test results been introduced at trial showing that the pubic hair(s) 2009-2010] 155 156 UNI V OF PENNSYL VANIA JO URNAL OF LA W AND SOCIAL CHANGE [o.1 Dedge's team appealed the ruling to the court of appeal, the same court that had denied the 1997 motion for testing by a 2-1 vote. Now, four years later, there was a critical difference: in October 2001, the Florida State Legislature enacted Criminal Rule 3.853, establishing a right to post- conviction DNA testing within two years of sentencing or, for all older cases, by October 1, 1572003 . The new rule also provided that a motion based on the evidence obtained is to be treated as raising a claim of newly discovered evidence. 1'Dedge's case helped inspire the new law. In November 2002, the Fifth District Court of Appeal unanimously affirmed Judge Silvernail's denial under Rule 3.850, but ruled that Wilton had a right to file a new motion under 159Rule 3.853 . Judge Sharp, the lone voice of reason in the 1998 appellate decision, now found herself part of a unanimous majority. "Finally," she observed, "the Legislature has provided a limited remedy for convicted persons to seek to exonerate themselves by resort to DNA evidence. .. which the courts have not done." 160 She warned Dedge to adhere strictly to the two-year limitation period, however, "since at this point, arguments based on due process and fundamental fairness have not succeeded in this state." 1 6 1 F. Innocence Is Irrelevant Dedge's team immediately filed a new motion before Judge Silvernail. The state attorneys opposed the motion on classic "Catch 22" grounds: the DNA testing on the hair had not technically been obtained pursuant to Rule 3.853, because the rule had not yet been enacted. Dedge's motion should not be "treated as raising a claim of newly-discovered evidence" 1 62 under the rule, they argued, since the DNA test was not secured pursuant to Rule 3.853.6 If he had actually waited a year, he could hav e axvailed himself of the newx rule.'~ But because he secured the DNA test before the rule came into effect, he should be denied any relief. This, despite the fact that Dedge's case had inspired the new law. "You couldn't make this stuff up," said Morrison. 165 recovered from the victim's bed sheets did not belong to the Defendant. The victim identified the Defendant as the perpetrator and adamantly testified that he, not his brother, raped her. The victim made this identification several times . . .. Id. at 4. 157 Amendment to Florida Rules of Criminal Procedure Creating Rule 3.853 (DNA Testing) Amendment to Florida Rules of Appellate Procedure 9.140 & 9.141, 807 So. 2d 633 (Fla. 200 1). 151 Id. at 63 5: Fla. R. Crim. P. 3.853(d)(2) ("A motion to vacate filed under rule 3.850 or a motion for postconviction or collateral relief filed under rule 3.85 1, which is based solely on the results of the court- ordered DNA testing obtained under this rule, shall be treated as raising a claim of newly-discovered evidence 15 9 Dedge v. State. 832 So. 2d 835 (Fla. Dist. Ct. App. 2002). 160 Id. at 836 (Sharp. J., concurring). 161 Id. at 837. 162 State's Response to Defendant's Motion for Post-Conviction Relief Based on DNA Testing at 1-2. State v. Dedge, No. 82-135-CF-A (Fla. Cir. Ct. March 26, 2003) (citation and internal quotations omitted). 163 See id. 164 Carl Hiaasen. Still Behind Bars, Despite DAA Evidence. MiANI HERALD. May 9. 2004. at Ll1. available at 2004 WLNR 19420712. 165 Morrison Interview, supra note 118. in his responsive papers, Hirsch commented, "[s]uch a position seems harsh and illogical to a degree to be found only in Kafka's The Trial, never in the courts of 156 Vol. 13 2009-20 10]ANATOMY OF A WRONGFUL CONVICTION15 In June 2003, Judge Silvernail ruled that the DNA results were admissible, just as the Fifth District Court of Appeal had directed in its November 2002 decision."'' "There is a reasonable probability that the Defendant would have been acquitted had the evidence been admitted at trial," Silvernail opined. 16 7 It appeared that Wilton would have his new trial and soon, he believed, his freedom. In fact, might the State agree to free him, given his manifest innocence, rather than subject him to another trial? It would not. To the contrary, the State appealed Judge Silvernail's decision, a frivolous appeal given that the judge was merely following the appellate court's explicit instructions on the matter. In his papers opposing the State's appeal, Hirsch fulminated: The State's legal advisors have submitted a 17 page brief . .. have demanded this Court's time and attention, have added to the days and weeks and months that Wilton Dedge will spend behind bars, all in defense of the incarceration of a man whose innocence they have entirely ceased to contest. Such conduct on behalf of the State's legal advisors is . .. but words are so terribly inadequate. Shameful? Monstrous? Orwellian? 6 8 "We lost another year of Wilton's life litigating an issue that the State had already lost," Morrison explains.1 6 9 After Innocence contains footage of Dedge in prison during this period. We see him in handcuffs and leg cuffs, led by a guard to meet with Morrison to discuss the prospects on appeal. Ever glancing downward, his large, sullen, deliberate eyes seem to cower, projecting sadness and a broken despondency that words cannot convey. 17 0 "You quit getting your hopes up after a while," Dedge later reflected. 17 1 "1 really didn't have much faith any more. By this point, my feelings were, when I'm out of the gates, I'll believe it.", 17 2 At oral argument on appeal in early 2004, Judge David Monaco of the Fifth District Court of Appeal appeared perplexed by the State's position: "Why hasn't this guy been given a new trial? Why is the State standing in the way?" 1 7 ' Judge Emerson Thompson then interjected: "Let me ask you a hypothetical question. If you knew with 100 percent certainty that this man was absolutely innocent, would that change your position in this case?" 1 7 4 "No," the State Assistant Attorney General responded, "1[t~hat is not the issue." 1 7 ' At that moment, says Hirsch, "the judge's jaw dropped open. It was the first time I had ever seen that in all my years of Justice of Florida." Defendant's Reply to State's Response to Defendant's Motion for Post-Conviction Relief at 4. State v. Dedge, No. 82-135-CF-A (Fla. Cir. Ct. April 7, 2003) (copy on file with author). 166 Order Granting Defendant's Motion for Postsentencing DNA Testing at 9, State v. Dedge. No. 82-135-CF-A (Fla. Cir. Ct. June 23, 2003) (copy on file with author). 167 Id. at 5. 168 Brief of Respondent/Appellee at 17, State v. Dedge, No. 5D03-2238 (Fla. Dist. Ct. App. Jan. 28. 2004). 169 Morrison Interview, supra note 118. 170 AFTER INNOCENCE. supra note 117. 17 1 Dedge Interview. supra note 22. 172 id. 173 Hiaasen. supra. note 164. at L 1. 174 id. 175 id 2009-2010] 157 158 UNI V OF PENNSYL VANIA JO URNAL OF LA W AND SOCIAL CHANGE [o.1 practice." 7 6 "If I hadn't been in the courtroom at the time," Morrison adds, "I would not have believed that it happened. You could have heard a pin drop.""' In April 2004, the court of appeal affirmed Silvernail's decision. 78 Three months later, Dedge and his team were back before Judge Silvernail to request a new trial based upon the exculpatory DNA evidence. Refusing to consent to a new trial, the state attorneys pulled out all the stops in opposing the request. Even under the new rule, state attorney Holmes and his colleague argued, Dedge should not be granted a new trial. Holmes is the attorney who, twenty years earlier, had relied upon the pubic hair to secure Dedge's conviction, telling the jury that it was "identical in every single respect" to Dedge's, 7 9 and that to acquit Dedge, the jury would have to assume that the real perpetrator had "pubic hair identical to Wilton Dedge." 180 Now he was arguing that the hair was irrelevant. As his colleague explained, the fact that it did not match Wilton's hair proved nothing: They've got a hair that could have come from God knows where, and it's not Wilton Dedge's, and we ought to just say, let that man walk. Let the citizens of Brevard County find out if he really is a rapist, whether he really assaults people in the way that he did here. Let's find out the hard way is what [Morrison] tells you.... [B~ut let me tell you, pubic hair is around, too. Just go into the urinal down there at the end of the hall and take a look. It's there. It gets pulled out.. .[T]here deserves to be some finality for victims, for the community, for everybody.... These people don't have to be fair. They aren't fair. They don't want to be fair. This is "Project Innocence." 1 8 1 The victim had stated that only she, her sister, and the rapist had ever been in her bed.1 8 The state attorneys then brought in the victim's father, twenty-two years after the crime, to proffer a brand new theory to explain the presence of the pubic hair in the victim's bed. Two weeks before the crime, Mr. Smith testified, his daughter had purchased a new dresser, and two men had moved the dresser into her room. 113The pubic hair in Ms. Smith's bed might have come from one of these men. "We were tempted to ask on cross-examination if the job was performed by the 'Naked Movers of Central Florida,"' says Morrison, "but this was no laughing matter. It was yet another example of the absurd lengths to which the State was willing to go to keep an innocent man in prison." 18 4 176 Hirsch Interview. supra note 130. 177 Morrison Interview, supra note 118. 178 State v. Dedge. 873 So. 2d 338 (Fla. Dist. Ct. App. 2004). 179 See supra note 97 and accompanying text. 180 See supra note 98 and accompanying text. 181 Transcript of State's Closing Argument, supra note 69. at 4. 9. 11-12. and 15. Much of this speech. in all of its ironic poignancy, is captured in the film After Innocence. AFTER INNOCENCE. supra note 117. 182 Adam Liptak. Prosecutors Fight DNA Use for Exoneration. N.Y. TIMtEs. Aug. 29. 2003. at AlI ("Though the victim said that only she, her sister and the rapist could have left the hairs in her sheets, the tests excluded the sisters and Mr. Dedge."). 183 Morrison Interview, supra note 118. 184 id 158 Vol. 13 2009-20 10]ANATOMY OF WRONGFUL CONVICTION15 Even without the hair evidence, "[lhere's a great deal of other evidence," the state attorney continued, 1 15 including, he emphasized, the testimony of Zacke or Preston. 1 8 6 A sense of irony again saturated the courtroom: Norm Wolfinger, head of the Brevard County State Attorney's Office, was a former public defender who had seen his own innocent client sentenced to death based on the testimony of Preston in 1982. Mr. Wolfinger's client, Juan Florencio Ramos, spent five years in prison, four of them on death row, for a murder he didn't cormmit. 187 There was no physical evidence linking Ramos to the crime, only the damning "testimony" of Harass 11.188 After sniffing a pack of cigarettes from Ramos, the dog allegedly detected his scent on the victim's blouse, among four others in the line-up. As the Florida Supreme Court later noted, "[t]he victim's shirt was the only one that had been worn by a female and was the only shirt with blood on it."' 8" Then, faced with five knives, Harass 11 licked the knife used in the murder. It was the only knife with blood-i.e. food-on it (Oust as Ms. Smith's sheets were the only sheets with blood on them in the Dedge lineup).190 Based upon this scent "identification," Ramos was sentenced to die. 191 Ramos' freedom was secured, in part, because of the 1985 expos6 on 20/20 revealing that Harass 11 and other scent dogs routinely misidentified suspects. 12"1 think Preston has gone beyond the bounds of what other people think is reasonable," Wolfinger said at the time, adding, "I wouldn't want my life to depend on what that dog says ." 9 Decades later, Wolfinger's attorneys were arguing that Wilton Dedge's life should depend on what the very same dog had said. For their final act, the state attorneys dropped a bombshell: they moved to have the semen from the crime scene tested before granting a new trial. 14"The problem is that hair does not have to be from the perpetrator," they argued. 15"The semen that he left does." 19 6 After fighting against the DNA tests for more than seven years, the State now demanded that further DNA tests be conducted to "back up" the mitochondrial DNA test. 17Under the technology available in 2000, the semen samples could not be tested, but a new method of DNA testing, called "Y chromosome" testing, might well produce results. Judge Silvernail said, "I feel like I've been thrown a curveball on this a little bit,"198 but he ultimately ordered the new tests.19 There would be no trial until the new results were in. It was back to prison for Dedge. 185 Transcript of State's Closing Argument, supra note 69, at 18. 186 Id. at 10, 15-19. 187 Sydney P. Freedberg, Florida 's Wrongly Convicted CondemnedlFreed from Death Row, Si. PEERSBu7R TIMiES. July 4. 1999, at IA. 188 Id. 189 Ramos v. State. 496 So. 2d 121, 122 (Fla. 1986). 190 Id. 191 Freedberg. supra note 187. 192 id. 193 Alex Beasly. Legal Foes Differ on V alue of Dog's Nose, ORLANDO SENTINEL, Jan. 30. 1984. at Cl1. C4. 194 AFTER INNOCENCE, supra note 117. 195 id. 196 id. 197 Transcript of State's Closing Argument, supra note 69. at 5. 198 Id. 2009-2010] 159 160 UNIV OF PENNSYL VANIA JO URNAL OF LA W AND SOCIAL CIHANGE [o.1 As its name suggests, the Y chromosome test isolates genetic markers on the Y chromosome, which oniy men possess. Unlike mitochondrial DN~A, these markers are passed from father to son .2 At the time, scientists were able to look at eleven of these markers (they are now able to examine seventeen) .2 The key in any such tests is to secure enough markers from the strand of DN~A to conclusively determine if there is or is not a match, for any individual might coincidentally share a few random markers with another. Any markers that do not match, however, automatically rule out a suspect. The great fear was that the sample was so small and degraded that no markers could be extracted, or worse, that a marker or two might randomly match with Dedge. The latter was unlikely, but given the macabre twists in the case to this point, there was ample cause for concern. ReliaGene Technologies began testing on July 29, 2004, promising to deliver expedited results within ten working days-a dramatic improvement on the normal waiting time for DNA 202 testing . This time, money was not an issue: since the State had requested the test, the State would bear the costs.20 On August 11, 2004, the results were in. Scientists were only able to extract four markers, but Dedge was ruled out on two of them (as were all of his paternal relatives).20 To no one's surprise, the DN~A evidence again proved conclusively that he was not the rapist. "I didn't know whether to laugh or cry," said Dedge .2 05 "1 ended up laughing, because 1 couldn't cry in front of hardened cons." 0 G. Free at Last On August 12, 2004, the State of Florida released Wilton Dedge after twenty-two years of wrongful incarceration, at tw~o o'clock in the morning. If prison is torture, then the State of Florida had tortured this innocent man for twenty-two straight years. To show their contrition, state authorities provided Dedge absolutely nothing upon release.2 0 Since he was technically awaiting his results in jail, rather than prison, the State did not even give him the one-hundred dollars given to released prisoners .2 And since technically he was not a convicted felon, he did 209 not receive the job training and post-release assistance provided all other releasees . In fact, Dedge did not even have clothes to wear when he left jail. After visiting her son earlier in the day, 199 Order Granting State's Request for DNA Testing on Semen on Anal Smear Slide, State v. Dedge. No. 82-135-CF-A (Fla. Cir. Ct. July 23, 2004) (copy on file with author). 200 Morrison Interview. supra note 118. 201 id. 202 id. 203 See Order. supra note 199 ("The State of Florida ... shall bear all costs. 204 Morrison Interview, supra note 118:- Forensic lest Results Report #2, ReliaGene Technologies, Aug. 11. 2004 ("Wilton Dedge and all his paternal relatives are excluded as the DNA donor in [the relevant sample].") (copy on file with author). 205 Dedge Interview. supra note 22. 206 id. 207 See Beth Kassab, Innocent Ex-Con Wants State to Pay. ORLANDO SENTINEL, Mar. 24, 2005 (noting that Dedge left jail "without so much as a bus ticket home"). 208 Laurin Sellers, Cleared Mlan Leaves Prison with Nothing, ORLANDO SENTINEL. Sept. 12, 2004, atA. 209 Id. ("He didn't receive the counseling or job referrals or temporary housing the state offers paroled murderers. rapists and thieves."). 160 Vol. 13 2009-20 10]ANATOMY OF A WRONGFUL CONVICTION16 Mary Dedge returned home and gathered up her husband's clothes and shoes so that Dedge could finally walk into freedom.91 In another ironic twist of unbridled proportions, the State Attorney's Office took the credit for freeing Dedge, blaming Dedge's lawyers for resisting DNA testing and forcing Dedge "to languish in prison."211 " The bottom line is they had evidence that could exonerate him and two months ago they objected to it being tested," Wolfinger told the press .2 12 "If it weren't tfor [the State Attorney's] office asking for this DNA test, he might still be in jail for years. I'm proud of this office for having that test done."2 Dedge had requested DNA testing of the evidence in 1988.21 The State Attorney's Office denial of that request consigned him to sixteen more years in jail. To this day, none of the prosecutors has apologized to Dedge in person, a fact that still burns. Wolfinger apologized in the press the day of Dedge's release and wrote Dedge a letter of apology, but as Dedge and his family note, he wasn't even one of the prosecutors who put Dedge behind bars for twenty-two years .2 Those prosecutors have yet to extend Dedge an apology. "I was brought up to believe you respect the law and what it stands for," said Dedge Sr., "and I do. But I have no respect for the state attorneys. They should be disbarred, and they should spend time in jail. I don't know how they look themselves in the mirror."2 16 Dedge Sr. is not a man given to hyperbole. "I think if they want to be man enough to call me a rapist," said Wilton Dedge, "then they ought to be man enough to come in person, to my face, and apologize to me and my family. That's the least they could have done." 1 H. The Quest/for Compensation Dedge emerged from twenty-two years of prison without a cent to his name, without a job or any job prospects, without health coverage, a college degree, or even clothes of his own. His parents were not in great financial shape either. Legal battles spanning more than two decades had caused significant financial hardship. To pay for their son's defense, his appeals, prison spending money, exorbitant phone bills from collect prison calls, and even the DNA tests, the Dedge family had scraped and borrowed, taking a second mortgage on their home, depleting Dedge Sr.'s entire pension fund, and paying $17,000 in penalties alone for the early withdrawals .218 Dedge's father would be forced to work years beyond his planned retirement 219age. The State of Florida offered absolutely no compensation for stealing twenty-two years of this man's life, and much of his parents' lives as well. Florida-like so many other states across the country-has no law or procedure for compensating wrongfully convicted and imprisoned 2 10 Telephone Interview with Mary Dedge (Nov. 5, 2006). 21 1 LaPeter. supra note 20. 212 id. 213 id. 214 Dedge Interview. supra note 22. 215 Id:, G. Dedge Interview, supra note 50. 216 G. Dedge Interview, supra note 50. 217 Dedge Interview. supra note 22. 2 18 Randy Shultz. No Law for 'Takzng'of Man's Life. PALM BEACH POSI, Apr. 3. 2005. 2 19 G. Dedge Interview, supra note 50. 2009-2010] 161 162 UNI V OF PENNSYL VANIA JO URNAL OF LA W AND SOCIAL CHANGE [o.1 individuals,'2 and neither state officials nor state legislators were keen to change that. It was quite possible that Dedge and his family would receive nothing for their torturous odyssey. That is when Sandy D'Alemberte, eminence gris of the Florida bar, stepped in. Special Counsel to H-unton & Williams and former President of the American Bar Association and Florida State University, Mr. D'Alemberte is as accomplished and esteemed a lawyer as you will find in the State of Florida, or just about anywhere else. He is also a former legislator and, until 2006, served as President of the Florida Innocence Initiative, Florida's version of the Innocence Project. In 2004, he agreed to represent Dedge on a pro bono basis in his quest for compensation. D'Alemberte's strategy was two-fold: if need be, he would file suit against the State of Florida for wrongful conviction and imprisonment, but he would also utilize his significant contacts and good standing with the Florida Legislature to lobby for a bill to compensate Dedge for all that the state put him through .2 This was an egregious case, and it would be nice if the State agreed to pay rather than force Dedge to pursue even more litigation. "He was a remarkably good citizen in prison," said D'Alemberte, "with no major disciplinary reports during twenty-two years of incarceration. 2 2 2 Among other things, D'Alemberte noted, Dedge had worked for the state: he ran a waste water plant for one of the prisons for eight years. "The state would have to pay for this," D'Alemberte concluded.2 2 To assess the value of Dedge's claim, D'Alemberte commissioned an economic study of the losses occasioned by the wrongful conviction. The study, by three eminent labor economists, looked at such factors as lost wages, lost social security payments (unable to work, Dedge had made absolutely no payments into the system), services rendered to the State in prison, and the money Dedge's parents expended over the years, with interest.2 2 4 "To be conservative," D'Alemberte explained, "we didn't ev en factor in his skills, ev en though he had them. 2 The final report assessed the economic losses to Dedge and his family, with interest, at $2,582,000. This figure did not include "the costs associated with the loss of liberty and reduced quality of life 226suffered by Mr. Dedge and his family," i.e., the twenty-two years of torture. D'Alemberte presented his report to state officials, seeking a total of $4.9 million in compensation for the economic losses (as calculated by the experts) together with the "loss of liberty." 2 Florida's Speaker of the House, Allan Bense, suggested that Dedge file suit rather than seek compensation from the Legislature. Dedge should "explore all the local options before all the taxpayers of Florida participate," he argued .2 2 8 "It's our policy.",2 2 9 But there was no policy for 220 Martin Dyckman. Find a Way to Compensate Dedge, Nw. ST. PEIERSBTRG liMiEs Feb. 20, 2005 ("Unlike 19 other states . .. Florida has no law, rule, or precedent for compensating someone like Dedge."):- Shultz. supra note 218 ("Florida has no law to compensate the wrongly convicted...").: Justice After Prison: Fair, Straightforward Compensation for the Innocent, DAYTONA BEACH NEws-i.. Apr. 29. 2008. at 0A4 ("As cases nationwide are overturned by DNA evidence, states have set up automatic programs that provide a quick. dignified compensation process to victims of wrongful convictions. Florida, which leads the nation in exonerations, needs such a process."). 22 1 D'Alemberte Interview. supra note 90. 222 id. 223 id. 224 id. 225 id. 226 D'Alemberte Interview. supra note 90. 227 id. 228 Dyckman. supra note 72. 162 Vol. 13 2009-20 10]ANATOMY OF WRONGFUL CONVICTION16 seeking and obtaining compensation for wrongful conviction in Florida. "Doesn't that have a familiar ring to it?" St. Petersburg Times columnist Martin Dyckman observedi.20 "Fairness doesn't matter. Rules do. Even if there are no rules."' In April 2005, a House Committee recommended that the State offer Dedge no more than $200,000 . D'Alemberte's attempt to negotiate a fair settlement proved fruitless, and the Legislature ended its session on May 6, 2005 without authorizing a dime for Dedge.2 On May 27, 2005, D'Alemberte filed suit against the State. It was a long shot based upon any statute (for there was none), but upon the "all writs" authority of the court .2 3 4 "In the absence of legislative relief," D'Alemberte's papers explained, "any remedy is dependent on extraordinary equitable proceedings ."' The failure to provide such proceedings, D'Alemberte argued, would violate the Florida Constitution, which states: "[~t~he courts shall be open to every person for redress of any injury ."' D'Alemberte argued that Dedge deserved redress for the taking of his liberty. While Judge William Gary of Leon County, Florida considered the State's motion to dismiss the case, Dedge took whatever employment he could find given his skill set and lack of work history, including landscaping and home improvement. And he continued to adjust to a life of freedom. For twenty-two years, he was denied virtually every right and every comfort, including the right to make the minutest decisions for himself "You're told what to do, when to do it, and how to do it every day," said Dedge. 2 3 7 "When I brought him home," said Dedge Sr., "he asked me, 'can I go outside and smoke?' I told him, you are a free man, and you don't have to ask anybody's permission."' Everything was new to Dedge, from cell phones to computers, emails to DVDs, to say nothing of the massive social and cultural changes. Waiting for his father in a supermarket parking lot, Dedge explained, he spent thirty minutes just trying to figure out how to turn on the car radio. 231"Those new dashboards have so many buttons crammed onto one small space," he said . Above all, Dedge enjoyed being outdoors. "Just seeing the night sky was great. The open space, the stars . . .. For twenty-two years, I couldn't look more than twenty feet without seeing a fence or bars.",24 0 Not surprisingly, Dedge now works full-time in landscaping, a job that allows him to enjoy the Florida sky without limitation. In August 2005, Judge Gary dismissed the lawsuit, ruling that the State of Florida enjoyed sovereign immunity from such lawsuits. 4 Unless the Legislature agreed to waive that 229 id. 230 id. 231 id. 232 Shultz. supra note 218. Florida State Representative David Simmons explained. "[w]e're not in the business of providing a lottery to someone who's been wrongly convicted." Jackie Halifax, Senate to Look at Compensation for Wrongly Convicted. Assoc. PRESS. Feb. 9. 2005. 233 Laurin Sellers. Legislature Shuns Cleared inmate, ORLANDO SENTINEL. May 23. 2005, at lOB. 234 2005 Petition. supra note 36. at 1. 235 Id. at 15. 236 Id. (quoting Fla. Const. art. 1. § 2 1). 237 Dedge Interview. supra note 22. 238 G. Dedge Interview, supra note 50. 23 9 Dedge Interview. supra note 22. 240 id. 24 1 Do the Right Thing for Dedge. ST. PETERSBURG TiMES. Sept. 2. 2005, available at http://sptimes.com/2005/09/02/Opinion/Do the right thing fo.shtml. 2009-2010] 163 164 UNI V OF PENNSYL VANIA JO URNAL OF LA W AND SOCIAL CHANGE [o.1 immunity by passing a law specifically allowing the wrongfully convicted to seek compensation from the State, the State could not be sued. "While everyone is in agreement that what happened to Wilton Dedge is tragic," the judge opined, "only the Legislature can address the issue of compensation under existing law." 4 The Legislature told Dedge to seek compensation in court, and now the court was telling Dedge to take it up with the Legislature. Morrison was right: you really couldn't make this stuff up. 24 ' Now that Dedge had exhausted his legal options, the Legislature could no longer deflect his plea for compensation. The longer they did so, moreover, the longer this wound, this indictment of the Florida judicial and prosecutorial system, would fester. When the Legislature returned to session, several lawmakers, including Senator Mike Haridopolos (R-lndialantic), Senator Daniel Webster (R-Winter Garden), and Representative John Quinones (R-Kissimmee), sponsored a bill to provide Dedge with two million dollars, along with free tuition and fees at any 244State university . Passed by the Legislature and signed by Governor Jeb Bush on December 14, 2005, the law acknowledged Dedge's "valuable services for the state" and the "significant 245expenses" his parents incurred in establishing his innocence . The law made no mention of the seven-year battle the State waged to resist DNA testing and exoneration, or the fact that Dedge was released three years after the DNA evidence had established his innocence. The lawmakers did, however, acknowledge that "the state's system of justice yielded an imperfect result with tragic consequences. 2 4 6 "For 22 years, Wilton Dedge was wrongfully denied one of his basic rights as an American: his liberty," Senator Webster commented .2 4 7 "While no amount of money will ever fully compensate Mr. Dedge, today the Senate voted to put justice and compassion above politics and allow the Dedge family to finally move on with their lives.",248 For blocking the earlier passage of this legislation, Speaker Bense apologized to Dedge .2 4 9 Governor Jeb Bush, after signing the bill in Tallahassee, flew into a small airport near Dedge's home for a ceremonial signing of the bill, and to apologize to Dedge.25 Dedge explains that then-Governor Bush told him, "I was not in office at the time, but I wish to extend my apologies for the wrong that was done to you.",25 1 1. ZackelPreston Postscript 242 Victor Thompson. Judge Kicks Dedge 's Case. FLA. TODAY. Aug. 30. 2005, available at http://www.nlada.org/DMS/Documents/ 1125 593964.0 1 /archives. 243 Morrison Interview, supra note 118. 244 S. 12-B. Spec. Sess. B (Fla. 2005). 245 id. 246 id. 247 Press Release. Florida State Senator Mike Haridopolos (Dec. 8. 2005) (copy on file with author). 248 id. 249 Carol Marbin Miller. Wrongly Convicted Man to Get $2 Million for His Prison Ordeal. MaINl HERALD. Dec. 10. 2005, available at http://www.miltonhirschpllc.com/docs/Miamni / 20Herald.pdf. 250 See John Kennedy & Laurin Sellers, Governor Approves $2 Million for Dedge. ORLANDO SENTINEL. Dec. 15. 2005. 251 G. Dedge Interview, supra note 50. 164 Vol. 13 2009-20 10]ANATOMY OF WRONGFUL CONVICTION16 When Clarence Zacke took the stand in 1983, he explained that he came forward to testify in order to protect womanhood . He could not stand by silently while Dedge threatened to harm Ms. Smith. Although Zacke was originally sentenced to 180 years in jail, the assistance he repeatedly provided the Florida State Attorney's Office allowed him to be released from prison in 2004. The fact that his lies had sent one man to the electric chair and another to prison for twenty-two years would not derail his release: the statute of limitations for perjury had long since 253run out in both of those cases . News of his imminent parole reached Zacke's adopted daughter, however, and she immediately contacted the authorities with a chilling tale: Zacke had repeatedly raped her when 25425she was a young child . She could not believe that he was to be released. 5 Summoning all her courage, she agreed to wear a wire into prison to visit Zacke .2 During the course of their conversation, Zacke confirmed her story.25 After being tried for rape, Zacke was sentenced to life 251 in prison in December 2005 . In denying Gerald Stano's request for a new trial based on Zacke's retraction of his damning testimony and the revelation that he was coached and rewarded by prosecutors, the Florida Supreme Court explained that "recanted testimony is exceedingly unreliable . 5 The testimony of Zacke was reliable enough to sentence a man to death, and another to a double-life sentence. But the retraction of that testimony, with an explanation of the basis for the perjured testimony, was "exceedingly unreliable" and thus insufficient grounds even for a retrial. Meanwhile, in 2008, with the help of the Florida Innocence Project, Brevard County, Florida's third victim of John Preston's dog scent "evidence" finally gained his freedom after twenty-seven years in prison.26 Like Ramos and Dedge, William Dillon was convicted for murder based in large part on the testimony of Preston .21 And like Dedge, Dillon's case invohved 262 the testimony of jailhouse informants as well as highly questionable witness testimony. In late 2008, the conviction was overturned on the basis of DNA evidence proving that, contrary to the evidence presented by Preston and his scent dog, Dillon had not worn a bloody t-shirt linked to the murder.2 6 252 D'Alemberte Interview. supra note 90:- John A. lorres, Prior Zacke Know/edge May Nave Been Hidden, FLA. TODAY. Jan. 23. 2006. 253 Dyckman. supra note 72. 254 D'Alemberte Interview. supra note 90. 255 id. 256 id. 257 id. 25 8 lorres. supra note 252. 259 Stano v. State. 708 So. 2d 271, 275 (Fla. 1998) (citation and internal quotations omitted). l6 orres. supra note 252. 26 1 Deanna Sheffied. 26 Years. ORLANDO WEEKLY. Oct. 11. 2007. available at http://www.orlando weekly. com/features/story. asp?id- 118 83; John A. Torres & Jeff Schweers. Dog Hand/er Led to Bad Evidence. FLA. TODAY. June 21. 2009. ava i/able at http://florida-issues.blogspot.com/2009/06/dog-handler- led-to-bad-evidence.html:- John A. Torres, Exonerated Wi/ton Dedge Inspired Dillon. FLA. TODAY. Nov. 27. 2008. available at http://lethal-inj ection-florida. blogspot. com/ 200 8/11 /exonerated-w ilton-dedge-in spired- dillon.html [hereinafter Torres, Exonerated Wi/ton Dedge]. 22Torres, Exonerated Wi/ton Dedge, supra note 261:- Sheffield, supra note 26 1. 263 Sheffield. supra note 261:- Torres, Exonerated Wi/ton Dedge. supra note 26 1. 2009-2010] 165 166 UNI V OF PENNSYL VANIA JO URNAL OF LA W AND SOCIAL CHANGE [o.1 Dillon had met Dedge in prison and was inspired by Dedge's exoneration to secure his own. 24On the day of Dillon's release, Dedge drove to Dillon's jail to greet him and to share some words of wisdom.26 Dedge may soon be making another trip: as this article goes to press, a fourth Brevard County man, "still in prison more than two decades after Preston and his German shepherd provided the key evidence allegedly tying him to the scene of the crime," is currently challenging his murder conviction with the help of Centurion Ministries, a group that has helped exonerate more than forty wrongfully convicted individuals. 6 111. CONCLUSION: AN URGENT NEED FOR REFORM A single case, State v. Dedge, illustrates the myriad problems and manifest injustices in our criminal justice system. Armed only with grossly inconsistent eyewitness testimony, combined with unreliable microscopy evidence, prosecutors never should have brought the case in the first place, notwithstanding the understandable desire for justice. Indeed, the decision to prosecute Dedge on this scant evidence not only led to his wrongful conviction, but it put an end to any further investigation of the crime, ensuring that the real perpetrator would never be apprehended. The real rapist still walks among us. As we have seen, canine scent identification has "little or no underlying body of scientific evidence affirming the validity of its use,",267 yet it continues to be used in criminal trials. In Dedge, it was used by prosecutors without any inquiry into the outlandish assertions of the handler, a "regular" in Brevard County. The use of Clarence Zacke, likely the key element in Dedge's second wrongful conviction, was even more egregious. Even where prosecutors do not expressly offer reduced time in return for testimony, snitches benefit in myriad w ay s by testifying, whether through prison transfers or the use of that cooperation to obtain early parole. Prosecutors had already shaved a whopping 120 years off of Zacke's sentence. At Dedge's trial, moreover, Zacke admitted: "I'm hoping, that I will be on record with the State Department of Corrections [and] that this will look favorably when I do come up for parole.",2 68 "If any other lawyer offered a witness merely five-hundred dollars in return for his testimony, he'd be disbarred, and charges would be filed," says H-orwitz.26 "Yet, prosecutors offer to take fifteen years, even 120 years off of a person's prison term. Which do you think has a greater chance to subvert the system? 270 Even if prosecutors have no intent to suborn perjury, the system is inherently subject to abuse. Several studies of the wrongfully convicted have demonstrated the critical role that lying snitches played in those convictions. According to the Northwestern University Law School's Center on Wrongful Convictions, for example, "there have been I1I1 death row exonerations since 264 Torres, Exonerated Wi/ton Dedge. supra note 261. 265 id. 266 Scott Maxwell, Did Magical Dog Jail a 4"h Innocent Man?. ORLANDO SENTINEL, June 24, 2009, available at http://www.orlandosentinel.com/news/local/orl-asecorl-maxwell-preston-062409062409iun24,0, 65 363 19.full. column.- Five Years Later: Wi/ton Dedge and Dog-Scent Evidence. Wrongful Convictions, available at http://wrongful-convictions.blogspot.com/2009/08/five-years-later-wilton-dedge-and-dog.html (stating that in July 2009, one year after Preston died, Florida state attorney Norm Wolfinger ordered a review of the murder and sexual battery cases in which Preston testified). 267 See Brisbin et al.. supra note 43 and accompanying text. 268 1984 Transcript. supra note 23. at 1220. 269 Horwitz Interview. supra note 54. 270 id 166 Vol. 13 2009-20 10]ANATOMY OF A WRONGFUL CONVICTION16 capital punishment was resumed in the 1970s. The snitch cases account for 45.9O% of those." 2 7 1 Since the mid-198(Js, federal sentencing guidelines have made things even worse. In federal cases, the only way to get below the sentencing guidelines is to provide "substantial assistance" to the government, i.e. the prosecutors. As H-orwitz points out, "defendants will say anything they need to in order to get their sentence lowered ."7 After DNA burst onto the scene, the Florida state attorneys' conduct in defending Dedge's conviction can only be described, to use Hirsch's word, as Orwellian. But their desire to preserve the conviction at all costs was not atypical. All too many prosecutors have ignored the United States Supreme Court's admonition that: [The prosecutor] is the representative . . . whose obligation to govern impartially is as compelling as its obligation to govern at all; and whose interest, therefore, in a criminal prosecution is not that it shall win a case, but that justice shall be done. As such, he is in a peculiar and very definite sense the servant of the law, the twofold aim of which is that guilt shall not escape or innocence suffer.27 As Scheck explains, that admonition was certainly ignored in Dedge: They wouldn't give us access to the evidence. They wouldn't permit the DNA testing. Even when we got DNA testing that was exculpatory of Dedge, they didn't want the judges to look at it. They didn't care whether he was innocent. They were more interested in cov ering themselv es against the possibility they made a grievous mistake. The so-called "finality of the system" was more important than getting an innocent person out of jail and finding the person who really committed the crime.27 The Dedge case may be egregious, but it is by no means unique. Although Dedge's case has led to changes in Florida's law, Florida and numerous other states continue to erect impossible hurdles to the wrongfully convicted, particularly with regard to the testing and introduction of DNA evidence. As one commentator explains, "[le~mpirical proof suggests that prosecutors have consented to DNA tests in less than fifty percent of the cases in which testing later exonerated the 271 Center on Wrongful Convictions, The Snitch System 3 (Winter 2004-2005). available at http://www.law.northwestem.edu/wrongfulconvictions/issues/causesandremedies/snitches/SnitchSystemBookl et.pdf. 272 Horwitz Interview. supra note 54. 273 Berger v. United States. 295 U.S. 78, 88 (1935); accord Donnelly v. DeChristoforo, 416 U.S. 637. 648-49 (1974) (Douglas, J., dissenting) ("~The function of the prosecutor under the Federal Constitution is not to tack as many skins of victims as possible against the wall. His function is to vindicate the rights of the people as expressed in the laws, and give those accused of crime a fair trial."):- MODEL OF RULES OF PROF'L CONDUCT R. 3.8 cmt. 1 (2007) ("A prosecutor has the responsibility of a minister of Justice and not simply that of an advocate. This responsibility carries with it specific obligations to see that the defendant is accorded procedural Justice. that guilt is decided upon the basis of sufficient evidence."). 274 AFTER INNOCENCE. supra note 117. 2009-2010] 167 168 UNIV OF PENNSYL VANIA JO URNAL OF LA W AND SOCIAL CIHANGE [o.1 inmate .,2 As a result, thousands of wrongfully convicted prisoners across this country continue to struggle to secure the preservation, testing, and introduction of biological evidence that could set them free. 275 Daniel S. Medwed, The Zeal Deal: Prosecutorial Resistance to Post-Conviction Claims of Innocence, 84 B.U. L. REv. 125, 129 (2004). 168 Vol. 13 State: Guilty until proven innocent http://www.sptimes.com/2004/11/14/news_pf/State/Guilty_until_proven_i.shtml[10/16/2015 9:34:20 AM] Weather | Sports | Forums | Comics | Classifieds | Calendar | Movies Guilty until proven innocent 21 years, 10 months, 23 days. For a crime he did not commit LEONORA LaPETER Published November 14, 2004 A hint of a smile played across Wilton Dedge's face, hardly noteworthy except that the man does not show emotion. It's not that he doesn't want to smile or laugh or even cry about his life, it's just that you can't change the way you've been for 22 years in the space of two weeks. But working on his 1988 Nissan at a friend's auto repair shop in Cocoa, Dedge's lips curled up, just a little, when he learned that his buddy's brother drives a school bus for a living. "Geez, how do you deal with that?" Dedge asked Dave Kryston. "I know how crazy we were." Kryston's 1987 Monte Carlo was parked next to the Nissan. Smoking a cigarette in one of the greasy bays, out of the hot sun, Kryston looked with pity and sadness at the friend he hadn't seen in more than two decades. "Glad to see a smile on your face," he said, nodding. "Glad to see you smile." Dedge lit a Marlboro and moved into the bay next to Kryston. He stared at the oil-marked floor, exhaled, and finally broke the silence. "Well, it's been a while. It really has." * * * Dec. 8, 1981, New Smyrna Beach, after lunch Dedge leaned under the hood of a Dodge van, working on the transmission. Two guys who owned the garage had hired him for a few days. He had just turned 20 and lived with his parents in Port St. John, a town of 400 anchored by the twin towers of a Florida Power & Light plant, where his father worked. He did a few days in jail once for reckless driving and running from police, charges that were dropped. He had a motorcycle and plenty of girls after him. A high school dropout, he surfed, drag-raced, skateboarded and partied, flitting from job to job, a couple of weeks here, a few months there. Mostly he worked for a phone cable company and in car garages. He was broke. After Dedge installed the rebuilt transmission, owner John Paul gave him a check for $25, which he cashed at the gas station across the street. Four people at the shop were sure Dedge stayed till closing, between 5 and 5:30 p.m. Paul wasn't sure when Dedge left. The other shop owner, John Huey, said he and Dedge closed the shop and hopped on their motorcycles for the quick ride to Pub 44. Dedge downed a sandwich and five or six beers in half an hour. As darkness fell, they rode to Moe's State: Guilty until proven innocent http://www.sptimes.com/2004/11/14/news_pf/State/Guilty_until_proven_i.shtml[10/16/2015 9:34:20 AM] Bar. * * * 4:35 p.m., Dec. 8, 1981. Canaveral Groves, 46 miles away She heard a noise and he was there, in the doorway of her bedroom. He wore jeans, a white T-shirt cut off at the sleeves and brown motorcycle gloves. He had long, fine blond hair bleached from the sun, a fine mustache and greenish-blue eyes. Muscular, he looked like a construction worker or a surfer. "Hi, Trish," he said. She was 17. Trish was her nickname; it was in big letters on her cosmetology case on the floor and on plaques around her room. Her eyes went to the box cutter in the stranger's right hand. With the retractable razor blade he sliced off her clothes, cut her and threw her on the bed. He held the blade over her face. She did not scream, did not cry. Don't hurt me, she thought. He sliced her face and neck, stomach and chest, arms and legs, working the razor back and forth, sometimes making X's on her skin. The weapon at her neck, he looked her in the eye and raped her. He emptied her purse, picked up her wallet, put it down. He raped her again and slugged her in the face. Through her bedroom window, she spotted a neighbor getting his mail. She grabbed a glass from atop her stereo and threw it at the bedroom door. Again he punched her in the face, and he was gone. In 45 minutes, he had cut her 65 times. * * * Dec. 12, 1981. Port St. John Trish's boyfriend took her to the hospital. Most of her external injuries were superficial; one cut on her calf measured 7 inches long. After four days, it was time to get out of the house. Trish and her sister drove to nearby Port St. John to check out the home they lived in before their parents divorced. About 9 p.m., Trish pulled her Pontiac up to a Jiffy Mart and popped in to buy her sister a Coke and cigarettes. Feeling someone staring at her, Trish turned. He was standing by a video game. His mustache looked slightly darker, he seemed shorter, and yet . . . She ran to the car, shaken. "What?" her sister said. "I think the guy that did it is in the store." State: Guilty until proven innocent http://www.sptimes.com/2004/11/14/news_pf/State/Guilty_until_proven_i.shtml[10/16/2015 9:34:20 AM] "Are you sure?" "I'm pretty sure. . . . It looked like him," she said, tears running. "But he (the rapist) looked taller." She had told police he stood maybe 6 foot, 160 to 180 pounds. Just then he exited the store. Trish's sister recognized him, they rode the bus together to elementary school; she thought his name was Walter. Did Trish want to call the police? Trish said no. * * * January 1982, Brevard County Trish returned to the Jiffy Mart a week later and saw him again. She thought maybe he looked shorter in the store because she wore 3-inch heels when she saw him there the first time. Now, certain it was him, she called police. Her sister had thought his name was Walter Hedge. Later Trish corrected it to Walter Dedge. On Jan. 8, Brevard sheriff's Sgt. Steven Kindrick arrested Wilton Dedge's older brother, Walter. Two days later, he showed Trish's sister a photo lineup that included Walter. "I hope you haven't put him in jail," she said. "It's his brother." Kindrick released Walter and arrested Wilton the next day. The officer had showed Trish a new photo lineup. "That's the one," she said, pointing to Wilton's photo. No doubt. * * * March 10, 1982. Brevard County Courthouse Dedge wet both hands at the restroom sink, dried them on paper towels from a wall dispenser and handed them to state attorney's investigator George Dirschka. Holding the towels by their edges, Dirschka hooked them into a paper clip. He hung them to dry from the window ledge of his office for 30 minutes. He folded and put them in a clean paper bag from the coffee shop downstairs, sealed it with a red evidence tag and placed it in his desk drawer. Eight days later, Sgt. Kindrick took the bag to the fifth-floor jury room, where a crime scene investigator arranged a lineup of five sets of bedsheets. In the No. 3 position were Trish's sheets - a white top one and a bottom one with a desert scene in tan, brown and blue with some blood on it. The other four sets were white sheets from the jail's dirty laundry. In came dog handler John Preston, a former Pennsylvania patrol officer, and his man-trailing purebred German shepherd, Harrass II. Preston stuck the bag with the paper towels in front of his dog. "Suche," he commanded in German. Search. Twice Preston walked the dog past the five piles of sheets. On the second run, Harrass II stopped, put down his head and sniffed at pile No. 3. * * * March 23, 1982, Sanford Regional Crime Lab State: Guilty until proven innocent http://www.sptimes.com/2004/11/14/news_pf/State/Guilty_until_proven_i.shtml[10/16/2015 9:34:20 AM] David Jernigan, a microanalyst with the Florida Department of Law Enforcement, removed Trish's twin bed sheets from a paper envelope. He hung them from a clothesline over an 8-foot long table covered with butcher paper. He "swept" the sheets, removing all debris. Two pubic hairs dropped to the paper; one of them was light brown. He placed it on a slide, side-by-side with a pubic hair collected from Wilton Dedge, and viewed them under his microscope. "Both similarities and differences were noted between this pubic hair and . . . the sample from Dedge," Jernigan wrote in his report. "However, the differences were not sufficient to entirely eliminate Dedge as a possible source." Prosecutors were sure Dedge was their man: Trish was convincing, the height difference now attributed to the attacker possibly having worn boots; the police artist sketch Trish did right after the rape turned out to be a dead-on likeness of Dedge; they had the dog-scent evidence, and now, the pubic hair. It all fit. * * * Sept. 20, 1982, Brevard County Courthouse The trial lasted eight days. Trish said Dedge was the one. The dog handler and the hair analyst testified. Dedge said it wasn't him. His alibi witnesses - five in all - put him at the auto shop, nearly 50 miles away. The jury deliberated four hours and pronounced him guilty as charged. At sentencing, Dedge's father beseeched Circuit Judge J. William Woodson. "Your honor," Walter Gary Dedge Sr. said, "there is someone out there that did this, and they have not been apprehended." "Maybe," the judge answered, "but the juries I have seen let a lot of guilty ones go, in my mind and the defense attorneys minds, that they know are guilty." Dedge's lawyer, Joseph Moss, jumped in. "As a practicing attorney of 12 years, I have got to tell you, I have never had a case that I thought was as wrong from a jury as this one." Moss said the dog handler's scent test seemed unreliable. The judge said he had heard somewhere that the jury had based its verdict on the victim's testimony. Trish originally said the rapist was 6 foot, maybe 180 pounds. Dedge was between 5-5 and 5-6, 125 pounds. "Your honor, you heard the testimony," Moss said. "The physical description of her assailant was - the first three days - clearly a much larger man than my client." "They (prosecutors) argued that somebody with a knife in their hand looks mighty big," the judge replied. Dedge didn't hear much after the judge said "30 years." His mind went blank, all sound left the room. * * * Late 1982, Sumter Correctional Institution So many inmates had knives and sharp instruments that the prison in Bushnell was known as "Gladiator School." Anger and racial tension permeated the place. Dedge sought out old-timers for advice. He learned that to talk about what you're in for is a sign of disrespect. He learned that he had to fight anyone who robbed or tried to rape him. He got into six or seven fights, he says, and was not raped. He was shocked the first time he saw two men lip-locked in the recreation yard. State: Guilty until proven innocent http://www.sptimes.com/2004/11/14/news_pf/State/Guilty_until_proven_i.shtml[10/16/2015 9:34:20 AM] Those first 18 months he kept to himself, unassuming, observant, until good news came as the calendar turned to 1984. An appeal court ruled that Dedge's trial judge should not have barred his side from putting on an expert to challenge the dog-scent evidence. He was entitled to a new trial. * * * Jan. 23, 1984, aboard a prison transport vehicle Dedge left Sumter Correctional at 7 a.m., bound for Brevard County to request bail while he waited for his second trial. At prisons and county jails along the way, the van picked up and dropped off inmates. Three prisoners got off at the Reception and Medical Center at Lake Butler; Clarence Zacke got on. A one-time millionaire with an auto salvage business, Zacke had been sentenced to 180 years for three murder-for- hire plots. He tried to hire two hit men to kill a witness in a drug-smuggling case against him. He tried to get someone else to murder one of the hit men. In jail, Zacke tried to hire another inmate to kill the state attorney who prosecuted him, to "get even." For testifying against others, Zacke already had gotten his sentence cut. Now he was being transported to Brevard, to testify against someone else. Shackled and seated on wooden benches opposite each other, Dedge and Zacke talked for two hours. The next night, Dedge's prosecutor, John Dean Moxley Jr., got a call at home. "Does the name Dedge mean anything to you?" asked Zacke's son, Richard. At Dedge's bail hearing a few days later, Clarence Zacke testified that Dedge told him he "raped and cut up some old hog." Zacke knew details: that Dedge worked at the auto shop in New Smyrna Beach, that five people there were his alibi. That one of them, John - a biker with a scraggly beard, long hair and a prison record - made a poor impression on the jury. Zacke said Dedge told him he drove his souped-up Kawasaki motorcycle more than 160 mph, made the 45-mile drive to the victim's double-wide in 15 minutes and got back to the shop before anybody noticed he was gone. He said Dedge told him that if he saw his victim again, he would kill her. Bail? Denied. * * * August 1984, Dedge's second trial Dedge's new lawyer, Mark Horwitz, had a slew of ammunition ready - transcripts from other cases - when the dog handler took the stand. Preston testified he was a member of the United States Police Canine Association. He was not. Another time Preston had misrepresented his level of training at the Tom McGean School for Dogs in Pennsylvania. The U.S. Postal Service had investigated Preston, questioning the reliability of his tracking in a number of criminal cases. State: Guilty until proven innocent http://www.sptimes.com/2004/11/14/news_pf/State/Guilty_until_proven_i.shtml[10/16/2015 9:34:20 AM] Dog handling experts had accused Preston of cuing his dogs and had questioned his assertion that his dogs could track someone years after the fact. In one case, a judge ordered a test after just four days; Preston's dog failed. Now Horwitz asked: Various investigators handled the paper bag containing the paper towels Dedge had touched; wouldn't that contaminate the evidence? The lawyer read Preston his testimony that a person's scent could pass through the leather soles of his shoes. Preston caved. Maybe scent could travel through a paper bag. He wasn't sure. Though the scent evidence was undone, for this trial prosecutors had something new. They had Zacke. A veteran snitch, his testimony in other cases shaved 130 years from his sentence and got a confiscated pickup truck returned to his girlfriend. He also got the prison transfer he wanted. For testifying against Dedge, he hoped to improve his chance for parole. Wearing prison blues and rubber sandals, Zacke told the jury that Dedge bragged he had to fight off girlfriends he had so many. Then why rape somebody? Zacke said he had asked. "He kind of grinned back at me," Zacke testified, "and he said, because, those girls that flock all over me . . . he said there's no challenge to it." Dedge's alibi witnesses did not testify this time; his lawyer counseled that their rough exteriors had not played well at the first trial. Defense experts challenged the state's dog scent and hair evidence; Dedge testified that he was innocent and that Zacke had lied. The all-male jury deliberated seven hours and reached the identical verdict as had the first jury. At sentencing, Dec. 12, 1984, prosecutor Robert Wayne Holmes pointed to Zacke's testimony that Dedge had threatened to kill the victim. That gave the judge the leeway to exceed the 30 years Dedge got the first time. Wilton Dedge had himself a new sentence: life in prison. * * * Prison: life inside One of his best buddies was a murderer ("it was self-defense"), and he counted drug dealers, burglars, kidnappers and counterfeiters as friends. ("I would rather not even know what they are in for. I'd rather see what they are like.") He got his GED. He carved a miniature piano out of red oak, with tiny black walnut keys. He saw a man a few beds over get stabbed six times for turning down a lover. He gathered plastic spoons from the canteen and melted them together to make an airboat. He got a Grim Reaper tattooed on his arm by an inmate with crude instruments - a sharpened staple attached to an ink pen cartridge filled with a mixture of oil, water and soot from burned spoons. He got out for a few hours in 1986, to attend his grandmother's funeral. He wore a gray suit and shackles, and he was embarrassed. He held his brother's new baby in the prison visitation room, where he would see three of his grandparents for the last time. State: Guilty until proven innocent http://www.sptimes.com/2004/11/14/news_pf/State/Guilty_until_proven_i.shtml[10/16/2015 9:34:20 AM] He joined prison Toastmasters to overcome his shyness, speaking in front of other prisoners about surfing, pollution and prison conditions. He saw an inmate raped by 20 others. He would write down his anger about the system and the prosecutor, then rip up what he wrote. He glued matchsticks together to make a galley ship, with sails made of resin paper. He paid an inmate clerk $10 to get to the head of a waiting list to take a class to become a certified water and wastewater plant operator. He trained for more than a year and gained 10 pounds to compete in a U.S. prison weightlifting competition. At the last moment, the competition was canceled. His softball team won the prison tournament and, for a few minutes, he forgot where he was. He did not show emotion. He trusted no one. Two men targeted him to become their "pressure punk," a sex partner. An inmate gave him a knife and told him to stab one or both of them, it was the only way. Dedge sought advice from a guard, who he said told him: stab the men. "I can't see myself stabbing someone. Fighting is one thing, something you have to do in there, but sticking someone with a knife is personal." Instead, he reported that he feared for his life and twice was put in protective custody away from other prisoners, for a total of two years. He said it was like solitary confinement. * * * Prison: seeking DNA Five years he lived court document to court document until they slowed to a trickle, his appeals exhausted. He spent two years with nothing to cling to, until he read a newspaper article about a new DNA test. Dedge had his former lawyer's secretary check: The pubic hair and semen sample evidence still remained. He wrote to 35 lawyers. Not one would take his case. Four years passed. 1990. 1991. 1992. 1993. Home in Port St. John, his parents took down the black-light posters in his bedroom and gave his bunk beds to his brother for his kids. They moved in his grandmother's bed and hung pictures of snow-covered mountains and palm trees at sunset. They visited their son at least once a month. In early 1994, watching Good Morning America, Dedge saw lawyer Peter J. Neufeld, who along with Barry C. Scheck had started the Innocence Project, a nonprofit legal clinic that works to obtain new DNA testing for inmates. Back then it consisted of the lawyers and a handful of New York law school students. State: Guilty until proven innocent http://www.sptimes.com/2004/11/14/news_pf/State/Guilty_until_proven_i.shtml[10/16/2015 9:34:20 AM] Dedge wrote and the Innocence Project started investigating. The pace was excruciating. A college student would write Dedge, introducing himself or herself and counseling patience. That student would graduate and Dedge would get another letter of introduction. Three more years passed. 1995. 1996. 1997. Dedge earned certification to operate water and wastewater plants and got a transfer to Cross City Correctional, a prison with a water plant inside its maximum security gates. He ran the system, repaired the equipment, adjusted the chemicals. In 1997, Scheck's team, unable to get Dedge's prosecutors to agree to test his DNA, asked a judge for permission, one of the first such requests in Florida. The state fought the request to apply new science to an old case, saying the time for postconviction relief had passed. "Without rules, we would never have finality in any case," said Holmes, the prosecutor at Dedge's second trial. Appeals courts agreed that Dedge was "procedurally barred" from obtaining the evidence. By now, prominent Miami lawyer Milton Hirsch, an expert on criminal procedure, had joined the case. Working for free, he appealed to a judge to allow the test for the sake of justice. In 2000, the judge agreed. The semen sample had degraded across 18 years; that DNA test was inconclusive. But the test on the pubic hair was definitive: It had not come from Dedge. At trial, Holmes had told the jury that the pubic hair all but put Dedge in the victim's bed. Now he said the hair was irrelevant, it could have come from anywhere. Hirsch demanded a new trial. Holmes eventually argued that Dedge's timing was off. Having earlier argued that Dedge was too late - the time for appeals had passed - the state now argued that he was too early. The Florida Legislature had just passed a law providing a mechanism for prisoners to seek DNA testing in older cases. Because Dedge got permission for DNA testing before the state passed its law, Holmes said, he should not get to take advantage of it. The state argued to the Fifth District Court of Appeal that Dedge's possible innocence was beside the point. Rules are paramount. The appeal court issued its ruling Jan. 13, 2003: Dedge could use the DNA evidence to seek a new trial. * * * Oct. 26, 2001, Daytona Beach The private investigator knocked on Trish's door. Lorraine Yuen says she identified herself as working for Dedge's attorneys. State: Guilty until proven innocent http://www.sptimes.com/2004/11/14/news_pf/State/Guilty_until_proven_i.shtml[10/16/2015 9:34:20 AM] Trish refused to talk but followed when Yuen headed to her car to leave. "What do you want to know?" Trish said. She was 17 when she was raped; now she was 37. Could she have identified the wrong man? Yuen asked. "No, why?" The hair is not Dedge's, Yuen said. "That can't be." Trish asked Yuen if she was a reporter; she had dodged them for years. Yuen produced her ID. Trish said she always wondered about the hair because she had been a hairdresser; the hair could have belonged to anyone. Yuen clarified. This was pubic hair. Trish said only three people ever were in her bed - herself, her sister and Wilton Dedge. "Could I possibly have put the wrong man behind bars for 20 years?" Trish said. She said she forgave Dedge and would not stand in the way of his release. But she told Yuen she wouldn't help him get out unless it was proven beyond a doubt that he did not rape her. She said she wanted to talk more but needed time to think and pray. But after their 30-minute conversation, Trish took no more of Yuen's calls. She told prosecutors that Yuen tricked her by pretending she worked for the State Attorney's Office. Yuen denied it. * * * Aug. 12, 2004. Freedom After the inconclusive DNA test on the semen sample in 2000, a more advanced test, known as Y-chromosome, became available. The two sides now flopped positions. The state, which had opposed tests, wanted this one. Let the chips fall where they may. If Dedge is innocent, let the test prove it. The defense, which had demanded tests, opposed it - for now. It had been three years since DNA proved the pubic hair came from somebody other than Dedge. His lawyers said another test was just a stall tactic. The state got its way. On Aug. 11, Dedge got a call from Nina Morrison, one of his lawyers. The results were back, the semen was not his. He was getting out. At 1:15 a.m. Aug. 12, he was released into the arms of his parents. They took him to Port St. John, back to his old room. His mother's pink flamingo collection had grown exponentially, the house seemed a whole lot smaller and the vegetation had grown so much the neighborhood looked like a "jungle." State: Guilty until proven innocent http://www.sptimes.com/2004/11/14/news_pf/State/Guilty_until_proven_i.shtml[10/16/2015 9:34:20 AM] That night, he and his 65-year-old father walked the streets, holding hands. * * * The state: Sorry, but no regrets State Attorney Norman Wolfinger wrote Dedge a letter the day he got out. "I have no words that I can say to you that will ever be able to adequately express my heartfelt apology," he began. "There is also no way that I can give back to you the precious time you lost in prison as an innocent man away from your family and loved ones. But I want you to know that I am sorry this has happened to you." The letter is better than nothing, Dedge said, but not enough. "He (Brevard prosecutors) got up in a courtroom and said a lot of bad things about me," Dedge said. "Why can't he apologize to my face?" Wolfinger, who became state attorney after Dedge's second trial, said the office is blameless. "It hits in the pit of my stomach about what happened, and it should," Wolfinger said. "But did I personally or this office do anything wrong? No! "What you have here is the miracle of science coming forward. I don't know anyone who's perfect and makes the right decisions every time except for God. All you can do is be as honest and as vigilant as you can to search for the truth." The victim's identification was strong, he said. The composite sketch she dictated to police closely resembled Dedge. "The issue is does that composite sketch look like Dedge? Jury No. 1 thought it did. Jury No. 2 thought it did. Should we punish them?" What about the state using the "innocence doesn't matter" argument to block the DNA testing? "I don't think we were saying it (innocence) wasn't relevant," Wolfinger said. "I think we were talking about time frames. I think it's a legal thing. I don't know. . . ." He blasted Dedge's lawyers. "The bottom line is they had evidence that could exonerate him and two months ago they objected to it being tested," Wolfinger said. "If it weren't for (the state attorney's) office asking for this DNA test, he might still be in jail for years. I'm proud of this office for having that test done." He continued, indignant. "If I knew I had semen of my client that is innocent and it could prove his innocence, would I let him languish in prison? No! Absolutely not. I'd ask for the test." "That is just a big fat lie," Hirsch said. He said he had argued that before the judge ordered another test, he should rule on whether the pubic hair evidence warranted a new trial. A last-minute wrinkle emerged 10 days before Dedge was released. A woman who knew Dedge as a teenager gave prosecutors a statement, accusing him of raping her 26 years ago. She said she was 14 and said she didn't report it because she was afraid. The state did not investigate her allegation because the statute of limitations had long since expired. Her statement went into Dedge's file. "When people make complaints, it becomes part of the public record," Wolfinger said. "Right now it's just an State: Guilty until proven innocent http://www.sptimes.com/2004/11/14/news_pf/State/Guilty_until_proven_i.shtml[10/16/2015 9:34:20 AM] allegation. But all that's ended. There's nowhere it could go." Dedge did not know about the allegation. He said he didn't do it. His lawyers investigated and provided affidavits from two witnesses who contradict the woman's statement. They say it's a shame to tarnish Dedge's reputation after everything he's been through. Trish's sister said Trish did not want to be interviewed about Dedge's release or about the pain of discovering, after all these years, that whoever raped her was never called to account. Wolfinger said Trish still wants her attacker pursued but said it will be a tall order to find him after 22 years. He said it's up to the public to help solve the crime. "If anyone sees someone who looks like Dedge or that composite picture," he said, "certainly they should call the sheriff's department." * * * Epilogue The guys at his brother's metal fabrication plant chipped in $300. A lady sent a $100 coupon for Wal-Mart. Someone in the checkout lane at Publix handed him $10. A dentist offered free service. Someone sent him $5,000 anonymously through a church. "I'm not used to all this kindness," Dedge said. Seven job offers streamed in, many in the wastewater and water plant businesses. But he's not ready for a full-time job. He's working for a concrete business and in home improvement. "I'm not quite ready to be tied down. I'm trying to stay busy so I don't have to think about it. Sometimes, I listen to the radio when I'm driving down the road just to keep from thinking. Basically, I'm trying to put it behind me." Leading the effort to compensate Dedge for his lost 22 years is Sandy D'Alemberte, a former legislator and Florida State University president whose office is next door to the Florida Innocence Initiative. The law caps claims against the state at $100,000; to get more, the Legislature must pass a special claims bill. Dedge's attorneys say they intend to file a civil rights lawsuit for wrongful imprisonment. J. Cheney Mason, an Orlando lawyer, said he is investigating whether Zacke, the snitch, was planted on the prison transport bus to get Dedge to talk. Zacke has said that prosecutors from the Dedge case fed him information to make another case, against Gerald Stano, an accused serial killer who was later executed. Zacke was sentenced to 180 years in 1982. By testifying against others and with good behavior, he is due to be released in January 2006. Dedge says he's trying not to focus on how wrong the justice system treated him. Like his parents, he believes it best to live in the moment. "I don't want to have a bad attitude. It's there in the back of my mind, but I don't want to dwell on it right now. I'm having too much fun with new things. "I'm very, very disappointed. There's anger there. But I can't dwell on anger or I'll mess up my life. I'm trying to enjoy things instead of dwelling on anger." Hirsch vents for them both. He has called the prosecutors' actions "moronic," "monstrous, shameful" and "Orwellian." "When Wilton got out, he was pleased and forward-looking and capable of not dwelling on the bitterness of the past," State: Guilty until proven innocent http://www.sptimes.com/2004/11/14/news_pf/State/Guilty_until_proven_i.shtml[10/16/2015 9:34:20 AM] Hirsch said. "But when it occurs to me that some prosecutors got up the next day and put on their pants and go to work and prosecute the next Wilton Dedge defendant with no consequences and no change in the criminal justice system, I can't get past that." Dedge is focused on learning to be a grownup. For 22 years, he hasn't paid rent or an electric bill. Others told him what to wear, what to eat. He lives with his parents and wants an apartment of his own. He walked around an island of belts at JCPenney recently, finally settling on a brown one with two rows of metal- lined holes running the length. It cost $19.99. "We'd make these in leathercraft," he said, pointing to a cognac-colored braided belt. "Cost me about $4 to make." A woman trying to pick a belt for someone asked Dedge his belt size. "He's bigger than you are," she said. He's heard that before. He turns 43 next Sunday. His once-long blond hair is cropped above his ears. He displays a soldier's stoic countenance - except when he looks at his parents and his eyes soften, a little. People approach, and he wonders what they really want. He looks for the bad rather than the good. It has been awkward going from guilty to innocent of doing something so awful. "Sometimes, I'm self-conscious about how I act around people. Like women, I'm thinking, "Should I do this? Or will they take this the wrong way?' Out of the blue it will pop in my mind: I wonder if they're thinking I did this or not." He regrets that he didn't get to marry and have kids. He has a girlfriend but isn't sure about children. "I don't want to be an old man going out to play ball with my kid." He thinks it unwise to bring a life into a world where he just got a bank account. Two weeks after his release, driving the '88 Nissan his father gave him, Dedge stopped at a red light in Titusville. A sheriff's deputy pulled up alongside, waiting for the light, just like him. For the first time since Dedge got out, he was afraid. Times researcher Caryn Baird contributed to this report, which includes information from court records, State Attorney's Office files, Milton Hirsch's files and Florida Today. Staff writer Leonora LaPeter can be reached at 727-893-8640 or lapeter@sptimes.com © Copyright, St. Petersburg Times. All rights reserved. Forgetting the Once-Seen Face: Estimating the Strength of an Eyewitness’s Memory Representation Kenneth A. Deffenbacher University of Nebraska at Omaha Brian H. Bornstein and E. Kiernan McGorty University of Nebraska—Lincoln Steven D. Penrod John Jay College of Criminal Justice The fidelity of an eyewitness’s memory representation is an issue of paramount forensic concern. Psychological science has been unable to offer more than vague generalities concerning the relation of retention interval to memory trace strength for the once-seen face. A meta-analysis of 53 facial memory studies produced a highly reliable association (r .18, d 0.37) between longer retention intervals and positive forgetting of once-seen faces, an effect equally strong for both face recognition and eyewitness identification studies. W. A. Wickelgren’s (1974, 1975, 1977) theory of recognition memory provided statistically satisfactory fits to 11 different empirical forgetting functions. Applied to the results of field studies of eyewitness memory, the theory yields predictions relevant to fact finders’ evaluations of eyewitness credibility. A plausible upper limit for witness initial memory strength corresponds to a probability of .67 of being correct on a fair six-person lineup. Furthermore, not only can the percentage of remaining memory strength be determined for any retention interval, but this strength estimate can be translated into an estimated probability of being correct on a fair lineup of a specified size. Keywords: eyewitness memory strength, forgetting of faces, retention interval, single-trace fragility theory Unless the state possesses incriminating physical evidence, eye- witness identification testimony is crucial whenever the prosecu- tion attempts to prove that the defendant and the perpetrator are one and the same. The reliability of an identification is affected by two classes of variables, system variables and estimator variables (Wells, 1978). System variables are those under the control of the criminal justice system, instructions given to eyewitnesses before they consider a lineup or photospread or the method by which members of the lineup other than the suspect are chosen, for instance. Estimator variables are those beyond the control of the criminal justice system and whose effects can only be estimated. These factors include, among many other estimator variables, the duration of the eyewitness’s exposure to the perpetrator, lighting conditions at the crime scene, and retention interval, the length of the interval between observation of the suspect and test of the eyewitness’s memory. Given the controllability of system variables, a considerable amount of research has been focused on them, given the greater promise that such research would lead to increases in the reliability of eyewitness memory testing procedures. Indeed, sufficient re- search progress on system variables had accumulated in the last 2 decades of the 20th century that the U.S. Department of Justice issued guidelines for the collection of eyewitness evidence based on these findings (Technical Working Group for Eyewitness Ev- idence, 1999). Police in a number of jurisdictions around the United States have already adopted these guidelines as standard practice. Progress in producing forensically useful empirical generaliza- tions has not been nearly as great in the case of estimator variables. Nevertheless, research on these variables may have the potential to produce not only greater understanding of situations in which eyewitnesses may experience perceptual or memorial problems but also to yield empirical generalizations that may assist the trier of fact (judge or juror) when he or she must assess the fidelity of an eyewitness’s memory representation (cf. Wells, Memon, & Pen- rod, 2006). In making this assessment, the key estimator variables are initial memory strength for the perpetrator’s face and length of the retention interval. Many other estimator variables have their effect only as they affect initial memory strength. These variables include duration of exposure to the perpetrator, illumination conditions, presence or absence of other foci of attention (e.g., a weapon), eyewitness stress level, and whether the perpetrator is of a different race, among others. To make a proper assessment, the trier of fact would not only need to have an estimate of the witness’s initial memory strength for the perpetrator and to know the length of the retention interval but also to understand the nature of the forgetting Kenneth A. Deffenbacher, Department of Psychology, University of Nebraska at Omaha; Brian H. Bornstein and E. Kiernan McGorty, Depart- ment of Psychology, University of Nebraska—Lincoln; Steven D. Penrod, Department of Psychology, John Jay College of Criminal Justice. An earlier version of the meta-analysis included in this article and curve fits for two of the empirical forgetting functions included in Table 2 were presented at the Off the Witness Stand: Using Psychology in the Practice of Justice conference held March 1–3, 2007, in New York, NY. Correspondence concerning this article should be addressed to Kenneth Deffenbacher, Department of Psychology, University of Nebraska, Omaha, NE 68182-0274. E-mail: kdeffenbacher@mail.unomaha.edu Journal of Experimental Psychology: Applied Copyright 2008 by the American Psychological Association 2008, Vol. 14, No. 2, 139–150 1076-898X/08/$12.00 DOI: 10.1037/1076-898X.14.2.139 139 function for the human face. The forgetting function, of course, is the curve that specifies the strength of the memory representation over the retention interval. That is, the forgetting function specifies how rapidly memory strength, plotted on the ordinate of the graph, decreases as a function of time, plotted on the abscissa. Knowing the rate of memory strength loss and retention interval length allows one to specify the proportion of original memory strength remaining. To specify in absolute terms how much memory strength remains, one must know the initial memory strength, the “starting point” on the ordinate of the forgetting function. Typically, the length of the retention interval for an eyewitness can easily be established to a reasonable degree of precision from information provided in police reports. Until now, however, psy- chological science has not had a means to provide at least a ballpark estimate of initial memory strength for a witness. Fur- thermore, cognitive psychologists have not established whether the forgetting curve for the human face is even of the same form as Ebbinghaus (1913) had determined. For that matter, it has not always been abundantly clear whether there even is a statistically reliable association between retention interval length and facial recognition memory (Deffenbacher, 1986). For example, in the period between 1970 (approximately the beginning of modern research on eyewitness testimony) and 1985, studies testing the effect of retention interval length on memory for the human face included a substantial minority reporting a null effect. An initial meta-analysis of this literature by Deffenbacher (1986) included 15 studies reporting a null effect out of a total of 33 studies, even though overall he found a highly reliable effect ( p .0001) of memory test delay on face recognition memory: The average z was 1.46, yielding a meta-analytic Z of 8.38 and an equivalent corre- lation of .25 (as retention interval or delay increased, forgetting increased). Including retention interval as part of a much more comprehensive meta-analysis than Deffenbacher’s, Shapiro and Penrod (1986) also documented statistically reliable effects of retention interval length on face recognition memory. With results of these previous meta-analyses in hand, an imme- diate attempt to describe the forgetting function for once-seen faces might seem in order. However, there are good reasons to conduct a more up-to-date meta-analysis of face memory studies before searching for a suitable theoretical forgetting function. In the more than 2 decades since 1986, the published body of research findings concerning the effect of delay has increased by more than 60%. The number of null or negative (“negative” forgetting or reminiscence) results has also continued to increase. A further concern is the proportion of studies that have been conducted in the context of the eyewitness identification paradigm rather than with the standard face recognition task in the tradition of cognitive psychology. The eyewitness identification paradigm usually exposes witnesses to a single target face (perpetrator) in a scripted scenario. Memory for this face is tested by embedding it in a 5- to 9-person live or photo lineup (target present) or by substituting someone else who is a match to the perpetrator’s description (target absent). Witnesses are asked to identify the perpetrator or to indicate that he or she is not present. The recog- nition memory task, on the other hand, exposes observers to a relatively large number of target faces. A recognition memory test typically includes the target set plus an equal number of unfamiliar distracter faces. Observers are exposed to faces serially and are to respond “yes” or “no” as to whether a given face has been seen previously. It turns out that the proportion of eyewitness identifi- cation studies has more than doubled, increasing from 27% of published studies concerned with the effect of delayed memory test in Deffenbacher’s (1986) meta-analysis to 57% at present. As a result, not only has there been a considerable increase in the proportion of studies with greater forensic applicability, but it is entirely possible that the effect size for retention interval could be different for eyewitness identification studies than for face recognition studies. Consider one possibility. Results of face rec- ognition memory studies have typically been assumed to represent high estimates of the amount of facial memory obtaining in real- world eyewitness identification settings. If eyewitness identifica- tion studies did indeed produce lower estimates of initial memory strength than did face recognition memory studies, then there would be less room for the decline of any forgetting function to occur. Thus, retention interval effects might be less for studies in the eyewitness identification paradigm because of the greater prob- ability of a restriction of range in possible loss of memory strength, as compared with face recognition studies. On the other hand, the direction of a difference in the effect size for retention interval could well be in the opposite direction. A number of published meta-analyses of the effects of other independent variables have yielded generally larger effects on memory for eyewitness identi- fication studies then for laboratory face recognition studies. For instance, Deffenbacher, Bornstein, Penrod, and McGorty (2004) found a considerably larger negative effect of heightened stress on memory for witnesses in studies conducted in the more forensi- cally relevant eyewitness identification tradition than for witnesses in face recognition studies. For all these reasons, before attempting a theoretical description of the forgetting function for face memory, we deemed it advisable to conduct an up-to-date meta-analysis of the effects of retention interval on the strength of a witness’s memory representation for the once-seen face. We next present the methodology followed and the results obtained from this meta-analysis. Meta-Analysis of Retention Interval Effects Method Sample characteristics. Clark (2005), Deffenbacher et al. (2004), and Reisberg and Heuer (2007) have all agreed that the legal standards for proffered scientific testimony established in Daubert v. Merrell Dow Pharmaceuticals (1993) have strength- ened the legal system’s preference for meta-analytic conclusions based on a body of well-conceived, well-executed, and easily retrievable studies. Hence, we made the decision to include only published studies in our sample. A thorough search of relevant citation retrieval systems was made. These systems included PsycINFO, Medline, and Social SciSearch (the Social Science Citation Index). We also examined the citations in published research and in social science convention proceedings. The present study sample consists of 39 published articles, books, and book chapters. These sources, listed in Table 1, generated 53 indepen- dent tests (N 5,405) of the hypothesis that longer retention intervals have a negative effect on memory strength for the once- seen face. Individual sample sizes ranged considerably, from a low of 8 to a high of 590 (M 101.98). Retention intervals associated with these studies ranged from 1 s to 350 days. 140 DEFFENBACHER, BORNSTEIN, MCGORTY, AND PENROD Table 1 Effect Sizes for Proportion Correct Recognition Memory or Identification Accuracy Study n RI z r Scapinello & Yarmey (1970) 40 20 min 0.00 .00 Smith & Nielsen (1970) 144 10 s 3.30 .28 Wallace et al. (1970) Children 200 5 min 0.00 .00 Adults 200 5 min 0.00 .00 Goldstein & Chance (1971) 52 2 days 0.00 .00 Shepherd & Ellis (1973) 36 35 days 1.92 .32 Laughery et al. (1974) 292 1 week 0.00 .00 Chance et al. (1975) 144 2 days 0.00 .00 Egan et al. (1977) 86 54 days 1.65 .18 Davies et al. (1978) 40 19 days 1.96 .31 Walker-Smith (1978) 8 19 s 2.58 .91 Yarmey (1979) 84 30 days 2.32 .25 Ellis et al. (1980) 48 1 week 2.58 .37 Courtois & Mueller (1981) 128 28 days 2.76 .24 Deffenbacher et al. (1981): Control: 2 min/2 wk 22 2 weeks 1.96 .42 Krouse (1981) 76 2.5 days 2.58 .30 Mauldin & Laughery (1981) 100 47.5 hr 0.00 .00 Barkowitz and Brigham (1982) 237 1 week 2.58 .17 Brigham et al. (1982) 88 22 hr 5.65 .60 Shepherd et al. (1982) Experiment 2 40 343 days 3.58 .57 Experiment 3 104 90 days 0.72 .07 Krafka & Penrod (1985) TP/no context 24 22 hr 0.22 .04 TP/context 20 22 hr 0.45 .10 TA/no context 21 22 hr 2.16 .47 TA/context 20 22 hr 1.41 .32 Cutler et al. (1986): Experiment 2 287 23 days 2.75 .16 Chance & Goldstein (1987) 59 5 days 0.00 .00 Cutler et al. (1987a) 165 1 week 1.60 .12 Cutler et al. (1987b) 290 12 days 0.47 .03 Peters (1988) 212 6 days 0.00 .00 Read et al. (1989) Early rehearsal 68 1 week 2.63 .32 Late rehearsal 68 1 week 1.30 .16 Ellis & Flin (1990) 153 1 week 1.96 .16 Podd (1990) 90 2 weeks 1.75 .18 Read et al. (1990) 90 100 min 0.00 .00 Goodman et al. (1991) 48 4.5 days 1.04 .15 Peters (1991) Experiment 1 71 26 days 0.00 .00 Experiment 3 64 13 days 0.00 .00 Shepherd et al. (1991) 96 1 month 1.96 .20 Wixted & Ebbesen (1991): Experiment 2 195 2 weeks 2.81 .20 Yarmey et al. (1996) TP:5 min/24 hr 69 24 hr 1.45 .17 TA: 5 min/24 hr 76 24 hr 0.82 .09 Sh/TP: 5 min/24 hr 69 24 hr 1.29 .16 Sh/TA: 5 min/24 hr 70 24 hr 3.05 .36 Peters (1997): Experiment 2 96 6 months 3.00 .31 MacLin et al. (2001) 64 30 min 1.34 .17 Memon et al. (2003) Older adults: TP 45 1 week 1.86 .28 Younger adults: TP 42 1 week 0.40 .06 Older adults: TA 42 1 week 3.75 .58 Younger adults: TA 42 1 week 0.97 .15 Yarmey (2004) 590 4 hr 0.00 .00 Brewer et al. (2006) TP 37 30 min 0.00 .00 TA 66 30 min 0.00 .00 Note. RI length of delay between shortest and longest retention intervals. TP target present lineup; TA target absent lineup; Sh showup. 141FORGETTING THE ONCE-SEEN FACE Statistical procedures. As we always compared the longest and shortest retention intervals in each study to determine effect size, we selected z scores for a difference between proportions as the primary dependent measure. For the studies in our sample, a z score for a difference between proportions was occasionally reported or, more often, could be calculated post hoc. In in- stances in which a test of the hypothesis was reported as not statistically significant, but no statistics were cited, we followed the conservative procedure of entering a z of zero (Rosenthal, 1995). Otherwise, we entered a z score associated with the p value of the effect size estimate, 1.65 for p .05, one-tailed, for instance. To test the statistical reliability of any estimate of typical effect size, we calculated a one-sample t test and an associated 95% confidence interval (Rosenthal & DiMatteo, 2002). Given that r and d are more frequently encountered measures of effect size, and in the case of r, may be a more generally useful measure, we have reported mean effect sizes in terms of r and d as well. In the case of r, we first converted the z-score measures of effect size for each study to r by dividing z by the square root of n, a conversion formula recommended by Rosenthal and DiMatteo (2002). Each of these biserial correlation coefficients between retention interval (short or long) and memory accuracy was then normalized by conversion to the equivalent Fisher’s z’ score before averaging. Values of Cohen’s d equivalent to the mean effect size expressed in terms of r were obtained by use of the expression d 2 r(1 r2)1/2. Results and Discussion For each study, we subtracted the proportion correct associated with the longer retention interval from the proportion correct associated with the shorter retention interval. Thus, a positive result represented positive forgetting, a loss of memory. A negative result represented negative forgetting, or reminis- cence. When we report effect size in terms of r, then a positive r means that longer retention intervals were associated with more forgetting. The unweighted mean r was .18, significantly different from zero, t(52) 4.78, p .005, with the 95% confidence interval (CI) extending from .10 to .26. The mean effect size for r in this instance is equivalent to d 0.37, a small to medium effect size (Cohen, 1988). We should note that when all possible pairwise comparisons of retention intervals in studies that had more than two retention intervals were treated as independent effect sizes, the sample size increased from 53 to 78, and the mean effect size was .17 (d 0.34), remarkably close to the results we obtained when only the longest and shortest retention intervals were compared. We next applied a test of homogeneity of variances across the sample of weighted effect sizes to determine whether the degree of variability exceeded that expected on the basis of sampling error alone. A chi-square value of 23.19 (df 52, p .05) indicated that the degree of variability did not exceed that expected on the basis of sampling error. Strictly speaking, then, no moderator analyses were required. However, given our prediction that studies conducted in the context of the eyewitness identification paradigm might well show more or even less of an effect of retention interval on memory strength as compared with face recognition studies, we nevertheless calculated mean effect sizes across 23 face recogni- tion memory studies and 30 eyewitness identification studies. In the former case, the mean r was .21, t(22) 3.18, p .005, 95% CI .08–.34; in the latter case, it was .16, t(29) 3.58, p .005, 95% CI .07–.25. However, the difference between these two correlations was not significant by a two-sample t test, t(51) .21, p .05. Hence, nature of the research paradigm was not a moderator of average effect size. Post hoc, it was suggested to us that a particularly strong moderator of the effect size for delay might be the duration of delay itself. In the third column of Table 1 (RI), we have included the length of delay between the shortest and longest retention intervals for each study in our sample. Noting that the most commonly encountered delay for British police has been a month (Pike, Brace, & Kynan, 2002), we estimated the average r to be .27, t(6) 3.39, p .01, 95% CI .08–.44, for the seven studies with delays of a month or more. For studies with lesser durations of delay, we estimated the average r to be .17, t(45) 3.99, p .005, 95% CI .08–.25. The difference in magnitude of these two correlations suggests that duration of the memory test delay itself might moderate effect size. This conjecture cannot be supported, however, because the difference between effect sizes at shorter and longer durations of delay was not significant, t(51) .38, p .05. Even so, it is interesting to note that the upper bound of the confidence interval for the studies with a maximum delay of a month or more was .44, as compared with a comparable figure of .25 for studies with a maximum delay of less than a month. Thus, despite 22 of the sample of 53 effect sizes being null or negative, we have found a statistically reliable effect size estimate for the effect of retention interval on proportion of correct recog- nition judgments, a measure of memory accuracy for the human face. Furthermore, our effect size estimate does not vary as a function of whether it is a product of studies done in the face recognition memory paradigm or of studies conducted in the eyewitness identification tradition. Hence, it is reasonable to con- clude that increased delay of a test for recognition memory for the once-seen face portends decreased probability of correct recogni- tion judgments. This decreased probability presumably reflects loss of underlying memory trace strength. Our estimate of the effect size for retention interval on memory for faces is also likely an underestimate of the actual value. The 28% of studies reporting forgetting effects that were not statisti- cally significant but for which no statistics were cited resulted in our entering a conservative value of z 0.00 in each instance. Most likely a small but positive amount of forgetting was actually exhibited by participants in such studies. Our meta-analyses put us in a better position to specify what happens over time to a person’s memory representation for an unfamiliar face. At least now we can say with some assurance that memory strength will be weaker at longer retention intervals than at briefer ones. However, our meta-analyses do not permit us to specify the shape of the forgetting function and answers to related questions, such as whether the memory representation will ever be truly lost, much less when. To address these questions, we would need to be able to specify a theoretical forgetting function that would satisfactorily fit empirical forgetting functions, particularly for studies in which facial recognition memory was tested at three or more retention intervals. The latter requirement would enable us to assess fit to nonlinear functions. 142 DEFFENBACHER, BORNSTEIN, MCGORTY, AND PENROD Finding a Theoretical Forgetting Function for the Human Face Criteria As indicated earlier, the trier of fact has had no useful way to estimate the initial strength of an eyewitness’s memory represen- tation for the once-seen face. Clearly, for it to have forensic applicability, any candidate theoretical forgetting function must (a) be able to provide an estimate of initial memory strength; (b) be accurate at predicting where future points will fall as retention interval increases, a strong test of the theory (Wixted & Carpenter, 2007); and (c) be able to satisfactorily fit group forgetting data, the form in which empirical forgetting functions exist in studies of memory for the human face included in our meta-analyses. If a theoretical forgetting function is found that meets these criteria, eyewitness memory researchers should finally have evi- dence bearing directly on their belief that the forgetting function for the once-seen face is Ebbinghausian in nature. That is, 93% of experts in the field of eyewitness testimony research, when sur- veyed most recently (Kassin, Tubb, Hosch, & Memon, 2001), agreed that there was a research basis for the notion that the rate of memory loss for an event is greatest right after an event and then levels off over time. A still large majority (83%) of these same experts agreed that this generalization was reliable enough for psychologists to present in courtroom testimony. There has been little direct evidence provided to date, however, that the faith of these experts is justified when it comes to specifying the forgetting function for the once-seen human face. Consider the critique provided by Elliott (1993): The Ebbinghaus forgetting curve . . . is another dubious metaphor for most eyewitness circumstances, both because the human face seems to have special properties as a stimulus, and because the retention intervals that are pertinent to identification scarcely ever include the very short ones where most forgetting presumably occurs. There is now a large enough number of results that are null or negative with respect to the Ebbinghaus hypothesis that their presence ought cer- tainly to form part of any testimony that might be given: They should no longer be treated simply as error. (p. 429) Selection of a Theory of Forgetting The only theory meeting the first criterion for forensic applica- bility, provision of an estimate of initial memory strength at 0 s after stimulus cessation, is Wickelgren’s (1972, 1974, 1975, 1977, 1979) single-trace fragility theory of recognition memory. Thus, Wickelgren’s theory is the only one that we evaluate for its ability to meet the remaining two criteria. In its least complex version (Wickelgren, 1975, 1977), the form of the retention function is m LtDeIt, where m represents memory strength at a given retention interval, t seconds after target stimulus exposure has ended; L is initial memory strength at 0 s after stimulus exposure ends; D is the rate parameter for a time-decay process, which is inversely proportional to the rate of memory consolidation; and I is the rate parameter for the loss of memory strength due to interference, which is directly proportional to the similarity of the target stimulus to subsequently encountered stimuli. Of course e 2.72, the base of the natural or Naperian system of logarithms. It is important to note that Wickelgren (1974) proposed that at least for recognition memory, an interval-scale measurement of memory strength (d’) is possible by making relatively weak, yet plausible assumptions concerning how statistical decision theory would translate strength into yes–no decisions. For all practical purposes, then, both m and L are measured in terms of d’ units. Wickelgren’s (1972, 1974, 1975, 1977, 1979) theory is unique in that rather than distinguishing between short- and long-term traces, it posits a single memory trace and two mechanisms pro- ductive of forgetting. An interference-free, time-decay process produces rapid forgetting in the first seconds and minutes of the retention interval because initially trace fragility is very high. As the neurophysiological process of consolidation begins to decrease trace fragility, however, the rate of forgetting slows in a negatively accelerated fashion, and less is forgotten per unit time. Consoli- dation, showing a negatively accelerated increase over time, is assumed to continue to decrease trace fragility and its susceptibil- ity to the time-decay process throughout the life of the memory trace. The negative power component of Wickelgren’s forgetting function, tD, would appear to be a plausible model of the nega- tively accelerated loss of trace fragility over time and therefore the continually decreasing amount of trace strength lost per unit time. As the contribution of the time-decay process to forgetting declines in power function fashion with increases in the retention interval, the second process, a storage interference process, oper- ating in a negative exponential fashion (eIt), would be expected to increase its influence on the rate of forgetting at longer retention intervals. This prediction might explain a result noted by Deffen- bacher, Carr, and Leu (1981), who found that for recognition memory of both faces and words, the amount of forgetting due to retroactive interference with an item’s trace in storage increased over a 2-week retention interval. We should note that a simpler version of Wickelgren’s (1972, 1974, 1975, 1977, 1979) theory has been proposed (e.g., Wixted & Carpenter, 2007). This version, in effect, contains only two free parameters, initial memory strength and the rate of forgetting due to time decay. The version we have selected (Wickelgren, 1974, 1975, 1977) contains a third parameter, rate of forgetting due to interference generated subsequent to encoding of the stimulus. It would be prudent to justify the necessity of the additional free parameter. We have found it necessary to retain the interference parameter to secure an adequate fit to empirical forgetting functions that included retention intervals greater than 1–2 weeks in length. The two-parameter version provides about the same degree of fit as the three-parameter one for retention intervals up to this length. At longer intervals, however, face recognition memory appears to require a source of forgetting in addition to time decay. Values of the time-decay parameter sufficient for a good fit at shorter inter- vals were not sufficient to account for the considerable additional forgetting at longer intervals. Indeed, a plot of log memory strength (d’) against log time reveals a downward inflection in empirical forgetting functions that occurs between an interval corresponding to about 1 week and ones corresponding to a month or more (Deffenbacher, 1986). Interestingly, Valentine, Pickering, and Darling (2003) found in their analysis of 314 lineups con- ducted by the London Metropolitan Police that the probability of identifying the suspect decreased drastically in the interval be- tween 1 week and 1 month, declining from .66 to .34. Finally, face recognition in a forensic context often includes an institutional source of interference subsequent to encoding of the perpetrator’s 143FORGETTING THE ONCE-SEEN FACE face, the exposure of the witness to mugshots before a memory test by means of a live or photographic lineup (Deffenbacher, Born- stein, & Penrod, 2006). This sort of interference could be prob- lematic if the later lineup were a target-absent one. Previous Tests of the Theory In the first decade after the introduction of the single-trace fragility theory of forgetting in the 1970s, a modest amount of empirical support was generated. For instance, Wickelgren (1972, 1974, 1975) found that the theory provided an excellent fit to forgetting functions obtained for episodic memory representations for frequently encountered words and for pictures of commonly encountered objects. In the three publications just cited, Wickel- gren reported a dozen experiments resulting in 35 separate r2 statistics, averaged across 3–10 research participants in each in- stance. The median r2 was .89, the proportion of empirical forget- ting function variance accounted for by single-trace fragility the- ory. All these experiments used yes–no recognition memory tasks, with memory for verbal and pictorial materials being tested under a variety of conditions and measured at retention intervals up to 2 years in length. To the best of our knowledge, there have been only two previ- ously published attempts to fit any theory of forgetting to face recognition memory forgetting functions. Deffenbacher (1986) not only conducted a meta-analysis of memory for the once-seen face as a function of retention interval but also conducted a preliminary test of the ability of Wickelgren’s (1972, 1974, 1975, 1977, 1979) single-trace fragility (power-exponential) theory to fit empirical forgetting functions for face recognition memory. He found that Wickelgren’s power-exponential theory provided relatively good fits to the functions from five different studies. Wixted and Ebbesen (1991, Experiment 2) showed that a simple power func- tion was an excellent fit to their empirical forgetting function for face recognition memory tested at retention intervals ranging from 1 hr to 2 weeks in length. Unfortunately, except for the single effort of Wixted and Ebbesen (1991, Experiment 2), neither Deffenbacher nor anyone else ever followed up these first curve-fitting forays with any further theory testing or development in regard to the forgetting of faces. Furthermore, neither Deffenbacher nor anyone else ever made a serious attempt to determine to what extent either Wick- elgren’s (1972, 1974, 1975, 1977, 1979) power-exponential theory or any other theory of forgetting might have forensic application. In the next section, we attempt to remedy the first of these two deficiencies. We remedy the second deficiency in a subsequent section. New Tests of the Theory Table 2 illustrates the results of our fitting Wickelgren’s (1972, 1974, 1975, 1977, 1979) power-exponential theory to 11 empirical forgetting functions obtained from the face recognition memory and eyewitness identification literatures. These 11 data sets were obtained from studies that included at least three retention inter- vals, that obtained a significant effect for retention interval (pos- itive forgetting), and for which sufficient information was pro- vided to calculate d’ values as a measure of memory strength at each of the tested retention intervals. Six of the data sets were Table 2 Fit of Single-Trace Fragility Theory to Empirical Forgetting Functions Study Observed and (predicted) d’ memory strength after various delays 0 s 5 min 2 days 7 days Barkowitz & Brigham (1982) (1.70) 1.47 (1.47) 1.19 (1.24) 1.14 (1.18) 0 s 10 min 2 days 7 days Chance & Goldstein (1987) White faces (2.49) 2.12 (2.12) 1.96 (1.82) 1.61 (1.72) Japanese faces (1.53) 1.30 (1.30) 1.02 (1.02) 0.88 (0.76) 0 s 1 min 2 days 28 days Courtois & Mueller (1981) (3.26) 2.94 (2.94) 2.41 (2.39) 1.93 (1.95) 0 s 5 min 1 day 7 days Ellis & Flin (1990) 7 years/2-s encoding time (1.13) 0.98 (0.98) 0.74 (0.84) 0.70 (0.72) 10 years/2-s encoding time (1.78) 1.54 (1.54) 0.98 (1.20) 0.72 (0.62) 10 years/6-s encoding time (1.98) 1.72 (1.72) 1.72 (1.47) 1.23 (1.26) 0 s 3 min 6 days 35 days Shepherd & Ellis (1973) (1.97) 1.73 (1.73) 1.24 (1.28) 0.78 (0.74) 0 s 7 days 30 days 90 days 350 days Shepherd, Ellis, & Davies (1982) (2.78) 1.92 (1.92) 1.62 (1.64) 1.47 (1.17) 0.00 (0.29) 0 s 1 hr 1 day 1 week 2 weeks Wixted & Ebbesen (1991, Experiment 2) (2.47) 2.01 (2.01) 1.75 (1.83) 1.46 (1.57) 1.41 (1.37) 0 s 1 min 7 days 30 days Yarmey (1979) (3.39) 3.12 (3.12) 2.44 (2.30) 1.47 (1.50) 144 DEFFENBACHER, BORNSTEIN, MCGORTY, AND PENROD obtained from studies published since Deffenbacher’s (1986) pre- liminary test of Wickelgren’s theory of forgetting. We should note that these 11 functions were of necessity fitted by eye so as to minimize the sum of absolute deviations of predicted and observed values. Least-squares or maximum likeli- hood estimates of parameter values were not possible, given that each forgetting function contained only three or four retention intervals and that the observed values at each retention interval were group d’ scores. Fortunately, we were able to begin our curve-fitting exercise by taking advantage of parameter values required to fit Wickelgren’s (1975) data for frequently encountered English words and Ryback, Weinert, and Fozard’s (1970) data for recognition of pictures of common everyday objects. We discovered, however, that the value of the time-decay parameter needed to fit our data for unfamiliar faces was only one-tenth that required for the data by Wickelgren (1975) and Ryback et al. (1970). The same value of the time-decay parameter (.025) provided good fits for 10 of the 11 forgetting curves. A value of .02 improved the fit slightly for the remaining study (Yarmey, 1979). Values of the interference parameter that we used here were up to an order of magnitude smaller (6 108) than that required to fit the data of Wickelgren (1975) and Ryback et al. (1970), 6 107. The forgetting data from Barkowitz and Brigham (1982), Courtois and Mueller (1981), and Shepherd, Ellis, and Davies (1982) and Chance and Goldstein’s (1987) data from Caucasians viewing Caucasian faces were fit with the 6 108 value of the interference parameter, and the data from the remaining seven studies were fit by values of the interference parameter that were up to 16 times greater. The values provided in the 0-s column of Table 2 are estimates of L, the initial memory strength parameter. Given that all the data for memory measurement as a function of retention interval were group, rather than individual, in nature, and given the lack of any previously established estimates of initial memory strength for unfamiliar faces, we obtained initial strength estimates by substi- tuting for the predicted value of d’ in Wickelgren’s (1975, 1977) equation the observed value of d’ obtained from the first retention interval at which face memory was measured and then solving for L. It should therefore not be surprising that the predicted and observed values match perfectly at the first retention interval for each forgetting function. Clearly, any statistical assessment of the adequacy of fit includes only the degree of fit at retention intervals subsequent to the first. This approach also permits assessment of how well the theory predicts where future points will fall as retention interval increases beyond Time 0. A statistical assessment of the fit of Wickelgren’s (1972, 1974, 1975, 1977, 1979) power-exponential theory to the 11 empirical face forgetting functions was made by applying a chi-square goodness-of-fit test in each instance. In no instance was the chi- square test significant. Hence, in each case the null hypothesis that both observed and predicted values represent the same forgetting function could not be rejected. An omnibus chi-square test of the fit of retention interval data from all 11 functions (23 df) was also not significant. The quality of the curve fits by the power- exponential theory is especially encouraging when one notes that the observed values of d’ are by necessity group scores rather than being based on individually computed scores such as Wickelgren obtained from continuous recognition memory experiments. Thus, it can be said that Wickelgren’s (1972, 1974, 1975, 1977, 1979) power-exponential theory has met all three criteria set out earlier for any theory of forgetting to have potential forensic applicability. The theory provides an estimate of initial memory strength, predicts accurately where future points will fall on the forgetting function as retention interval increases, and satisfacto- rily fits functions based on group data. Before considering the forensic applicability of Wickelgren’s (1975, 1977) theoretical forgetting function, however, we should first note some additional aspects of its theoretical utility. Further Theoretical Observations Faces as a stimulus category. Finding a theory that fits face recognition memory forgetting functions has been rewarding in terms of a number of additional insights gained, insights accrued beyond the mere promise of having a more precise accounting of the loss in fidelity for the memory representation of the once-seen face. One particularly intriguing finding is that not only is virtually the same value of the time-decay parameter required to fit the power-exponential theory to each of the 11 empirical forgetting functions but it is an order of magnitude smaller than the value required by Wickelgren (1975) to fit forgetting functions for common English words and pictures of common objects. Appar- ently, there is a more rapid rate of decline in trace fragility (a more rapid rate of increase in trace consolidation) for unfamiliar faces than there is for episodic traces of familiar English words and pictures of familiar objects. Thus, even though face recognition memory forgetting functions may be fit by the same theory as forgetting functions for words and objects, episodic memory for unfamiliar faces may decline more slowly. As Deffenbacher (1986) noted, perhaps this phenomenon should not be all that surprising, given selection pressures in the evolutionary history of our species to promote efficient processing of faces. After all, faces constitute a very important category of stimuli, providing a very rich source of socially relevant information, continually re- quiring all of us to make fine discriminations among them, and needing a relatively large quantity of human cortex to be devoted to their processing (e.g., Sergent, Ohta, & MacDonald, 1992). Our finding that memory representations for the unfamiliar face may be consolidated more rapidly than for certain other visual stimulus classes reinforces the notion that faces may not be unique stimuli but they are at least somewhat special (Ellis & Young, 1989). Cross-race effect. A second interesting byproduct of our the- oretical search is revealed as a result of our fit of the power- exponential theory to two forgetting functions provided by the data of Chance and Goldstein (1987; see Table 2). Here we have additional illumination of mechanisms underlying the cross-race effect, a forensically relevant phenomenon whereby once-seen faces of another race or ethnic grouping are discriminated from one another less well and later recognized less well than are once-seen faces of the observer’s own race (Meissner & Brigham, 2001). Chance and Goldstein’s observers were Caucasians ex- posed to Caucasian and Japanese faces, blocked in counterbal- anced order by race of face. The statistically reliable cross-race effect obtained here was clearly due to superior encoding of the Caucasian faces. This effect was documented by an initial memory strength superiority of approximately 1 d’ unit for Caucasian faces when encoded by Caucasian observers as compared with the initial 145FORGETTING THE ONCE-SEEN FACE memory strength engendered by their encoding of Japanese faces. It is also of interest that even though Japanese faces were consol- idated at the same rate as Caucasian faces, requiring the same value of the time-decay parameter for a fit of theory to forgetting function, the forgetting function for the Japanese faces required a value of the interference parameter 10 times as great,107 versus 108. This finding provides support for the view that same-race faces are more easily discriminated from one another than are other-race faces (Meissner & Brigham, 2001), providing an op- portunity for them to be encoded in a more discriminating fashion, yielding thereby at least some of the encoding advantage for same-race faces. Same-race faces would therefore likewise be expected to withstand better the ravages of interference generated by subsequent encounters with other faces during any particular face’s retention interval. Indeed, Meissner and Brigham (2001) found that cross-race effects were greater at longer retention intervals. Target face exposure. The effect of increases in target face exposure time is illuminated by our fit of Wickelgren’s (1972, 1974, 1975, 1977, 1979) theory to two forgetting functions yielded by the work of Ellis and Flin (1990). Examining the face recog- nition memory functions generated by two different groups of 10-year-old Scottish schoolchildren (Table 2), one group having an encoding time of 2 s per face and the other having encoding time of 6 s, we observed that the extra 4 s of encoding time provided an initial memory strength advantage of about 0.20 of a d’ unit. This initial advantage leveraged a memory strength advantage of 0.50 of a d’ unit at a 7-day retention interval. A recent meta-analysis by Bornstein, Deffenbacher, McGorty, and Penrod (2007) has con- firmed that longer exposures of target faces are associated with both higher hit rates (r .23) and lower false alarm rates (r .12). Clearly, more research is needed in an effort to try to establish just how much additional initial memory strength can be purchased by n additional seconds of exposure time. Age differences. Still another phenomenon of face recognition memory is the age effect associated with efficiency of face pro- cessing, such that older children show superior memory for faces. It would appear that the age effect associated with the greater recognition memory shown by Ellis and Flin’s (1990) 10-year-olds with 2 s of encoding time per face as compared with 7-year-olds with the same 2 s of encoding time is strictly due to the former being able to encode unfamiliar faces more effectively in the time available, yielding an initial memory strength advantage of 0.65 of a d’ unit (Table 2). Up until the middle teen years, face recognition memory shows continual improvements in the ability to discrim- inate same-race faces from one another and later to recognize them (Chance & Goldstein,1984). Thus, 10-year-olds have had another 3 years of fine tuning of their brain’s perceptual learning “machin- ery,” permitting enhanced ability to respond to more subtle differ- ences among faces they typically encounter. Failures to find retention interval effects. Finally, power- exponential theory can help us to understand at least one potential contributor to the frequent failure to find statistically significant retention interval effects for face recognition memory. Clearly, one factor in producing findings of a lack of a statistically reliable effect of forgetting occurs primarily when just two retention in- tervals are measured and both intervals are at points on the for- getting function between which little forgetting would be expected to occur. Although it is more informative to test retention at more than two intervals, if only two intervals are to be tested, investi- gators should at least ensure that one occurs within minutes after encoding, given the relatively rapid roll-off in memory strength in the first minutes after encoding. After all, memory strength for the once-seen face loses 15% of its strength in the first 10 min of the retention interval. Consider an illustration provided by the data from Wixted and Ebbesen (1991, Experiment 2, 11-s stimulus duration condition); see Table 2. Note that there was about a 12% loss of original memory strength (d’ 2.47) between Day 1 and Day 7. However, the actual amount of original memory strength lost since encoding was about 41%. If one were to have only measured face recognition memory at the Day 1 and Day 7 intervals, one would have underestimated the actual amount of forgetting by a factor of more than three to one. Hence, Wickel- gren’s (1972, 1974, 1975, 1977, 1979) power-exponential theory and its ability to provide an estimate of original memory strength permits us to see clearly that just where retention intervals fall on the theoretical forgetting function will affect the likelihood of finding statistically reliable amounts of forgetting. Forensic Applicability of Power-Exponential Theory As we indicated in the introduction to this article, to make a proper assessment of the strength of an eyewitness’s current mem- ory representation, a trier of fact needs to have an estimate of the witness’s initial memory strength, to know the length of the retention interval, and to understand the nature of the forgetting function for the once-seen human face. As retention interval length can usually be specified with some precision, acquiring an estimate of the eyewitness’s initial memory strength and a knowledge of the precise nature of the forgetting function represent the key forensic needs. Let’s consider the latter forensic need first. Psychologists are now able to provide a much greater degree of specific knowledge to the trier of fact as regards the nature of the forgetting function for human face recognition memory. It turns out that we can now offer the judge or juror an estimate of what proportion of memory strength, regardless of its initial value, remained at the time the eyewitness’s memory for the perpetrator was tested. We can do this for three reasons. First, we have clearly demonstrated the ability of Wickelgren’s (1972, 1974, 1975, 1977, 1979) power- exponential theory to fit forgetting functions that have included retention intervals ranging in length from 1 min to nearly 1 year. Second, we have demonstrated the remarkable constancy of the value of the time-decay parameter (.025) needed for a fit. Third, we have likewise demonstrated the relatively narrow range of values of the interference parameter needed for a fit, practically all falling within an order of magnitude of each other, 107 to 108. For a conservative estimate of the proportion of remaining mem- ory strength, we can simply plug into Wickelgren’s (1975, 1977) equation given earlier in this article the values of the forensically relevant retention interval (in seconds), the value of the time-decay parameter at .025, and the value of the interference parameter at 108; if a case involves a cross-race identification, the interference parameter should be instead set at 107. For a given retention interval, the resulting calculation yields the estimated proportion of initial memory strength remaining. Having an expert on eyewit- ness memory be able to testify to this estimate should aid the trier of fact considerably in his or her task of assessing the fidelity of an 146 DEFFENBACHER, BORNSTEIN, MCGORTY, AND PENROD eyewitness’s memory representation. Knowing, for example, that at memory test, an eyewitness had only 50% of original memory strength remaining would represent a real improvement in speci- ficity over what could be provided by an expert before the present. More valid assessments of eyewitness credibility can only increase the quality of justice rendered by a trier of fact. Let’s now return to the other key forensic need for triers of fact to be able to add precision to their assessment, the need for an estimate of initial memory strength for the perpetrator’s face. Wickelgren’s (1972, 1974, 1975, 1977, 1979) theory of forgetting is indeed unique among theories of forgetting in its specification of an estimate of initial memory strength. However, the forensic situation is neither a laboratory nor a field experiment and does not yield an estimate of initial memory strength. There is, nevertheless, a way to yield a conservative estimate of initial memory strength for a typical eyewitness, conservative in the sense that the estimate would very likely represent an upper bound on initial memory strength for many forensic situations. The results of an interesting field experiment (Pigott, Brigham, & Bothwell, 1990) provide us with the opportunity. Pigott et al.’s (1990) participants were 47 Florida bank tellers, each of whom interacted with one of two men who attempted to cash a crudely altered U.S. Postal Service money order during a scripted 1.5-min encounter. More than 75% of these tellers had training in eyewit- ness techniques but were not made aware that their encounter with the perpetrator of attempted bank fraud was not genuine until after their recall and recognition had been measured 4 hr later. Averaged across two target-present and two target-absent lineups, their mean proportion correct was 0.55, which for a seven-alternative, forced- choice recognition memory task (six lineup members plus the alternative of rejecting the entire lineup) is equivalent to a d’ score of 1.41 (Hacker & Ratcliff, 1979). It seems reasonable to account for the alternative of rejecting the lineup as an additional choice. After all, for a target-absent lineup, the correct choice is rejection of the lineup. If this had not been done in the present instance, the six-alternative, forced-choice d’ would have been 1.30. If we substitute 1.41 for the value of m in Wickelgren’s (1975, 1977) equation and solve for L, we come up with a very plausible estimate of initial memory strength for the bank teller eyewitnesses of Pigott et al. (1990), d’ 1.79, equivalent to 67% correct on a seven-alternative, forced-choice task (Hacker & Ratcliff, 1979). After 4 hr, then, memory strength for the perpetrator of attempted bank fraud was just 79% of what it had been originally. Had the tellers not been tested until a week after the encounter with the perpetrator, memory strength would have been approximately d’ 1.24, equivalent to a predicted performance score of 49% cor- rect—considerably lower, although still better than chance. The d’ value of 1.79 plausibly represents an upper limit for initial memory strength for eyewitnesses in many forensic situa- tions, at least for those not having a highly distinctive or memo- rable perpetrator. That is, it is fair to say that the forensic scene for Pigott et al.’s (1990) bank tellers represents a close to optimal situation for an eyewitness. Consider that the perpetrator was in full view for 1.5 min, an amount of target exposure greater than the “critical value” of 1.0 min noted by Valentine, Pickering, and Darling (2003) in their massive study of lineups conducted by the London Metropolitan Police. The banks were well illuminated. The perpetrators were not disguised. There was no alternative focus of attention present that might ordinarily have been expected to draw the teller’s attention away from the perpetrator’s face, such as a weapon; only the face of the money order proffered by the man trying to perpetrate bank fraud required some attention. Tell- ers should not have been operating under high stress levels, given the absence of any personal threat to them. To the extent that one or more of the optimal conditions of the Pigott et al. (1990) study are not met in any given forensic situation, then, the predicted initial memory strength for an eye- witness should be less than the figure of 67% correct predicted for the Florida bank tellers in Pigott et al.’s field experiment. Until further research is conducted—testing three or more retention intervals and the effect of varying durations of target person exposure, target person distinctiveness, and illumination levels, for instance—conservative advice to a trier of fact would be that the typical eyewitness viewing a perpetrator’s face that was not highly distinctive would be expected to have no more than a 50% chance of being correct in his or her lineup identification (six-person lineup) at a 1-week delay. Clearly the trier of fact would still need to consider other specific facts of the case to decide how much less than 50%, if any, the chance of a correct identification might be in these less than optimal witnessing conditions. Retention interval benchmarks other than 1 week can be readily calculated, of course, using Wickelgren’s (1972, 1974, 1975, 1977, 1979) theory of forgetting and the data provided by Pigott et al. These calculations do assume that the lineup’s construction and administration have been conducted fairly. However, a post hoc assessment of lineup fairness is relatively straightforward (Malpass, Tredoux, & McQuiston-Surrett, 2007). To illustrate in particular the need for more research on the relationship of variations in target face distinctiveness to initial memory strength, consider the results of using power-exponential theory to predict initial memory strength for the eyewitnesses in the field studies of Read, Tollestrup, Hammersley, McFadzen, and Christenson (1990). Averaging across four different photo lineup conditions and 212 retail clerks, performance was at a level of 76% correct at a 48-hr retention interval, equivalent to a d’ of 2.10. Using this value of memory strength to estimate initial memory strength, we find a d’ value of 2.87 (91% correct), considerably higher than the 67% correct initial memory strength figure pre- dicted for Pigott et al.’s (1990) bank tellers. Predicted performance level for Read et al.’s clerks would have been 73% correct at a memory test delay of 1 week, equivalent to a memory strength value of d’ 1.98. However, because only one perpetrator was used across the four different 48-hr retention interval conditions to which Read et al.’s clerks were exposed, it may well be that their higher performance level than that obtained by Pigott et al.’s (1990) eyewitnesses and those in other field experiments was due to the single perpetrator having a rather distinctive, and hence memorable, face. Conclusions Psychological science is now in a position to offer the trier of fact more than vague generalities regarding the relationship be- tween retention interval and strength of the memory representation for the once-seen face. The results of our meta-analysis confirm that there is indeed a statistically reliable association between longer retention intervals and decreased face recognition memory, an association equally true of face recognition memory and 147FORGETTING THE ONCE-SEEN FACE eyewitness identification studies. That is, there is an increase in positive forgetting as the delay increases between encoding of a face and test of one’s memory for it. The present meta-analytic review of the literature also provides support for Wickelgren’s (1972, 1974, 1975, 1977, 1979) theory of forgetting from recognition memory, using data from studies of memory for once-seen faces that used diverse methodologies and a wide range of retention intervals. Fitting Wickelgren’s power- exponential theory to 11 different empirical forgetting functions providing group data resulted in statistically satisfactory fits, fits predicting where future points will fall on the function as retention interval increases. In addition, power-exponential theory provides an estimate of initial memory strength. This latter feature of the theory, particularly useful when applied to the results of field studies, permits calculation of not only an estimate of initial memory strength (d’) but also calculation of a strength estimate at any given retention interval. Hence, not only can percentage of initial memory strength remaining be determined at any retention interval, but the strength estimate at a particular retention interval can also be readily translated into a probability of being correct on a fair lineup of a specified size. Of course, to be practically useful, these estimates would need to be calculated for and clearly ex- plained to the trier of fact by an eyewitness memory expert. When considering the applicability of our findings, at least two concerns might be raised. First, throughout this article, we have assumed that the amount forgotten is a function of the current strength of the memory representation. It could be objected that what is remembered is also very much determined by retrieval conditions, such as the type of memory test. Recognition memory tests are often more sensitive measures of memory than are recall tests for the same material, for instance. Furthermore, the encoding specificity principle proposes that the amount of forgetting at retrieval is a function of the degree of match between encoding context and retrieval context. Although we do not deny the validity of these objections, we do not see them as problematic for the applicability of Wickelgren’s (1972, 1974, 1975, 1977, 1979) theory to eyewitness memory. For one thing, the retrieval tasks in the studies we have reviewed, eyewitness identification memory tests and laboratory face recognition tests, differ somewhat but are still essentially tests of recognition memory. One of our moderator analyses showed that type of retrieval task, recognition memory or eyewitness identification, was not a moderator of effect size for the correlation between retention interval length and memory strength. In addition, even though the match of encoding and retrieval context is typically greater for recognition memory tasks than for eyewitness identification tasks, this difference did not affect the strength of the relationship between retention interval and memory strength, either. Furthermore, Wickelgren’s theory was equally effective at predicting memory strength at any given retention interval for both eyewitness identification and face recognition memory studies. A second concern that might be raised relates to the fact that our curve-fitting exercise could only be applied to 11 forgetting func- tions from just eight published studies. The robustness of the data sets underlying these 11 functions clearly depends on the overall quality of the eight published papers. Our considered judgment is that there is little to be concerned about in this regard. Quality of fit of theory and data shows no obvious relationship to any per- ceived differences in quality of publication source or any minor differences in quality of methodology and data analysis. In any event, psychologists interested in the psychology of testimony now have much more abundant direct evidence bearing on their belief that the forgetting function for the once-seen face is Ebbinghausian in nature (cf. Kassin et al., 2001): Rate of memory loss for an unfamiliar face is greatest right after the encounter and then levels off over time. Psychological science can now also provide to both these same psychologists and triers of fact rather more specific details concerning the decline and fall of a face’s memory representation over time and succeeding facial encounters. References References marked with an asterisk indicate studies included in the meta-analysis. *Barkowitz, P., & Brigham, J. C. (1982). Recognition of faces: Own-race bias, incentive, and time delay. Journal of Applied Social Psychology, 12, 255–268. Bornstein, B. H., Deffenbacher, K. A., McGorty, E. K., & Penrod, S. D. (2007). The effect of cognitive processing on facial identification accu- racy: A meta-analysis. Unpublished manuscript, University of Nebraska— Lincoln. *Brewer, N., Caon, A., Todd, C., & Weber, N. (2006). Eyewitness iden- tification accuracy and response latency. Law and Human Behavior, 30, 31–50. *Brigham, J. C., Maass, A., Snyder, L. D., & Spaulding, K. (1982). Accuracy of eyewitness identifications in a field setting. Journal of Personality and Social Psychology, 42, 673–681. Chance, J. E., & Goldstein, A. G. (1984). Face-recognition memory: Implications for children’s eyewitness testimony. Journal of Social Issues, 40, 69–85. *Chance, J. E., & Goldstein, A. G. (1987). Retention interval and face recognition: Response latency measures. Bulletin of the Psychonomic Society, 25, 415–418. *Chance, J. E., Goldstein, A. G., & McBride, L. (1975). Differential experience and memory for faces. Journal of Social Psychology, 97, 243–253. Clark, S. E. (2005). A re-examination of the effects of biased lineup instructions in eyewitness identification. Law and Human Behavior, 29, 575–604. Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum. *Courtois, M. R., & Mueller, J. H. (1981). Target and distracter typicality in face recognition. Journal of Applied Psychology, 66, 639–645. *Cutler, B. L., Penrod, S. D., & Martens, T. K. (1987a). The reliability of eyewitness identifications: The role of system and estimator variables. Law and Human Behavior, 11, 233–258. *Cutler, B. L., Penrod, S. D., & Martens, T. K. (1987b). Improving the reliability of eyewitness identification: Putting context into context. Journal of Applied Psychology, 72, 629–637. *Cutler, B. L., Penrod, S. D., O’Rourke, T. E., & Martens, T. K. (1986). Unconfounding the effects of contextual cues on eyewitness identifica- tion. Social Behaviour, 1, 113–134. Daubert v. Merrell Dow Pharmaceuticals, 509 U.S. 579. (1993). *Davies, G. M., Ellis, H. D., & Shepherd, J. W. (1978). Face identification: The influence of delay upon accuracy of Photofit construction. Journal of Police Science and Administration, 6, 35–42. Deffenbacher, K. A. (1986). On the memorability of the human face. In H. D. Ellis, M. A. Jeeves, F. Newcombe, & A. Young (Eds.), Aspects of face processing (pp. 61–70). Dordrecht, the Netherlands: Martinus Nijhoff. 148 DEFFENBACHER, BORNSTEIN, MCGORTY, AND PENROD Deffenbacher, K. A., Bornstein, B. H., & Penrod, S. D. (2006). Mugshot exposure effects: Retroactive interference, mugshot commitment, source confusion, and unconscious transference. Law and Human Behavior, 30, 287–307. Deffenbacher, K. A., Bornstein, B. H., Penrod, S. D., & McGorty, E. K. (2004). A meta-analytic review of the effects of high stress on eyewit- ness memory. Law and Human Behavior, 28, 687–706. *Deffenbacher, K. A., Carr, T. H., & Leu, J. R. (1981). Memory for words, pictures, and faces: Retroactive interference, forgetting, and reminis- cence. Journal of Experimental Psychology: Human Learning and Mem- ory, 7, 299–305. Ebbinghaus, H. (1913). Memory: A contribution to experimental psychol- ogy (H. A. Ruger & C. E. Bussenius, Trans.). New York: Teachers College, Columbia University. (Original work published in 1885). *Egan, D., Pittner, M., & Goldstein, A. G. (1977). Eyewitness identi- fication: Photographs vs. live models. Law and Human Behavior, 1, 199 –206. Elliott, R. (1993). Expert testimony about eyewitness identification: A critique. Law and Human Behavior, 17, 423–437. *Ellis, H. D., & Flin, R. H. (1990). Encoding and storage effects in 7-year-olds’ and 10-year-olds’ memory for faces. British Journal of Developmental Psychology, 8, 77–92. *Ellis, H. D., Shepherd, J. W., & Davies, G. M. (1980). The deterioration of verbal descriptions of faces over different delay intervals. Journal of Police Science and Administration, 8, 101–106. Ellis, H. D., & Young, A. W. (1989). Are faces special? In A. W. Young & H. D. Ellis (Eds.), Handbook of research on face processing (pp. 1–26). Amsterdam: North-Holland. *Goldstein, A. G., & Chance, J. (1971). Visual recognition memory for complex configurations. Perception & Psychophysics, 9, 237–241. *Goodman, G. S., Hirschman, J. E., Hepps, D., & Rudy, L. (1991). Children’s memory for stressful events. Merrill-Palmer Quarterly, 37, 109–158. Hacker, M. J., & Ratcliff, R. (1979). A revised table of d ’ for M-alternative forced choice. Perception & Psychophysics, 26, 168–170. Kassin, S. M., Tubb, V. A., Hosch, H. M., & Memon, A. (2001). On the “general acceptance” of eyewitness testimony research: A new survey of the experts. American Psychologist, 56, 405–416. *Krafka, C., & Penrod, S. D. (1985). Reinstatement of context in a field experiment on eyewitness identification. Journal of Personality and Social Psychology, 49, 58–69. *Krouse, F. L. (1981). Effects of pose, pose change, and delay on face recognition performance. Journal of Applied Psychology, 66, 651–654. *Laughery, K. R., Fessler, P. K., Lenorovitz, D. R., & Yoblick, D. A. (1974). Time delay and similarity effects in facial recognition. Journal of Applied Psychology, 59, 490–496. *MacLin, O., MacLin, K. M., & Malpass, R. S. (2001). Race, arousal, attention, and delay: An examination of factors moderating face recog- nition. Psychology, Public Policy, & Law, 7, 134–152. Malpass, R. S., Tredoux, C. G., & McQuiston-Surrett, D. (2007). Lineup construction and lineup fairness. In R. C. L. Lindsay, D. F. Ross, J. D. Read, & M. P. Toglia (Eds.), The handbook of eyewitness psychology, Vol. 2: Memory for people (pp. 155–178). Mahwah, NJ: Erlbaum. *Mauldin, M. A., & Laughery, K. R. (1981). Composite production effects of subsequent facial recognition. Journal of Applied Psychology, 66, 351–357. Meissner, C. A., & Brigham, J. C. (2001). Thirty years of investigating the role of the own-race bias in memory for faces: A meta-analytic review. Psychology, Public Policy, & Law, 7, 3–35. *Memon, A., Bartlett, J., Rose, R., & Gray, C. (2003). The aging eyewitness: Effects of age on face, delay, and source-memory ability. Journal of Gerontology: Psychological Sciences and Social Sciences, 58B, 338 –345. *Peters, D. P. (1988). Eyewitness memory and arousal in a natural setting. In M. M. Gruneberg, P. E. Morris, & R. N. Sykes (Eds.), Practical aspects of memory: Current research and issues, Vol. 1: Memory in everyday life (pp. 89–94). Chichester, England: Wiley. *Peters, D. P. (1991). The influence of stress and arousal on the child witness. In J. Doris (Ed.), The suggestibility of children’s recollections (pp. 60–76). Washington, DC: American Psychological Association. *Peters, D. P. (1997). Stress, arousal, and children’s eyewitness testimony. In N. L. Stein, P. A. Ornstein, B. Tversky, & C. J. Brainerd (Eds.), Memory for everyday and emotional events (pp. 351–370). Hillsdale, NJ: Erlbaum. Pigott, M. A., Brigham, J. C., & Bothwell, R. K. (1990). A field study on the relationship between quality of eyewitnesses’ descriptions and iden- tification accuracy. Journal of Police Science and Administration, 17, 84–88. Pike, G., Brace, N., & Kynan, S. (2002). The visual identification of suspects: Procedures and practice (Briefing note 2/02 Policing and Reducing Crime Unit, Home Office Research, Development and Statis- tics Directorate) [online]. Retrieved December 3, 2007, http:// www.homeoffice.gov.uk/rds/prgbriefpubs1.html *Podd, J. (1990). The effects of memory load and delay on facial recog- nition. Applied Cognitive Psychology, 4, 47–59. *Read, J. D., Hammersley, R., Cross-Calvert, S., & McFadzen, E. (1989). Rehearsal of faces and details in action events. Applied Cognitive Psy- chology, 3, 295–311. *Read, J. D., Tollestrup, P., Hammersley, R., McFadzen, E., & Chris- tensen, A. (1990). The unconscious transference effect: Are innocent bystanders ever misidentified? Applied Cognitive Psychology, 4, 3–31. Reisberg, D., & Heuer, F. (2007). The influence of emotion on memory for forensic settings. In M. P. Toglia, J. D. Read, D. F. Ross, & R. C. L. Lindsay (Eds.), The handbook of eyewitness psychology, Vol. 1: Memory for events (pp. 81–116). Mahwah, NJ: Erlbaum. Rosenthal, R. (1995). Writing meta-analytic reviews. Psychological Bul- letin, 118, 183–192. Rosenthal, R., & DiMatteo, M. R. (2002). Meta-analysis. In J. Wixted (Ed.), Stevens’ handbook of experimental psychology (Vol. 4, pp. 391– 428). New York: Wiley. Ryback, R. S., Weinert, J., & Fozard, J. L. (1970). Disruption of short-term memory in man following consumption of ethanol. Psychonomic Sci- ence, 20, 353–354. *Scapinello, K. F., & Yarmey, A. D. (1970). The role of familiarity and orientation in immediate and delayed recognition of pictorial stimuli. Psychonomic Science, 21, 329–331. Sergent, J., Ohta, S., & MacDonald, B. (1992). Functional neuroanatomy of face and object processing. Brain, 115, 15–36. Shapiro, P. N., & Penrod, S. D. (1986). A meta-analytic analysis of facial identification studies. Psychological Bulletin, 100, 139–156. *Shepherd, J. W., & Ellis, H. D. (1973). The effect of attractiveness on recognition memory for faces. American Journal of Psychology, 86, 627–633. *Shepherd, J. W., Ellis, H. D., & Davies, G. M. (1982). Identification evidence: A psychological evaluation. Aberdeen, Scotland: Aberdeen University Press. *Shepherd, J. W., Gibling, F., & Ellis, H. D. (1991). The effects of distinctiveness, presentation time, and delay on face recognition. Euro- pean Journal of Cognitive Psychology, 3, 137–145. *Smith, E. E., & Nielsen, G. D. (1970). Representations and retrieval processes in short-term memory: Recognition and recall of faces. Jour- nal of Experimental Psychology, 85, 397–405. Technical Working Group for Eyewitness Evidence. (1999). Eyewitness evidence: A guide for law enforcement. Washington, DC: U.S. Depart- ment of Justice, Office of Justice Programs. Valentine, T., Pickering, A., & Darling, S. (2003). Characteristics of eyewitness identification that predict the outcome of real lineups. Ap- plied Cognitive Psychology, 17, 969–993. 149FORGETTING THE ONCE-SEEN FACE *Walker-Smith, G. J. (1978). The effects of delay and exposure duration in a face recognition task. Perception & Psychophysics, 24, 63–70. *Wallace, G., Colheart, M., & Forster, K. I. (1970). Reminiscence in recognition memory for faces. Psychonomic Science, 18, 335–336. Wells, G. L. (1978). Applied eyewitness testimony research: System vari- ables and estimator variables. Journal of Personality and Social Psy- chology, 36, 1546–1557. Wells, G. L., Memon, A., & Penrod, S. D. (2006). Eyewitness evidence: Improving its probative value. Psychological Science in the Public Interest, 7, 45–75. Wickelgren, W. A. (1972). Trace resistance and the decay of long-term memory. Journal of Mathematical Psychology, 9, 418–455. Wickelgren, W. A. (1974). Single-trace fragility theory of memory dynam- ics. Memory & Cognition, 2, 775–780. Wickelgren, W. A. (1975). Alcoholic intoxication and memory storage dynamics. Memory & Cognition, 3, 385–389. Wickelgren, W. A. (1977). Learning and memory. Englewood Cliffs, NJ: Prentice-Hall. Wickelgren, W. A. (1979). Chunking and consolidation: A theoretical synthesis of semantic networks, configuring in conditioning, S-R versus cognitive learning, normal forgetting, the amnesic syndrome, and the hippocampal arousal system. Psychological Review, 86, 44–60. Wixted, J. T., & Carpenter, S. K. (2007). The Wickelgren power law and the Ebbinghaus savings function. Psychological Science, 18, 133–134. *Wixted, J. T., & Ebbesen, E. B. (1991). On the form of forgetting. Psychological Science, 2, 409–415. *Yarmey, A. D. (1979). The effects of attractiveness, feature saliency and liking on memory for faces. In M. Cook & G. Wilson (Eds.), Love and attraction (pp. 51–53). Oxford, England: Pergamon Press. *Yarmey, A. D. (2004). Eyewitness recall and photo identification: A field experiment. Psychology, Crime & Law, 10, 53–68. *Yarmey, A. D., Yarmey, M. J., & Yarmey, A. L. (1996). Accuracy of eyewitness identifications in showups and lineups. Law and Human Behavior, 20, 459–477. Received October 9, 2007 Revision received February 20, 2008 Accepted February 25, 2008 MAIL TO: Name Address City State Zip APA Member # Check enclosed (make payable to APA) Charge my: VISA MasterCard American Express Cardholder Name Card No. Exp. Date Signature (Required for Charge) BILLING ADDRESS: Street City State Zip Daytime Phone E-mail ORDER FORM Start my 2008 subscription to the Journal of Experimental Psychology: Applied ISSN:1076-898X ___ $52.00, APA MEMBER/AFFILIATE _______ ___ $84.00, INDIVIDUAL NONMEMBER _______ ___ $300.00, INSTITUTION _______ In DC add 5.75% / In MD add 6% sales tax _______ TOTAL AMOUNT ENCLOSED $ _______ Subscription orders must be prepaid. (Subscriptions are on a calendar year basis only.) Allow 4-6 weeks for delivery of the first issue. Call for international subscription rates. SEND THIS ORDER FORM TO: American Psychological Association Subscriptions 750 First Street, NE Washington, DC 20002-4242 Or call 800-374-2721, fax 202-336-5568. TDD/TTY 202-336-6123. For subscription information, e-mail: subscriptions@apa.org XAPA08 150 DEFFENBACHER, BORNSTEIN, MCGORTY, AND PENROD Psychological Bulletin Copyright 1986 by the American Psychological Association, Inc. 1986, Vol. 100, No. 2, 139-156 0033-2909/86/$00.75 Meta-Analysis of Facial Identification Studies Peter N. Shapiro and Steven Penrod University of Wisconsin--Madison In this article we present a meta-analysis of 128 eyewitness identification and facial recognition studies, involving 960 experimental conditions and 16,950 subjects. The meta-analysis was designed to address two general questions. (a) What knowledge have we accumulated on factors that influence facial identification performance? (b) What areas of facial identification research would benefit from further research? Most of the article concerns the first issue, and two techniques were used: an effect size analysis, which integrates the effect sizes of independent variables across studies; and a study- characteristics analysis, which integrates the influence of study characteristics on performance. A number of variables operating at the encoding and retrieval stages yield large effects on performance. These variables include context reinstatement, transformations in the appearance of faces, depth of processing strategies, target distinctiveness, and elaboration at encoding. Additional variables yield- ing strong effects on recognition performance are exposure time, cross-racial identification, and retention interval. There was little correspondence between a variable's impact on hit rate and false- alarm rate. Practical implications of the findings are discussed throughout the article. Na tu re o f the Problem Recent exchanges in the American Psychologist (May, 1983, and September, 1984, issues) debated the usefulness of present- ing jurors with expert testimony regarding factors that influ- ence eyewitness accuracy. One major issue raised was whether the level of research knowledge about the factOrs that influence eyewitness accuracy justify the courtroom appearance of expert witnesses to testify about these factors. In short, how adequate (and reliable) is the scientific foundation from which one might develop expertise on eyewitness identifications? Having psychologists appear as expert witnesses on factors relevant to eyewitness performance is an issue that has gener- ated substantial debate both within the legal profession and in psychological circles (Addison, 1978; Ellison & Buckhout, 1981; McClosky & Egeth, 1983b; Weinstein, 1981; Wells & Lof- tus, 1984; Woocher, 1977). Although some investigators assert that the level of knowledge justifies the use of expert witnesses (Loftus, 1983a, 1983b), others oppose this practice (McClosky & Egeth, 1983a, 1983b). One purpose of the present research is to ascertain how much we have learned about the factors that This research was partially supported by a grant from the Wisconsin Alumni Research Foundation and National Institute of Justice Grants 80-IJ-CX-0034 and 84-IJ-CX-0010 to the second author. We thank Stevens Smith, Larry Heuer, James Coward, and Brian Cutler for their contributions to this article. We also thank John Brig- ham, Robert Buckhout, Betty House, Gary Wells, and an anonymous reviewer for their thoughtful comments regarding the article. Finally, we thank all of the authors who cooperated by sending us additional information about their studies. Peter Shapiro is now at Arbor Inc., The Science Center, 3401 Market Street, Philadelphia, Pennsylvania 19104. Correspondence concerning this article should be addressed to Steven Penrod, Department of Psychology, University of Wisconsin, W. J. Brogden Psychology Building, 1202 West Johnson Street, Madison, Wisconsin 53706. affect facial identification. A second purpose is to identify those domains where additional research is needed and those where additional research is not likely to yield fruitful results. In this article we report a meta-analysis of research findings in a field that has grown exponentially in the last decademre- search on eyewitness identification and facial recognition. For ease of presentation, these two terms are collectively referred to as facial identifications. Although we combine these research domains because they both involve facial identification, they do differ. For example, facial recognition studies are usually con- dueted by cognitive psychologists, generally to answer theoreti- cal questions about facial memory. In contrast, eyewitness iden- tification studies are usually conducted by social and cognitive psychologists to address applied questions about factors that in- fluence actual eyewitness performance. Of course, there is sub- stantial overlap between facial recognition and eyewitness iden- tification research besides their use of a common dependent variable. For instance, many facial recognition studies have dis- cussed their practical applications (e.g., Barkowitz & Brigham, 1982; Brigham & Williamson, 1979; Davies, 1983; Davies, El- lis, & Shepherd, 1981; Goldstein, Johnson, & Chance, 1979; Laughery, Alexander, & Lane, 1971; Laughery & Fowler, 1977; Malpass & Kravitz, 1969; Malpass, Lavigueur, & Weldon, 1973; Thomson, Robertson, & Vogt, 1982; Wagstaff, 1982; Wood- head, Baddeley, & Simmonds, 1979). In addition, researchers in both domains often use and examine the same independent variables (e.g., exposure time, context reinstatement, retention interval, target distinctiveness). Thus, our meta-analysis inte- grates two areas of research that investigate facial identification, but have different emphases. Although we address the field of eyewitness performance in general, our analyses are based only on facial identification studies. Thus, we do not include studies that use descriptions of an event, the estimated time of an event, multiple-choice questions concerning the event, and so on as dependent vari- ables. These studies are surely important in eyewitness perfor- mance, but a meta-analysis requires the use of a common de- 139 140 PETER N. SHAPIRO AND STEVEN PENROD pendent variable. The common dependent variable in the pres- ent study is facial identification. Knowledge of the factors that influence eyewitness perfor- mance has both practical and theoretical implications. As evi- denced by the number of volumes on face and eyewitness identi- fications that have recently appeared, this topic has widespread appeal for empirical researchers. Within the last 8 years, seven major research-oriented volumes on eyewitness reliability have appeared (Clifford & Bull, 1978; Davies et al., 1981; Lloyd-Bos- took & Clifford, 1983; Loftus, 1979; Shepherd, Ellis, & Davies, 1982; Wells & Loftus, 1984; Yarmey, 1979). However, the extent to which this quanti ty of research reflects substantial and cumu- lative development is no t entirely clear and is the pr imary ques- t ion we address in the meta-analysis. Meta-analysis is a set of procedures that allows quantitative integration of research findings. The basic techniques were pi- oneered by and have received their impetus from Glass (1976), Rosenthal (1978), Glass, McGaw, and Smith (1981), and Hunter, Schmidt, and Jackson (1982). In meta-analyses, studies are treated as the uni t of analysis and are combined by convert- ing them into a common met r i c - - the effect size. The effect size, as measured by the d statistic, is the standardized difference between the control and experimental conditions. In short, it measures the magnitude of impact or effect of an independent variable. Several recta-analytic techniques are used to explore different aspects of facial identification research: In the first section of the article we present descriptive and inferential statistics of the variables that are most frequently manipulated in the facial identification area. In the second section we examine the effects on performance of different study characteristics (e.g., number of targets and decoys, exposure t ime per target, retention inter- val, etc.). Method Studies A total of 960 experimental conditions from over 190 studies re- ported in 128 research articles were analyzed: 80% are facial recogni- tion studies and 20% are eyewitness identification studies. Over 16,950 subjects participated in these studies and provided an average of 42 judgments. The data set is thus based on more than 713,600 separate subject judgments. Studies were obtained in several ways: (a) from the bibliographies of the volumes mentioned previously (and other major publications); (b) unpublished manuscripts, technical reports, and conference presenta- tions solicited from researchers active in facial recognition research; and (e) from a computer search conducted on the Social Sciences Citation Index with the following key words: facial recognition, facial identifica- tion, eyewitness identification, eyewitness recognition, and eyewitness reliability. Types of Variables Examined Three kinds of variables were coded for each study: independent vari- ables, study characteristics, and dependent variables. Independent vari- ables were those factors manipulated by the researcher, for example, target distinctiveness (i.e., whether the target was unusual or ordinary looking). Nineteen independent variables are included in the analysis. Study characteristics are those features of research methodology held constant in a given study (but varied across studies). Thus, variables such as exposure time (the amount of time subjects were exposed to a target or perpetrator) are study characteristics. There are 16 such study characteristics, but note that several study characteristics, such as expo- sure time, retention interval, target race, and target gender also served as independent variables in a number of experiments. Four dependent variables were coded: hits, false alarms, d', and B ~. A hit is a correct identification and a false alarm is an incorrect identifi- cation. The statistic d' measures overall sensitivity (i.e., the ability to detect a signal both when it is present, and when it is absent) and B ~ indexes subjects' decision criterion (a lax criterion means that subjects are more willing to gness). However, because many researchers reported hits but frequently failed to report false-alarm rates, our analyses focus on the larger set of hit rate data. Virtually all of the statistics on the false- alarm data are computed on target-present lineups. For hits and false alarms several statistics were calculated in addition to raw performance: z values on the tests of independent variable manipulations, d (effect size), and D (a weighted effect size that takes sample sizes into consider- ation). There were 443 comparisons on hits (i.e., 886 experimental con- ditions that yielded hit rates), 282 comparisons on false alarms, and 215 involving d' and B ". Coding and Analysis Strategies Each experimental condition in the studies was coded to refiect the independent variables, study characteristics, and dependent variables. Most studies used two levels of independent variables. For studies that examined more than two levels, we coded the two extreme levels. For independent variables that are continuous in nature (e.g., exposure time and retention interval), the effect size analyses were based on a dichoto- mous short-long distinction (or low vs. high level) and ignore the partic- ular values of the continuous variables. Because different researchers choose differing levels of independent variables for particular studies, their choices can control the relative magnitude of experimental effects. For example, an exposure time manipulation of 2 s versus 2 min will lead to a larger effect than an 8- versus 14-s manipulation. Simply com- bining these two results Can obscure such differences. However, we do retain the original metrics of the continuous independent variables in the study characteristics analyses, which make up our second section. Thus, for continuous variables, the study characteristics analysis pro- vides a better index of variables' impact. Although it is clear that the magnitude of effects produced by contin- uous variables depends on the experimenter who chooses the manner of operationalizing the independent variable, it is less obvious that some other variables (even the "dichotomous" qualitative ones) are affected by the same problem. For example, although distinctiveness can be a scaled variable, we have coded it as a dichotomous variable. Thus, in one comparison highly attractive and highly unattractive faces were combined and coded as "distinct" and compared with average looking faces: The effect size obtained in this comparison is likely to be smaller than one in which the experimenters explicitly calculated distinctive- ness (i.e., "functional size"; see Malpass & Devine, 1983). Although combining different operationalizations of the same independent vari- able is disadvantageous in that it increases error variance, it has the advantages of allowing tests of multiple operationalizations and better estimates of the average impact of the variable (as the number of data points is larger and a larger range of the variable is assessed). Another reason for combining different operationalizations of a given indepen- dent variable is that in most instances too few studies used the same operationalization of a variable. The effect size analyses use d as the dependent variable and the study characteristics analyses use percentages of hit and false-alarm rates, and d' as dependent measures. One advantage of using percentages of hits and false alarms as the dependent variable is that a percentage is a highly meaningful dependent variable metric. Specifically, knowing that the difference between two independent variables yields a 10% difference in FACIAL IDENTIFICATION 141 performance may be more meaningful than knowing it yields a 0.3 effect size difference in performance. In studies with more than one independent variable, when the first one was examined the analysis collapsed over the second. This coding procedure sacrifices information about interaction effects. Yet, given the objective of the meta-analysis, the loss of information is quite mini- mal, for there were few instances in which more than one study experi- mentally crossed the same independent variables. Nonetheless, the two most frequently examined interactions (same-race vs. cross-race identi- fication and same-sex vs. cross-sex identification) were converted into main effects, so we do examine two interactions. To assess the reliability of the coding of the study characteristics, an- other person was trained briefly and then coded 23 articles. These arti- cles were randomly selected from studies that were coded at the begin- ning, the middle, and the end of the coding. The rate of agreement be- tween the two coders was 96%. Variable Definitions To clarify the variables we coded, we briefly describe a typical experi- ment's procedure. During the study phase subjects are exposed to a series of faces (in facial recognition studies) or to an event (in eyewitness studies). After a period of time has elapsed (i,e., the retention interval), subjects complete the recognition phase, in which they attempt to iden- tify the target or targets that appeared at the study phase. Faces that appear at the recognition phase but not at the study phase are called decoys. Study characteristics and independent variables. Study characteris- tics are denoted SC, independent variables 1V, and variables used both ways 1V/SC. Where appropriate, descriptive statistics are provided for the variables. (Table 1 shows the number of times an IV was manipu- lated.) 1. Individual differences (IV). These research reports included 21 different subject characteristics measured in a variety of ways, such as race, gender, anxiety, field dependence, visual ability, and so on. The direction of effect was coded to attain (a) consistency within indepen- dent variables, and (b) the maximum effect size across independent vari- ables. 2. Encoding instructions (IV). Generally, two encoding strategies were tested--whether subjects made inferences about psychological traits of the targets while viewing them (i.e., depth of processing) or whether subjects searched for a distinctive feature of the target. 3. Context reinstatement (IV). This variable measured whether or not context was reinstated with the use of cues previously associated with the targets or the incident at the study phase. 4. Target distinctiveness (IV). This involved the degree of similarity, high or low, between targets and decoys. 5. Sex of target (IV/SC). Fifty-four percent of the studies used male targets, 4% used female targets, and 42% used both genders. 6. Transformation (IV). This variable measured whether the target appearance differed between the study phase and the recognition phase. Twenty-six percent of the studies operationalized transformation as "disguise" (e.g., glasses, clothing, mustache), 21% changed the video display (e.g., upside down, negative), and 53% changed the targets' pose and/or expression. 7. Race of target (IV/SC). Eighty-one percent of the studies used only white targets, 6% used only black targets, and 13% used both. 8. Retention interval (IV/SC). This is the length of time between the study and recognition phases, M = 108 hr, SD = 507 hr. 9. Same versus cross-race identification (IV). Same or different race identification, such as black subjects viewing white faces or vice versa. 10. Same versus cross-sex identification (IV). 11. Mode of presentation of targets at recognition phase (IV/SC). Subjects were presented either live (2%), via film or videotape (2%), still photographs (94%), or line drawings (1%). This variable was coded on an interval scale (live-videotape-still-line drawings). 12. Elaboration (IV). Was the target paired with rich (e.g., several descriptions) or poor elaboration (e.g., no descriptions) at study phase? 13, Pose at recognition and study phase (IV/SC). Sixty-two percent of the studies used front views, 25% used mixed poses, and 13% had missing data (i.e., pose not known). In the effect size analyses, we eall this variable pose at study, because that is when pose is usually manipu- lated. 14. Subject age (IV). Seventy-seven percent of the studies compared two "young" samples (e.g., 10-year-olds vs. 6-year-olds), and 23% com- pared young and "old" (i.e., college age and above) samples (e.g., 10- year-olds vs. 20-year-olds). 15. Training in facial identification (IV). Did subjects have prior training in facial recognition techniques? 16. Exposure time per face at study (IV/SC). This involved the num- ber of seconds subjects were given to view each target (total exposure time divided by number of targets); M = 12 s, SD = 19 s. 17. Mode of presentation of targets at study (IV/SC). Subjects were presented live (15%), via color film or videotape or black and white film (9%), color or black and white still (73%), or in line drawings (1%). This variable was coded on an interval scale (live-videotape-still-line draw- ings). 18. Whether subjects had knowledge of the future recognition task (IV/SC). Sixty-one percent of the subjects knew about the future recog- nition task, 31% did not, and 8% had missing data. 19. Target-present versus target-absent lineup (IV). Were subjects presented with a lineup that contained the target or targets or one that did not? 20. Number of targets at study and recognition phases (SC). M = 22, SD = 19. 21. Number of faces at study (SC). For the number of targets and distractors at study, M = 25, SD = 24. 22. Total exposure time at study phase (IV/SC). This was the total amount of time given to view faces, M = 2.8 rain, SD = 4.0 min. 23. Number of(usually same-race) decoys at the study recognition phase (SC). M = 40, SD = 37. 24. Number of faces shown simultaneously at the recognition phase (SC). M = 2.3, SD = 2.4. In studies that did not present faces succes- sively (one after another), faces were shown in blocks (i.e., several faces were presented simultaneously) in which subjects were often told that one face in the block was the target. 25. Ratio of targets to decoys at recognition phase (SC). M = 0.64, SD = 0.58. 26. Attention (IV/SC). This variable measured whether subjects' at- tention was focused on targets or diffused throughout the scene. This was intentionally manipulated via instructions or was varied in eyewit- ness studies by exposing subjects to an event that did not capture their attention (coded on a 2-point scale: focused [88%] or diffused [12%] attention). 27. Type of study (SC). The studies were either of facial recognition (subjects are exposed to several faces, usually sequentially) or eyewitness identification (subjects are witnesses to an event, usually with one tar- get); 80% were of facial recognition and 20% were of eyewitness identi- fication. Missing data were treated conservatively. Unless we could infer con- fidently the value of a variable, it was left missing. Some examples of when we did infer values follow. (a) If the poses of faces were not de- scribed and the investigators used faces from yearbooks, we inferred that they were front views. (b) If the target's race was not expressly stated, we assumed that she or he was white. (c) In coding experimental results, if the value of a "significant" statistical test was not given, we coded it as being significant at the .05 level. (d) If the statistical test was not computed on raw hit or false-alarm rates, but on some sort of transformation, we used this transformation's effect size as the estimate of the effect size for the corresponding hit or false-alarm rate. (e) If only d' was given and we could not determine whether the effect was more evident in hits or false alarms, we coded half of the effect size for hits 142 PETER N. SHAPIRO AND STEVEN PENROD and half for false alarms. We made statistical inferences in less than 5% of the studies. In approximately 95% of the studies we were able to code all ofthe variables. In order to assure as complete a data set as possible, we solicited additional information from the authors of the studies that did not provide complete information. Effect Size Analyses We first present an effect size analysis of the independent vari- ables that were most frequently manipulated in facial identifi- cation studies. For hits and false alarms Table 1 presents the number of experimental conditions in which the independent variable was manipulated (N), the number of subjects in those studies (n), the average effect size--both unweighted (d) and weighted (D) for study sample sizes, the standard deviation of the unweighted effect sizes, the cumulative weighted Z, and the cumulative probability for the Z. The entries in Table 1 are or- ganized as follows. First, the d is associated with the first speci- fied level of an independent variable. For example, for encoding instructions, the first level is "high" and the positive effect size indicates that special instructions yield better performance. Second, regarding false alarms, the d is associated with im- proved l~rformanee and is inversely correlated with the num- ber of falSe alarms. Thus, for encoding instructions, the 0.38 refers to fewer false alarms for the first specified conditionm high levelsmcompared to the second or low-level condition. We find that for hits, the following variables yielded the larg- est effects: context reinstatement, subject age, encoding instruc- tions, elaboration, target distinctiveness, transformation, expo- sure time, pose at study, cross-racial identification, race of tar- get, mode of presentation, and retention interval. For false alarms, these variables yielded the largest effect sizes: transfor- mation, subject age, target distinctiveness, cross-racial identifi- cation, encoding instructions, and target present/absent lineup. It is also evident from Table I that many other variables yielded a cumulative probability greater than .05, thus we cannot reject the null hypothesis of no reliable relation between these inde- pendent variables and hit or false-alarm rates. Variables such as target gender, knowledge of recognition task, and training in facial recognition at recognition phase do not reliably influence facial identification performance. To make the results more concrete to readers, several other statistics are reported for the independent variables (Table 2). First, mean percentage rates on hits and false alarms are pre- sented for the independent variables, as well as the number of studies reporting them. From these percentages, d' and B ~ were calculated. In signal detection terminology, d' reflects sensitiv- ity and B" measures decision criterion. The value of d' ranges from 0 (no sensitivity or "guessing") upward, but rarely exceeds 3 (very high sensitivity). The value of B" ranges from - 1 (very lax criterion) to 1 (very strict criterion). The d' a n d / 7 analyses provide some insights into how a given variable affects perfor- mance. For example, encoding instructions affect sensitivity (as evidenced by the differences in d' for the high and low levels), but seem not to affect criterion (as evidenced by the approxi- mately equal B~ for the high and low levels). In contrast, context reinstatement affects both sensitivity and criterion. On the whole, B ~ is roughly equal for the high and low levels of most independent variables, which suggests that the independent variables primarily influence sensitivity rather than criterion. Implications of Effect Size Analyses In this section we examine the independent variables in more detail and discuss the implications of our findings for future research and application to real-world identification proce- dures. To make the effect size analysis of Table I more concrete, it is useful to refer to Table 2, which has additional statistics associated with the effect size analysis. Table 2 shows the per- centages for the continuous independent variables, but these re- suits are potentially misleading because the impact depends on the levels chosen by the experimenter. Thus we urge the reader to refer to the regression analyses to gain insight into effects of the continuous variables. Also, because the individual differ- ences variable comprises so many individual differences, we did not collapse across them and enter them in Table 2. Collapsing across them would yield meaningless results. It is important to emphasize four cautions regarding Tables 1 and 2. First, the effect sizes are not additive. For example, the effect size for race of target and race of subject are not equal to the effect size for the Race of Target • Race of Subject interac- tion. The second caution is that the percentages associated with the high and low values of a variable (Table 2) are based on a smaller number of subjects than the effect size analysis. The percentages may somewhat overstate average differences in per- formanee because researchers were more likely to present their hit and false-alarm rates if their independent variables pro- duced significant effects. Third, weighted effect sizes are almost always smaller than unweighted effect sizes, which suggests that some experimental procedures or study characteristics may me- diate the effects of independent variables. Fourth, separate anal- yses demonstrated that for almost all independent variables, only about 20% of the variability in effect sizes could be attrib- uted to sampling error (Hunter et al., 1982), which suggests that we should pay attention to possible moderating variables. Two such moderator variables are different operationalizations of a given independent variable (e.g., transformation), and the fact that independent variables (e.g., encoding instructions) were sometimes manipulated in between-subjects designs, and some- times in within-subjects designs, which yield smaller error vari- ance and a larger effect size. Although there are well-developed methods for testing the extent to which methodological or other study characteristics are related to effect size (Hunter et at., 1982, offer a particularly lucid summary of sueh methods, which essentially involve comparing average effect sizes in stud- ies that differ in some systematic way), we generally have too few studies to make such comparisons very informative, and have instead taken a slightly different approach to the problem by analyzing study characteristics. Individual differences (1). With the exception of subject age, which is reported separately in Table 1, when one considers in- dividual difference variables as a group, they yield a small but significant effect on hits and false alarms (Table 3). Table 3 shows the individual differences that were manipulated more than once, the average unweighted effect size, its cumulative z, and the probability level. In sum, although some individual differences are associated with better performance, the effects are generally small and are probably overestimated given the prejudice against reporting null findings (Gfeenwald, 1975). Encoding instructions (2) and degree of elaboration (12). These variables were analyzed separately, but it is useful to dis- FACIAL IDENTIFICATION 143 Table 1 Manipulated Variables Variable N n Hits d SD D Z 1. Individual differences 103 9,699 2. Encoding instructions (high vs. low) 29 1,868 3. Context reinstatement (yes vs. no) 23 1,684 4. Target distinctiveness (high vs. low) 22 2,174 5. Sex of target (male vs. female) 19 2,052 6. Transformation (none vs. disguise) 19 2,682 7. Race of target (white vs. black or oriental) 18 1,894 8. Retention interval (short vs. long) 18 1,980 9. Same-vs. cross-race identification 17 1,571 10. Same- vs. cross-sex identification 13 1,197 11. Mode of presentation at recognition phase (live or videotape vs. still) 13 1,807 12. Face was associated with rich vs. poor elaboration at exposure time 10 362 13. Pose at study (3/4 vs. front or profile) 10 1,266 14. Subject age (young vs. old) 9 603 15. Training in facial recognition (yes vs. no) 8 534 16. Exposure time at study (long vs. short) 8 990 17. Mode of presentation at study (live or videotape vs. still) 5 896 18. Knowledge of recognition task 5 703 Grand means for entire data set 443 44,301 0.13 0.27 0.13 7.34*** 0.97 1.32 0.63 9.87*** 1.91 1.87 1.39 17.54"** 0.76 0.79 0.67 12.53"** 0.02 0.38 0.08 1.88 1.05 0.83 0.67 13.46*** 0.24 0.55 0.10 2.05* 0.43 0.61 0.27 8.03*** 0.53 0.56 0.40 6.99*** 0.14 0.19 0.23 3.18"** 0.07 0.28 0.14 3.13" 1.00 0.67 0.98 8.15*** 0.53 0.87 0.30 5.37*** 1.10 0.68 0.78 13.34"** 0.18 0.58 0.08 0.54 0.61 0.74 0.38 4.48*** 0.50 0.80 0.18 3.92** 0.10 0.12 0.05 0.42 0.47 0.85 0.32 25.57 False alarms I. Individual differences 70 6,941 2. Encoding instructions (high vs. low) 19 1,733 3. Context reinstatement (yes vs. no) 18 1,982 4. Target distinctiveness (high vs. low) 18 1,957 5. Sex of target (male vs. female) 12 1,690 6. Transformation (none vs. disguise) 6 1,494 7. Race of target (white vs. black or oriental) 15 1,626 8. Retention interval (short vs. long) 14 1,868 9. Same- vs. cross-race identification t4 1,432 10. Same- vs. cross-sex identification 5 784 11. Mode of presentation at recognition phase (live or videotape vs. still) 10 1,407 12. Face was associated with rich vs. poor elaboration at exposure time 2 72 13. Pose at study (3/4 vs. front or profile) 4 1,027 14. Subject age (young vs. old) 5 408 15. Training in facial recognition (yes vs. no) 5 371 16. Exposure time at study (long vs. short) 8 1,389 17. Mode of presentation at study (live or videotape vs. still) 2 280 18. Knowledge of recognition task 5 1,100 19. Target-present/absent lineup 12 1,694 282 28,232 Grand means 0.08 0.25 0.06 2.23* 0.38 0.48 0.21 2.07* -0.44 0.40 -0.26 -2.75* 0.78 0.96 0.55 7.89*** -0.07 0.34 -0.13 -3.40*** 0.40 0.30 0.32 5.64*** O. 18 0.39 0.06 0.23 0.33 0.45 0.20 2.02*** 0.44 0.55 0.44 7.43*** 0.02 0.17 0.07 0.06 0.17 0.36 0.07 0.14"* -0.06 0.09 -0.07 -0.27 0 0 0 0 0.66 0.68 0.86 13.33*** -0.04 0.06 -0.04 0.15 0.22 0.31 0.08 0.67 0.27 0.68 0.05 0.27 0.27 0.21 -0.06 - 1.03 0.65 0.50 0.63 12.35"** 0.18 0.49 0.21 7.95*** Note. d = unweighted average effect size. D = weighted average effect size. *p < .05. **p < .001. ***p < .0001. cuss them together because they both refer to the amount o f informat ion encoded with a face. They differ in that the encod- ing instructions typically call for substantial activity on the part o f subjects (e.g., requiring them to make inferences about a face while looking at it). In contrast, elaboration refers to whether the face was associated or paired with one or several descriptors versus none, and in these studies the subjects take a passive role. Both variables produce large effects on hits (ds = 0.97 and 1.00 for encoding instructions and elaboration, respectively), but small effects on false alarms (ds = 0.38 and -0 .06) . In fact, r ich elaboration confers no benefit on false alarms. Although these variables have a large effect on hits, they only improve hit rates by 8% and 6%, respectively. One reason for the large effect size and small performance difference is that some of these studies have a small error te rm because o f their within-subjects designs. One promising avenue o f future research is to incorporate encoding instructions into a training program to establish whether they can also be applied fruitfully to eyewitness perfor- 144 PETER N. SHAPIRO AND STEVEN PENROD Table 2 Mean Hit and False-Alarm Rates From Experimental Studies Hits Variable High Low N False alarms d' High Low N High Low B ~ N High Low 2. Encoding instructions (high vs. low) 74 66 26 21 27 3. Context reinstatement (yes vs. no) 79 52 23 25 18 4. Target distinctiveness (high vs. low) 70 60 14 17 29 5. Sex of target (male vs. female) 74 72 18 14 14 6. Transformation (none vs. disguise) 75 54 19 22 30 7. Race of target (white vs. black or oriental) 59 53 15 16 20 8. Retention interval (short vs. long) 61 51 16 24 32 9. Same- vs. cross-race identification 63 57 16 18 22 10. Same- vs. cross-sex identification 76 72 12 21 21 11. Mode ofpresentation at res phase (live or videotape vs. still) 50 50 11 30 26 12. Face was associated with rich vs. poor elaboration at exposure time 78 72 10 10 11 13. Pose at study (3/4 vs. front or profile) 66 54 10 41 39 14. Subject age (young vs. old) 70 58 9 15 25 15. Training in facial recognition (yes vs. no) 65 61 8 10 10 16. Exposure time at study (long vs. short) 69 57 5 34 38 17. Mode of presentation at study (live or videotape vs. still) 72 58 4 30 38 18. Knowledge of recognition task 56 58 2 27 27 19. Target-present/absent lineup 25 52 17 0.68 0.48 17 0.07 0.06 18 0.77 0.39 18 -0.06 0.25 12 0.64 0.46 11 0.20 0.07 9 0.71 0.71 9 0.23 0.25 5 0.74 0.32 5 0.06 0.09 9 0.57 0.44 9 0.29 0.22 11 0.47 0.15 11 0.14 0.06 11 0.55 0.51 11 0.24 0.17 3 0.41 0.37 3 0.06 0.11 7 0. I0 0.10 7 0.09 0.14 2 0.84 0.82 2 0.31 0.33 2 0.20 (I.20 2 -0.04 0.02 4 0.77 0.23 4 0.24 0.14 4 0.71 0.63 4 0.44 0.46 3 0.39 0.00 3 -0.02 0.02 1 0.53 0.29 1 0.02 0.02 2 0.25 0.25 3 0.11 0.11 12 mance. As we note later, training has yielded disappointing re- suks for most researchers. Although some of the training proce- dures have used encoding strategies, not one has explicitly ma- nipulated depth of processing. It is intriguing why simple encoding instructions yield large effects, in contrast to training procedures, which rely heavily on encoding instructions. Context reinstatement (3). This variable yields the largest positive effect on hits (d = 1.91), but a negative effect on false alarms (d = -0.44). As Table 2 indicates, the difference in hit rates between subjects receiving and those not receiving context reinstatement is 27% (79% vs. 52%, respectively). Future re- search should be conducted on context reinstatement, as this variable might be exploited to improve eyewitness perfor- mance. For example, Malpass and Devine (1981) and Krafka and Penrod (1985) manipulated context reinstatement in an eyewitness paradigm and obtained parallel results: hits and false alarms both increased, but hits increased to a greater ex- tent. However, the magnitude of the context reinstatement effect in more lit~like situations is considerably smaller (d ffi 0.50 for hits) than the effect in laboratory studies. Though the reasons for the difference between laboratory and field effects are unclear, the results may indicate that in real-world settings there are relatively few reinstatement cues not already used by eyewitnesses. Future research might be directed to the problem of developing context reinstatement instructions and proce- dures that will increase real eyewitnesses' hit rates without in- creasing false alarms. One clue to how this can be done is pro- vided by B" (Table 2). Context reinstatement cues are associ- Table 3 Effect Sizes and Inferential Statistics of Individual Differences Hits False alarms Individual difference N d z N d z Women/men 48 0.10 4.11"** Blacks/whites 14 0.17 2.64** High/low verbal ability 3 0.11 1.95" Field independence/dependence 8 0.24 4.46*** Low/high anxiety 6 0.11 1.83 High/low imagery 4 0.11 0.68 Low/high self-consciousness 3 0.09 0.5 High/low verbal ability for pictures 2 0 0 High/low ability to describe faces 2 0.41 2.3 I* Popular/unpopular children 2 0.61 2.29* 26 0.08 2.77** 10 --0.04 2.05* 3 0 0 3 0 0 6 0.33 3.69*** *p <.05. **p<.01. ***p<.001. FACIAL IDENTIFICATION 145 ated with a more lax criterion than baseline performance. Perhaps the extra retrieval cues provided by the context rein- statement give subjects an illusory "feeling of knowing" (e.g., Schacter, 1983) that results in increased confidence and a more lax criterion. The "transformation" variable is conceptually like the con- text reinstatement variable in that they both compare the de- gree of correspondence between cues at the study and recogni- tion phases (i.e., encoding specificity, see Tulving & Thomson, 1973). In addition, they both strongly affect facial identification performance. The difference between the two variables is that transformation detracts from performance because the com- parison is between a control condition and a condition that re- ceives the cue mismatch, but context reinstatement enhances performance in that the comparison is between a control condi- tion and one that receives a cue match. Future research might combine these two variables to examine the extent that the transformation effect can be offset by the context reinstatement effect. This research has obvious practical implications in that many crimes are committed by a disguised perpetrator. Target distinctiveness (4). This variable produces a similar effect for hits (d = 0.76) and false alarms (d = 0.78), with dis- tinctive targets being easier to recognize than ordinary looking targets. Target distinctiveness clearly has an impact at the re- trieval stage in that the more the target resembles the decoy, the more likely it is to share some retrieval cues. Distinctiveness might also operate at the encoding stage (Light, Kayra-Stuart, & Hollander, 1979). That is, distinctive faces carry more infor- mation and may elicit more extreme judgments (McArthur, 1981), which could increase the level of processing. Again, there are several reasons to suspect that active encoding of informa- tion leads to better performance than passive processing. Future research needs to delineate more precisely these conditions. Sex of target (5) and race of target (7) identification. Accu- racy rates for female subjects are higher than for male subjects. White targets are more easily identified (the weighted d for hits is 0.10) but this yields no reliable effect on false alarms. Transformation (6). This variable includes several types of manipulations. When the transformation was a disguise, the effect size was 0.71 for hits and 0.23 for false alarms. When the transformation was a change in pose and expression the effect size was 1.22 for hits and 0.62 for false alarms. The effect sizes for exposing subjects to a different medium (from study to rec- ognition, or inverting the faces) was 0.57 for hits and 0 for false alarms (in only one study). Same versus cross-race (9) and same versus cross-gender (10) identification. Performance on same-gender targets is better than on cross-gender targets for hits but there are no differences for false alarms. However, same-race identification leads to many more hits and fewer false alarms than cross-race identifi- cation. These results, however, need to be interpreted with cau- tion, as Lindsay, Wells, and Rumpel (1981) have shown that in target-absent lineups, cross-race identifications lead to signifi- cantly fewer false alarms than same-race identification. Unfor- tunately, there are not sufficient data on target-absent lineups to assess how lineup construction interacts with other variables. Mode of presentation at recognition phase (11) and mode of presentation at study phase (17). These are complex variables and the results do not lead to firm conclusions at present. We hypothesize that the more realistic the mode was, the more re- trieval cues would be available, and consequently performance should be better. Thus, we predicted the following order (from worst to best performance): line drawings, still photographs, videotape, and live. Although some experiments support our ordering (e.g., Davies, 1983), some do not. For example, Dent (1977) showed that live lineup confrontations increase anxiety in children and thereby detract from their performance. Thus, the mode of presentation variable interacts with other variables. Further research might investigate how the number of re- trieval cues can be increased without making subjects anxious. Large, lifelike videotaped displays at the recognition stage might provide as many retrieval cues as a live lineup presenta- tion, while avoiding the concomitant stress of a live lineup. Pose at study (13). In this category three types of pose (three- quarters, profile, and front) were compared. When three-quar- ter and frontal poses were compared with a profile, they led to more hits (d = 0.75 in five studies) but no difference in false alarms. Faces in three-quarter poses were easier to recognize than faces in frontal view on hits (d = 0.50 in four studies), but there was no difference in false alarms. Thus, a three-quarters pose leads to the best performance, followed by frontal pose, and then by profile. This ordering holds for hits but not for false alarms, which are unrelated to pose in the studies available. Subject age H4). For hit rate, as one would expect, studies that compare two young samples obtain a smaller effect size than those comparing a young and an old sample, d = 0.94 versus 1.66. For false alarms, d = 1.01 for young versus young samples and 0.26 (in only one comparison) for young versus old populations. Training (15). This is a perplexing variable because relatively extensive training procedures have not improved performance (Malpass et at., 1973; Woodhead et at., 1979), in contrast to brief (20-min) training sessions, which have improved perfor- mance (Elliott, Wills, & Goldstein, 1973). Although Table 1 in- dicates a nonsignificant improvement on hits and false alarms, we agree with Baddeley and Woodhead (1983) that with better techniques, training may be more effective. Indeed, the results obtained with other variables have some implications for train- ing strategies. As already mentioned, encouraging individuals to make in- ferences about a face will increase the level (depth) of process- ing. In addition to training subjects to look only at the target's face, it may be advantageous for witnesses to use a somewhat broader encoding strategy (Woodhead et at., 1979). It has been shown that "global" processing, that is, looking at the entire face, leads to better performance than "feature" processing, or looking only at particular facial features (Walker-Smith, Gale, & Findlay, 1977). Global processing involves attending to a wide array of stimuli, as opposed to fixating on a narrow range of stimuli. The previous research suggests that if the target has one or two distinctive features, it may be advantageous to focus on them, but if not, global processing may be better. One shortcoming of the previous training studies is that they have not used real events; rather, the primary targets for identi- fication have been still photographs (which probably reduces the generalizability of their "null" findings). Thus, part of the ineffectiveness of training procedures may be due to a "restric- tion" of the range of stimuli. In particular, using still photo- graphs of faces may impose featural processing and not allow for global processing. Consequently, the number of potential 146 PETER N. SHAPIRO AND STEVEN PENROD retrieval and context reinstatement cues and the level of elabo- ration is probably quite limited. As far as we know, no one has tested global perception training techniques (in fact, as we just mentioned, the prior research does just the opposite), but there are theoretical reasons (and compelling context reinstatement results) for believing that this may be a fruitful avenue for re- search. Knowledge of the recognition task (18). This variable was manipulated in five studies in which no demonstrable effect was produced in hits or false alarms. Target-present/absent lineup (19). This variable is extremely important to forensic psychologists, yet has received scant re- search attention. Although its effect size is much smaller than some other variables (see Table 1), an examination of Table 2 indicates that accuracy rate is strongly affected. Specifically, subjects who view target-absent lineups produce a 52% false- alarm rate, compared with subjects who view target-present lineups, whose false-alarm rate is 25%. This point underscores Glass et al?s (1981) warning that the intrinsic value of effect sizes is quite variable. Raw effect sizes must be qualified by the variability of the independent and dependent variables as well as other properties. As noted previously, retention interval (8), subject age (14), and exposure time (l 6) were coded as continuous variables and a straightforward consideration of their effect sizes is potentially misleading. Rather than relying on average effect size analysis, in the next section we analyze the influence of retention interval and exposure time on raw hit and false-alarm rates using regres- sion procedures. Subject age is not further analyzed because there was not sufficient variability in it. Outside of studies that manipulate subject age, most studies used college age samples. However, based on the effect size analysis, it is fair to say that young children (first through third graders) perform far worse than older children and adults. Study Characteristics Analyses The characteristics of all experimental cells were coded so that the influence of study characteristics on performance could be examined. By way of clarification, our study characteristics do not include variables such as quality of study, year of publi- cation, and so on, but do include substantive variables. What effect do the viewing circumstances have on witness/subject performance when examined across studies? Of course, some study characteristics (e.g., exposure time, retention interval) were also independent variables in many studies, but analyses of study characteristics are potentially more informative because they have more than 950 data points (based on judgments from more than 16,500 subjects) for hits, whereas the independent variables generally have fewer than 20 data points (even though they are often based on between 1,000 and 2,000 subjects). The advantage of the analyses of the independent variables is that they are based on experimental manipulations, unlike the study characteristics, which are more susceptible to the influence of confounding variables. For variables that appear in both sets of analyses, each kind of analysis can be considered a validity cheek for the other. There were two steps in our analysis of the impact of the study characteristics. First, the study characteristics were factor ana- lyzed. The factor analysis was conducted first because our cod- ing strategies involved recording study characteristics that we thought were likely to be interrelated. Second, we hypothesized that some multicollinearity in our measures would arise from the fact that facial recognition and eyewitness identification re- searchers use characteristically different research methods. That is, experimenters are not constructing experiments by randomly selecting study characteristics; rather, there is an or- derliness and theoretical impetus to this research that results in "preferred" research methods in the two experimental do- mains. Simply looking (across studies) at Pearson correlation coefficients to assess the relation between each study character- istic and study results could be misleading because many of the study characteristics would not be independent. As expected, preliminary analyses revealed significant multicollinearity among the study characteristic variables, and we used the factor analysis to guide our subsequent analyses. The second step of the analysis--which we emphasize as exploratory--involves multiple-regression analyses designed to assess the relative im- pact of various study factors on identification performance. Assessing the Structure of Study Characteristics The 16 study characteristics of the 960 cells were included in the factor analysis. The independent variables were excluded because they did not vary enough. Encoding instructions, for example, was manipulated the second highest number of times, yet was varied in less than 6% of the studies. The number of times a given study was represented in the factor analysis was a function of the number of cells in the experiment. Thus, a study that manipulated four variables had eight cells (a "high" and "low" cell for each independent variable). As we discuss in the regression analysis that follows, there seemed to be no impor- tant differences between studies that manipulated few versus several variables. For the sake of brevity and clarity, we only highlight the three major factors that emerged from the analysis. We term the fac- tor that accounted for most of the variance Optimality of View- ing, which included the following variables: degree that atten- tion was focused on the targets, whether subjects had knowledge of the recognition task, type of study, and mode of presentation at study. Load at Study emerged as the second factor and in- cluded number of targets at study, number of faces at study, and total exposure time at study. Load at Recognition was the third factor and included number of decoys, number of total faces at recognition, and ratio of targets to decoys (which also loaded moderately highly on the Load at Study factor). The remainder of the variables either loaded by themselves on a single factor or did not load highly on any factor. These variables included pose, target sex, target race, duration of exposure per target, retention interval, and number of simultaneous faces at recog- nition. It is probably apparent to the reader that the Optimality of Viewing factor might also be called a "Type of Study" factor that distinguishes between laboratory facial recognition studies and eyewitness studies. Studies characterized by "optimal" viewing are those in which subjects' attention is focused on the targets, subjects have knowledge of the recognition task, and tend to involve facial-recognition using photographs. Although the type of study and the substantive variables (whether subjects had knowledge of the recognition task, the degree that attention FACIAL IDENTIFICATION 147 was focused on the targets) are correlated, they are not entirely redundant (as will become evident when the regression analyses are presented), therefore, we maintain this distinction. It is im- portant not to attribute to the substantive variables variance that is actually a function of methodological variables that dis- tinguish between facial recognition and eyewitness studies. It is also important not to underestimate the effect of theoretically and forensically important substantive variables just because they are confounded with research methodology. To separate the methodological variable (type of study) from the substantive ones (i.e., degree that attention was focused on targets at study, knowledge of the recognition task), we refer to the latter cate- gory as attention and analyze it separately from type of study. Our analyses of the relations between study characteristics and performance take two forms. First we examine the propor- tion of variance in performance accounted for by discrete blocks of variables such as attention, load at study, and so on (with percentage of hits, percentage of false alarms, and d' as the dependent variables). Second, we examine the proportion of variance in performance that is accounted for when the effects of all the other independent variables have been par- tialed. Using raw performance rates and study-wise d' as depen- dent variables aUows differentiation in performance across lev- els of study characteristics and they are therefore "range cor- rected" (Hunter et al., 1982) in ways that the effect size analyses are not. Relat ions Between S tudy Characteristics and Hits, False Alarms, and d' We started by investigating whether there were performance differences between studies that had missing data compared with those with complete data. Because we wanted the regres- sion analyses on study characteristics to include complete cases (i.e., as much data as possible), we included in them missing- data dummy variables for two of the study characteristics that had the highest proportions of missing data (Cohen & Cohen, 1975): pose (13% missing), and whether subjects had knowledge of the later recognition task (8% missing). Separate analyses in- dicated that there were small, but statistically significant Pear- son product-moment correlations between missing data for pose and false-alarm rates (r = . 14, p < .03), but not hit rates. The correlations were higher between missing values for knowl- edge and hit rates (r = - .20, p < .001) and false-alarm rates (r = .3 l, p < .001). Missing-data dummies were used as reference groups for both variables. Each study was treated equally in these analyses. That is, a study that manipulated one variable was represented twice in Table 4 Hits Zero order Partialed sr B Block, variables R 2 r R 2 (full model) coefficient Attention .33* .02* Degree that attention was focused on targets .52 .09 4.64 Mode of presentation at study -.53 .07 4.95 Knowledge of recognition task .35 .07 5.44 No knowledge of recognition task -.23 .01 0.92 Duration of exposure per face at study (s) .07* -.22 .003 .05 0.075 Duration (squared) of exposure per face at study .03* (over linear) - . 18 .00 -.01 -0.0004 Pose .25* .03* Mixed vs. others -.49 - . 14 - 11.07 Front vs. others .43 .00 0.17 Load at study .17" .01" No. of targets .41 .06 0.14 No. of faces .35 -.02 -0.03 Total exposure time .19 .03 0.001 Target race .02* .02* White .13 .11 7.02 Black -.14 -.01 -1.11 Target sex .15* .01" Male -.38 -.02 - 1.56 Mixed .36 .03 2.98 Retention interval (min) .07* -.29 .01" - . 11 -0.000075 Retention interval (squared) .03* (over linear) - . 18 .00 .03 0.00 Load at recognition .15* .00 No. of simultaneous faces - . 15 .03 0.27 Mode of presentation - . 16 -.01 - 1.08 No. of decoys .22 -.01 -0.009 Ratio of targets to decoys .27 -.01 -0.64 Type of study .35* .03* Eyewitness vs. facial recognition -.59 - . 16 - 16.15" Note. Total R z = .47; adjusted R 2 = .45. F(22, 671) = 27.18, p < .00005. p < .05 for r .13, sr = .1. p < .0001 for r = .16, sr = .12. "Intercept = 43.51. *p <.001. = .08, sr = .06. p < .01 for r = .11, sr = .08. p < .001 for r = 148 PETER N. SHAPIRO AND STEVEN PENROD the regression analysis (once for the high level and once for the low level), and a study that manipulated five variables was rep- resented 10 times. This method of accumulating results across studies gives rise to two concerns. First, studies are not weighted for degrees of freedom. This problem is ameliorated because separate analyses indicate only a slight correlation (r = . 10, p < .0 I) between degrees of freedom and hit rates, and a nonsignifi- cant correlation between degrees of freedom and false-alarm rates. Second, using repeated observations from the same sub- jects within studies means that not all observations are truly independent of one another. Glass et al. ( 1981) discuss at length the implications of such nonindependence for tests of signifi- cance and parameter estimation. In light of the latter limitation, we must reemphasize the exploratory nature of our analyses. In order to assess the effect of study characteristics on perfor- mance, we primarily relied on the common distinction made among the encoding, storage, and retrieval stages of memory, which has obvious temporal and causal implications for the re- gression analysis. Within the broad framework of the stage anal- ysis, the factor analysis results were used to structure regression analyses with the following blocks of factors and/or variables. (a) Attention (a 2-point scale): the dummy coded knowledge of the recognition taskmwith missing data as the reference group, and mode of presentation at study; (b) duration of exposure per face at study; (c) duration of exposure per face at study squared (to assess a quadratic trend); (d) pose (with dummies for mixed and frontal poses and missing data as the reference group); (e) load at study (which included number of faces and targets at study, mode of presentation of targets and foils, and total expo- sure time); (f) target race (both races as reference); (g) target gender (women as reference); (h) retention interval; (i) retention interval squared; (j) load at recognition (includes number of decoys and faces at recognition, ratio oftargets to faces at recog- nition, number of simultaneous faces, and mode of presenta- tion at recognition); and (k) type of study (facial recognition vs. eyewitness). It should be noted that the cubic trends of retention interval and duration of exposure per target were included in a preliminary analysis, but their tolerances were too low to enter the regression. Tables 4-6 present a summary of the study characteristics analyses. The first set of results report the percentage of vari- ance in hit rates, false-alarm rates, and d' accounted for by a block of variables and by individual variables. These analyses do not control for multicollinearity among study characteris- tics, and although they suggest that a number of study charac- teristics variables are strongly related to recognition perfor- mance, the results have to be viewed very cautiously. A more informative analysis, which systematically controls for shared variance in the study characteristics, is based on a multiple re- gression analysis and labeled "partialed analyses?' These results indicate the proportion of variance accounted for by blocks of variables (column 3) once the effects of all other variables have been partialed. Because there is multicollinearity in the data, the proportion of variance attributable to particular blocks after other vari- ables have been partialed differs, in some instances dramati- cally, from the proportions reflected in the unpartialed analysis (column 1). We believe that the partialed analysis provides a more reliable overview of the relations we are interested in and, in any event, is a far more conservative method for assessing the relation between study characteristics and performance. Each table also contains the semipartial correlation for each separate independent variable--this indexes the unique relation be- tween the particular variable and the dependent measure. Ta- bles 4-6 also present the nonstandardized B weights for each independent variable from the full multiple regression (all vari- ables entered simultaneously). For exposure and retention in- terval, the R2s are for the linear trend when it is entered before the quadratic trend, whereas the semipartials are for the full regression model. Hit Rate Results Because most studies analyze hits rather than other depen- dent variables, we discuss in most detail the analyses with hits as the dependent variable. As Table 3 shows, when blocks of variables and individual variables are considered alone, they all appear to account for a statistically significant portion of vari- ance, with the blocks of attention and study type variables each accounting for about 33% of the variance. However, because of multicollinearity among the independent variables, these analy- ses clearly overstate the magnitude of the relations. When the effects of all other variables are removed from the analyses, most variables and blocks of variables still explain sig- nificant portions of variance. The Attention factor accounted for a significant unique portion of variance (with attention, knowledge, and mode of presentation yielding significant semi- partials). The duration of exposure per face was marginally and positively related to hit-rate performance, whereas its quadratic component was not significant. Pose accounted for an addi- tional 3% of variance. Load at study accounted for 1% of vari- ance (with number of targets making the significant contribu- tion). Race and gender of target each accounted for 2% of vari- ance (with whites being easier and men more difficult to recognize); sex and retention interval accounted for 1% of the variance but the quadratic component of retention interval was not significantly related to performance. Load at recognition did not account for any unique variance, whereas type of study accounted for 3% of the variance. In sum, these 11 sets of vari- ables,accounted for 47% of the variance in hit rates (with an adjusted R 2 of.45). As noted previously, the variables under the Attention factor are confounded with the methodological variable type of study. This methodological variable accounts for a small but signifi- cant portion (3%) of the variance in performance. In contrast, when considered separately, it accounted for 35% of the vari- ance. These results underscore that the facial recognition/eye- witness study distinction is almost entirely confounded with study characteristics that differentiate the two types of studies (indeed, nearly 75% of the variance in study type can be ac- counted for by variations in attention, knowledge, mode of pre- sentation, exposure time, number of targets, and target race). The regression analysis results suggest that within our set of studies, the quality of viewing is the most important determi- nant of facial identification performance, followed by pose, tar- get race, and gender. These results are consistent with the effect size analyses that implicate encoding variables such as depth of processing and elaboration as factors having a substantial im- pact on hit-rate performance. Contrary to what was expected, as load at study increased the FACIAL IDENTIFICATION 149 hit rates also increased slightly. However, this finding should be considered in view of the findings obtained with false alarms and d' as the dependent variables, where load of study was in- versely related to performance. Thus, although a high load at study is associated with increases in the hit rate, there is a much more dramatic increase in false-alarm rate and a reduction in d' (overall performance). It seems that a high load at study in- creases subjects' willingness to guess (as load at study increases hit rate but decreases overall performance), which underscores the point that some variables influence performance directly by influencing sensitivity (e.g., encoding variables like depth of processing), and some variables (e.g., load at study) influence performance indirectly by affecting decision criteria. The duration of viewing each target at study variable has a negative zero-order correlation but a positive partial correla- tion, which indicates that net suppression has occurred (Cohen & Cohen, 1975). In line with intuition and the results of the effect size analysis, when all of the variables are entered in the regression equation, we find that as the time spent viewing a target increases, so does the hit rate. Another variable that we expected to yield a complex relation was retention interval. In this instance we found a significant linear trend, but the qua- dratic trend did not account for any unique variance. Again, this result is consistent with previous research (see Wells & Murray, 1983). The effect size analysis revealed a small but statistically sig- nificant effect for target race, and the study characteristics anal- ysis further suggests that whites are easier to recognize than blacks. However, in the regression analysis the target race main effect is essentially a cross-racial identification effect. Most sub- jects in the analyzed studies were white. Though we did not explicitly code this variable, we know that except for studies that examined cross-racial identification (4% of our studies), few other studies used black subjects. Thus, we are inclined to see the results as highly supportive of the experimental cross- race findings. Inconsistent with the results from the effect size analysis, the regression results indicate that performance was significantly better in studies that used female (or mixed-sex) targets than in studies that used male targets. The results for the pose variables indicate that the (dummy- coded) frontal view leads to better performance than other poses, and the "mixed" category (in which the researcher did not use a constant pose) led to poorer performance than other poses. These results are not directly comparable to the effect size analyses because the comparisons in those analyses looked Table 5 False A l a r m s Zero order Partialed sr B Block, variables R 2 r R 2 (full model) coefficient Attention .21"* .06** Degree that attention was focused on targets .32 -.06 2.48 Mode of presentation at study .37 - . 12 -7.04 Knowledge of recognition task -.32 - . 18 -9.12 No knowledge of recognition task .12 - . 12 -6.56 Duration of exposure per face at study (s) .13** .36 .02** .13 0.17 Duration (squared) of exposure per face at study .08** (over linear) .28 .00 -.04 -0.001 Pose .08** .02* Mixed vs. others .18 -.04 -2.37 Front vs. others -.28 - . 12 -6.95 Load at study .06** .03** No. of targets -.25 .09 -0.20 No. of faces at study -.20 -.07 0.12 Total exposure time -.07 -.02 -0.002 Target race .00"* .01 ** White .04 -.03 - 1.51 Black .05 .07 4.80 Target sex .10"* .01 ** Male .31 .02 2.06 Mixed -.30 -.02 - 1.73 Retention interval (min) .01 .09 .00 .02 0.00 Retention interval (squared) .00 (over linear) .02 .00 -.05 0.00 Load at recognition .19"* .11"* No. of simultaneous faces .23 -.04 -0.39 Mode of presentation .21 .11 5.10 No. of decoys -.36 -.21 -0.16 Ratio of targets to decoys - . 19 - . 13 10.15 Type of study .30"* .02"* Eyewitness vs. facial recognition .54 .17 13.58 a Note. Total R 2 = .43; adjusted R 2 = .40. F(22, 406) = 13.93, p < .00001. p < .05 for r =. 11, sr = .09. p < .01 for r =. 14, sr =. 11. p < .001 for r = .17, sr = .14.p < .0001 for r = .20, sr = .18. a Intercept = 24.60. *p<.01. **p < .001. 150 PETER N. SHAPIRO AND STEVEN PENROD at three-quarter views versus other types of poses. When not manipulated, pose was virtually never three-quarters (as a study characteristic) because faces were usually taken from year- books, which use a frontal view. The two sets of results confirm that pose may be an important variable from a forensic point of view, though as we noted earlier, it may be most useful to consider pose in terms of the retrieval cues made available to witnesses. False-Alarm Results The same strategy that governed the analysis of hits was used for false alarms. AS the summary results in Table 5 show, when considered alone, the Attention factor accounted for 21% of the variance in false-alarm performance whereas Type of Study ac- counted for 30%. However, the prominence of both factors de- dined dramatically in the partialed analysis. The attention, knowledge, and mode of presentation variables within the At- tention factor all influenced false identifications in the manner expected. The duration of exposure per face at study (linear trend) was marginally significantly related to false-alarm performance (note, once again, the suppression effects), but the direction of the relation is opposite to what one would expect based on pre- vious research. That is, as the duration of exposure per face increases, so does the false-alarm rate. Of all the findings we present, this is the most anomalous. It could be due to a con- founding variable that has not been coded (e.g., studies that show faces relatively briefly have easier recognition tasks). Again, pose was significant, with frontal views leading to fewer false alarms than other poses. Load at study also ac- counted for 3% of the variance (with number of targets and faces positively correlated with false-alarm rates), Target race and gender each accounted for additional unique variance (with whites--again, this is best interpreted as a cross-racial effect-- and women easier to recognize). Retention interval did not affect false-alarm rate. In stark contrast with its weak effects on hits, load at recognition uniquely accounted for 5% of the variance in false alarms. Mode of presentation at recognition accounted for a large portion of this effect--the B weights indi- cate a 15% higher false-alarm rate when subjects are confronted with live targets as opposed to still photographs. Interestingly, there is a negative correlation between false-alarm rate and number of decoys (i.e., as the number of decoys increases, the false-alarm rate decreases). It is plausible that subjects use more strict criteria when choosing a target from a large lineup. Such a finding would be important to policymakers, because they are especially concerned about minimizing false-alarm rates. In all, the 11 sets of variables accounted for 43% of the variance in false-alarm performance, with an adjusted R 2 of.40. It is important to underscore that the variables that strongly influenced hit rates did not concomitantly strongly influence faise-alarm rates. The independent effect of variables on hits and false alarms is highlighted when one examines effect sizes (d) for ~/ariables on both hits and false alarms (see Table 1). A cursory glance indicates that few variables produce similar effects on both hits and false alarms. In fact, the correlation between the effect sizes on hits and false alarms (across indepen- dent variables) is not significant (r = -.06). These results are based on studies in which hits and false alarms are methodologi- cally independent. That is, in facial recognition studies where subjects are shown a sequence of faces, hit rates may have little bearing on false-alarm ratesma hit rate of 40% can be associ- ated with 10%, 20%, or 90% false-alarm rates. In contrast, in some eyewitness studies ifa subject was forced to make an iden- tification from a target-present lineup, he or she can make only a hit or a false alarm. Naturally, in the latter set the methodology imposes a relation between hits and false alarms. In sum, our results suggest that researchers should examine and report hits and false alarms and, to be complete, correct rejections and misses. Although d' takes these statistics into ac- count, simply reporting d' gives little information regarding the differential impact of variables on hit and false-alarm rates. Pol- icymakers would certainly be interested in hit rates and false- alarm rates, separately. Results for d ' Although the study characteristics affect hit rates and false- alarm rates differently, it is still possible to integrate them with a studywise d' analysis. Macmillan and Kaplan (1985) show that this technique generally provides accurate estimates of the average d'. From the percentages of hits and false alarms, a studywise d' was computed. The same sets of variables and or- der of entry used for the hit and false-alarm analyses was used for d'. Table 6 provides a summary of the results. The regression analysis suggests that attention and target race are the most important determinants of facial identification sensitivity, each uniquely accounting for 5% of the variance. The linear, but not the quadratic, trend for duration of exposure per face was significant. However, in common with the false- alarm rate findings, the correlation was negative (i.e., in the "wrong" direction). We believe the finding should be inter- preted as due to an unknown (confounding) variable. Load at study accounted for 4% of the variance, pose accounted for 1%, and target gender accounted for 1% (with women more easily recognized). Retention interval was significant and its quadratic component was marginally significant. If plotted, the predic- tions would approximate the forgetting curves commonly found in memory literature. Load at recognition accounted for 3% of the variance (with mode of presentation at recognition, number of decoys, and ratio of targets to decoys making sig- nificant contributions), and type of study accounted for 2%. In sum, the 11 sets of variables accounted for 45% of the variance in d' with an adjusted R 2 of.41. As one recalls from the effect size analysis, the knowledge variable had a nonsignificant effect, in contrast to the study characteristics analysis, where it liad a significant effect. One possible explanation for this discrepancy is that studies that in- form subjects of the recognition task tend to be facial recogni- tion studies where attention is often focused on the targets. Thus, in the study characteristics analysis, knowledge of the recognition task covaries with other variables (e.g., focused at- tention) that lead to relatively good performance. In contrast, when knowledge is manipulated, degree of attention is usually held constant and is probably high, which would obscure any differences between having knowledge versus not having knowl- edge of the recognition task. In short, if one wants to estimate the effect size of only knowledge, one should refer to the effect size analysis, but if one wants to estimate the effect of knowledge FACIAL IDENTIFICATION 151 Table 6 d' Zero order Partialed s r B Block, variables R 2 r R 2 (full model) coefficient Attention .32** .06** Degree that attention was focused on targets .48 .08 0.23 Mode of presentation at study -.49 .10 0.35 Knowledge of recognition task .37 .20 0.66 No knowledge of recognition task - . 17 ,15 0.50 Duration of exposure per face at study (s) .14** -.37 .01 ** - . 11 -0.01 Duration (squared) of exposure per face at study .08** (over linear) -.29 .00 -.02 -0.00004 Pose .05** .01"* Mixed vs. others -.30 -.04 -0.18 Front vs. others .33 .04 0.15 Load at study .06** .04** No. of targets .25 -.09 -0.01 No. of faces .20 -.10 -0.01 Total exposure time .08 .03 0.0002 Target race .03* .05** White .11 .14 0.43 Black - . 16 -.06 -0.26 Target sex .14** .01 ** Male -.38 -.04 -0.26 Mixed .36 .00 0.00 Retention interval (min) .04** -.20 .01 ** -0.00001 Retention interval (squared) .03* (over linear) - . 13 .00 .05 0.00 Load at recognition .20** .03** No. of simultaneous faces -.23 .04 0.03 Mode of presentation - . 17 -.06 -0.19 No. of decoys .33 .16 0.007 Ratio of targets to decoys .27 - . 11 0.64 Type of study .35** .02** Eyewitness vs. facial recognition a -.59 - . 13 -0.6 i N o t e . Total R 2 = .45; adjusted R 2 = .41. F(22, 292) = 10.87, p < .00005. p < .05 for r = .18,sr = .15.p < .0001 forr = .21, sr = .18. a Intercept = -0.16. *p < .01. **p < .001. 9 11, s r = .099 p < .01 for r = . 15, s r =. 12. p < .001 for r = in relation to how it probably occurs in the real world (i.e., knowledge plus other related variables such as focused atten- tion), one should refer to the study characteristics analysis. The load at study findings suggest a paradox. As expected, heavy load at study is associated with poorer performance. The notion underlying this relation is that a large load at study means that a given target is operated on by fewer cognitive re- sources than if there were a lighter load at study. However, this result is counter to the effect size analyses indicating that the more details accompanying a face at encoding, the better the performance (i.e., the elaboration variable). Elaboration in- creases load at study and is presumably a passive process (as is being exposed to several faces). Thus, it seems that a large load at study need not detract from performance. If the information is individuating, allows for easier organization, or does not in- crease interference as in the elaboration studies, a large load at study can improve performance. If, however, the large load at study is not amenable to effective organization, as when subjects are merely exposed to a large number of faces, then a large load at study will detract from performance. Perhaps future research can directly examine this hypothesis. Another paradox is that performance increases as the num- ber of decoys increases. This effect can be traced to the false- alarm data, where a large number of decoys is associated with lower false-alarm rates. However, some research (Alexander, 1972; Laughery et al., 1971; Laughery, Fessler, Lenorovitz, & Yoblick, 1974) indicates that hit rates are also lowered by having a large number of decoys. Interference processes obviously lower one's sensitivity as the number of decoys increases, and it probably also lowers one's criterion, as subjects may be aware that their image of the target is less crisp. Future experiments should address these processes to establish whether the false- alarm rates decelerate faster than the hit rates (and therefore increase overall performance) as the number of decoys in- creases. Conclusion Our purpose was to quantitatively summarize facial identifi- cation and to identify areas calling for more research. The pres- ent meta-analysis demonstrates that psychologists have learned much about the factors affecting facial identifications. The fac- tors that have the largest impact on identification accuracy in- clude Context Reinstatement, Transformation, Depth of Pro- cessing, Target Distinctiveness, Elaboration, and the Target Present/Absent variable. Other factors that reliably affect per- formance include Exposure Time, Subject Age, Cross-Racial Identification, Retention Interval, and Pose. 152 PETER N. SHAPIRO AND STEVEN PENROD Borrowing Weils's (1978) useful distinction, most of the vari- ables in our analyses are estimator variables as opposed to sys- tem variables. Estimator variables are variables that cannot be manipulated once the eyewitness event has occurred. For exam- ple, we cannot manipulate an event's duration, or whether the target wore a disguise, after the event occurred. Estimator vari- ables tend to operate at the encoding stage and because we can- not influence them after the fact, we can only estimate the mag- nitude of their impact (as the present analyses do). System vari- ables, in contrast, tend to operate on storage and retrieval processes and can be manipulated after a given event has oc- curred. For example, law enforcement specialists can reinstate context and have eyewitnesses view lineups soon after seeing the perpetrator. Wells (1978) implies that estimator variables have limited fo- rensic use and recommends more research on system variables because they are easier to translate into good identification pro- cedures. Although he cogently argues that system variables may be more valuable in forensics than estimator variables, he does not preclude the forensic importance of estimator variables. Ac- tually, he suggests that estimator variables can be important for forensic researchers, provided they lead to theoretical develop- ment. In an attempt to provide such theoretical development, we suggest that the variables we have examined might gainfully be summarized under three basic psychological processes or prin- ciples. Cognitive psychologists have used these principles for years, but some of our variables (e.g., pose) have not commonly been classified in accordance with them. It should be noted that variables may be relevant to more than one principle. The first principle is that performance is enhanced when the target at the encoding stage matches the target at the recognition stage. The variables that can be subsumed under this encoding specificity principle (Tulving & Thomson, 1973) are context reinstate- ment, target distinctiveness, target present/absent, transforma- tion, and pose. The second principle is that the more processing the target prompts, the better performance will be. Variables subsumed under this elaboration principle are encoding in- structions, target distinctiveness, target present/absent, cross- race identification, and subject age. The third principle is that performance will be increased to the extent that the viewing conditions and identification procedures make more informa- tion available at identification. This principle is relevant to ex- posure time and retention interval. In short, as Wells (1978) implies, what is generalizable is not necessarily the findings of a given variable; rather, the process or processes underlying the variable are what can be extrapolated to real-world settings. Our review goes beyond merely describing the effects of rele- vant variables. We have also examined the manner in which variables affect performance. For example, we have shown that most variables affect sensitivity rather than criterion. However, the increase in false alarms for the context reinstatement vari- able does appear ,to be due to a criterion change. Another dis- tinction we explored, in addition to sensitivity versus criterion, is whether the variables affect hits, false alarms, or both. Such a distinction is crucial for forensic researchers who are more concerned with reducing false alarms than increasing hits. In- terestingly, there was little correspondence between the effect sizes for hits and false alarms. It is important that future re- search discover why variables have different effects on the two dependent measures. A third distinction that we explored, al- though there was a paucity of research in this area, was whether a variable affects performance similarly in target-present and target-absent lineups. Some research (e.g., Lindsay et al., 1981) has shown that some variables (e.g., cross-race vs. same-race identifications) operate differently in target-present versus tar- get-absent lineups. We urge future researchers to use the target- present/absent variable, as it is of fundamental importance to forensic settings. We have made a number of suggestions for future research. In addition to those previously discussed, researchers should devote as much attention to false-alarm rates as they do to hit rates. As Table I indicates, researchers are more likely to report hit rates than false-alarm rates, but false-alarm rates are also important, especially to forensic psychologists. Another sugges- tion for future research is to use more realistic facial identifica- tion/eyewitness settings. Although our study characteristics analysis indicates that differing levels of performance obtained in laboratory versus field experiments can almost entirely be accounted for by the different levels of study characteristics used in the two types of studies, it is still possible that study setting affects the magnitude of effects produced by indepen- dent variables. We have noted, for instance, that the effect size for context reinstatement in field studies is 25% as largeas the effect size is for laboratory studies. We need to know whether context interacts with other variables, or whether the differences in the effect sizes reflect a different range of the independant variable (e.g., the range between context reinstatement/nonre- instatement is larger in a laboratory experiment than in a field experiment). In addition, research with an applied focus should examine more system variables, because they can easily be translated into real identification procedures (Wells, 1978). The present meta-analysis is chiefly a main-effect analysis. We have quantified the impact of several independent variables (as main effects). To establish a "mature" data base, future re- searchers must uncover important interactions. For example, perhaps the same-race advantage is reduced or eliminated if the cross-race target has distinctive features or if there is a long ex- posure time. Researchers may also benefit by addressing differ- ent facets of identification processes. As previously mentioned, training procedures have inconsistently affected performance. Training manipulations seem designed to influence sensitivity. A more fruitful approach might be to influence eyewitnesses' criterion, to teach witnesses to use strict criteria when viewing conditions are unfavorable but to use lax criteria when viewing conditions are favorable (and base rate performanceis high). Finally, we should emphasize that our conclusions are quali- fied by two limitations: (a) Most of the studies we analyzed used target-present.lineups, and (b) the recta-analysis investigated fa- cial identification research and not other indices of eyewitness performance. References Addison, B. M. (1978). Expert testimony on eyewitness perception. Dickinson Law Review, 82, 465-485. Alexander, J. E (1972). Search factors influencing personal appearance identification. In A. Zavala & J. J. Paley (Eds.), Persona/appearance identification (10p. 14-27). Springfield, IL: Charles C Thomas. Baddeley, A. D., & Woodhead, M. M. (1983). Improving face recogni- FACIAL IDENTIFICATION 153 tion ability. In S. Lloyd-Bostock & B. R. Clifford (Eds.), Evaluating witness evidence (pp. 125-136). Chichester, England: Wiley. Barkowitz, P., & Brigham, J. C. (1982). Recognition of faces: Own-race bias, incentive, and time delay. Journal of Applied Social Psychology, 12, 255-268. Brigham, J. C., & WiUiamson, N. L. (1979). Cross-racial recognition and age: When you're over 60 do they still "all look alike?" Personal- ity and Social Psychology Bulletin, 5, 218-222. Clifford, B. R., & Bull, R. (1978). The psychology ofpersonal identifica- tion. London: Routledge & Kegan Paul. Cohen, J., & Cohen, P. (1975). Applied multiple regression~correlation analysis for the behavioral sciences. Hillsdale, N J: Erlbaum. Davies, G. M. (1983). The recognition of persons from drawings and photographs. Human Learning, 2, 237-249. Davies, G. M., Ellis, H. D., & Shepherd, J. W. (1981). Perceiving and remembering faces. London: Academic Press. Dent, H. R. (1977). Stress as a factor influencing person recognition in identification parades. Bulletin of British Psychology,, 30, 339-340. Elliott, E. S., Wills, E. J., & Goldstein, A. G. (1973). The effects of dis- crimination training on the recognition of white and oriental faces. Bulletin of Psychonomic Society 2, 71-73. Ellison, K. W., & Buckhout, R. (1981). Psychology and Criminal Jus- tice. New York: Harper & Row. Glass, G. V. (1976). Primary, secondary, and meta-analysis of research. Educational Research, 5, 3-8. Glass, G. V., McGaw, B., & Smith, M. L. (1981). Meta-analysis in social research. Beverly Hills, CA: Sage. Goldstein, A. G., Johnson, K. S., & Chance, J. (1979). Does fluency of face description imply superior face recognition? Bulletin of the Psychonomic Society, 13, 15-18. Greenwald, A. G. (1975). Consequences of prejudice against the null hypothesis. Psychological Bulletin, 82, 1-20. Hunter, J. E., Schmidt, E L., & Jackson, G. B. (1982). Meta-analysis: Cumulating research findings across studies. Beverly Hills, CA: Sage. Kxafka, C., & Penrod, S. (1985). Reinstatement of context in a field experiment on eyewitness identification. Journal of Personality and Social Psychology, 49, 58-69. Laughery, K. R., Alexander, J. E, & Lane, A. B. (1971). Recognition of human faces: Effects of target exposure time, target position, pose position, and type of photograph. Journal of Applied Psychology,, 55, 477-483. Laughery, K. R., Fessler, P. K., Lenorovitz, D. R., & Yoblick, D. A. (1974). Time delay and similarity effects in facial recognition. Jour- nal of Applied Psychology, 59, 490-496. Laughery, K. R., & Fowler, R. H. (1977, October). Facial recognition: Effects of changing accessories. Paper presented at the Proceedings of the 21 st Annual Meeting of the Human Factors Society, San Fran- cisco. Light, L. L., Kayra-Stuart, E, & Hollander, S. (1979). Recognition memory for typical and unusual faces. Journal of Experimental Psy- chology: Human Learning and Memo~ 5, 212-228. Lindsay, R, C. L., Wells, G. L., & Rumpel, C. M. (1981). Cross-racial eyewitness identifications: It may be better if they all look alike. Un- published manuscript, University of Alberta. Lloyd-Bostock, S., & Clifford, B. R. (1983). Evaluating witness evi- dence. London: Wiley. Loftus, E. E (1979). Eyewitness testimony. Cambridge, MA: Harvard University Press. Loftus, E. E (1983a). Silence is not golden. American Psychologist, 38, 564-572. Loftus, E. E (1983b). Whose shadow is crooked? American Psycholo- gist, 38, 576-577. Macmillan, N. A., & Kaplan, H. L. (1985). Detection theory analysis of group data: Estimating sensitivity from average hit and false-alarm rates. Psychological Bulletin, 98, 185-199. Malpass, R. S., & Devine, P. G. (1981). Guided memory in eyewitness identification. Journal of Applied Psychology, 66, 343-350. Malpass, R. S., & Devine, P. G. (1983). Measuring the fairness of eyewit- ness identification lineups. In S. M. A. Lloyd-Bostock & B. R. Clifford (Eds.), Evaluating witness evidence (pp. 81 - 102). Chichester, England: John Wiley & Sons. Malpass, R. S., & Kravitz, J. (1969). Recognition for faces of own and other race. Journal of Personafity and Social Psychology, 13, 330- 334. Malpass, R. S., Lavigueur, H., & Weldon, D. E. (1973). Verbal and visual training in face recognition. Perception & Psychophysics, 14, 285- 292. McArthur, L. Z. (1981). What grabs you? The role of attention in im- pression formation and causal attribution. In E. T. Higgins, T. P. Her- man, & M. P. Zanna (Eds.), Social cognition: The Ontario Sympo- sium (pp. 201-246). Hillsdale, N J: Erlbaum. McCloskey, M., & Egeth, H. E. (1983a). Eyewitness identification: What can a psychologist tell a jury? American Psychologist, 38, 550- 563. McCloskey, M., & Egeth, H. E. (1983b). A time to speak, or a time to keep silent? American Psychologist, 38, 573-575. Rosenthal, R. (1978). Combining results of independent studies. Psy- chological Bullet in, 85, 185-193. Schacter, D. L. (1983). Feeling of knowing in episodic memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 9, 39-54. Shepherd, J. W., Ellis, H. D., & Davies, G. M. (1982). Identification Evidence. Aberdeen, Scotland: Aberdeen University Press. Thomson, D. M., Robertson, S. L., & Vogt, R. (1982). Person recogni- tion: The effect of context. Human Learning, 1, 137-154. Tulving, E., & Thomson, D. (1973). Encoding specificity and retrieval processes in memory. Psychological Review, 80, 352-372. Wagstaff, G. E (1982). Hypnosis and recognition of a face. Perceptual & Motor Skills, 55, 816-818. Walker-Smith, G. J., Gale, A. J., & Findlay, J. M. (1977). Eye movement strategies involved in face perception. Perception, 6, 313-326. Weinstein, J. (1981). Review of eyewitness testimony. Columbia Law Review, 81, 441-457. Wells, G. L. (1978). Applied eyewitness testimony research: System variables and estimation variables. Journal of Personality and Social Psychology, 36, 1546-1557. Wells, G. L., & Loftus, E. E (Eds.). (1984). Eyewitness testimony: Psy- chological perspectives. London: Cambridge University Press. Wells, G. L., & Murray, D. M. (1983). What can psychology say about the Neil v. Biggers criteria for judging eyewitness accuracy? Journal of Applied Psychology, 68, 347-362. Woocher, E D. (1977). Did your eyes deceive you? Expert psychological testimony on the unreliability of eyewitness identification. Stanford Law Review, 29, 969-1030. Woodhead, M. M., Baddeley, A. D., & Simmonds, D. C. (1979). On training people to recognize faces. Ergonomics, 22, 333-343. Yarmey, A. D. (1979). The psychology of eyewitness testimony. New York: Free Press. (Appendix follows on next page) 154 PETER N. SHAPIRO AND STEVEN PENROD Appendix Bib l iog raphy o f C o d e d S tud ies Alexander, J. E (1972). Search factors influencing personal appearance identification. In A. Zavala & J. J. Paley (Eds.), Personal appearance identification (pp. 14-27). Springfield, IL: Charles C Thomas. Baddeley, A. D., & Woodhead, M. M. (1982). Depth of processing, con- text, and face recognition. Canadian Journal of PsychologT 36, 148- 164. Bailis, K. L., & Mueller, J. H. (1981). Anxiety, feedback, and self-refer- ence in face recognition. Motivation and Emotion, 5, 85-96. Barkowitz, P., & Brigham, J. C. (1982). Recognition of faces: Own-race bias, incentive, and time delay. Journal of Applied Social Psychology,, 12, 255-268. Blaney, R. L., & Winograd, E. (1978). Developmental differences in children's recognition memory for faces. Developmental Psychology, 14, 441-442. Boice, R., Hanley, P., Shaughnessy, P., & Gansler, D. (1982). Eyewitness accuracy: A general observational skill? Bulletin of the Psychonomic Society, 20, 193-195. Bower, G. H., & Karlin, M. B. (1974). Depth of processing pictures of faces and recognition memory. Journal of Experimental Psychology, 103, 751-757. Brigham, J. C., & Barkowitz, P. (1978). Do "they all look alike?" The effect of race, sex, experience, and attitudes on the ability to recognize faces. Journal of Applied Social Psychology 8, 306-318. Brigham, J. C., Maass, A., Martinez, D., & Whittengerger, G. (1983). The effect of arousal on facial recognition. Basic and Applied Social Psychology 4, 279-293. Brigham, J. C., Maass, A., Snyder, L. D., & Spaulding, K. (1982). Accu- racy ofeyewimess identification in a field setting. JournalofPersonal- ity and Social Ps)vhology, 42, 673-680. Brigham, J. C., & Williamson, N. L. (1979). Cross-racial recognition and age: When you're over 60 do they still "all look alike?" Personal- ity and Social Psychology Bulletin, 5, 218-222. Bruce, V. (1983). Recognizing faces. Philosophical Transactions of the Royal Society of London, 302, 423-436. Buckhout, R. (1980). Nearly 2,000 witnesses can be wrong. Bulletin of the Psychonomic Society, 16, 307-310. Buckhout, R., Alper, A., Chern, S., Silverberg, G., & Slomovits, M. (1974). Determinants of eyewitness performance. Bulletin of the Psy- chtmomic Society, 4, 191-192. Buckhout, R., Figneroa, D., & Hoff, E. (1975). Eyewitness identifica- tion: Effects of suggestion and bias in identification free photographs. Bulletin of the Ps~honomic Society, 6, 71-74. Carey, S., Diamond, R., & Woods, B. ( 1981). Development of face rec- ognition--A maturational component? Developmental Psychology, 16, 257-269. Chance, J. E., & Goldstein, A. G. (1979). Reliability of face recognition performance. Bulletin of the Psychonomic Society, 14, 115-117. Chance, J. E., Goidstein, A. G., & McBride, L. (1975). Differential ex- perience and recognition memory for faces. Journal of Social Psy- chology, 97, 243-253.' Chance, J. E., Turner, A. L., & Goldstein, A. G. (1982). Development of differential recognition for own- and other-race faces. Journal of Psj~hology, 112, 29-37. Clifford, B. R., & Hollin, C. R. (1981). Effects of the type of incident and the number of perpetrators on eyewitness memory. Journal of Applied Psychology 66, 364-370. Cohen, M. E., & Cart, W. J. (1975). Facial recognition and the yon Rest- orffeffect. Bulletin of the Psychonomic Society, 6, 383-384. Cohen, M. E., & Nodine, C. E (1978). Memory processes in facial rec- ognition and recall. Bulletin of the Psychonomic Society, 12, 317-319. Courtois, M. R., & Mueller, J. H. (1979). Processing multiple physical features in facial recognition. Bulletin of the Psychonomic Society, 14, 74-76. Courtois, M. R., & Mudler, J. H. ( 1981 ). Target and distractor typicality in facial recognition. Journal of Applied Psychology, 66, 619-645. Cross, J. E, Cross, J., & Daly, J. (1971). Sex, race, age, and beauty as factors in recognition of faces. Perception & Psychophysics, 10, 393- 396. Cutler, B. L., Penrod, S., & Martens, T (1985). Lineup variations and the reliability of eyewitness testimony Paper presented at the meeting of the American Psychological Association, Los Angeles, CA. Davies, G. M. (1983). The recognition of persons from drawings and photographs. Human Learning, 2, 237-249. Davies, G. M., Ellis, H., & Sbeperd, J. (1978). Face recognition accu- racy as a function of mode O f representation. Journal of Applied Psy- chology, 63, 180-187. Davies, G., & Milne, A. (1982). Recognizing faces in and out of context. Current Psychological Research, 2, 235-246. Davies, G., Shepherd, J., & Ellis, H. (1979). Effects of interpolated mng- shot exposure on accuracy of eyewitness identification. Journal of Applied Psychology, 64, 232-237. Davies, G. M., Shepherd, J. W., & Ellis, H. D. (1979). Similarity effects in face recognition. American Journal of Psychology, 92, 507-523. Daw, P. S., & Parkin, A. J. (1981). Observations on the efficiency of two different processing strategies for remembering faces. Canadian Journal of Psychology, 35, 351-355. Deffenbacher, K. A., Leu, J. R., & Brown, E. L. (1981). Memory for faces: Testing method, encoding strategy, and confidence. American Journal of Psychology, 94, 13-26. Dent, H. R. (1977). Stress as a factor influencing person recognition in identification parades. Bulletin of British Psychology,, 30, 339-340. Dent, H. R., & Stephenson, G. M. (1979). Identification evidence: Ex- perimental investigations of factors aflbcting the reliability of juvenile and adult witnesses. In D. Farrington, K. Hawkins, & S. Lloyd-Bos- toek (Eds.), Psychology,, law and legalprocesses (pp. 195-206). Atlan- tic Highlands, NJ: Humanities Press. Devine, P. G., & Malpass, R. S. (1985). Orienting strategies in differen- tial face recognition. Personality and Social Psychology Bulletin, 11, 33-40. Egan, D. M., & Smith, K. H. (1979, October). Improving eyewitness identification: An experimental analysis. Paper presented at the American Psychology-Law Convention, Baltimore, MD. Elliott, E. S., Wills, E. J., & Goldstein, A. G. (1973). The effects of dis- crimination training on the recognition of white and oriental faces. Bulletin of Psychonomic Sociology, 2, 71-73. Ellis, H. D. (1982). The performance of witnesses on identity parades. Unpublished manuscript. Ellis, H. D., Davies, G. M., & Shepherd, J. W. (1977). Experimental studies of face identification. Journal of Criminal Defense, 3, 219- 234. Ellis, H. D., & Deregowski, J. B. (1981). Within-race and between-race recognition of transformed and untransformed faces.American Jour- nal of Psychology, 94, 27-35. Ellis, H. D., Shepherd, J., & Bruce, A. (1973). The effects of age and sex upon adolescents' recognition of faces. Journal of Genetic Psycholog~ 123, 173-174. Feinman, S., & Entwisle, D. R. (1976). Children's ability to recognize other children's faces. Child Development, 47, 506-510. Fleishman, J. J., Buekley, M. L., Klosinsky, M. J., Smith, N., & Tuck, B. (1976). Judged attractiveness in recognition memory of women's faces. Perceptual & Motor Skills, 43, 709-710. FACIAL IDENTIFICATION 155 Galper, R. E. (1970). Recognition of faces in photographic negative. Psychonomic Science, 19, 207-208. Galper, R. E. 0973). Functional race membership and recognition of faces. Perceptual & Motor Skills, 37, 455-462. Galper, R. E., & Hochherg, J. (1971). Recognition memory for photo- graphs of faces. American Journal of Psycholog)z, 84, 351-355. Going, M., & Read, J. D. (1974). Effects of uniqueness, sex of subject, and sex of photograph on facial recognition. Perceptual & Motor Skills, 39, 109-110. Goldstein, A. G., & Chance, J. E. (1964). Recognition of children's faces. Child Development, 35, 129-136. Goldstein, A. G., & Chance, J. E. (1978, August). Memory for faces: Pattern recognition and eyewitness identification. Paper presented at the 86th annual convention of the American Psychological Associa- tion, Toronto. Goldstein, A. G., Johnson, K. S., & Chance, J. (1979). Does fluency of face description imply superior face recognition? Bulletin of the Psychonomic Society 13, 15-18. Gorenstein, G. W., & Ellsworth, E C. (1980). Effect of choosing an in- correct photograph on a later identification by an eyewitness. Journal of Applied Psychology, 65, 616-622. Graefe, T M., & Watkins, M. J. (1980). Picture rehearsal: An effect of selectively attending to pictures no longer in view. Journal of Experi- mentaI Psychology: Human Learning and Memory, 6, 156-162. Gruneberg, M. M., Morris, R E., & Sykes, R. N. (1978). Person recogni- tion: More than a pretty face. New York: Academic Press. Gruneberg, M. M., Morris, E E., & Sykes, R. N. (1978). Sex differences in facial memory New York: Academic Press. Hastings, M. W. (1982). Effectiveness of face-name learning strategies. Perceptual & Motor Skills, 54, 167-170. Hilgendorf, L. E., & Irving, B. L. (1978). False positive identification. Medical Science Law, 18, 255-262. Hoffman, C., & Kagan, S. (1977). Field dependence and facial recogni- tion. Perceptual & Motor Skills, 44, 119-124. Hollin, C. R. (1984). Arousal and eyewitness memory. Perceptual & Motor Skills, 58, 266. Hosch, H. M., & Cooper, S. D. (1982). Victimization as a determinant of eyewitness accuracy. Journal of Applied Psychology 67, 649-652. Hosch, H. M., Leippe, M. R., Marchioni, E M., & Cooper, D. S. (1984). Victimization, self-monitoring, and eyewitness identification. Jour- nal of Applied Psycholog~, 69, 280-288. Howelis, T. H. (1928). A study of ability to recognize faces. Applied Psychology 12, 124-127. Kafer, N. E (1981). Peer acceptance and facial recognition. Journal of Psychologs 108, 291-295. Klatzky, R. L., Martin, G. L., & Kane, R. A. (1982). Semantic interpre- tation effects on memory for faces. Memory & Cognition, 10, 195- 206. Krafka, C., & Penrod, S. (1981, August). The relative influence of stimu- lus characteristics on eyewitness performance. Paper presented at the annual meeting of the American Psychological Association, Toronto. Krafka, C., & Penrod, S. (1985). Reinstatement of context in a field experiment on eyewitness identification. Journal of Personality and Social Psychology 49, 58-69. Krouse, E L. (1981). Effects of pose, pose change, and delay on face recognition performance. Journal of Applied Psychology 66, 651- 654. Lane, A. B. (1972). Effects of pose position on identification. In A. Za- vala & J. J. Paley (Eds.), Personal appearance identification (pp. 28- 35). Springfield, IL: Charles C Thomas. Laughery, K. R. (1972). Photograph type and cross-racial factors in fa- cial identification. In A. Zavala & J. J. Paley (Eds.), Personal appear- ance identification (pp. 36-43). Springfield, IL: Charles C Thomas. Langhery, K. R., Alexander, J. E, & Lane, A. B. (1971). Recognition of human faces: Effects of target exposure time, target position, pose position, and type of photograph. Journal of Applied Psychology, 55, 477-483. Laugfiery, K. R., Fessler, P. K., Lenorovitz, D. R., & Yoblick, D. A. (1974). Time delay and similarity effects in facial recognition. Jour- nal of Applied Psychology, 59, 490-496. Laughery, K. R., & Fowler, R. H. (1977, October). Facial recognition: Effects of changing accessories. Paper presented at the 21st Annual Meeting of the Human Factors Society, San Francisco. Lavrakas, P. J., Buri, J. R., & Mayzner, M. S. (1976). A perspective on the recognition of other-race faces. Perception & Psychophysics, 20, 475-481. Leippe, M. R., Wells, G. L., & Ostrom, T. M. (1978). Crime seriousness as a determinant of accuracy in eyewitness identification. Journal of Applied Psychologg, 63, 345-351. Light, L. L., Hollander, S. H., & Kayra-Stuart, E ( 1981 ). Why attractive people are harder to remember. Personality and Social Psychology Bulletin, 7, 269-276. Light, L. L., Kayra-Stuart, E, & Hollander, S. (1979). Recognition memory for typical and unusual faces. Journal of Experimental Psy- chology: Human Learning and Memory 5, 212-228. Lindsay, R. C. L., Wells, G. L., & Rumpel, C. M. (1980). Can people detect eyewitness-identification accuracy within and across situa- tions? Journal of Applied Psychology 65, 1-31. Lindsay, R. C. L., Wells, G. L., & Rumpel, C. M. (1981). Cross-racial eyewitness identifications: It may be better i f they all look alike. Un- published manuscript, University of Alberta. Maass, A., & Brigham, J. C. (1982). Eyewitness identifications: The role of attention and encoding specificity. Personality and Social Psychol- ogy Bulletin, 8, 54-59. Malpass, R. S. (1974). Racial bias in eyewitness identification? Personal- ity and Social Psychology Bulletin, 1, 42-44. Malpass, R. S. (1979). A cross-cultural face recognition field manual: Description and a validation study. In L. H. Eckensberger, W. J. Lon- net, & Y. H. Pourtinga (Eds.), Cross-cultural contributions topsychol- ogy (pp. 27-39). Amsterdam: Swets & Zeitlinger. Malpass, R. S., & Devine, P. G. (1978). Eyewitness identification: Real- ism vs. the laboratory Unpublished manuscript. Malpass, R. S., & Devine, P. G. (1980). Eyewitness Identification: Lineup instructions and the absence of the offender. Journal of Ap- plied Psycholog~, 66, 482-489. Malpass, R. S., & Devine, P. G. (1981). Guided memory in eyewitness identification. Journal of Applied Psychology 66, 343-350. Malpass, R. S., & Kravitz, J. (1969). Recognition for faces of own and other race. Journal of Personafity and Social Psychology, 13, 330- 334. Malpass, R. S., Lavigueur, H., & Weldon, D. E. (1973). Verbal and visual training in face recognition. Perception & Psychophysics, 14, 285- 292. Mauldin, M. A., & Laughery, K. R. (1981). Composite production effects on subsequent facial recognition. Journal of Applied Psychol- ogy, 66, 351-357. MeKelvie, S. J. (1976). The role of eyes and mouth in the memory of a face. American Journal of Psychologg, 89, 311-323. McKelvie, S. J. (1978). Sex differences in facial memory. In M. M. Gru- neberg, P. E. Morris, & R. N. Sykes (Eds.), Practical aspects ofmem- or), (pp. 263-269). New York: Academic Press. McKelvie, S. J. (1981). Sex differences in memory for faces. Journal of Psychologg, 107, 109-125. Memon, A., & Bruce, V. (1983). The effects of encoding strategy and context change on face recognition. Human Learning, 2, 313-326. Messick, S., & Damarin, E (1964). Cognitive styles and memory for faces. Journal of Abnormal and Social Psychology, 69, 313-318. Montgomery, R., Brown, E., & Deffenbacher, K. (1980). Memory for faces: Effects of repetition on item and context memory Paper pre- sented at the annual meeting of the Psychonomic Society, St. Louis, MO. 156 PETER N. SHAPIRO AND STEVEN PENROD Mueller, J. H., Bailis, K. L., & Goldstein, A. G. (1979). Depth of pro- cessing and anxiety in facial recognition. British Journal of Psychol- ogy, 70, 511-515. Mueller, J. H., Heesacker, M., & Ross, M. J. (1984). Body-image, con- sciousness, and self-reference effects in face recognition. British Jour- nal of Social Psychology, 23, 277-281. MueUer, J. H., Heesacker, M., & Ross, M. J. (1984). Likeability of tar- gets and distractors in facial recognition. American Journal of Psy- chology, 94, 1-24. Mueller, J. H., Heesacker, M., Ross, M. J., & Nicodemus, D. R. (1983). Emotionality of encoding activity in face memory. Journal of Re- search in Personality, 17, 198-217. Mueller, J. H., Nicodemus, D. R., & Ross, M. J. (1981). Self-awareness in facial recognition. Bulletin of the Psychonomic Society, 18, 145- 147. MueUer, J. H., & Wherry, K. L. (t979). Orienting strategies at study and test in facial recognition. American Journal of Psychology, 92, 2- 20. Murray. D. M., & Wells, G. L. (1982). Does knowledge that a crime was staged affect eyewitness performance? Journal of Applied Social Psychology, 12, 42-53. Parkin, A. J., & Goodwin, E. (1983). The influence of different process- ing strategies on the recognition of transformed and untransformed faces. Canadian Journal of Psychology, 37, 272-277. Patterson, K. E. (1978). Person recognition: More than a pretty face. In M. M. Gruneberg, P. E. Morris, & R. N. Sykes (Eds.), Practical as- pects of memory (pp. 228-235). New York: Academic Press. Read, J. D. (1979). Rehearsal and recognition of human faces. Ameri- can Journal of Psychology, 92, 71-85. Sanders, G. L., & Simmons, W. L. (1983). Use of hypnosis to enhance eyewitness accuracy: Does it work? Journal of Applied Psychology, 68, 70-77. Scapinello, K. E, & Yarmey, D. A. (1970). The role of familiarity and orientation in immediate and delayed recognition of pictorial stimuli. Psychonomic Science, 21, 329-330. Schill, T. R. (1966). Effects of approval motivation and varying condi- tions of verbal reinforcement on incidental memory for faces. Psy- chological Reports, 19, 55-60. Shepherd, J. W., Deregowski, J. B., & Ellis, H. D. (1974). A cross-cul- tural study of recognition memory for faces. International Journal of Psychology, 9, 205-211. Shepherd, J. W., & Ellis, H. D. (1973). The effect of attractiveness on recognition memory for faces. American Journal of Psychology, 86, 627-633. Slack, A. T., & Penrod, S. (1982). Facial recognition in eyewitness testi- mony Unpublished manuscript, University of Wisconsin--Madison. Smith, J. E., Pleban, R. J., & Shaffer, D. R. (1982). Effects ofinterroga- tor bias and a police trait questionnaire on the accuracy of eyewitness identification. Journal of Social Psychology, 116, 19-26. Strnad, B. N., & Mueller, J. H. (1977). Levels of processing in facial memory. Bulletin of the Psychonomic Society, 9, 17-18. Thomson, D. M. (1981). Person identification: Influencing the out- come. Australia & New Zealand Journal of Criminology, 14, 49-54. Thomson, D. M., Robertson, S. L., & Vogt, R. (1982). Person recogni- tion: The effect of context. Human Learning, 1, 137-154. Wagstaff, G. E (1982). Hypnosis and recognition of a face. Perceptual & Motor Skills, 55, 816-818. Warnicl~, D. H. (1983). Volunteers: Selective attrition in eyewitness tes- timony Paper presented at the annual meeting of the Midwestern Psy- chological Society, Chicago. Warrington, E. K., & Ackroyd, C. (1975). The effect of orienting tasks on recognition memory. Memory& Cognition, 3, 140-142. Watkins, M. J., Ho, E., & Tulving, E. (1976). Context effects in recogni- tion memory for faces. Journal of Verbal Learning and Verbal Behav- ior, 15, 505-517. Wells, G. L., & Hryciw, B. (1984). Memory for faces: Encoding and retrieval operations. Memory & Cognition, 12, 338-344. Wells, G. L., Lindsay, R. C. L., & Ferguson, T. J. (1979). Accuracy, con- fidence, and juror perceptions in eyewitness identification. Journal of Applied Psychology, 64, 440-448. Winograd, E. (1976). Recognition memory for faces following nine different judgments. Bulletin of the Psychonomic Society, 8, 419--421. Winograd, E. (1981). Elaboration and distinctiveness in memory for faces. Journal of Experimental Psychology,, 7, 181-190. Winograd, E., & Rivers-Bulkeley, N. T. (1977). Effects of changing con- text on remembering faces. Journal of Experimental Psychology, 3, 397-405. Woodhead, M. M., & Baddeley, A. D. (1981). Individual differences and memory for faces, pictures, and words. Memory & Cognition, 9, 368- 370. Woodhead, M. M., Baddeley, A. D., & Simmonds, D. C. (1979). On training people to recognize faces. Ergonomics, 22, 333-343. Yarmey, D. A. (1979). The effects of attractiveness, feature saliency and liking on memory for faces. In M. Cook & G. Wilson (Eds.), Love and attraction (pp. 51-53). New York: Pergamon Press. Yarmey, D. A., & Jones, H. P. T. (1983). Accuracy of memory of male and female eyewitnesses to a criminal assault and rape. Bulletin of the Psychonomic Society, 21, 89-92. Yin, R. K. (1969). Looking at upside-down faces. Journal of Experi- mental Psychology, 81, 141 - 145. Received October 1, 1985 Revision received February 4, 1986 9 Correction to Morris et al. In the article "Failures to Detect Moderating Effects With Ordinary Least Squares-Moder- ated Multiple Regression: Some Reasons and a Remedy," by James H. Morris, J. Daniel Sher- man, and Edward R. Mansfield (Psychological Bulletin, 1985, Vol. 99, No. 2, pp. 282-288), several errors went uncorrected. On page 283, the second line of the first full paragraph should read " in Equat ion 3 . . . . " O n page 284, in the eighth line of the first full paragraph, the power in the equation should be "1/2," no t " 1 2 " On page 287, in Table 4, the heading for co lumn 6 should read "Adjusted SS for deletion of X~X2," not just "X2" The heading for co lumn 7 should read "H0:/~3 = 0 r partial F," not "/~2." Finally, in line 3 of the table note, "X~X2" should read "Xt, X2." 1971). In the present paper, the intent is not to focus on the good studies or the bad studies, the likeable results or the unlikeable results, the big effects or the small effects, but rather to encom- pass the results of all the existing studies that have investigated the effects of the interpersonal self-fulfilling prophecy. The results of 345 studies will be addressed and summarized and it 'will be shown that: (1) The overall probability that there is no such thing as interpersonal expectancy effects is near zero; and (2) The average magnitude of the effect of interpersonal expecta- tions is likely to be ofpractical importance. - In the course of the summary of the results of all available studies, some methods will be illustrated that may be of use to others wanting to quantitatively summarize entire areas of re- search, a practice that is, happily, on the increase (Glass, 1976, Hall, 1978; Rosenthal, 1969; 1978; Rosenthal & Rosnow, 1975; Smith & Glass, 1977). The hope is that this paper, and the accompanying com- meritaries, will increase the likelihood that, in the future, various literatures will be summarized, not by counting and averaging yeas and nays, but by combining probabilities, estimating effect sizes, placing confidence intervals around these estimates, and systematically addressing various issues of data accessibility, quality, and retrievability. The structure of the paper is as follows: After some typical ex- periments are described, statistical significance is considered. The significance of the results ofan earlier data summary is com- pared to that of more recent studies so that changes over time can be assessed. In addition, the entire set of results is evaluated with respect to statistical significance in each of eight specific areas of research that have been investigated. Then, we consider the average size of the effects of interpersonal expectations for 0140-525XI78IRRORU009$04.00/0 There is a large and growing number of experiments inves- tigating the hypothesis that person A's expectation for person B's behavior can affect B's behavior in such a way as to increase the probability that B will behave as expected. These interpersonal \elf-fulfilling prophecies have been found to operate in plychological experiments so as to render behavioral researchers more likely to obtain the results they expect to obtain solely be- l1lUse they expect those results (Rosenthal, 1966; 1976). Such \tlf.fulfilling prophecies have also been found to operate in dassrooms and workshops with' the consequence that teachers nd Supervisors are more likely to obtain the performances they ~lpectsolely because they expect them (Rosenthal, 1973). . E~ects of interpersonal expectations are pervasive and of spe- l~ Importance, both scientific and social. They are important th1e~fiCal,IY because there is a possibility that many studies in e havlOral sciences have obtained their particular results:rtially because of the prior hypothesis or expectation held by ~ ~esearchers. They are important socially because pupils' or tmPtyees' performances may be lowered because teachers or \t~~Yers expect it to be low. Apparently, when behavioral re- Perf, ers, teachers, or supervisors expect a certain level of IcJnlOIrn h ance from their subjects, pupils, or subordinates, they e ow un 'tti Ipr'lhab'l' WI ng y treat them in such a way as to increase the Du Iity that they will behave as expected. lIloonu e to this inherent scientific and social significance, experi- tnlic o~ expectancy effects have been closely examined by ~e d\vii 0 have tried to show that one or more of these studies Silver ~~~ent in one or more ways (Barber, 1976; Barber & , '1\l6a ; Elashoff & Snow, 1970, 1971; Jensen, 1969; Thorn- ~nth I)' These criticisms have been answered elsewhere a , 1968, 1969a, 1969b, 1973, 1976; Rosenthal & Rubin, Keywords: experimenter bias; expectancy effects; Rosenthal effect; self-fulfilling prophecy. Department of Statistics, Harvard University, Cambridge, Mass. 02138 (On leave from Educational Testing Service, Princeton, New Jersey) Abstract: The research area of interpersonal expectancy effects originally derived from a general consideration of the effects of experi- menters on the results of their research. One of these is the expectancy effect, the tendency for experimenters to obtain results they ex- pect, not simply because they have correctly anticipated nature's response but rather because they have helped to shape that response through their expectations. When behavioral researchers expect certain results from their human (or animal) subjects they appear unwittingly to treat them in such a way as to increase the probability that they will respond as expected. In the first few years of research on this problem of the interpersonal (or interorganism) self-fulfilling prophecy, the "prophet" was always an experimenter and the affected phenomenon was always the behavior of an experimental subject. In more recent years, however, the research has been extended from experimenters to teachers, employers, and therapists whose expectations for their pupils, employees, and patients might also come to serve as interpersonal self-fulfilling prophecies. Our general purpose is to summarize the results of345 experiments investigating interpersonal expectancy effects. These studies fall into eight broad categories of research: reaction time, inkblot tests, animal learning, laboratory interviews, psychophysical judgments, learning and ability, person perception, and everyday life situations. For the entire sample of studies, as well as for each specific re- \earch area, we (I) determine the overall probability that interpersonal expectancy effects do in fact occur, (2) estimate their average magnitude so as to evaluate their substantive and methodological importance, and (3) illustrate some methods that may be useful to others wishing to summarize quantitatively entire bodies of research (a practice that is, happily, on the increase). Department of Psychology and Social Relations, Harvard University. Cambridge, Mass. 02138 Donald B. Rubin Robert Rosenthal '[BE BEHAVIORAL AND BRAIN SCIENCES (1978),3,377-415 printed in the United States of America $12.95 e ex- 1 :ntal dare ltions yet .apters 1e m and e ntal 1 Interpersonal expectancy effects: --.II the first 345 studies proportion Number ofstudies reaching p < .05_ Research a Since 1969 -Research area Till 1969 Since 1969 Till 1969 - lleaclion t Reaction time 3 6 .33 .17 t"lcblottel Inkblot tests 4 5 .75 .20 Anllnallel ~ratorAnimal learning 9 6 .89 .50 Psycboph'Laboratory interviews 6 23 .33 .39 iudgme:Psychophysical judgments 9 14 .33 .50 LearningLearning and ability 9 25 .22 .32 Pel10n pePerson perception 57 62 .25 .29 EverydayEveryday situations 11 101 .36 .41 Median 9 18 .33 .36 Median Total 108" 242" .35 .37 "'=== ~ 'Five entri • Three of these 108 entries represent research conducted in a single stuth. cl1l3 stud but for more than one research area. "Two of these 242 entries represen• "lUarlng. research conducted in a single study but for more than one research area. «her (x' = Rosenthal & Rubin: Interpersonal expectancy effects each of these eight areas as well as for all eight together. Issues of data quality control are discussed, and attention is given to prob- lems of unretrieved studies, sampling, minimal quality of data, controls for cheating and recording errors, and corrections for er- rors of data analysis. Practical implications are discussed, includ- ing the degree to which interpersonal expectations function in everyday life situations as well as in laboratories and the com- parability of the magnitude of experimenter expectancy effects with other variables of psychological importance such as brain lesions, preparatory effort, and persuasive communications. Fi- nally, our conclusions include an overview of the types offuture research suggested by our analyses. Some sample studies Before beginning our summary, it will be useful to illustrate the type of experiment that has been conducted on interpersonal ex- pectancy effects, both in the laboratory and in everyday life situations. Two animal experiments. Twelve experimenters were each given five rats. The rats were to be trained to run a maze with the aid of visual cues. Six of the experimenters were told that their rats had been specially bred for "maze-brightness" and the other six were told that theirs had been bred for "maze-dullness," whereas in reality unselected rats had been randomly assigned to both groups. At the end of the experiment, the results were clear. Rats who had been trained and tested by experimenters expecting brighter behavior showed significantly superior learn- ing compared to rats run by experimenters expecting dull be- havior (Rosenthal & Fode, 1963). The experiment was repeated, this time employing a series of learning experiments conducted in Skinner boxes. Half the experimenters were led to believe their rats were "Skinner box bright" and half were led to believe their animals were "Skinner box dull." Once again, the rats were randomly divided into two groups. But by the end of the experi- ment, the allegedly brighter animals really were brighter and the alleged dullards really were duller (Rosenthal & Lawson, 1964). If rats became brighter when so expected by their experi- menter, it seemed possible that children might become brighter if so expected by their teacher. Educational theorists had, after all, been saying for a long time that some children were unable to learn because their teachers expected them to be unable to learn. True, there was no experimental evidence for that theory, but the two studies employing rats and similar studies employing human beings suggested that these theorists might be correct. The follOWing experiment was therefore conducted (Rosenthal & Jacobson, 1968). The Pygmalion experiment. All of the children in an ele- mentary school were administered a nonverbal intelligence test disguised as a test that would predict intellectual "blooming." There ,were eighteen classrooms in the school, three at each of the six grade levels. Within each grade level the three classrooms were composed of children with above average ability, average ability, and below average ability, respectively, Within each of the eighteen classrooms, approximately 20 percent 0f the children were chosen at random to constitute the experimental group. Each teacher was given the names of the children from her class who were in the experimental condition and told that these children had scores on the "test for intellectual blooming" indicating that they would show remarkable gains in intell ctual competence during the next eight months of school. The only systematic difference between the experimental group and the control group children, then, was in the mind of the teacher. All the children were retested eight months later with the same IQ test. Considering the school as a whole, those children whom the teachers had been led to expect greater intellectual gains showed significantly greater gains in IQ than did the children of the control group. Statistical significance Our first analyses consider the statistical signiflca results. These analyses could be performed for all: 01 studi.es surveyed at the, time of. this writing because all Of prOVided at least some mformation about statistical signi always at least whether the results reached the .05 level.~ An earlier summary and changes over time. In 1969 sional summary of the literature on interpersonal ex ,. was published, listing and surveying 105 studies ofint~ expectations (Rosenthal, 1969b). The first two col~-' Table 1 show the number of studies conducted before an~nl t( 1969 within each of eight research areas as defined by an ead~ review (Rosenthal, 1969b). Although a total of 345 s~ studies have been found to date, several of these emplOYed~ pendent variables falling into more than one research area. Tb the 345 studies yield a total of 350 entries across the eight lIS, search areas. An analysis of the proportion of all s~ conducted in each time period in each research area showsthli, overall, there has been a large shift in the areas receiVing Ie- search attention since 1969. Most of this shift has been d\lf III changes in two of the eight research areas. Studies of person per. ception decreased dramatically from 53 percent of all studie. conducted until 1969 to only 26 percent of all studies conducted since. Studies of everyday life situations, including studies (J{ teacher expectations, increased dramatically from 10 percent of all studies till 1969 to 42 percent ofall subsequent studies. The third and fourth columns of Table 1 show the proportioo of studies reaching the .05 level of significance for each orlbe eight research areas. The .05 level was chosen because man) studies unfortunately do not report p values but simply indicate whether or not the .05 level was obtained. All the research areas, both before and after 1969, show very substantially higher proportions of results significant at the .05 level than would be expected by chance. (Later in this paper details are given concerning the process oflocating these studies and methods of controlling for a potential bias toward molt easily locating the statistically more. significant studies.) Considering the research areas separately, none show a significant change in the proportion of results reaching significance before 1969 as compared to afterwards. For both the older and the newer studies, about one-third reach the .05 level. about seven times as many as one would expect if there wer~ no significant relationship between experimenters' or teachers .ex. pectations and their subjects' or pupils' subsequent behaVIOr. Table 1. Comparison ofSignificance levels ofstudies before and after 1969 in eight research areas =============-= ureas cyw . effect: tdPJoyed iJ dPificant ! ~t,we .-likely IilIIlthan ot! lIQiIare less 1he final deviates of t aldie stan! OIlIer to be (JIosenthal, .1.27 was lJ'llcily of i results. It results tha Tlble 2. Sigr. =- 378 THE BEHAVIORAL AND BRAIN SCIENCES (1978),3 Rosenthal & Rubin: Interpersonal expectancy effects TtWe2,Slgnijicance ofexpectancy effects in eight research areas ~================ shown in this last column also show significant overall effects of interpersonal expectancies in all research areas. Effect size So far we have examined only the significance levels of our studies. Unfortunately, in most studies only significance levels and perhaps sample size and a test statistic were reported. But to know the p level is not to know enough about the results of a study. More and more, behavioral researchers are asking to know about the magnitude of the effects of the treatments being studied (Cohen, 1969, 1977). Cohen's d. The primary index of effect size employed in the present paper is the statistic d, defined as the difference between the means of the two groups being compared, divided by the within-group standard deviation assumed common to the two populations (Cohen, 1969, p. Hi). This index is useful because it permits us to compare the magnitudes of effects for a large va- riety of measures. It frees us from the particular scale ofmeasure- ment and allows us to speak of effects measured in standard de- viation units. This property is a great advantage in the behavioral sciences, where responses are measured on many different scales having varying means and standard deviations. Although there are other measures of effect size, d was chosen both for its simplicity and because it appears especially appropriate, given that a large proportion of the studies of interpersonal expectancy effects involves simply a comparison of an experimental with a control group by means of a t test; d is particularly useful for the situation because it is conceptually appropriate and computa- tionally convenient. (For a recent example of the extensive use of d as an index of effect size in the behavioral sciences, see Rosenthal and Rosnow, 1975.) THE BEHAVIORAL AND BRAIN SCIENCES (1978), 3 379 Sampling procedure. Suppose, first, that we had the results of all 345 studies. We would want to think of these as a sample from a population of similar studies that could have been done in the past or might be done in the future. Consequently, even though we could not find any more studies to date, we would have made the usual types of inferences (e.g., about effect sizes) from our 345 studies to the hypothetical target population. In fact, the test statistics previously calculated from Table 2 refer to this hypothetical target population. Although it would have been possible to go back to our 345 studies of interpersonal expectations and compute for each one the effect size, a stratified probability sample of 113 stUdies was chosen to permit the estimation of effect sizes. Two stratification variables were used: area of research and statistical significance of the results. The first three columns of Table 3 present the sampling scheme. For reaction time and inkblot tests, the two areas of research with fewer than ten studies, all studies were in- cluded. For the remaining six areas, fifteen studies were in- cluded for each area except for that of everyday situations, for which twenty studies were included. These studies were chosen as follows: the five most significant studies were included for each area except for the area ofeveryday situations, for which the ten most significant studies were included, and ten studies were selected at random from the remaining studies in each area. This stratification increases the precision of estimates ofeffect size since effect size is correlated with level of significance. For summary purposes, the mean effect size in each area was esti- mated and is given in the fourth column ofTable 3. For example, there were thirty-four studies of the effects of experimenter ex- pectations on the learning and ability scores of their subjects. The mean effect size (as measured by Cohen's d) of the five most significant studies was 1.25. The mean effect size of the ten studies randomly selected from the remaining 29 studies was 0.42. The estimated effect size for all thirty-four studies was 0.54, a value much closer to the mean of the ten studies than to the mean of the highly significant five studies. The means are Approximated Z standard normal deviate Proportion of studies reaching p <.05 9 94 .22< +2.14 9 25 .44' +4.05 15 17 .73' +7.73 29 37 .38' +6.71 23 25 .43' +6.61 34 39 .29" +5.14 119 37 .27' +6.62 112 97 .40' +14.24 26" 37 .39 6.62 Number Estimated of typical studies df' OCcur in m thIn T hI Ore an one area. 'Calculated from the sample a e 3 by finding the unbiased estimate ofYd]and then ~'f0POrtions in this column differ significantly from each .. 7, P =.025). aesults so striking could practically never occur if there w re ally no such relationship. The differences between the ex- ;eted proportion of results at p ~ .05 and the obtained prop r- ':Ons of results at p ~ .05 (i.e., .35 and .37 versus .05) were ~sociated with very large x2s of205 (z = 14.3) and 522 (z = 22. ), :tspectively. A brief overview of all studies. Table 2 summarizes ~e 'lgnificance levels of all 345 studies. The first column gives qe lllai number of studies that fall mto each of the eIght research as The second column shows, for each area, the typicaljlf' Illber of degrees of freedom for the two groups being com- .•Ued in each study. Typical values were calculated by squaring : unbiased estimate of the mean vdi The typical.s dfs ran~e ,-er the eight research areas from seventeen to nmety-seve'n ith a median of thirty-seven. The third column shows the proportion of studies reaching the -level of significance in the predicted direction. The range of dtese proportions was from .22 to .73, with a median proportio .39. Treating these 350 studies as a sample, these eight propo - IlOns differed significantly from each other (x2(7) = 16.32, p=.025). Inspection of the squared differences between the e . ;Jected and obtained frequencies divided by the expected fre- encies showed that only two research areas were contributin mbstantially to the large obtained x2 . The area of animal learn- mg showed more significant effects than did the other seve areas combined, X2 (1) = 8.02, p = .005, while the area of perso perception showed fewer significant effects than did the other 'en areas combined, x2 (1) = 5.54, p = .02. The difference i fTfquency with which these two areas yielded significant expec, Wier effects was not due to differences in size of the samples .mployed in these two areas. Indeed, the area showing fewer lignificant effects had employed larger sample sizes. For the pment, we simply conclude that studies of animal learning are more likely to yield significant effects of experimenter expecta- bon than other areas of research, while studies ofperson percep- bon are less likely to yield significant effects. The final column of Table 2 shows the standard normal deviates of the combined results based on the direct computation c( the standard normal deviate for all studies in each area. In _r to be consistent with the procedure of the earlier review tRosenthal, 1969b), however, any Z falling between -1.27 and -1.27 was entered as zero, a procedure used because of the lllueity of information in many studies claiming nonsignificant IfIuIls. It is expected that this procedure leads to combined ~ult that are too conservative in the long run. The results .969, a provi. expectations .nterpersonal columns of 'ore and after by an earlier 345 separate :mployed de- :h area. Thus. the eight re- f all studiel :a shows that, receiving reo been due to If person per. of all studies es conducted ng studies of 10 percent of tudies. ne proportiCII Ir each of the ecause many nply indicalt :ance of the 11 345 of the ;e all studies significance :vel. ' =:::::=== T~bIe5. Mean eff The file dra' ~rpersonal ex! "'ere summari: Ioeate by empl, learch proce< Abftracts I ntet Reaction tirr Inkblot tests Animal lean Laboratory i Psychophys Learning an Person perc. Everyday sil ~1edian ",,====- Research------ Number ofstudies 87 30 52 Median z (approximation) 1.25 1.32 1.28 Number potentially affected 909 340 2,748 Mean N per study 10 11 53 Percent potentially affected 63showing expectancy effects 66% 69% 60% Median (across studies) Percent showing expectancy 6.\- effects 69% 70% 64% ~ Experimenters Teachers "Expecter" "Expectee" --------~ Table 4. Percentages ofexperimenters, teachers, subjects, and pupils showing expectancy effects subjects (or pupils) in the direction of their expectations. lfthtrt were no main effect of interpersonal expectations, we wouldo. pect about half the experimenters or teachers to obtain resul , the direction of their expectation and the remaining lII!f to obtain results in the opposite direction. The results of lb. earlier reviews, based on over sixty studies, suggested that ahotl two-thirds of the "expecters" (the experimenters and teachen obtained results in the predicted direction (relative to the me» of the other group; for precise definitions, see, for exampk Rosenthal, 1966, p. 227; 1969b, pp. 234-35). Such data could straightforwardly obtained for 117 of the 345 studies. The first column of Table 4 shows that for the eighty-me studies giving data on experimenter expectations, aboull1lt thirds of the experimenters obtained results in the direction i their expectation. The second column shows that the results&. the thirty studies of teacher expectations were about the sameu the results for studies ofexperimenter expectations. The third and fourth columns of Table 4 report the analogO\!l results from the point of view of the subjects of biased eX]ll'11' menters and the pupils ofbiased teachers. Once again, we expl(' that if no expectancy main effects are operating, half the subj or pupils will respond in the direction of their"expecter's" it duced expectation, while half will respond in the oppositedira' tion. For both subjects and pupils, just under two-thirds showtll predicted expectancy bias. A simple analysis suggests that the magnitude of effect elt mated by two-thirds of the subjects and pupils reacting in thell the number of studie~ available in area i (Column I of Table 2),N_1I., N, = 350, and SE, IS given above. d Computed from unbiased esli1lllllltl the average value ofd, d', Z, Z', and dZ. ...•'li1I.... lle l =====================~ 95% confidence Number ofstudies sampled Mean Standard interval effect error of Research area nhjgh nrandom ntot.1 size the mean (SE) from to Reaction time 9 0.17 .06" +0.03 0.31 Inkblot tests 9 0.84 .39" -0.06 1.74 Animal learning 5 10 15 1.73 .35" +0.97 2.49 Laboratory interviews 5 10 15 0.14 .23" -0.36 0.64 Psychophysical judgments 5 10 IS 1.05 .2& +0.49 1.61 Learning and ability 5 10 15 0.54 .31- -0.13 1.21 Person perception 5 10 15 0.55 .21- +0.10 1.00 Everyday situations 10 10 20 0.88 .58- -0.34 2.10 Median 0.70 .28 -0.02 1.41 Estimated mean of345 studies 0.70 .20' 0.30 1.10 'Computed as [S~otal/nl"'. -Computed as [(nhi..lN)' (stlCh/nhi.. ) (N- nh.."lN)' (S~andom/l0)]'" where N = total number of studies in tllat area (Column 1 of Table 2). 'Computed as [}:'·.t (N,IN)' (SE,)'ltfz where ry, is weighted by 5 and N-5, respectively, so that the overa 1 esti- mated effect size is given by [5X high + (N - 5) X randc;>mJ/N, where N is the total number of studies conducted in that ar a. Alternative indices of magnitude of effect. Although we have focused on measuring the magnitude of effects by d, earlier re- views (Rosenthal, 1969b, 1971) have instead used the percentage of experimenters (or teachers) who obtained responses from their Table 3. Results ofsampling from 345 studies in eight research areas Rosenthal & Rubin: Interpersonal expectancy effects Estimates for the eight research areas. The range of estimated effect sizes is from 0.14 for studies of laboratory interviews to 1.73 for studies of animal learning, with a median effect ize of 0.70. In Cohen's (1969, p. 38) terminology, then, these effect sizes range from small (.20) through medium (.50) to large (.80) and, for two of the research areas, to very large. As anticipa ed by the sampling procedure, there was a large correlation (.88) between the estimated effect size (Table 3) and the proportion of studies (Table 2) reaching significance across the eight areas of research. In order to obtain a better understanding of the probable range of effect sizes for the various areas of research, confidence inter- vals were computed. These are shown in columns six and ~even of Table 3. For each area of research, the 95 percent confidence interval suggests the likely range ofthe effect size for that area. If we claim that the effect size falls within the range given we will be correct 95 percent of the time. The confidence intervals are wide because their computation was based on such small sam- ples of studies (i.e., nine, fifteen, or twenty) and most of them overlap substantially. However, the confidence interval for reac- tion time is below the confidence intervals for animal learning and psychophysical judgments, and the confidence interval for laboratory interviews is below that for animal learning. Studies of reaction time appear to have a particularly narrow confidence interval, and a test for the equality of standard errors is consistent with that view: F max = 93.4; p < .01. When we consider all 113 studies sampled, we find that the 95 percent confidence interval suggests an average overall effect size between 0.30 and 1.10, corresponding to effect magnitudes ranging from smalUmedium to quite large in the spirit of Cohen's (1969) terminology. The last column ofTable 3 reports estimated correlations obtained within each research area between e ef- fect size and the degree of statistical significance measured in standard normal deviates. These correlations were all positive, ranging from +.46 to +.91, with a median correlation of +.69. We would expect high correlations in research areas employing ela- tively homogenous sample sizes. 380 THE BEHAVIORAL AND BRAIN SCIENCES (1978),3 Rosenthal & Rubin: Interpersonal expectancy effects THE BEHAVIORAL AND BRAIN SCIENCES (1978),3 381 Dissertation data. Table 5 shows for the sample of 113 studies the comparison of dissertations with nondissertations with respect to size of effect and statistical significance of the results. The first two columns show the number of studies in each re- search area that were dissertations or nondissertations; 32 percent of the 345 studies were doctoral dissertations, while 28 percent of the 113 studies sampled were doctoral dissertations. The third column ofTbIe,less likely to be suppressed because of nonsignificant results, ana _likely to meet at least minimum standards of quality. Yet, doctoral dultrtations show expectancy effects that are statistically significant as .~lIllS at least moderate in magnitude. 1 Controls for cheating and recording error. Those studies instituting I\WtW safeguards against intentional or recording errors exhibit interper- IOlll1 expectancy effects that are statistically significant and at least"tile in magnitude. f. Dissertations with special controls. In this section we consider those 1IlldIes that have special controls for cheating and recording errors and iIlahrealso doctoral dissertations and therefore especially retrievable and Iy to satisfy at least minimum standards of quality. These studies also edubit effects of interpersonal expectancy that are statistically significant lIllill"st moderate in size. 5. Conecting errors of data analysis. In this section, we give an example lililQCedures employed to correct for errors ofdata analysis. This example Its that common errors may tend to underestimate the significance of_. ~ted direction is quite consistent with the effect size (d) of J.70 estimated from the 113 sampled studies. For two expectancy 'lOUps E andE' with opposite.expectations and the same propor- ~n P reacting in each group's expected directioh, which is what - actually observed in empirical studies, we have proportion p of up E reacting in that group's expected direction and prop r- ,jOn (l - p) ofE 1 reacting in the same direction (the expected ai- JO.tion for E but the unexpected direction for E'). Hence, the dif-:rence between the means of E and E' is p - (1 - p) and Je . thin group standard deviation isV pO - pl· Thus d = (2p - )/'l\ _pl. For p between about .3 and .7, d is well approximat d llY 4p - 2. With our data, p~.65 so thatd~.6. ~tations.Iftheq os, we wouldet obtain resulbla relation between ct size and level gnificance (Z1' =====~ Correcting e "lilTed with se lVVeycd and IUrized. SOIDt lk) were largr elects were cl lOlIletimes th, idlerwise exct iUustration of !lie in detail. [eshock (19 Iltr city boys Within each gl telChers as sl &Iealer than tl lClual Scores v ~ndent varia The data anal: 'PprOpriate bl ~·her expec 'try large efl 'lIa1Ysis of th .i.:IPite an eta "thin treatm de level w :nOr term in _med in 1 -'dence inte "I~cial dis ~Ieems q lbIdie~ that an .,olled, we 6ct obtained .00 ,01 .03 .63 .33" ,18 .11 ,06 .05 .04 .03 Interval for Z Unpredicted (-3.09, -"') direction (-2.33, -"') (-1.65, -"') Not significant (- 1.64, +1.64) ,90 Predicted (+ 1.65, +"') .05 direction (+2,33, +"') .01 (+3.09, +"') .001 (+3.72, +"') .0001 (+4.27, +"') .00001 (+4,75, +"') .000001 (+5.20, +"') ,0000001 analogous proportions of the remaining 302 studies. The I IUItt are unequivocal. The more carefully controlled studies kremOll likely (p = .007) to show effects of interpersonal expectab significant at p < .05 than are the studies permitting aUeastdlt possibility of cheating and/or recording errors. The mean sin- ·.A.>ii.....,.·· dard normal deviate for the specially controlled studiel +1.70, while that for the remaining studies was +1.15, The percent confidence interval for the mean of the speciall ton- Ii}'illiliril:;;:::== trolled studies was 1.09 to 2.31, and clearly indicates that the 43 carefully controlled studies did not yield less significant relul than the remaining 302 studies. The reason why these espetiilh controlled studies might be more likely than the remainlnl studies to yield significant effects is not obvious. The medin sample size employed in these studies was about the same as the median sample size employed in all 345 studies. Perhaps thnlt investigators careful enough to institute special safeguardl against cheating and/or observer errors are also careful enougblG reduce nonsystematic errors to a minimum, thereby inelea,I", the precision and power of their experiments. ~ a Mean Z = + 1.70, mean effect size = .64. "Mean Z = +l.1S'1lleaIl size = .71. C Mean Z = 1.22, mean effect size = .70, "X' that lhtI portions differ = 7.38, p = .007. t.- Table 6, Effects ofspecial controls against cheati studies reaching given levels ofsignificance ng 011 t~ Dissertations with special controls. A subgroup of the forty· three specially controlled studies, the eighteen doctoral disscrt. tions, were of particular interest. Examination of the results ri these studies might permit a reasonable estimate of the re uI obtained in studies that were both error-controlled and less su.· ceptible to sampling bias. Presumably, this group of specialh controlled dissertations represents the work of careful disscrU- tion researchers and/or dissertation researchers whose commit, tee members were careful. Table 7 lists the eighteen studies of this subgroup along WI the dj, effect size, and standard normal deviate obtained in ea ~ The mean effect size of these specially controlled dissertation: was larger than that found for the thirty-two examined in Tabl.~ (that set includes some of the eighteen dissertations of Table 7 The 95 percent confidence interval around the mean effect :1< for these eighteen ranges from +0.26 to + 1.30, or from small t very large, and covers the 95 percent confidence interval cal . lated from the 113 studies of Table 3 (0,30, 1.10) as well as the' percent confidence interval calculated from the forty-th1tt studies employing special controls for cheating and observer eI~ rors (.32, .96), There is thus no evidence to suggest that researt studies with better controls and greater retrievabilit)' i1lf associated with smaller effects of interpersonal expectations, Table 7 also shows that the mean Z is greater for these spe- cially controlled dissertations than for the full set of dissertation- median difference of +.35 favoring the nondissertations. Thus, dissertations tended to show smaller effect sizes than did non- dissertations. I The seventh column gives the estimated overall mea~ stan- dard normal deviate (Z) for each research area, while the next two columns give estimated mean Zs separately for dissertations and nondissertations. The last column shows the diffeIlences between the Zs estimated for dissertations and nondissertations. These differences ranged from -1.16 favoring the dissertations to + 1.71 favoring the nondissertations, with a median diffclrence of +.18 favoring the nondissertations. Thus, dissertations tended to show somewhat less statistical significance than did nondissertations. The tendency for dissertations to show somewhat smal er ef- fect sizes and less statistical significance might be due to a educ- tion in sampling bias in retrieving dissertations as compared to nondissertations, or it might be due to real differences be ween nondissertation research and dissertation rese;u.ch (e.g., yo nger, less prestigious, and less skilled investigators usually perform the latter). A potentially powerful factor might be introdurfd by dissertation researchers if they were unusually procedure- consCious in the conduct of their research; there are indic tions that such researchers may tend to obtain data that are biased in the direction opposite to their expectations (Rosenthal, 1969b, p. 234). Rosenthal & Rubin: Interpersonal expectancy effects Controls for cheating and recording errors. Previous wo k has shown that although the occurrence of cheating or recording er- rors on the part of experimenters and teachers cannot be defini- tively ruled out, the occurrence of such intentional or uninten- tional errors cannot reasonably account for the overall obt ined effects of interpersonal expectations (Rosenthal, 1969b, pp. 245- 49; see also the discussion in Rosenthal 1979'1). ExperiIrents described there showed major effects of interpersonal ex~ecta tions despite the impossibility of the occurrence of either cheat- ing or recording errors. More recently, in two ingenious experiments, Johnson and Adair (1970; 1972) were able to assess the relative magnitudes of intentional and/or recording errors. In both experiment, the overall effects of interpersonal expectation (Le., true effec~ plus intentional and/or recording error) were modest, with ds Of .30 and .33, respectively. When the analyses were repeated with the errors removed, the effects decreased only slightly to ds 6f .21 and .26, respectively. Thus, even where cheating and/or record- ing errors can and do occur, these studies suggest that the rrors cannot be invoked as an "explanation" for the effects ofinte per- sonal expectations. Further evidence for this position is obtained from some spe- cial studies. Among the 345 studies under review here, forty- three employed special methods for the elimination or control of cheating or observer errors or permitted an assessment or the possibility of intentional or unintentional errors. These met ods included using tape recorded instructions, data recording by ob- servers blind to the treatments, and video-taping of the interac- tion between the subject and the data collector. The results of these forty-three studies employing such safeguards were orspe- cial interest because if cheating and recording errors rally played a major role in "explaining" interpersonal expectano 1 y ef- fects, then we would expect that studies guarding against $uch errors would show markedly reduced effects of interpersonal ex- pectation. The mean effect size of the studies employing special con ols was .64, a value very close to that estimated for the remai ing studies that did not employ such special controls (.71). Th5 dif- ference between these estimated effect sizes occupies o~ly 9 percent of the 95 percent confidence interval around the esti- mated mean effect size for all 345 studies. Table 6 shows the proportion of these special forty-three studies reaching various levels of significance in the unpredicted and predicted directions and compares these proportions to the 382 THE BEHAVIORAL AND BRAIN SCIENCES (1978),3 Effect Grade Control Experimental Difference Z size 2 -2.83 18.83 21.66 +5.20 +3.85 3 11.50 24.67 13.17 +3.72 '+2.34 4 -2.83 -0.17 2.66 +0.81 +0.47 5 -8.00 +0.33 8.33 +2.45 +1.48 Mean -0.54 +10.92 11.46 +3.04 +2.04 THE BEHAVIORAL AND BRAIN SCIENCES (1978), 3 383 In this section, we address the practical implications ofinterper- sonal expectations. First, we consider the generality of interper- sonal self-fulfilling prophecies when moving from the laboratory to everyday life situations. Then we consider measuring the im- portance of expectancy effects by comparing their size to the size of the effects of such other important independent variables as brain lesions, preparatory effort, and persuasive communica- tions. Practical implications blocking. Consequently, the effects of teacher expectations were claimed to be nonsignificant. Fortunately, Keshock wisely pro- vided the raw gain scores for all children for the achievement variables so that reanalysis was simple. Table 8 shows the results of the reanalysis where total achieve- ment was the sum of reading and arithmetic scores. Gains in performance were substantially greater for the children whose teachers had been led to expect greater gains in performance. The sizes of the effects varied across the four grades from nearly half a standard deviation to nearly four standard deviations. For all subjects combined, the mean effect size was 2.04. The analo· gous mean effect sizes for Keshock's measures of intelligence and motivation were -.01 and + 1.55, respectively. The median of the three effect sizes of2.04, -0.01, and 1.55 was reported for the Keshock study in Table 7. Interestingly, in another carefully conducted doctoral disserta- tion carried out by a colleague of Keshock's at about the same time, at the same university, and under, in part, the same com- mittee members, significant effects of teacher expectations on in- telligence (Stanford-Binet) were obtained although effects on achievement were not found to be significant (Maxwell, 1970). The external validity of interpersonal expectancy effects. Al- though many studies of teacher expectation effects have been conducted since the Pygmalion experiment (Rosenthal, 1973), the majority of the 345 studies surveyed in this paper have been studies of interpersonal expectation effects in laboratory situa- tions rather than in such everyday situations as schools, clinics, or industries. A simple way to examine the external validity, or generality, of the interpersonal expectancy effect is to compare the results of studies conducted in laboratories with studies conducted in more "real life" situations. Such comparisons have been implicitly made in earlier tables. Table 4 showed that the percentages of experimenters and sub- jects (laboratory situations) and teachers and pupils (everyday situations) showing the effects of interpersonal expectations are very similar. The data from Table 3 imply that the mean effect (d) for everyday situations is 0.88, with a 95 percent confidence in- terval of (-0.34, +2.10), and the mean effect for laboratory situa- tions is 0.62, with a 95 percent confidence interval of (+0.38, +0.86). Thus effect sizes may tend to be larger, on the average, in everyday situations than in laboratory situations. However, they also appear somewhat more variable; the estimates of standard deviation are 1.90 and 1.52, respectively. Table 8. Mean gains in total achievement of experimental and control group pupils: after Keshock (1970) Rosenthal & Rubin: Interpersonal expectancy effects examined in Table 5 or for the 345 studies. The 95 percent confidence interval around the mean level of significance (2) for the special dissertations ranged from +.93 to +2.79. The evi- dence seems quite clear: when we speCifically examined those ~udies that are more precisely retrievable and more precisely controlled, we find no decrease in either the average size of the effect obtained or in the average level of statistical significance. Correcting errors of data analysis. Errors of data analysis oc- L'Urred with some frequency in the sample of 345 studies we urveyed and were corrected before the results were sum- 1IIarized. Sometimes these errors were trivial and sometimes they were large. Sometimes the errors were such that expectancy elTects were claimed to be significant when they were not, and lometimes the opposite occurred. Even in the reports of otherwise excellent studies such errors could be found. As an illustration of such problems, it will be useful to examine one case in detail. Keshock (1970, listed in Table 7) studied forty-eight black in- ner 't b\1" CI Y oys aged seven to eleven and in grades two to five. lIhin each grade level, half the children were reported to their leichers as showing an ability level one standard deviation treater than their actual scores. For control group children, the -«ual sc'd ores were reported to the teachers. There were three de- Tho ent variables: intelligence, achievement, and motivation. .~ e data analysis for intelligence and for motivation employed ."prop· !tach nate blocking on grade level and showed no effects of ~r expectations on intelligence (d = -0.01) but showed a ltIJ ~rge effect (d = + 1.55) on motivation. However, in the "" YSII of the achievement data, no blocking was employed Plte a'th. n eta of .86 between grade level and total achievement tz~~nl treatment conditions. In short, the massive effects of ~r tevel were inadvertently pooled into the within-condition erm instead of being removed from the error term by 'These studies were carried out by a student ofone of the authors (RR). I r.t>le 7. Effect sizes, df, and Zs ofdoctoral dissertations employing .cial controls for cheating and observer errors I'~,;:::= Standar normal Effect deviate Mea Study df size (Z)\--Everyday Anderson, 19711" 48 -0.43 -1.46 situations Anderson, 1971 n" 49 -0.20 -0.70 Beez, 1970 58 +1.89 +4.67 Carter, 1969 58 +0.53 +1.97 Keshock, 1970 46 +1.55 +4.67 Maxwell, 1970 62 +0.81 +3.07 Seaver, 1971 77 +0.44 +1.87 Wellons, 1973 14 +4.08 +4.72 Person Blake (and Heslin) 1971 124 +0.55 +2.98 perception Hawthorne, 1972 24 +0.21 +0.50 Mayo, 1972 24 +0.15 +0.38 Todd,1971 18 +1.16 +2.88 [.earning and Johnson, 1970 I 60 .+0.19 +0.75 ability Johnson,I970 II 60 +0.28 +1.08 Page,I970 24 +1.74 +3.64 Yarom,1971 72 +0.04 +0.17 Uiboratory Gravitz, 1969 28 +0.19 +0.50 interviews Inkblot Marwit, 1968 18 +0.90 +1.80 tests Median 48.5 +0.48 +1.84 Mean 48.0 +0.78 +1.86 95% confidence interval 34.3, +0.26, +0.93, 61.7 +1.30 +2.79 ====- The results es are more xpectations at least the mean stan· :tudies was .15. The 95 lcially con· that the 43 cant results l especially remaining 'he median same as the :haps those safeguards I enough to increasing 5, mean effect l~t these pro- f the forty· ral disserla' ~ results of the results ld less sus' If specially ul disserla' se commit· along with ed in each. ssertation . in TableS ,f Table 7). effect siJd m srnall lO :rval caler II as the 95 forty·thttt- bserver 1t rese ,bility dions. these s9'" .ertlitilll' roportion of Total 2'/' (N = 345r--.00 .Ql .03 .60 .36 .19 .12 .07 .05 .04 .03 Table 10. Discrimination learning as a function ofbra experimenter expectancy: after Burnham (1966) in lave examin ....IfiIHng pre ....,.ed. The re: .. lOean size of area of resear' ~ amall effect: If!V\ews (ds =' .1~ ehophyslclll judg' TbH limated gr "research was .7 We also consic ODe analysis, we iIlere existed e averwhelm the lDalysis, we sb sy temalically m o showed stat altudes. A third saf~guards again ltalisticaily sign IliUy. we found dbsertations an, cording errors, ludes of effect the studies. When we con l\'e hawed that peat. on the av. laboratory expel the effects of e lize as the effet lions, preparati, klions. Future reseal tv.o kinds: that Illterpersonal SE content but to ""ies should e: and calibratin 'Rosenthal 191 boos of self-ful r'e es (Rosen llrO(:esses of co 46.5 48.2 Source p Brain state 1.72 .05 Expectancy 2.22 .02 Interaction 0.87 .20 Lesioned Unlesioned Actual brain state nonlesioned. Randomly, some of the really lesioned rats We" beled accurately as lesioned but some were falsely labtl unlesioned. Also randomly, some of the really unlesiOnt; were labeled accurately as unlesioned but some were fal I. beled as lesioned. Table 10 shows the mean performantt~ in each of the four conditions. A higher score indicates sU~lll performance. Animals that had been lesioned did not perform well as those that had not been lesioned, and animals that, believed to be lesioned did not perform as well as tho~ tU were believed to be unlesioned. This experiment is of spa:W interest because the effects of experimenter expectanty larger than those of actual removal of brain tissue, although tiM difference was not significant (p = .40). It should be notedthat if investigators interested in the elT/$ of brain lesions on discrimination learning had conducted usual two-group experiment without keeping the experimenllll blind to treatment condition, the results would have bill seriously misleading. That is, had they employed experimentte knowing the lesioned rats were lesioned and had they compml these results to those obtained by experimenters knowing tit unlesioned rats were unlesioned, they would have greaL' overestimated (by about 100 percent) the effects on discrimi~ tion learning of brain lesions. For investigators interested: assessing, for their own specific area of research, the likelihrol and magnitude of expectancy effects, there appears to be nofulJ. adequate substitute for the employment of expectancy conb group designs. For investigators interested only in the reducli« of expectancy effects, other techniques such as blind I minimized experimenter-subject contact or automated exptl' mentation are among the techniques that may prove to be use. (Rosenthal, 1966, chs. 19-22). Other investigators have also directly compared the magnitudo of effects of experimenter expectations with that of more tn, tional psychological variables. Cooper et al. (1967) compared effects of experimenter expectancy with the effects of prepanCf for an examination on the degree of belief that the examinati, would actually take place. Miller (1970) conducted three esptr ments comparing the effects of experimenter expectancy ~( the effects of persuasive communications (pro versus con). Table 11 summarizes the results of the studies by Burnhalll.• Cooper et aI., and by Miller. For each study, the effect size and, are reported for experimenter expectancy as well as for the oth~ variable against which expectancy effects were to be compa! The final rows of Table 11 indicate that the magnitude of es tancy effects was nearly identical to that of other variables, that the Zs for expectancy effects were slightly more significal than the Zs for the other variables, Five studies are not very many upon which to base any butt most tentative conclusions. Nevertheless, it does appear tbt Unpredicted (-3.09, -00) .001 .00 .00 .00 direction (-2.33, -00) .01 .01 .00 .01 (-1.65, -00) .05 .05 .01 .03 Not significant (- 1.64, +1.64) .90 .61 .59 .60" Predicted (+ 1.65, +00) .05 .34 .40 .3& direction (+2.33, +oc) .01 .19 .21 .19 (+3.09, +00) .001 .11 .15 .12 (+3.72, +oc) .0001 .05 .12 .07 (+4.27, +oc) .00001 .03 .10 .05 (+4.75, +oc) .000001 .03 .07 .04 (+5.20, +oc) .0000001 .02 .03 .03 Table 9. Proportion ofstudies reaching given levels ofsignificance Laboratory Everyday Expected situations situations Interval for 2 proportion (N = 233) (N = 112) A final comparison is given in Table 9, which gives the propor- tion of studies reaching various levels of significance 'n the predicted and unpredicted directions for studies conduc ed in laboratory and everyday situations. The proportions are s milar for the two types of studies, with those conducted in eve yday situations showing significant results in the predicted dir ction somewhat more often. This is not surprising, since the typical df for studies in everyday situations is substantially larger th n for other studies, 97 df versus 36 df. Consequently, the 0 erall results summarized here support the conelusion that int rper- sonal expectancy effects are as likely to occur in everyd y life situations as in laboratory situations. Rosenthal & Rubin: Interpersonal expectancy ffects Type ofstudy Expectancy control group designs. Although we have sown that the average effects of interpersonal expectations are both statistically significant and of moderate to large magnitude, it is also of interest to compare the statistical significance an~ ffect size of other variables of psychological importance. If it could be shown, for example, that the effects of experimenter expect tions were substantially smaller than the effects of some othler be- havioral research variables, we might decide that experim nter expectancy effects, though real, are not large enough, relative to other behavioral variables, to pose a real threat to the intrnal validity of our experiments. A particular research paradig has been developed to compare directly the size of the effect f ex- perimenter expectations with some other variable of greater in- trinsic research interest. This paradigm, the expectancy control group design, has een described in detail elsewhere (Rosenthal, 1966, ch. 23) i the spirit of a form of calibration whereby the effect size of "ar- tifact" variable can be compared with the effect size of ome variable of primary interest. This earlier treatment was e elu- sively theoretical in the sense that no studies existed tha had employed the suggested paradigm. Now, however, there ar five studies available that permit a direct comparison of the effe ts of experimenter expectancy with such other psychological e ects as brain lesions, preparatory efforts, and persuasive communica- tions. . I The first of these was conducted by Burnham (1966). He had each of twenty-three experimenters run one rat in a T-maze dis- crimination problem. About half the rats had been lesione by removal of portions of the brain, and the remaining animals l had received only sham surgery, which involved cutting througll the skull but with no damage to brain tissue. The purpose 0 the study was explained to the experimenters as an attempt to I about the effects of lesions on discrimination learning. pectancies were manipulated by labeling each rat as lesion 384 THE BEHAVIORAL AND BRAIN SCIENCES (1978).3 a Grand mean of all 2s = + 1.22. ·X' that this exceeds expected proportion =691,2=26.3. 'sions and Rosenthal & Rubin: Interpersonal expectancy effects Jble 1L Summary ofjive experiments employing expectancy control g oup designs ===== 1CY Jnlesioned 9.0 8.3 d .79 1.02 .40 'Srlin lesions. • Beliefas a function ofpreparatory effort. C Persuasive c mmunications. ~d rats were la· ;ely labeled as mlesioned nil Nere falsely ~ formance con [icates SUperlOl not perform II imals that wtrt II as those tlul nt is of special :pectancy ...ert ~, although thia iere is no eVidenc~ to support the idea that the effects of exper - 'llfnter expectations are small relative to the effects of "real' ?I)chological variables. Conclusion \\'e have examined the results of 345 studies of interpersona If.fulfilling prophecies and some clear conclusions hav emerged. The reality of the phenomenon is beyond doubt an e mean size of the effect is clearly not trivial. Depending oIj iIlt area of research considered, the mean size of the effect varieS &msmall effects for studies of reaction time and laboratory in. klYlews (ds = .17 and .14) to very large effects for studies of PSY-I lphysicaljudgments and animal learning (ds = 1.05 and 1.73). ~estimated grand mean effect size over eight different areas ilresearch was .70. lI'e also considered various issues of data quality control. In llIIle analysis, we showed that it was unreasonable to suppose that ikre existed enough unretrieved nonsignificant studies to I'oerwhelm the studies we were able to retrieve. In another malYlis, we showed that doctoral dissertations that were "l1tmatically more retrievable than other unpublished studies ~IO showed statistically significant effects with nontrivial mag- lIhldts. Athird analysis showed that studies instituting special ~ds against intentional or recording errors also showed laII'bcally significant effects with nontrivial magnitudes. Fi- ~we found that for the subset of studies that were doctoral 4II1ll Itlons and especially controlled for intentional and re- "';'1 errors, the statistical significance and average mag- of effect obtained were as large as for the remainder o( ltudies . '::0 We considered the practical implications of our results IV d that the effects of interpersonal expectations were as O'n the average, in everyday life situations as they were in ~ experiments. We also showed that the magnitude of tbls of experimenter expectations were about the same Y1tee effects of such other important variables as brain le- paration for an examination, and persuasive communi- ~earch suggested by the results of our analyses is of L_"_b'..1at specific to the content of our analyses, that is, the llIsl t self-fulfilling prophecy, and that specific not to our ~ 0 our methods. Research on self-fulfilling proph- tiexplore the procedures suggested for minimizing l~ the effects of experimenter expectations Cullili. 1976), the interpersonal and policy implica- th~ng prophecies in classrooms, clinics, and busi- llf COm I 1973, 1976), and the role of nonverbal rnunication mediating interpersonal expectancy effects (Hall et aI., 1977; Rosenthal, Hall, DiMatteo, Rogers & Archer, in press; Rosenthal, Hall, Archer, DiMatteo, & Rogers, in press.) Finally, future research suggested by our analyses, but not specific to the present content, should address the improve- ment of methods ofsummarizing entire domains of research with respect to statistical significance, size of effect, and problems of data quality control. ACKNOWLEDGMENTS Preparation of this paper was facilitated by an award ofa John Simon Gug- genheim Memorial Foundation Fellowship to Donald B. Rubin and by support from the Milton Fund of Harvard University and the Biomedical Sciences Support Grant from the National Institutes of Health to Harvard University (5S07 RR07046-12). Their support is gratefully acknowledged. REFERENCES Anderson, D. F. Mediation of teachers' expectancy with normal and retarded children. Unpublished doctoral dissertation, Harvard University, 197L Barber, T. X. Pitfalls in Human Research: Ten Pivotal Points. New York: Pergamon Press. 1976. and Silver, M. J. Fact, fiction, and the experimenter bias effect. Psychological Bulletin Monograph Supplement 70:1-29.1968. Beez, W. V. Influence ofbiased psychological reports on "teacher" be- havior and pupil performance. Unpublished doctoral dissertation, In- diana University, 1970. Blake, B. F., and Heslin, R. Evaluation apprehension and subject bias in experiments.Journal ofExperimental Research in Personality 5:57- 63.1971. Burnham, J. R. Experimenter bias and lesion labeling. Unpublished manuscript, Purdue University, 1966. Carter, R. M. Locus ofcontrol and teacher expectancy as related to achievement ofyoung school children. Unpublished doctoral dissertation, Indiana University, 1969. Cohen, J. Statistical Power Analysis for the Behavioral Sciences. New York: Academic Press, 1969; Rev. ed., 1977. Cooper, J., Eisenberg, L., Robert, J., and Dohrenwend, B. S. The effect of experimenter expectancy and preparatory effort on belief in the probable occurrence of future events. Journal of Social Psychology. 71:221-26. 1967. Elashoff, J. D., and Snow, R. E. A case study in statistical inference: Re- consideration of the Rosenthal-Jacobson data on teacher expectancy. Technical Report No. 15, Stanford Center for Research and Develop- ment in Teaching, School of Education, Stanford University, December, 1970. (eds.), Pygmalion Reconsidered. Worthington, Ohio: Charles A. Jones, 1971. Glass, G. V. Primary, Secondary, and Meta-Analysis ofResearch. Paper presented at the meeting of the American Educational Research Association, San Francisco, April, 1976. Gravitz, H. L. Examiner expectancy effects in psychological assessment: The Bender Visual Motor Gestalt Test. Unpublished doctoral dissertation, University ofTennessee, 1969. THE BEHAVIORAL AND BRAIN SCIENCES (1978).3 385 Commentary/Rosenthal & Rubin: Interperso al expectancy effects '1' hIl llC ",. tJllste r1fI'I' undl ~ll8rrese ~a! ~nCYefff jfCOOS'deral ::;:~j .,peetancier r/.lflCher e> ~view ~combl "to hOW th pIlYISlVe. s< IilcIlhe elus oP'Jf8'e: As R&R p.enomenor ttoW ESP is. lhiSphenom' On the othel Goven the ec publication ~ Sible thal tt Il1IdOW of a studieS in iii, before it cou ,,"yare we scientific sta Theanswe seH to a con mg or accou serves as thE old data. It ~ merely that " study, for ex! theory is on existence an Theorizin£ quale. In on 1969b op. Cj transmission bur possibh and cheatin! possibility It ing the caul merrts furthe variable: ch substantial , ~ansmit exp Slateme nls ( teptualizing Rosenthal in menterexpe Nonethele lOme rUdim, that tasks or 1967, in Ad il'OIIlinent st ot.l- and ¥ (~dair & Ep: llosenthal 1r evaluation a :reate what ~'lCy effect! II does sug( :euld be COl a'SOevide ~ animal by John G. Adair Department of Psychology, University of Manitoba, Winnipeg, Manitoba, CsIIfdI R3T2N2 The combined probabilities of 345 studies: only half the story? Rosen- thal & Rubin, through the use of newer and less commonly used statistlCl techniques, achieve their objectives of demonstrating that interpersonaleJ' pectancy effects exist and are important. With the satisfaction of a job well done, they conclude with ihe hope that in other areas of research resultswiU be combined and analyzed by these techniques. Although Iconcurwithlhel conclusions about expectancy effects. and admire the thoroughness WI which they statistically and analytically attack the problem, Ido not shareltie enthusiasm for their method as they have chosen to apply it. The overall probability estimate and Cohen's d statistic together permit a quantified summary of a vast amount of research; however. a summary limited totheS! statistical procedures inadequately represents a body of research by plac, ing undue emphasis on the mere existence of the phenomenon, to the a~ parent exclusion of its meaning and significance. Each of Rosenthal's siJ!> maries of research on the experimenter expectancy effect, incudifl9 l!'l present paper, suffers from these difficulties. In what follows Ishall focusll'l remarks on the research area I am most familiar with. experimenter expec· tancy effects, although most remarks will apply equally to teacher expec, tancy research. On four 'occasions subsequent to publication of Experimenter Effects If' Behavioral Research (1976 op. cit.), Rosenthal has summarized the researct on experimenter expectancy effects. In each review (Rosenthal 196& 19691: 1976 oper cit., and the present target article) he has combined ttl probabilities of studies accumulated to date and provided an estimate of . probability that the expectancy effect is a chance finding. Each review haS shown increased statistical sophistication in combining probabilities Of calculating the size of the effect. In each review the conclusion has beent'i same: the expectancy effect exists and is robust and important. Howe''f' this conclusion is not as convincing as the statistics imply. The mere caletJ'" tion of the need for 65,121 more studies of a nonsignificant sort before tli combined probability even falls to p = .05 does not convey a feeling (j understanding and increased knowledge about this limited aspect of 0.: world. There seem to be two shortcomings to this quantitative summary, a" Note: Commentary reference lists omit works already cited in the target artle/elll indicated by op. cit.). Commentaries submitted by the qualified professional readership of this jour,. will be considered for publication in a later issue 8S Continuing CommentaryOt/ this article. search 2:61-72. 1964. and Rosnow, R L. The Volunteer Subject. New York: Wiley_ Interscience, 1975. and Rubin, D. B. Pygmalion reaffirmed. In: J. D. Elashoffand 1\ Snow (eds.), Pygmalion Reconsidered. Pp. 139-155. Wortbin .E. Ohio: Charles A. Jones, 1971. 8tOn, Seaver, W. B., Jr. Effects of naturally induced teacher expectancie academic performance of pupils in primary grades. Unpublish ~Il,.. tora! dissertation, Northwestern University, 1971. e do,;, Smith, M. L. and Glass, G. V. Meta-analysis of psychotherapy oute studies. American Psychologist. 32:752-60. 1977. Ome Snedecor, G. W. and Cochran, W. G. Statistical Methods. 6th ed. A Iowa: Iowa State University Press, 1967. me>, Thorndike, R L. Review of Pygmalion in the classroom. American Cd tionaI ResearchjoumaI5:708-11. 1968. UC4. Todd, J. L. Social evaluation orientation, task orientation, and delibe, cuing in experimenter bias effect. Unpublished doctoral diSsertat~t. University of California, Los Angeles, 1971. lO1l, Wellons, K. W. The expectancy component in mental retardation. Unpublished doctoral dissertation, University of California, Berke~ 1973. Yarom, N. Temporal localization and communication ofexperimenter ... pectancy effect with 10-11 year old children. Unpublished doctoral dissertation, University of Illinois , 1971. Open Peer Commentary Hall, J. A. Gender effects in decoding nonverbal cues. Psychologi al Bulletin. 85:845-57. 1978. Rosenthal, R, Archer, D., DiMatteo, M. R., and Rogers, P. L. T e profile ofnonverbal sensitivity. In: P. McReynolds (ed.),Adv nces in Psychological Assessment, vol. 4. Pp. 179-221. San Francisco Jossey- Bass, 1978. Hawthorne, J. W. The influence of the set and dependence of the ata collector on the experimenter bias effect. Unpublished docto al dissertation, Duke University, 1972. I Jensen, A. R. How much can we boost IQ and scholastic achievem nt? Harvard Educational Review. 39:1-123.1969. Johnson, R. W. Inducement ofexpectancy and set ofsubjects as de. terminants ofsubjects' responses in experimenter expectancy!re- search. Unpublished doctoral dissertation, University of Man 'toba, 1970. and Adair, J. G. The effects of systematic recording error vs. exp menter bias on latency of word association. journal ofExperi Research in Personality. 4:270-75. 1970. Experimenter expectancy vs. systematic recording error under a, - tomated and nonautomated stimulus presentation. journal of xperi- mental Research in Personality. 6:88-94. 1972. Keshock, J. D. An investigation of the effects of the expectancy phenomenon upon the intelligence, achievement and motivat on of inner-city elementary school children. Unpublished doctoral isserta- tion, Case Western Reserve University, 1970. Marwit, S. J. An investigation of the communication oftester-bias b means of modeling. Unpublished doctoral dissertation, State University of New York at Buffalo, 1968. Maxwell, M. L. A study of the effects ofteacher expectation on the. ~.Q. and academic performance of children. Unpublished doctoral I dissertation, Case Western Reserve University, 1970. Mayo, C. C. External conditions affecting experimental bias. Unpu Iished doctoral dissertation, University of Houston, 1972. Miller, K. A. A study of"experimenter bias" and "subject awarenes "as demand characteristic artifacts in attitude change experiments. Unpublished doctoral dissertation, Bowling Green State Univ rsity, 1970. Mosteller, F., and Bush, R. R Selected quantitative techniques. In: G. Lindzey (ed.), Handbook ofSocial Psychology, vol. 1. Pp. 289- 34. Cambridge, Mass.: Addison-Wesley, 1954. Page, J. S. Experimenter-subject interaction in the verbal condition ng ex- periment. Unpublished doctoral dissertation, University ofTo nto, 1970. Rosenthal, R Experimenter expectancy and the reassuring nature 0 null hypothesis decision procedure. Psychological Bulletin Monograph Supplement. 70:30-47. 1968. Empirical vs. decreed validation of clocks and tests. American Ed ca- tional Researchjoumal. 6:689-91. 1969a. Interpersonal expectations: Effects of the experimenter's hypothesis. In: R Rosenthal and R. L. Rosnow (eds.), Artifact in BehavioraZ Re- search. pp. 181-277. New York: Academic Press, 1969b. Teacher expectations and their effects upon children. In: G. S. Le ser (ed.), Psychology and Educational Practice. pp. 67-87. Glenvie Ill.: Scott, Foresman, 1971. On the Social Psychology ofthe Self-Fulfilling Prophecy: Further vi- dence for Pygmalion Effects and Their Mediating Mechanisms. New York: MSS Modular Publication, Module 53, 1973. Experimenter Effects in Behavioral Research. New York: Appleto - Century-Crofts, 1966. Rev. ed., New York: Irvington, 1976. I Combining results of independent studies. Psychological Bulletinl 85.185-93.1978. The file drawer problem and tolerance for null results. Psychological Bulletin, in press, 1979. How often are our numbers wrong? American Psychologist, i'1 1979a. and Fode, K. L. The effect of experimenter bias on the performanc the albino rat. Behavioral Science 8: 183-89. 1963. Hall, J. A., Archer, D., DiMatteo, M.R., and Rogers, P. L. The PON test: Measuring sensitivity to nonverbal cues. In: S. Weitz (ed.), on- verbal communication. Rev. ed., Oxford University Press, in pre s. Hall, J. A., DiMatteo, M. R, Rogers, P. L., and Archer, D. Sensitivi y to Nonverbal Communication: The PONS Test. Baltimore: The Joll s Hopkins University Press, in press. I and Jacobson, L. Pygmalion in the Classroom. New York: Holt, Rinehart and Winston, 1968. and Lawson, R. A longitudinal study of the effects ofexperimenter pias on the operant learning oflaboratory rats.journal ofPsychiatric fe- 386 THE BEHAVIORAL AND BRAIN SCIENCES (1978).3 Law Hum Behav (2009) 33:70-82 DOl 10.IOO7/sl0979-008-9136-x Instruction Bias and Lineup Presentation Moderate the Effects of Administrator Knowledge on Eyewitness Identification Sarah M. Greathouse· Margaret Bull Kovera Published online: 2 July 2008 © American Psychology-Law SocietylDivision 41 of the American Psychological Association 2008 Abstract Pairs (N = 234) of witnesses and lineup administrators completed an identification task in which administrator knowledge, lineup presentation, instruction bias, and target presence were manipulated. Administrator knowledge had the greatest effect on identifications of the suspect for simultaneous photospreads paired with biased instructions, with single-blind administrations increasing identifications of the suspect. When biased instructions were given, single-blind administrations produced fewer foil identifications than double-blind administrations. Administrators exhibited a greater proportion of biasing behaviors during single-blind administrations than during double-blind administrations. The diagnosticity of identi- fications of the suspect in double-blind administrations was double their diagnosticity in single-blind administrations. These results suggest that when biasing factors are present to increase a witness's propensity to guess, single-blind administrator behavior influences witnesses to identify the suspect. Keywords Double-blind· Eyewitness identification . Lineups . Memory Recent developments in DNA testing have enabled a number of convicted felons to demonstrate their innocence. Analyses of these cases revealed that prosecutors obtained the majority of these wrongful convictions using evidence based on mistaken eyewitness identifications (Connors, S. M. Greathouse· M. B. Kovera (jgJ) Department of Psychology, John Jay College of Criminal Justice, City University of New York, 445 W. 59th Street, New York, NY 10019, USA e-mail: rnkovera@jjay.cuny.edu ~ Springer Lundregan, Miller, & McEwan, 1996; Wells et al., 1998). There are a number of factors that increase the probability that witnesses will make false identifications, including improperly chosen foils (Lindsay, Wallbridge, & Drennan, 1987; Lindsay & Wells, 1980; Luus & Wells, 1991), biased instructions that suggest the culprit is in the lineup (Clark, 2005; Malpass & Devine, 1981a; Steblay, 1997), and simultaneous lineup presentation (Cutler & Penrod, 1988; Lindsay, Lea, & FUlford, 1991; Lindsay et al., 1991; Lindsay & Wells, 1985; Steblay, Dysart, Fulero, & Lind- say, 2001). Although scholars have long argued that a lineup administrator's knowledge of the suspect's identity may influence lineup administration (e.g., Buckhout, 1975), little research has been conducted on how a lineup administrator's knowledge might bias identification accu- racy. Investigator bias may occur when a lineup administrator with knowledge of the suspect's identity either intentionally or unintentionally communicates to the witness which lineup member is the suspect. To protect against the possible influence of investigator knowledge of the suspect's identity on eyewitnesses' identification behaviors, Wells (1988) suggested the use of a double-blind lineup procedure in which the person administering the lineup is kept blind to the suspect's identity. Since the introduction of the concept of a double- blind lineup 20 years ago, reformers have advocated its adoption as a best practice in lineup administration based on the extensive research on experimenter expectancy effects in areas other than lineup administration (e.g., Ro- senthal, 1976, 2002). There is substantial evidence that post-identification feedback to witnesses regarding whether they have identified the suspect has detrimental effects, including confidence malleability and changes in witness reports of the quality of event viewing conditions (for a meta-analytic review of this literature, see Douglass & Law Hum Behav (2009) 33:70-82 Steblay, 2006). In part, based on the undesirable effects of post-identification feedback, there has been much public policy discussion of the benefits of double-blind lineup administration, with some states (e.g., New Jersey, North Carolina) now requiring the use of double-blind proce- dures. Although the benefits of double-blind procedures for eliminating the negative effects of post-identification feedback have been documented, there has been little research investigating the effects of double-blind proce- dures on the accuracy of eyewitness decisions and the research that does exist has yielded mixed results. PSYCHOLOGICAL RESEARCH ON INVESTIGATOR BIAS Wells and Luus (1990) proposed a useful way of concep- tualizing the potential biases that can be present during lineup administrations. These scholars argued that the same principles that scientists use to conduct a valid experiment can be used by lineup administrators to conduct a fair and unbiased identification procedure. Like an experimenter, a lineup administrator has a hypothesis (i.e., the suspect is the perpetrator) and the lineup is constructed to test that hypothesis. The police gather stimulus materials (e.g., a photo of the suspect and of other people), provide instructions to the participant (i.e., the eyewitness), execute the procedure (e.g., show the photos to the eyewitness), and collect data (i.e., record the eyewitness's decision). The same factors that introduce systematic bias into experiments can bias the lineups conducted by police (Wells et aI., 1998). For example, demand characteristics can be present during a lineup if the investigator pressures an eyewitness to make an identification. Investigators may fall prey to confirmation bias, asking the witness questions that will confirm that the suspect resembles the perpetrator but not questions that will disconfirm this hypothesis. Investigators may introduce response bias by encouraging the witness to adopt a less stringent criterion for a positive identification. The police may make judgments based on small sample sizes (e.g., assume that a positive identifica- tion of the suspect is very reliable even though it is based on the memory of only one witness) and without utilizing proper controls (e.g., failure to determine whether people who did not witness the event would also identify the suspect). Finally, police officers may leak their hypotheses by consciously or unconsciously communicating to wit- nesses which lineup member is the suspect. Two of these sources of bias, confirmation bias and hypothesis leaking, can be eliminated by using a double-blind procedure when administering the lineup. That is, these biases cannot operate if the administrator of the lineup does not know which lineup member is the suspect. 71 In the late 1990s, a panel of distinguished experts on eyewitness identification recommended four rules that police should follow to increase the reliability of evidence provided by eyewitness identifications (Wells et aI., 1998). These experts made only one recommendation without knowledge of any eyewitness identification research to support their position. Specifically, these experts argued that the person who conducts the lineup should be kept blind to the identity of the suspect. These authors argued that research on experimenter expectancy effects (e.g., Rosenthal, 1976,2002; Rosenthal & Rosnow, 1991) dem- onstrates that a lineup administrator with knowledge of the suspect's identity may exhibit subtle nonverbal behaviors that lead the witness to choose a particular lineup member. In addition, the literature on confirmation bias in hypoth- esis testing (e.g., Klayman & Ha, 1987; Snyder, 1984) suggests that knowledgeable administrators may ask the witness questions about the lineup members that will confirm the administrators' hypothesis that the suspect is the culprit. Administrators mayor may not be aware of their suggestive behavior. The recommendation for a double-blind lineup procedure protects against both inten- tional and unintentional investigator bias. Since double-blind procedures were first recommended as a safeguard (Wells, 1988), only a handful of empirical studies have examined the effect of investigator knowledge on eyewitness identification decisions. Results from related previous eyewitness identification research suggested that experimenter knowledge may influence the behaviors that officers exhibit (Fanselow & Buckhout, 1976, Wells & Seelau, 1995). Thus far, however, the results from studies directly testing the influence of administrator knowledge on eyewitness decisions have produced mixed results, with studies suggesting different conclusions about the condi- tions under which effects of administrator knowledge are observed or even whether effects are observed at all. The first study to empirically examine administrative influence paired student investigators with student wit- nesses who had previously viewed a live, staged crime involving two perpetrators (phillips, McAuliff, Kovera, & Cutler, 1999). During the lineup task the investigators presented two target-absent lineups to each witness; the administrator was informed of the suspect's identity for one of the lineups but not for the other. Administrator knowledge, the type of lineup presented, as well as the presence of an observer during the lineup task were also manipulated. Under certain circumstances, knowledge of the suspect's identity increased the rate of false alarms. Specifically, administrator knowledge influenced witnesses to choose the innocent suspect when a sequential lineup was administered and an experimenter-observer was present during the lineup task but not in the other conditions. ~ Springer 72 Other support for experimenter expectancy effects in the context of eyewitness identification tasks was observed in research manipulating the level of contact between administrators and witnesses (Haw & Fisher, 2004). In the high contact condition, administrators were permitted direct contact with the eyewitness when administering the lineup. In the low contact condition witnesses were pro- vided with instructions, photos, and an identification form to complete individually; the administrator did not have direct contact with the witnesses and sat behind the wit- nesses out of their direct view. When high contact administrators presented target absent, simultaneous line- ups, witnesses were more likely to identify the innocent suspect than in any other condition. So in contrast to the results from the Phillips et al. study, investigator knowl- edge effects were more likely to be seen in simultaneous rather than sequential lineups. Although these studies have begun to demonstrate the importance of double-blind lineup administration for safeguarding the reliability of witness identifications, there is still little research on the conditions under which double- blind lineups serve a greater protective function and what research exists on the moderating effects of lineup pre- sentation has produced mixed results (Haw & Fisher, 2004; Phillips et aI., 1999). Moreover, there are several studies that have failed to find effects of lineup administrator knowledge on the accuracy of witness identifications (Haw, Mitchell, & Wells, 2003; Russano, Dickinson, Greathouse & Kovera, 2006). CURRENT RESEARCH The previous research on investigator bias leaves many questions unanswered about the role of lineup administra- tor knowledge on the accuracy of eyewitness identification decisions. The effect of investigator knowledge does not seem to be particularly robust, with some studies finding an effect and others failing to do so. It is also problematic that for studies that have found an effect of administrator knowledge, the conditions under which the effect has been observed have not been consistent. Some studies observed an influence of administrator knowledge when simulta- neous lineups were presented (Haw & Fisher, 2004) whereas other studies have only found the effect to be present under a subset of sequential lineup presentation conditions (Phillips et aI., 1999). In both of these studies, instructions to the witness were based on the unbiased model instructions recommended in the NU guidelines. Thus, there is still a need for thorough investigation of the moderating effects of different lineup procedures on the influence of administrator knowledge on identification decisions. ~ Springer Law Hum Behav (2009) 33:70-82 In consideration of the difficulty in observing an effect of experimenter knowledge and the variability of the effects obtained in earlier studies, we hypothesized that lineup procedures that increase choosing rates may increase the effects of administrator knowledge of the suspect's identity on identification accuracy. Lineup administrators who know the identity of a suspect may steer the witness to the suspect, either intentionally or unintentionally, but only under conditions that promote guessing among witnesses. The lineup setting is different from the traditional setting in which experimenter expec- tancy effects have been observed. Teachers' high expectations for future student performance may alter their behavior toward their students in a manner that elicits better performance from the students. Graduate students who believe that their rats will learn to run a maze quickly or slowly may produce a behavior change in the rats (Rosenthal, 1976). In both of these situations, however, the target of the expectancy has the capacity to change in the expected ways. It is possible that the degree of ecphoric similarity between the suspect and the perpetrator (Char- man & Wells, 2006) may moderate whether external influences such as administrator behavior will influence witnesses' decisions in a lineup task. In the eyewitness situation, the witness has a memory for the perpetrator. Perhaps that memory is imperfect; perhaps it is very accurate. What is important is that this memory may limit the ability of the investigator to influ- ence the witness toward a particular suspect, if that suspect does not closely resemble the witness's memory for the perpetrator. If witnesses are certain that the perpetrator is not in the lineup, then they may not be influenced by the non-blind administrator to choose the suspect because the suggestion does not comport with their memory. Similarly, some witnesses who choose foils are certain in their identifications, although wrong, and the non-blind admin- istrator's cues to the suspect may be similarly uninfluential. Some witnesses who choose foils are merely guessing and do not have certain memories for the perpetrator. It is our hypothesis that these are the witnesses who can be shifted from guessing a foil to guessing a suspect. Thus, we predict that differences between the pattern of identification deci- sions between double-blind and single-blind lineup administration conditions will be the result of fewer foil identifications and more identifications of suspects in the single-blind conditions as compared to the double-blind conditions. We predict that rejection rates will remain relatively unchanged between the two conditions. If the effects of administrator knowledge are seen pri- marily with guessing witnesses, we should also see that the effects of double-blind administration are greatest under conditions that promote guessing (e.g., conditions that encourage the adoption of a lower response criterion). Law Hum Behav (2009) 33:70-82 Eyewitness identification researchers have identified sev- eral variables that may lower participants' response criterion or level of certainty that they need to make an identification, including simultaneous presentation of lineup members and biased instructions. Some researchers have suggested that simultaneous lineups cause witnesses to adopt lower criterion levels and that sequential lineups shift the witness's criterion level to a higher degree of certainty necessary to make an identification (Rowe & Ebbesen, 2007). Signal detection research comparing sequential and simultaneous lineups demonstrated that simultaneous line- ups produce lower criterion levels than sequential lineups (Meissner, Tredoux, Parker, & MacLin, 2005). Biased instructions also appear to lower a witness's criterion level for making an identification (Clark, 2005; Malpass & Devine, 1981a, b). Biased instructions that insinuate that the suspect is in the lineup may lower wit- nesses' criterion levels, prompting them to guess even when they are unsure that the lineup member that they are choosing is indeed the perpetrator. This willingness to choose in the absence of a certain match between their memory of the perpetrator and the lineup memory may make them more susceptible to behavioral cues exhibited by a single-blind administrator. That is, if witnesses lack certainty about whether the perpetrator is in the lineup based on internal information derived from their memory of the perpetrator but are encouraged to choose a lineup member anyway, they may look for external cues when making their identification decision. In this way, biased instructions may lead unsure witnesses to attend more to the investigator's behavior, allowing single-blind investi- gators to wield more influence when biased instructions are given to witnesses. Therefore, we predicted that the effects of administrator knowledge would be greatest under con- ditions that promote guessing by reducing response criterion levels, specifically when simultaneous lineups are presented in combination with biased instructions. We also sought to examine the types of administrator behaviors that are associated with administrator knowl- edge. Previous research infers investigator bias from decreases in identifications of suspects under double-blind as opposed to single-blind conditions. Because no one has videotaped the interaction between the participant admin- istrators and participant witnesses, it is unknown what administrators did to increase identifications of suspects under the single-blind conditions. Wells et ai. (1998) sug- gested two mechanisms through which investigators could bias single-blind lineups. First, experimenter expectancy effects could cause knowledgeable investigators to emit nonverbal cues that communicate the identity of the sus- pect to the witness. Second, investigators may ask witnesses hypothesis-confirming questions that lead the witness to identify the suspect. Research is needed to 73 determine which of these processes is responsible for the investigator bias effects seen in earlier studies. In addition to uncovering the behavioral processes underlying the investigator bias effect, it is practically important to determine whether the use of double-blind administration adversely affects correct identifications. Double-blind administration appears to reduce false iden- tifications in some culprit-absent photospreads. However, it is unclear whether double-blind administration will nega- tively impact the number of correct identifications made in culprit-present photospreads because all of the photo- spreads in the Phillips et ai. (1999) study were culprit- absent. It is possible that some correct identifications are merely lucky guesses by witnesses (Penrod, 2003). If so, then it is also possible that an investigator with knowledge of the suspect's identity might influence a witness with a poor memory of the perpetrator to choose the perpetrator rather than a filler. Presumably double-blind procedures would eliminate that portion of correct identifications that resulted from steering the witness, perhaps unintentionally, toward the perpetrator. Because the police may be reluctant to adopt a procedure without reassurance that the procedure does not influence the rate of correct identifications (Wells et aI., 1998), we varied whether our participants saw a target-absent or target-present lineup to examine whether the double-blind procedure reduces correct identifications as well as false ones. Moreover, the orthogonal manipu- lation of target presence and administrator knowledge of the suspect's identity will allow for the calculation of di- agnosticity of the identifications made using single- and double-blind lineup administrations. METHOD Design The study had a 2 (Target Presence: Target Present vs. Target Absent) x 2 (Administrator Knowledge: Single- Blind vs. Double-Blind) x 2 (Lineup Presentation: Simul- taneous vs. Sequential) x 2 (Instruction Bias: Biased vs. Unbiased) factorial design. Participants Four-hundred-sixty-eight undergraduate psychology stu- dents from a large public southeastern university participated in exchange for course credit. Half of the participants served as lineup administrators (141 women, 92 men, Mage = 19); the remaining participants served as the eyewitnesses (158 women, 75 men, Mage = 20). Witness-administrator pairs (N = 234) served as the unit of analysis. ~ Springer 74 Videotapes Lineup Administrator Training Video To instruct the lineup administrators on the procedures to use when administering a photo lineup, a training video was created. A police officer on the university campus force narrated the video. The video included an explanation of the basic procedures to use when administering a photo lineup. At the end of the video, the police officer conducted a mock lineup with a mock witness. The mock lineup contained several instances of bias on the part of the administrator. For example, when the witness seemed to stop on a photo that was not the suspect, the administrator asked the witness if she was sure that was the suspect, and when she faltered, he suggested she take another look. These examples were included to simulate real world situations in which a new police officer learns to administer lineups by observing a more experienced police officer doing so. If the more experienced police officer exhibits biased techniques, the new police officer may learn to do so as well. Two separate videos were made. One version contained instructions for administering a simultaneous lineup and a demonstration of a simultaneous lineup administration and the other version contained an explanation and demonstration of a sequential lineup. The videos were edited to make them identical except for the type of lineup presented. Witnessed Event Eyewitnesses came to the experiment under the guise of evaluating a videotaped speech. The speech was given by a young woman who was using a projector to give her speech in a classroom. Halfway through the speech, a young man entered the room stating that he was with media services and needed to take the equipment. The young woman asked if he could wait until she was finished with the speech, and he agreed; the young woman then finished her speech. The intruder was visible for about 20 s. and could be seen from both frontal and profile views. To ensure that the findings were not specific to a specific perpetrator, two versions of the tape were created so that the young man that came in to take the projector varied across participants. The young men were similar in coloring, height, and stature. The videotapes were originally created for use in an earlier study (Haw & Fisher, 2004). Photospread For this study, we used two photo arrays constructed by Haw and Fisher (2004) to be used in conjunction with the videotaped events described above. Haw and Fisher con- structed these arrays using the two-part procedure ~ Springer Law Hum Behav (2009) 33:70-82 described in Koehnken, Malpass, and Wogalter (1996), with lineup members chosen first for their match to description and then their similarity to one another. Lineup members were photographed wearing the same shirt to eliminate clothing bias (Lindsay et aI., 1987). Haw and Fisher conducted pilot tests to determine the effective size (Tredoux's e; Tredoux, 1998) and the functional size of the two lineups. The effective size of both lineups was 5.0; the functional size of one lineup was 4.17 and the other was 5.47 (M = 4.82). The target-present photo array for each target served as the target-absent photo array for the other target. A complete description of the construction of these two photo arrays can be found in Haw and Fisher (2004). Lineup Instructions Administrators read one of the two versions of instructions to the witnesses. The unbiased version of the instructions was adapted from the U.S. Department of Justice guide- lines (Technical Working Group for Eyewitness Evidence, 1999). They reminded witnesses that the suspect mayor may not be in the lineup, that certain features may have changed over time, and instructed witnesses to take their time and study each photograph carefully. The biased instructions were based on the biased instructions used in Lindsay et al. (1991) and read, "We have the suspect in custody and would like to show you a photo lineup to see if you are able to identify him." The biased instructions implied that the suspect was present and the witness only needed to pick him out of the lineup. Dependent Measures Following an identification decision by the eyewitness, administrators indicated which photo, if any, the eyewit- ness had chosen. This information was used to determine whether the participant made an identification of the sus- pect, identified a foil, or rejected the lineup. Participants rated their confidence on a 7-point scale with 1 = not at all confident and 7 = extremely. Then, the administrator and the witness each completed separate questionnaires assessing their perceptions of bias present in the photo array administration. The specific questions asked of the witnesses and of the administrators are listed in Table 1. Participants made their ratings on 7-point Likert-type scales (l = strongly disagree; 7 = strongly agree). When necessary, items were recoded so that higher numbers indicated stronger agreement with statements suggesting that the lineup administration was biased. Procedure Participant witnesses signed up to participate in a study examining perceptions of speeches and participant Law Hum Behav (2009) 33:70-82 Table 1 Statements rated by witnesses and administrators Witness statements I. The instructions provided to me by the administrator of this lineup were complete and unbiased (R) 2. The administrator encouraged me to identify someone in this lineup 3. The lineup administrator encouraged me to identify a specific person in the lineup 4. The instructions given to me during this lineup were biased in favor of me choosing someone in particular 5. The administrator appeared to be doing his or her best to be fair and impartial throughout the lineup administration (R) 6. The administrator wanted me to pick a certain person from the lineup 7. I made my identification without any input from the lineup administrator (R) 8. I believe the administrator's behavior probably influenced my decision in this lineup Administrator statements I. The instructions that I provided to the witness were complete and unbiased (R) 2. I encouraged the witness to identify someone from this lineup 3. I tried my very best to be fair and impartial throughout the lineup administration (R) 4. I encouraged the witness to pick a specific person in the lineup 5. The eyewitness made an identification without any influence from me (R) 75 Note: Items marked with an (R) were recoded so that higher numbers indicated stronger agreement with statements that suggested that the lineup administration was biased administrators signed up to participate in a study of eye- witness memory; they were restricted to participating in only one of the studies. The scheduling of the two studies was staggered so that the participants would not arrive at the laboratory at the same time. Participant witnesses arrived first, were instructed that they would watch a video of a person giving a speech and then evaluate the effec- tiveness of the speaker, signed a consent form, and then watched the video. At the completion of the video, they completed a filler task in which they rated the effectiveness of the speaker's communication. In the meantime, participant administrators arrived at the lab. They were taken to a separate room where they were told that some media equipment had been stolen, that a suspect was in custody, and that we needed their help in administering a photo array to a person who had seen a person who was suspected of stealing the equipment. After signing a consent form, the experimenter played the training video for them and then gave them a set of instructions, which varied depending on the instruction bias condition, to read to the participant witness. The experi- menter told all participants that they would receive a bonus of $20 if their witness successfully identified the suspect from the photospread. Only half of the participants were told the identity of the suspect: the perpetrator seen by the witness for the target-present arrays and the target sub- stitute for the target-absent arrays. The other half of the participants were not told who the suspect was but were told that they would be told whether the witness had chosen the suspect at the conclusion of the experiment. When witnesses had completed their filler questionnaire, the experimenter introduced the administrator and told the witness that the person who interrupted the speech was suspected of stealing the LCD projector after the speech was over and that the administrator would be showing them a photo array. At that time, the experimenter left the wit- ness with the administrator with the instruction that they should open the door when the administration of the photoarray was complete. The interaction between the administrator and the witness was surreptitiously video- taped. After the witness and the administrator had completed their respective post-identification question- naires, the participants were informed of the deception regarding the videotape and were offered the opportunity to erase the tape or to sign a consent form allowing us to use the videotape for research purposes. These procedures were approved by the university's Institutional Review Board. Before being dismissed, administrators completed paper- work to receive their $20 if their witness had in fact identified the designated suspect. RESULTS Identification Decisions We tested whether our manipulations influenced the rate of identifications of the suspect by conducting logistic regressions with the main effects and all possible interac- tions of target presence, administrator knowledge, lineup presentation, and instruction bias as predictors. Using a backward stepwise procedure, only two of the predictors were significant. There was a main effect of target presence, B = 2.32, S.E. = .34, Wald's l (1, N = 234) = 46.12, p < .001, exp(B) = 10.19,0.60 for target-present and 0.14 for target-absent lineups. When the target was present in the ~ Springer 76 lineup, the odds that the witness would identify the suspect was 10 times greater than the odds that the witness would identify the suspect from a target-absent lineup. A three- way interaction of Administrator Knowledge x Lineup Presentation x Instruction Bias was also observed B = 1.42, S.E. = .48, Wald's l (1, N = 234) = 8.62, p = .003, exp(B) = 4.13. The effect of administrator knowledge on identifications of the suspect was greatest when administrators conducted a simultaneous lineup using biased instructions in comparison to all other conditions. See Table 2 for the proportion of participants making dif- ferent identification decisions (i.e., identifications of the suspect, foil identifications, and rejections) by condition. We ran a similar logistic regression with foil identifi- cations as the dependent variable. This analysis revealed a main effect for target, B = -1.92, S.E. = .30, Wald's l (1, N = 234) = 40.58, P < .001, exp(B) = .14. Witnesses in the target present condition were less likely to choose a foil (.30) than participants in the target absent condition (.73). A significant two-way interaction of administrator knowledge and instruction bias indicated that foil identifi- cations were less common when administrator knew the suspect and gave the witness biased instructions (.40) than in the other three cells of the interaction (double-blind/ biased = .63; double-blind/unbiased = .57; single-blind/ Law Hum Behav (2009) 33:70-82 unbiased = .47), B = -1.18, S.E. = .44, Wald's l (1, N = 234) = 7.38, p = .007, exp(B) = .31. We conducted a logistic regression with target presence, administrator knowledge, instruction bias, and lineup pre- sentation and their interactions as predictors of lineup rejections. Using a backward step procedure, none of the predictors remained in the model. Our analyses showed that rejections of the lineup are not affected by administrator knowledge. In contrast, admin- istrator knowledge did interact with other variables to influence both identifications of the suspect and foil iden- tifications. Specifically, it appears as if some foil identifications made under double-blind administration conditions, perhaps those produced by guessing, are redistributed to identifications of the suspect under single- blind administration conditions. Figure 1 provides a visual depiction of this phenomenon for simultaneous photo- spreads paired with biased instructions, where we found the greatest effect of investigator knowledge. Diagnosticity We calculated diagnosticity scores for double-blind and single-blind lineup administrations. These scores indicate the extent to which an identification is likely to occur given Table 2 Proportion of identification decisions by target presence, instruction bias, lineup presentation, and administrator knowledge Target-present Target-absent Collapsed across target presence Simultaneous Sequential Simultaneous Sequential Simultaneous Sequential Identifications of suspects Biased Instructions Single-blind .86 (n = 14) .57 (n = 14) .33 (n = 15) .21 (n = 14) .60 (n = 29) .39 (n = 28) Double-blind .64 (n = 14) .50 (n = 14) .00 (n = 15) .07 (n = 15) .32 (n = 29) .28 (n = 29) Unbiased Instructions Single-blind 047 (n = 15) .79 (n = 14) .14 (n = 14) .13 (n = 15) .31 (n = 29) 046 (n = 29) Double-blind 043 (n = 14) .56 (n = 16) .19 (n = 16) .07 (n = 15) .31 (n = 30) .32 (n = 31) Foil identifications Biased Instructions Single-blind .14 (n = 14) .36 (n = 14) 047 (n = 15) .64 (n = 14) .31 (n = 29) .50 (n = 28) Double-blind .29 (n = 14) 043 (Il = 14) .87 (Il = 15) .93 (n = 15) .59 (n = 29) .69 (n = 29) Unbiased Instructions Single-blind .27 (n = 15) .14 (n = 14) .79 (n = 14) .67 (n = 15) .52 (n = 29) Al (n = 29) Double-blind 043 (n = 14) .38 (n = 16) .81 (n = 16) .67 (n = 15) .63 (n = 30) .52 (n = 31) Rejections of the lineup Biased Instructions Single-blind .00 (n = 14) .07 (n = 14) .20 (n = 15) .14 (n = 14) .10 (n = 29) .Il (n = 28) Double-blind .07 (n = 14) .07 (n = 14) .13 (n = 15) .00 (n = 15) .10 (n = 29) .03 (n = 29) Unbiased Instructions Single-blind .27 (n = 15) .07 (n = 14) .07 (n = 14) .20 (n = 15) .17 (n = 29) .14 (n = 29) Double-blind .14 (n = 14) .06 (n = 16) .00 (n = 16) .27 (n = 15) .07 (n = 30) .16 (n = 31) ~ Springer Law Hum Behav (2009) 33:70-82 77 Decision Double -Blind Administrator Single -Blind Administrator analyses revealed no significant effects for any of the individual items. Witness Confidence Witnesses Ratings of Administration Bias Fig. 1 Identification decisions in Biased Instruction, Simultaneous Photo Spreads as a Function of Administrator Knowledge We conducted a 2 (Target Presence) x 2 (Administrator Knowledge) x 2 (Lineup Presentation) x 2 (Instruction Bias) multivariate analysis of variance (MANOVA) with witnesses' ratings of their agreement with a series of statements about the bias present in the lineup procedure as the dependent variables (see Table 1 for a list of the statements). The only significant effect was a three-way interaction of administrator knowledge, lineup presenta- tion, and instruction bias, multivariate F(8, 210) = 2.27, P < .03, partial 1'/2 = .08. However, follow-up univariate We also conducted a 2 (Target Presence) x 2 (Administra- tor Knowledge) x 2 (Lineup Presentation) x 2 (Instruction Bias) MANOVA with administrators' ratings of their agreement with a series of statements about the lineup pro- cedure as the dependent variables (see Table I for a list of the statements). A main effect of lineup presentation was observed, multivariate F(5, 211) = 2.26, p = .05, partial 1'/2 = .05. Follow-up univariate analyses revealed that administrators were more likely to say that they had encouraged the witness to identify someone from the lineup when they administered a simultaneous lineup (M = 4.63) than when they administered a sequential lineup, M = 3.95, F(l, 211) = 5.93,p < .02, partial '12 = .03. There was also a significant three-way interaction of Administrator Knowledge, Instruction Bias, and Target Presence, multivariate F (5, 211) = 2.83, p < .02, partial '12 = .06. The only significant interaction in the follow-up univariate tests was for the statement that the administrator tried to be fair and impartial during the lineup adminis- tration, F(l, 211) = 9.53, p = .002, partial '12 = .04. Administrators indicated that they were less likely to have been fair and impartial when they knew who the suspect was (M = 2.00) than when they did not (M = 1.21), but only when they administered biased instructions for a tar- get-absent lineup. This effect of administrator knowledge did not obtain for the other combinations of instruction bias and target presence. Administrator Behavior During the Photoarray Administration Administrator Ratings of Administration Bias Two doctoral-level students blind to the condition as wel1 as to the hypotheses of our study coded the videotaped lineup administrations for verbal and nonverbal cues that might influence witnesses to choose a particular lineup member. Due to technical difficulties with the recordings (e.g., camera malfunctions, administrators moving outside of camera range) that were not systematically associated with particular experimental conditions, 179 administra- tions were available for coding and were included in the analyses. The raters coded for the presence of a variety of administrator behaviors such as asking witnesses whether they are sure after failing to make an identification or to take another look at the photos after making an identifi- cation. See Table 3 for a full list of behaviors. Concordance rates between the two coders were calculated using the formula C = 2(C1,2)1(CI + C2 ) where C1,2 = number of identical categories assigned by both 32~60 58~30 10 10 0% Suspect 10 Foil 10 Reject We conducted a 2 (Target Presence) x 2 (Administrator Knowledge) x 2 (Lineup Presentation) x 2 (Instruction Bias) analysis of variance (ANOVA) with witnesses' rat- ings of their confidence in their identification as the dependent variable. There was a nonsignificant trend for a main effect of administrator knowledge on witness confi- dence, F(l, 217) = 3.57, P = .06, partial 1'/2 = .02. Witnesses who were administered the lineup by a double- blind administrator expressed more confidence in their decision (M = 5.20) than witnesses who were exposed to a single-blind administration (M = 4.89). one hypothesis (i.e., that the suspect is the perpetrator) versus another hypothesis (Le., that the suspect is not the perpetrator) and provide infonnation about how much one should rely on an identification of the suspect under dif- ferent conditions. Diagnosticity was calculated by dividing the proportion of identifications of the suspect in target- present lineups by the proportion of identifications of the suspect in target-absent lineups (Wells & Lindsay, 1980), with higher scores indicating greater diagnosticity. Identi- fications of the suspect made under double-blind conditions were twice as diagnostic as those made under single-blind conditions (single-blind = 3.25; double-blind = 6.66). ~ Springer 78 Law Hum Behav (2009) 33:70-82 Table 3 Proportion of Administrator knowledge Lineup presentation administrators engaging in different behaviors as a function Administrator behavior Double-blind Single-blind Sequential Simultaneous of administrator knowledge or lineup presentation Smile at identification .21 .19 .12 .29 Frown at identification .02 .01 .01 .02 Raise eyebrows .04 .05 .06 .03 Tell witness to change behavior .03 .06 .06 .02 Tell witness to look carefully .10 .26 .14 .22 Tell witness to take time .21 .25 .20 .26 Think of perp from new angle .22 .21 .32 .11 Compare two photographs .07 .05 .05 .08 Say know who suspect is .03 .14 .10 .07 Say do not know who suspect is .23 .01 .13 .11 Ask witness to describe suspect .13 .18 .20 .12 Ask if sure after identification .41 .48 .39 .51 Ask if sure after non-identification .20 .26 .36 .09 Remove picture slowly after non-ID .00 .04 .04 .00 Call attention to specific photo .05 .07 .05 .07 Ask to look again after non-ID .21 .36 .45 .12 Ask to look again after identification .31 .28 .28 .31 Repeat choice with questioning tone .03 .00 .02 .01 coders, and C, and C2 = total number of categories assigned by the first and second coders, respectively. The overall concordance rate between the two coders was .80. Disagreements between the two raters were resolved by a third doctoral student who was also blind to the adminis- tration condition and the hypotheses of the study. The coders also rated the level of pressure the admin- istrator exerted on the witness to choose any photograph on a 5-point Likert type scale, with higher numbers indicating greater bias. The intraclass correlation for ratings of pres- sure to choose a photograph indicated reasonable interrater reliability (.70). The ratings of the two coders were aver- aged to create a single measure of pressure to choose. We subjected the blind observers' ratings of adminis- trator pressure to choose a photo to a 2 (Target Presence) x 2 (Administrator Knowledge) x 2 (Lineup Presentation) x 2 (Instruction Bias) ANOVA. This analy- sis revealed a main effect of administrator knowledge, F(l,162) = 8.23, p = .005, partial Yf2 = .05. The blind coders rated single-blind administrators as placing more pressure on the witness to choose a photograph (M = 3.47) than double~blind administrators (M = 3.20). To test which administrator behaviors differed among the conditions, we subjected coders ratings of administrator behaviors to a 2 (Target Presence) x 2 (Administrator Knowledge) x 2 (Lineup Presentation) x 2 (Instruction Bias) MANOVA, with specific behaviors as the dependent variables. This analysis revealed a main effect of adminis- tratorknowledge,multivariateF(18,144) = 2.20,p = .006, ~ Springer partial Yf2 = .22. Single-blind administrators were more likely than double-blind administrators to tell witnesses to examine the lineup carefully (F(I,l44) = 6.88, p = .01, partial Yf2 = .04), to tell witnesses that they know who the suspect is (F(l,I44) = 7.41,p = .007, partial Yf2 = .04), to tell witnesses to take another look at the lineup if they did not make an identification (F(l,l44) = 5.54, p = .02, partial Yf2 = .03), and to remove a picture slowly ifwitnesses said no toaparticularphotograph,F(l,I44) = 3.91,p = .05,partial Yf2 = .02. Single-blind administrators were less likely than double-blind administrators to tell witnesses that they did not know who the suspect was, F(1,144) = 18.37, p < .001, partial Yf2 = .10. See Table 3 for means. There was also a main effect of lineup presentation, multivariate F(l8,144) = 3.51, p < .001, partial '12 = .31. Investigators presenting a sequential lineup were more likely than administrators of simultaneous lineups to ask witnesses to think about the perpetrator from another angle, F(l,I44) = 11.79, p = .001, partial '12 = .07, to ask wit- nesses if they were sure if the witnesses did not make an identification,F(l,l44) = 21.56,p < .001, partial Yf2 = .12, to take the picture away slowly after a non-identification, F(l,I44) = 3.91, p = .05, partial Yf2 = .02, and to ask wit- nesses to take another look if they did not make an identification, F(I,l44) = 27.79,p < .001, partial '12 = .15. Administrators of simultaneous lineups were more likely than administrators of sequential lineups to smile when the witness made an identification, F(l, 144) = 7.63, p = .006, partial Yf2 = .05. See Table 3 for means. Law Hum Behav (2009) 33:70-82 DISCUSSION Although previous research has clearly shown the effects of lineup administrator feedback on witness confidence and memory for the conditions present during the witnessed event (Douglass & Steblay, 2006), the research record for the effects of lineup administrator knowledge of the sus- pect's identity on the reliability of eyewitness identifications has been more equivocal (Russano et aI., 2006). The present research sought to answer several questions left unanswered by previous research, including whether other features of lineup procedures moderate the effects of administrator knowledge on eyewitness accuracy and whether double-blind photo array administration increases the diagnosticity of identifications of the suspect. In addition, we videotaped the photo array administrations to examine the types of behavioral cues that are associated with administrator knowledge of a suspect's identity. Effects of Administrator Knowledge on Eyewitness Identifications Past research showing administrator knowledge effects (Phillips et aI., 1999) had tested the effects exclusively in target-absent lineups. Our study examined the effects of manipulating administrator knowledge in both target- absent and target-present lineups, which allows for the examination of the diagnosticity of identifications of the suspect in double-blind and single-blind photo array administrations. The diagnosticity of identifications of the suspect under double-blind administrations was twice that obtained under single-blind administrations, indicating that identifications of the suspect obtained when the adminis- trator does not know the identity of the suspect in the photo array provide better information about the true guilt of the identified suspect. The identification data also suggest that the effects of administrator knowledge are greater under some lineup procedures than others. We manipulated whether the instructions administered to the witnesses were biased or unbiased and whether the administrator conducted a simultaneous or sequential photo array. The manipulation of administrator knowledge of the suspect's identity had the greatest influence on identifications of the suspect when other factors that increase mistaken identifications were also present during the photo array administration (i.e., biased instructions; simultaneous presentation). Specifi- cally, when presented with biased instructions and a simultaneous lineup, witnesses were more likely to make an identification of the suspect when administrators knew the identity of the suspect (single-blind administration) than when they did not (double-blind administration), irrespective of whether the suspect was the culprit. Thus it 79 is possible that the mixed results obtained in earlier researcher were due to differences in the instructions given to witnesses, with the NIT recommended instructions minimizing the effects of administrator knowledge on mistaken identifications of the suspect. Moreover, simultaneous lineups and biased instructions are both lineup features that promote a lower criterion for choosing someone from the lineup (Clark, 2005; Flowe & Ebbesen, 2007; Malpass & Devine, 1981a, b; Meissner et aI., 2005) and consequently increase the likelihood that witnesses will guess when making an identification deci- sion in the absence of a clear memory of a lineup member as the perpetrator. Sometimes guesses will be correct and witnesses will guess the suspect in a target-present lineup. But guesses may also lead to incorrect choices, including identifications of the suspect in target-absent arrays or foil identifications in either type of array. Is there any evidence that guessing plays a role in the effects of administrator knowledge on witness behavior? Our findings suggest that under conditions that promote guessing such as biased instructions and simultaneous administration, single-blind lineup administration results in the redistribution of gues- ses from fillers to suspects. Effects of Administrator Knowledge on Administrator Behavior Did having knowledge of the suspect's identity change administrator behavior during the photospread administra- tion? The answer is yes. Our trained observers who were blind to the knowledge condition of the administrators and to the hypotheses of the study were able to identify some specific behaviors that single-blind administrators exhib- ited at greater rates than double-blind administrators. Specifically, single-blind administrators were more likely to tell the witness to examine the lineup carefully, to take another look at the lineup after the witness failed to make an identification, and to remove a picture from consider- ation slowly if the witness rejected it as the suspect than were double-blind administrators. Single-blind adminis- trators sometimes even told the witnesses that they knew who the suspect was, giving a very overt cue to the witness that the administrator had knowledge that could help them choose the suspect. The observers also judged that single- blind administrators exerted greater pressure to choose a photograph than did double-blind administrators. Although these behaviors were present across single-blind adminis- trations, irrespective of other lineup procedures like biased instructions or simultaneous presentation, they seemed to exert the most influence on witnesses when biased instruction and simultaneous presentation were also pres- ent. Again, because these other procedures lower witnesses' criterion for choosing a member of a lineup, it is %! Springer 80 possible that the increased guessing produced by the lower criterion increases reliance on administrator cues to inform the guesses. How do these findings map onto the types of bias that Wells and colleagues (1998) hypothesized might operate in lineup administrations? Observers' ratings of increased pressure to choose in single-blind lineups and single-blind administrators increased tendency to warn the witnesses to look at the lineup carefully suggest that demand charac- teristics may be operating to a greater extent in single-blind lineups than in double-blind lineups. There also seems to be evidence of hypothesis leaking in that non-blind administrators were more likely to tell the witness to take another look at the lineup after the witness failed to make an identification, and to remove a picture from consider- ation slowly if the witness rejected it as the suspect. Unfortunately, the camera angle did not allow us to ascertain whether these behaviors occurred more fre- quently when the witness failed to identify the suspect rather than a filler. It is important to note that both the witnesses and administrators participating in the photospread administra- tion reported few if any differences in administrator influence as a function of single-blind versus double-blind administration. This finding is particularly troubling for a number ofreasons. If lineup administrators are not aware that they are exhibiting behavioral cues to the suspect's identity, they obviously will not try to inhibit them. In addition, during trial, jurors rely on the witnesses' accounts of the lineup administration procedure to judge the reliability of the identification. If witnesses are not able to convey that the administrator influenced their decision, jurors will not be able to consider this in their decision-making process. Limitations Student participants served as both eyewitnesses and lineup administrators in this experiment. Although we provided motivation to the lineup administrators though monetary incentives and provided them with a training video for administering lineup techniques, it is not known whether the behaviors of experienced police officers might differ from those exhibited by the mock administrators in this study. It is possible that police officers are aware of the dangers of mistaken identifications of suspects and there- fore are more likely to inhibit any intentional cues to the witness to pick a particular suspect. Of course, behavioral cues may not always be intentional and it is these unin- tentional cues that are likely to go uncontrolled, even by experienced officers. Moreover, it is likely that police officers are highly motivated to obtain identifications of suspects in cases involving actual crimes, especially when the officer has a strong belief that the suspect is the guilty ~ Springer Law Hum Behav (2009) 33:70-82 party, more motivation than we could ethically provide to our participants. In such highly motivating situations, it is possible that officers may be unconsciously emitting more cues to witnesses to choose the suspect than were emitted in the current study. Therefore, although the effect of administrator knowledge when an experienced police officer administers the lineup is not known, it is possible that the effects may be even stronger than those observed with student administrators in this study. More importantly, it is clear that the effects of the incentives and training video were minimized by using double-blind lineup administration procedures, and it is reasonable to expect that if there is motivation to obtain identifications of sus- pects in real lineups, double-blind administration would also minimize the effects of that motivation. Additionally, the witnesses in this study were under- graduate students who were not aware at the time that the event they were viewing was a crime. It was only after they watched the event (i.e., the speech with the interruption by the perpetrator) that they were informed a theft had later taken place. Furthermore, the students were aware that they were participating in a psychology experiment and, there- fore, may not have felt the same motivation to pick a suspect that might be felt by witnesses who were placed in danger or to be careful with their choice as there would be no consequences of their actions if they made a wrong choice. Future research should explore the role that witness motivation plays in lineup tasks and how it affects the amount of influence an investigator has on identification decisions. Conclusion Although our research suggests that administrator knowl- edge of a suspect's identity may have greater biasing influence when the administrators deliver biased instruc- tions and simultaneous photospreads, there is still cause for concern about the effects of administrator knowledge in photospreads that lack these features. Across all lineup procedures, single-blind photospread administrations pro- duced identifications of suspects that had lower diagnosticity than the identifications of suspects produced using double-blind procedures. The present findings sug- gest the importance of double-blind lineup administration. Although one high-profile field study of double-blind practices sheds doubt for some on the usefulness of double- blind lineups (Mecklenburg, Bailey, & Larson, 2(08), others suggest that there are sufficient design flaws in that field study that give pause about making policy recom- mendations based on its findings (Schacter et aI., 2008). Although field studies may be helpful in examining prac- tice in the field, more laboratory studies, in which one can know whether an identification of a suspect is a correct or a Law Hum Behav (2009) 33:70-82 mistaken identification, are needed to examine the diag- nosticity of identifications under double- and single-blind administration conditions and the role of other lineup procedures in moderating the effects of administrator knowledge on witness reliability. Even double-blind lineups may not be enough to guard against administrator expectancy effects when the admin- istrator conducts lineups with multiple witnesses, as there is evidence that the confidence of the witness who partic- ipated in the first lineup administration influences the administrator's perception of the difficulty of the lineup task. Consequently, double-blind administrators adminis- tering a second lineup may be especially likely to steer the second witness to the photo chosen by the first witness when the first witness lacks confidence in his or her iden- tification (Douglass, Smith, & Fraser-Thill, 2005). This study illustrates that there are still many questions about the effects of administrator knowledge of a suspect's identity and double-blind lineup administration on witness behavior that remain unanswered before solid policy rec- ommendations can be made. Although double-blind administration of lineups may not be a panacea, there is no strong empirical evidence that it would produce harmful results. A continued exploration of other variables that interact with administrator knowledge to influence wit- nesses' decisions (e.g., strength of the memory trace) will assist those who wish to make policy recommendations about best practices for administering lineups. Acknowledgments This research was supponed grant from the National Science Foundation (SBE #9986240) to M. B. Kovera. Ponions of this research were presented at the 2004 and 2005 meetings of the American Psychology-Law Society in Scottsdale, AZ and La Jolla, CA as well as the 2006 meeting of the Association of Psychological Science in New York, NY. We would like to thank Caroline Crocker, Ryan Copple, and Katy Sothmann for their assis- tance with coding the administrators' behaviors. REFERENCES Buckhout, R. (1975). Reliability checklist for corporeal lineups. Social Action and the Law, 2, 1-8. Charman, S. D., & Wells, G. L. (2006). Eyewitness lineups: Is the appearance-change instruction a good idea? Law and Human Behavior, 31, 3-22. Clark, S. E. (2005). A re-examination of the effects of biased lineup instructions in eyewitness identification. Law and Human Behavior, 29, 575-604. Connors, E., Lundregan, T., Miller, N., & McEwan, T. (1996). Convicted by juries, exonerated by science: Case studies in the use of DNA evidence to establish innocence after trial. Alexandria, CA: National Institute of Justice. Cutler, B. L., & Penrod, S. D. (1988). Improving the reliability of eyewitness identification: Lineup construction and presentation. Journal of Applied Psychology, 73,281-290. Douglass, A. B., Smith, c., & Fraser-Thill, D. (2005). A problem with double-blind photospread procedures: Photospread administrators 81 use one eyewitness's confidence to influence the identification of another eyewitness. Law and Human Behavior, 29, 543-562. Douglass, A. B., & Steblay, N. (2006). Memory distortion in eyewitnesses: A meta-analysis of the post-identification feed- back effect. Applied Cognitive Psychology, 20, 859-869. Fanselow, M. S., & Buckhout, R. F. (1976). Nonverbal cueing as a source ofbiasing information in eyewitness identification testing. Center for Responsive Psychology Monograph No. CR-26. New York: Brooklyn College C.U.N.Y. Flowe, H. D., & Ebbesen, E. B. (2007). The effect of lineup similarity on recognition accuracy in simultaneous and sequential lineups. Law and Human Behavior, 31,33-52. Haw, R. M., & Fisher, R. P. (2004). Effects of administrator-witness contact on eyewitness identification accuracy. Journal ofApplied Psychology, 89, 1106-1112. Haw, R. M., Mitchell, T. L, & Wells, G. L. (2003, July). The influence of lineup administrator knowledge and witness perceptions on eyewimess identification decisions. Poster presented at the International Congress of Psychology and Law, Edinburgh, Scotland. Klayman, J., & Ha, Y. (1987). Confirmation, disconfirmation, and information in hypothesis testing. Psychological Review, 94, 211-228. Koehnken, G., Malpass, R. S., & Wogalter, M. S. (1996). Forensic applications of line-up research. In S. L. Sporer, R. S. Malpass, & G. Koehnken (Eds.), Psychological issues in eyewitness identification (pp. 205-231). Mahwah, NJ: Erlbaum. Lindsay, R. C. L., Lea, J. A., & Fulford, J. A. (1991). Sequential lineup presentation: Technique matters. Journal of Applied Psychology, 76, 741-745. Lindsay, R. C. L., Lea. 1. A., Nosworthy, G. J., Fulford, J. A., Hector, J., LeVan, V., & Seabrook, C. (1991). Biased lineups: Sequential presentation reduces the problem. Journal of Applied Psychol- ogy, 76, 796-802. Lindsay, R. C. L., Wallbridge, H., & Drennan, D. (1987). Do clothes make the man? An exploration of the effect of lineup attire on eyewitness identification accuracy. Canadian Journal of Behav- ioural Science, 19, 463-478. Lindsay, R. C. L., & Wells, G. L. (1980). What price justice? Exploring the relationship between lineup fairness and identifi- cation accuracy. Law and Human Behavior, 4, 303-314. Lindsay, R. C. L., & Wells, G. L. (1985). Improving eyewitness identifications from lineups: Simultaneous versus sequential lineup presentation. Journal ofApplied Psychology, 70, 556-564. Luus, C. A. E., & Wells, G. L. (1991). Eyewitness identification and the selection of distracters for lineups. Law and Human Behavior, 15, 43-57. Malpass, R. S., & Devine, P. G. (l98Ia). Eyewitness identification: Lineup instructions and the absence of the offender. Journal of Applied Psychology, 66, 482-489. Malpass, R. S., & Devine, P. G. (l98Ib). Realism and eyewitness identification research. Law and Human Behavior, 4, 347-358. Mecklenburg, S. H., Bailey, P. J., & Larson, M. R. (2008). The Illinois Field Study: A significant contribution to understanding real world eyewitness identification issues. Law and Human Behavior, 32, 22-27. Meissner, C. A., Tredoux, C. G., Parker, J. F., & MacLin, O. H. (2005). Eyewitness decisions in simultaneous and sequential lineups: A dual-process signal detection theory analysis. Memory and Cognition, 33, 783-792. Penrod, S. D. (2003). How well are witnesses and police performing? Criminal Justice Magazine, 54,36-47. Phillips, M. R., McAuliff, B. D., Kovera, M. B., & Cutler, B. L. (1999). Double-blind photoarray administration as a safeguard against investigator bias. Journal of Applied Psychology, 84, 940-951. ~ Springer 82 Rosenthal, R. (1976). Experimenter effects in behavioral research. New York: Irvington Publishers. Rosenthal, R. (2002). Covert communication in classrooms, clinics, courtrooms, and cubicles. American Psychologist, 57, 839-849. Rosenthal, R., & Rosnow, R. L. (l99\). Essentials of behavioral research: Methods and data analysis. New York: McGraw-HilI. Russano, M. 8., Dickinson, J. J., Greathouse, S. M., & Kovera, M. B. (2006). Why don't you take another look at number three: Investigator knowledge and its effects on eyewitness confidence and identification decisions. Cardozo Public Law, Policy, and Ethics Journal, 4, 355-379. Schacter, D. L., Dawes, R., Jacoby, L. L., Kahneman, D., Lempert, R., Roediger, H. L., & Rosenthal, R. (2008). Studying eyewit- ness investigations in the field. Law and Human Behavior, 32, 3- 5. Snyder, M. (1984). When belief creates reality. In L. Berkowitz (Ed.), Advances in experimental social psychology (Vol. 18, pp. 247- 305). San Diego, CA: Academic Press. Steblay, N. (1997). Social influence in eyewitness recall: A meta- analytic review of lineup instruction effects. Law and Human Behavior, 21, 283-298. Steblay, N., Dysart, 1., Fulero, S., & Lindsay, R. C. L. (2001). Eyewitness accuracy rates in sequential and simultaneous lineup ~ Springer Law Hum Behav (2009) 33:70-82 presentations: A meta-analytic comparison. Law and Human Behavior, 25, 459-473. Technical Working Group for Eyewitness Evidence. (1999). Eyewit- ness evidence: A guide for law enforcement (NCJ 178240). National Institute of Justice, U.S. Department of Justice. Tredoux, C. G. (1998). Statistical inference on measures of lineup fairness. Law and Human Behavior, 22, 217-237. Wells, G. (1988). Eyewitness identification: A system handbook. Toronto, Canada: Carswell. Wells, G. L., & Lindsay, R. C. L. (1980). On estimating the diagnosticity of eyewitness non-identifications. Psychological Bulletin, 88, 776-784. Wells, G. L., & LUllS, C. A. E. (1990). Police lineups as experiments: Social methodology as a framework for properly conducted lineups. Personality and Social Psychology Bulletin, 16, 106- 117. Wells, G. L., & See1au, E. P. (1995). Eyewitness identification: Psychological research and legal policy on lineups. Psychology, Public Policy, and Law 1, 765-791. Wells, G. L., Small, M., Penrod, S., Malpass, R. S., Fulero, S. M., & Brimacombe, C. A. E. (1998). Eyewitness identification proce- dures: Recommendations for lineups and photospreads. Law and Human Behavior, 22, 1-39. Memory Distortion in Eyewitnesses: A Meta-Analysis of the Post-identification Feedback Effect AMY BRADFIELD DOUGLASS1* and NANCY STEBLAY2 1Bates College, USA 2Augsburg College, USA SUMMARY Feedback administered to eyewitnesses after they make a line-up identification dramatically distorts a wide range of retrospective judgements (e.g. G. L. Wells & A. L. Bradfield, 1998 Journal of Applied Psychology, 83(3), 360–376.). This paper presents a meta-analysis of extant research on post-identification feedback, including 20 experimental tests with over 2400 participant-witnesses. The effect of confirming feedback (i.e. ‘Good, you identified the suspect’) was robust. Large effect sizes were obtained for most dependent measures, including the key measures of retrospective certainty, view and attention. Smaller effect sizes were obtained for so-called objective measures (e.g. length of time the culprit was in view) and comparisons between disconfirming feedback and control conditions. This meta-analysis demonstrates the reliability and robustness of the post-identification feedback effect. It reinforces recommendations for double-blind testing, recording of eyewitness reports immediately after an identification is made, and reconsideration by court systems of variables currently recommended for consideration in eyewitness evaluations. Copyright # 2006 John Wiley & Sons, Ltd. Media coverage of DNA exonerations has highlighted the fact that mistaken eyewitness identifications can result in wrongful convictions of innocent suspects (e.g. Doyle, 2005, www.innocenceproject.org). Long before the problem of eyewitness misidentification reached public consciousness, however, psychological researchers explored the memory and social influence processes underlying identification errors. Recently, this research has generated procedures designed to minimize the likelihood of a false identification (Davies & Valentine, 1999; Technical Working Group for Eyewitness Evidence, 1999; Wells, Small, Penrod, Malpass, Fulero, & Brimacombe, 1998). Current recommendations for police lineups include five core components: effective use of fillers (e.g. Wells, Rydell, & Seelau, 1993); blind administration of the line-up (e.g. Douglass, Smith, & Fraser-Thill, 2005; Phillips, McAuliff, Cutler, & Kovera, 1999); a cautionary instruction to the witness that the culprit may or may not be present in the set of photos (Malpass & Devine, 1981; Steblay, 1997), sequential rather than simultaneous presentation of photos (e.g. Lindsay & Wells, 1985; Steblay, Dysart, Fulero, & Lindsay, 2001), and obtaining a statement of certainty from the witness at the time of the identification decision (e.g. Luus & Wells, 1994). As researchers continue to advance knowledge of best line-up practices, a number of APPLIED COGNITIVE PSYCHOLOGY Appl. Cognit. Psychol. 20: 859–869 (2006) Published online 19 June 2006 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/acp.1237 *Correspondence to: A. B. Douglass, Bates College, 4 Andrews Road, Lewiston, ME 04240, USA. E-mail: adouglas@bates.edu Copyright # 2006 John Wiley & Sons, Ltd. jurisdictions in the United States are bringing these science-based recommendations to effective field practice (see Klobuchar, 2005). Line-up research recently has produced an ancillary line of investigation focusing on the integrity of an eyewitness’s recollections after the line-up decision is made. This growing body of literature has revealed the astonishing power of a casual comment from a line-up administrator to affect eyewitness memory. The first study to examine this effect (Wells & Bradfield, 1998) demonstrated that confirming post-identification feedback received by the witness immediately after the identification (i.e. ‘Good. You identified the actual suspect’.) significantly inflated retrospective confidence reports when compared with a control group told nothing about the accuracy of the identification (participants indicated how certain they were at the time of their identification). Perhaps more alarming is that an extensive range of variables was inflated in conjunction with retrospective certainty, including witness reports of the quality of their view of the perpetrator, howmuch attention was paid, ease of the identification, and basis for the identification. Participants who received confirming feedback were also more willing to testify about their identification and reported a greater ability to remember strangers. The post-identification feedback effect bears a resemblance to Fischhoff’s hindsight bias (1977) in which participants given the correct answer to a decision indicated how they would have responded, had they not known the correct answer. Participants’ estimates of their own accuracy were routinely higher than the actual accuracy of participants who made the same decision without knowing the correct choice. There are two important differences between Fischhoff’s paradigm and the post-identification feedback paradigm. First, in the feedback paradigm participants cannot misremember their prior decision because feedback is administered immediately after the identification is made. Second, participants are asked to recall judgements surrounding a decision, rather than a decision itself (see Bradfield & Wells, 2005, for a fuller discussion of these issues). Therefore, the post-identification feedback effect demonstrates that outcome information can distort memories beyond the boundaries first outlined by Fischhoff. Subsequent research has replicated the post-identification feedback findings with variations in experimental design designed to explore their theoretical underpinnings. One explanation for the effect hypothesized that participants do not consider their judgements before being queried in the dependent measures questionnaire. At that time, the only way to consider judgements about the witnessed event and the identification procedure is through the lens of the feedback received. One set of experiments explicitly tested this possibility. Wells and Bradfield (1999) manipulated whether participants were instructed to think about testimony-relevant judgements before receiving feedback. Participants who answered questions about their certainty before hearing feedback were inoculated against the effect of feedback on the certainty dependent measure—their judgements did not show the typical post-identification feedback inflation on retrospective certainty. Research in a related paradigm demonstrated a similar ability of prior thought to protect participants against the memory distorting effects of feedback (Bradfield & Wells, 2005). The post-identification feedback effect is noteworthy for multiple reasons. First, eyewitnesses in the feedback paradigm typically have made identifications from target- absent photospreads—all of their identifications are inaccurate. Consequently, their distorted reports correspond to mistaken identifications of innocent suspects, a forensically relevant scenario of critical importance given the eyewitness errors exposed by DNA exoneration cases (e.g. Davies, 1996; Rattner, 1988). Second, this powerful effect is Copyright # 2006 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. 20: 859–869 (2006) 860 A. B. Douglass and N. Steblay produced by a simple and seemingly casual comment from the line-up administrator—a ‘system’ variable (Wells, 1978) that potentially could be controlled in police practice. Third, the aspects of eyewitness experience distorted by post-identification feedback (e.g. certainty, witness perception of his/her view of the perpetrator, attention given to the witnessed event, ease of identification) are the very attributes that are likely to bolster eyewitness credibility in the eyes of investigators, prosecutors, and juries. Research has established that people who evaluate eyewitness identifications routinely and naturally assume that confidence (certainty) is correlated with accuracy (e.g. Leippe, 1994) and continue to use confidence to assess accuracy even after being told that the two are not reliably linked (Fox &Walters, 1986). Finally, court systems have explicitly recommended using some of the very criteria distorted by post-identification feedback in evaluations of eyewitnesses. The US Supreme Court recommends using certainty, view, and attention reports (e.g. Neil v. Biggers, 1972); courts in England and Wales recommend using view and attention (R v Turnbull, 1977); the Australian Law Reform Commission (2005) is currently reviewing jury instructions regarding eyewitness identification evidence. Considering the research findings reported above, these recommendations demand scrutiny. Since the post-identification feedback effect entered the published eyewitness literature in 1998, many researchers have explored this phenomenon. However, in spite of strong academic interest in the topic, there has not been a systematic organization and evaluation of the research. The current research aims to provide such structure using the tool of meta- analysis. Meta-analysis already has been useful in the psycho-legal realm, as it provides objective quantitative indicators of the status of a hypothesized effect, detailed analysis of effect moderators and direction for advances in research design, theory and practice (see recent meta-analyses on topics of line-up instruction, Steblay, 1997; sequential presentation, Steblay, et al., 2001; and showups, Steblay, Dysart, Fulero, & Lindsay, 2003.). Systematic evaluation of extant research on the post-identification feedback effect will assist future researchers by guiding the selection of variables and experimental paradigms that can target the causes and parameters of the effect. Equally important, this meta- analysis will provide a summary of knowledge for a broader audience that includes line-up administrators and court personnel. Line-up administrators are often interested in learning about strategies for obtaining eyewitness evidence that are immune to challenges from the defence. Similarly, court personnel (including defence attorneys) are interested in hearing from experts about procedures that might have compromised the integrity of the eyewitness’s memory. With a meta-analysis as a foundation for their recommendations, experts involved in conversations with these constituencies will be able to inform both more comprehensively. METHOD Sample The sample included 20 experimental tests from 14 studies. Studies were obtained from a search of PsycInfo and additional conversation with researchers within this area of expertise. The final sample included 10 published and 4 unpublished studies, representing 2477 participant-witnesses. The majority of the studies were conducted in the United Copyright # 2006 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. 20: 859–869 (2006) Post-identification feedback meta-analysis 861 States (n¼ 11) with others conducted in the United Kingdom (n¼ 1), and Australia (n¼ 2). In order to be included in the meta-analysis, the study must have included a laboratory test of the confirming feedback effect, retrospective certainty as a dependent variable, and data that could provide calculation of an effect size for the comparison of a group that received confirming feedback to a control group. The studies included in this analysis were conducted between 1998 and 2005 using participants who ranged in age from 11 to 97; most were college students. Participants of both genders were included in 100% of the studies. All studies used videotaped stimuli as the witnessed event with a range of length from 60 seconds to 180 seconds and required participants to make an identification from a photospread containing colour photographs. One exception was Bradfield, Wells, & Olson, 2002 in which participants made an identification from a videotaped lineup. Sample sizes ranged from 62 to 320 (M¼ 176.93). Dependent measures A total of 13 dependent measures were recorded from the 14 studies analysed, not all of which were included in each study analysed. The measures fell into three broad categories. First were retrospective judgements regarding the witnessed event. Measures in this category include: view, attention paid, ability to make outfacial features, basisfor an identification, quality of the culprit’s image in memory, distance of the camera from the perpetrator, and length of time the perpetrator wasin view. A second set of measures concerned aspects of participants’ identification experience: retrospective certainty, easeof identification and timeneeded to make the identification. Finally, measures concerning summative judgements were analysed: general ability to remember strangers, reports of trust in an eyewitness who had similar viewing conditions, and willingness to testify. Statistics Cohen’s d, the standardized mean difference between two groups, was calculated as the effect size indicator for each comparison (Cohen, 1988). In the following results, d is used to indicate a mean effect size across tests. A meta-analytic Z (Zma) was calculated using Rosenthal’s (1991) method of combining t-values. This Zma provides an overall probability level associated with the observed pattern of results. A fail-safe N (Nfs) was calculated as a means to determine the number of unretrieved studies averaging null results necessary to bring the overall p-value to a specific level of significance (in this case, p¼ 0.05). This number of studies, or tolerance for future null results, allows us to evaluate the resistance of the review conclusion to a ‘file drawer threat’ (Rosenthal, 1991). Comparisons Eleven of the tests compared a confirming feedback (CF) condition to a no feedback (NF) control group. Six compared CF to disconfirming feedback (DF) condition. Three compared DF and NF groups. The focus of our study is the first comparison (CF vs. NF) as that is the forensically relevant contrast because of the inflationary power of confirming feedback for a witness who has identified a suspect. Copyright # 2006 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. 20: 859–869 (2006) 862 A. B. Douglass and N. Steblay RESULTS Primary analysis: Comparison between confirming feedback (CF) and no feedback (control) groups, on each dependent measure Certainty Certainty is arguably the most important dependent measure in the post-identification feedback paradigm. Participants receiving confirming feedback expressed significantly more retrospective confidence in their decision compared with participants who received no feedback (d¼ 0.79, Zma¼ 13.42, p< 0.0001). An effect size of 0.79 is considered large, based on Cohen’s rule of thumb (Cohen, 1988) (see Table 1). Biggers criteria Eyewitness certainty (noted above), opportunity to view the perpetrator, and attention paid to the event are qualities of the eyewitness viewing experience that, according to the US Supreme Court (Neil v. Biggers, 1972), are criteria relevant to juror decision-making (recommendations in England and Wales focus on view and attention). Participants’ retrospective reports of view and attention (ds¼ 0.50 and 0.46, respectively) were significantly affected by confirming feedback, producing medium effect sizes. Related subjective measures CF participants demonstrated consistently inflated perceptions on subjective measures related to their line-up performance compared to the control group. Participants who received confirming feedback reported that they possessed a significantly better basis for making the identification (d¼ 0.77), greater clarity of the perpetrator’s image in mind (d¼ 0.68), greater ease of identification (d¼ 0.80), and needing less time to make their ID (d¼ 0.45). They also reported a better memory for strangers’ faces (d¼ 0.45) and greater trust in the memory of another witness with a similar experience (d¼ 0.52). Not surprisingly, then, they are also more willing to testify about their identification decision (d¼ 0.82). Table 1. Confirming feedback vs. No feedback comparison: Retrospective reports Dependent measure Tests d Range (min, max) Nfs Certainty at time of ID 11 0.79 0.20 1.27 590 How good a view? 9 0.50 0.02 0.90 132 Opportunity to view face 9 0.55 0.04 1.04 165 Attention paid 9 0.46 0.27 0.67 145 Good basis to make an ID 9 0.77 0.56 1.10 386 Ease of making an ID 9 0.80 0.35 1.02 587 Speed of ID 9 0.45 0.12 0.67 104 Willing to testify 9 0.82 0.43 1.13 437 My memory for strangers 8 0.45 0.19 0.84 75 Clarity of image in my mind 7 0.68 0.30 1.17 150 Trust in eyewitness with similar experience 3 0.52 0.41 0.71 <1 How far away? 2 0.12 0.10 0.13 <1 How long in view? 4 0.29 0.09 0.69 <1 Confidence ‘right now’ 2 0.53 0.31 0.75 19 Comparisons for all measures produced statistically significant Zma values (p< 0.05), except for ‘how far away’ and ‘how long in view’. Copyright # 2006 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. 20: 859–869 (2006) Post-identification feedback meta-analysis 863 ‘Objective’ measures Smaller effect sizes and no statistically significant differences were found for attributes of the participants’ experience that are ostensibly objective: time that the perpetrator was in view and distance from the camera to the perpetrator (ds¼ 0.29, and 0.12, respectively). These smaller effect sizes are noteworthy for at least two reasons: They suggest some limits to the influences of confirming feedback; they also indicate discernment and seriousness on the part of the research participants (i.e., participants were not simply employing a thoughtless response set across measures). However, they also tap information towhich a subject could surmise the experimenter has access—knowable facts—unlike an investigator in a real crime situation. Moderators The small number of studies available did not allow for extensive moderator analysis. However, it may be noted that post-identification feedback effects are quite robust. Overall, the studies involved a reasonably diverse sample of participants (undergraduates, children, adults) and stimulus materials. The effects were achieved for witnesses who made accurate identifications in target-present lineups as well as false IDs in target absent arrays, although the effect is stronger for inaccurate witnesses (Bradfield et al., 2002). Semmler, Brewer, & Wells (2004) found post-identification effects for witnesses who rejected the lineup also (‘He’s not there’). Semmler et al., also found the effects when a cautionary instruction (‘may or may not be in the lineup’) was provided; Douglass and McQuiston-Surrett (in press) found the effects with both sequential and simultaneous lineups. Secondary analysis: Comparison between disconfirming feedback (DF) and no feedback (Control) groups, on each dependent measure. Only three tests (in three separate studies) explored the impact of disconfirming feedback on participants’ retrospective reports. These reports produce small average effect sizes and some inconsistencies. For dependent measures of view and ease of ID, participants receiving disconfirming feedback indicate less positive retrospective reports in all three studies, (d¼0.14 and 0.31, respectively). On measures of retrospective confidence (d¼0.21), ability to make out details of the face (d¼0.04), attention (d¼0.08), basis for ID (d¼0.10), time to make an ID (d¼ 0.01), and willingness to testify (d¼0.10), the three tests show mixed results—two tests with less positive reports from the DF condition, one test with more positive reports (negative effect sizes indicate higher scores from the NF control condition).1 DISCUSSION Through this review, the reliability and robustness of the post-identification feedback effect are well documented. Over 2400 participant-witnesses have been tested, with remarkably consistent outcomes. Compared to control participants, those who receive a simple post- identification confirmation regarding the accuracy of their identification significantly 1Not surprisingly, a comparison of CF and DF conditions indicates substantial differences between the groups in retrospective certainty (d¼ 1.07), view (0.69), memory for the face (0.77), attention (0.67), basis for judgment (0.79), ease of ID (1.01), time to make an ID (0.60), willingness to testify (1.01), memory for strangers (0.58), clarity of image in memory (0.99), and trust in an eyewitness with similar experience (0.56). Copyright # 2006 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. 20: 859–869 (2006) 864 A. B. Douglass and N. Steblay inflate their reports to suggest better witnessing conditions at the time of the crime, stronger memory at the time of the lineup, and sharper memory abilities in general. Participants apparently make what would otherwise seem to be reasonable post hoc inferences about their witnessing experience and behaviour during the identification. However, these inferences are based on erroneous information external to their actual memory of these events. This ‘creeping determinism’ (Fischhoff, 1975) produces a memory distortion that is by no means benign. The implications of these results are quite profound. Both memory for a crime and confidence in one’s memory are fragile and potentially slippery evidentiary elements. Indeed, one of the startling lessons of line-up research is just how powerful seemingly subtle aspects of line-up construction and investigator behaviour can be. The simple addition of a cautionary instruction (the perpetrator ‘may or may not be in the lineup’) produces a significant (25%) drop in false identifications (Steblay, 1997); and use of a sequential lineup cuts the false identification rate almost in half (by 23%) in target absent lineups (Steblay et al., 2001). Similarly, subtle changes in investigator behaviour derived from the knowledge an investigator has about the identity of the suspect can influence witnesses’ identification decisions (e.g. Douglass et al., 2005; Phillips et al., 1999) and confidence (Garrioch & Brimacombe, 2001). Although the present data do not allow for a precise calculation of the number of errors that could be avoided with a change in practice, they do provide dramatic evidence that post-identification feedback can compromise the integrity of a witness’s memory. Wells and Bradfield (1998) made this point clearly when reporting participant-witnesses’ disproportionate use of the extreme end of the ‘certainty’ scale: 50% of CF participants in that study indicated certainty of six or seven on a 7-point scale (compared with 15% using the extreme end of the scale in the DF condition). The relevance for real cases is clear. First, as noted by Wells and Bradfield (1998), it is reasonable to assume that an eyewitness must exceed some threshold of credibility in order for investigators and prosecutors to move ahead in their case against a suspect. Witnesses who reconstruct and enhance their report of both witnessing and identification procedures may well increase the likelihood that a case against that suspect will be pursued. Frighteningly, this enhancement is not due to increased accuracy, but to extra-memory factors. Second, witnesses with feedback- enhanced memories will likely be more compelling witnesses at trial, increasing the chances of a conviction—an unwelcome outcome if an innocent suspect was identified. Clear understanding of the impact of post-identification confirmation can facilitate the goal of many eyewitness researchers—prevent mistaken identifications from resulting in wrongful convictions. This meta-analysis can help accomplish this goal in several ways. First, this research should provide police with a strong rationale as to why it is critically important to administer double-blind photospreads and to immediately record eyewitness confidence. These procedures could decrease the likelihood that juries will be erroneously impressed by a falsely confident eyewitness. This is especially critical because at least one study demonstrates that participant-jurors are not sensitive to eyewitnesses who display confidence that has inflated over time (Bradfield & McQuiston, 2004). Additionally, this meta-analysis should influence the treatment of information regarding post-identification feedback effects in court by providing attorneys and experts with a stronger foundation from which to argue that a witness’s memory could be distorted if double-blind procedures were not followed and immediate confidence reports not recorded. For experts who testify in court, this meta-analysis will facilitate admittance of testimony on this topic. Most American courts now use the Daubert standard for admitting expert testimony, one element Copyright # 2006 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. 20: 859–869 (2006) Post-identification feedback meta-analysis 865 of which is that the information presented by an expert must have achieved ‘general acceptance in the relevant scientific community’ (Daubert v. Merrell Dow Pharmaceu- ticals, 1993). The consistency in outcomes demonstrated in this review lends credence to the argument that post-identification feedback effects should be ‘generally accepted’’. Directions for future research The effects reported here are remarkably consistent, suggesting that future research will target explanations for the post-identification feedback effect rather than resolution of inconsistencies. One direction for future research is to identify ways in which to anchor the witness’s memory in the witnessing experience itself rather than in post-event information. Wells and Bradfield (1999) have found some success in moderating the post-identification effect with instructions to the witness to privately think about his or her confidence and attributes of thewitnessing experience prior to receiving feedback. The videotaping of line- up procedures may also provide the means to later remind a witness (as well as a jury) of his or her confidence and perceptions at the time of the lineup (e.g. Kassin, 1998; Sporer, 1993). However, because even a 48-hour delay did not diminish distorted retrospective reports (Wells, Olson, & Charman, 2003), finding other ways to anchor witnesses’ memory is critical. Although the small number of studies did not allow for comprehensive analyses of moderator variables, there was enough evidence to suggest that ’objective’ measures are less susceptible to memory distortion than are ‘subjective’ measures. Therefore, it might be worthwhile to examine this variable more systematically. Perhaps the difference is due to the fact that participants realize those questions can be evaluated for accuracy by the experimenters. Would the same difference appear in a paradigm where participants knew that experimenters did not have access to accurate answers (i.e. in a more ecologically valid paradigm)? Other directions for future research include pursuing explanations for conditions under which the feedback effect is diminished such as when disconfirming feedback is administered. Although witnesses who receive disconfirming feedback probably have minimal impact on the criminal justice system (i.e. because they do not testify in court), an explanation for the smaller effects of disconfirming feedback could provide information about the nature of eyewitness memory and how it interacts with social influence cues. Finally, researchers might pursue feedback analogues. Would learning about accuracy from sources outside the immediate identification experience—e.g. a news report, a prosecutor, another witness (cf. Luus & Wells, 1994)—have the same distorting effects on retrospective confidence and perceptions of the witnessing conditions? The current research reveals an increased willingness of witnesses to testify in court. Does this eagerness translate to differences in subsequent interview and/or courtroom behaviour? Recommendations The primary recommendation to be made from this meta-analysis is straightforward— feedback to the witness should not be part of the identification procedure. There is also a straightforward strategic solution: use a blind line-up administrator, thoroughly record the line-up process, and obtain eyewitness reports (particularly confidence) immediately after the identification. Currently, blind administrators are recommended in order to guard against memory errors during the line-up decision (e.g. Wells et al., 1998). In England and Wales, the Police and Criminal Evidence Act (PACE) dictates that an officer who is not Copyright # 2006 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. 20: 859–869 (2006) 866 A. B. Douglass and N. Steblay involved in the investigation conduct the line-up procedure (although he or she does know who the suspect is, Davies & Valentine, 1999). Collateral benefits of a blind administrator are afforded in that no feedback could be provided to the witness until after completion of the line-up procedure and documentation of testimony-relevant judgements (Technical Working Group for Eyewitness Evidence, 1999). A final recommendation is directed at courts considering recommendations for evaluations of eyewitnesses. In the United States, the Supreme Court should reconsider its current recommendation for evaluations of eyewitness testimony. Of the five criteria outlined by the US Supreme Court in Neil v. Biggers (1972), three are dramatically distorted by post-identification feedback: confidence, attention and view. Similarly, the Turnbull Rules in England and Wales also include two variables distorted by post- identification feedback: attention and view (R v Turnbull, 1977). Rulings from courts suggesting that these variables only be taken into account if post-identification feedback has not been administered would likely do much to decrease the incidence of feedback in real world cases. Barring reconsideration of these criteria in court recommendations, researchers should continue to press for immediate witness reports and blind line-up administration. These practices are best suited to prevent the memory distorting effects of post-identification feedback. ACKNOWLEDGEMENTS We thank Margaret Mandeville and Emily Parker for help with data entry. REFERENCES Articles included in the meta-analysis Australian Law Reform Commission (2005). Discussion Paper 69, Review of the Uniform Evidence Acts. Retrieved January 4, 2006 from http://www.austlii.edu.au/au/other /alrc/publications/dp/69/ Bradfield, A., & McQuiston, D. E. (2004). When does evidence of eyewitness confidence inflation affect judgments in a criminal trial? Law and Human Behavior, 28(4), 369–387. Bradfield, A., &Wells, G. L. (2005). Not the same old hindsight bias: Outcome information distorts a broad range of recollections. Memory and Cognition, 33(1), 120–130. Bradfield, A. L.,Wells, G. L., &Olson, E. A. (2002). The damaging effect of confirming feedback on the relation between eyewitness certainty and identification accuracy. Journal of Applied Psy- chology, 87, 112–120. Coddington, K. A., & Brigham, J. C. (2000, March). The malleability of eyewitness metamemory judgments: The effect of question difficulty. Poster presented at the biennial meeting of the American Psychology-Law Society meeting. New Orleans, LA. Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum. Daubert v. Merrell Dow Pharmaceuticals, Inc.,113 S.Ct. 2786 (1993). Davies, G. M. (1996). Mistaken identifications: Where law meets psychology head on. The Howard Journal, 35, 232–241. Davies, G. M., & Valentine, T. (1999). Codes of practice for identification. Expert Evidence: International Journal of Behavioural Sciences in Legal Context, 7(1), 59–65. Douglass, A. B. (see Bradfield), & McQuiston-Surrett, D. (in press). Post-identification feedback: Exploring the effects of sequential photospreads and eyewitnesses’ awareness of the identification task. Applied Cognitive Psychology. DOI: 10.1002/acp1253 Copyright # 2006 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. 20: 859–869 (2006) Post-identification feedback meta-analysis 867 Douglass, A. B. (see Bradfield), Smith, C., & Fraser-Thill, R. (2005). A problem with double-blind photospread procedures: Photospread administrators use one eyewitness’s confidence to influence the identification of another eyewitness. Law and Human Behavior, 29(5), 543–562. Doyle, J. M. (2005). True witness: Cops, courts, science, and the battle against misidentification. Palgrave Macmillin: New York. Fischhoff, B. (1975). Hindsight 6¼ foresight. Journal of Experimental Psychology: Human Perception & Performance, 1, 288–299. Fischhoff, B. (1977). Perceived informativeness of facts. Journal of Experimental Psychology: Human Perception & Performance, 3, 349–358. Fox, S. G., & Walters, H. A. (1986). The impact of general versus specific expert testimony and eyewitness confidence upon mock juror judgment. Law and Human Behavior, 10, 215–228. Garrioch, L., & Brimacombe, C. A. E. (2001). Lineup administrators’ expectations: Their impact on eyewitness confidence. Law and Human Behavior, 25(3), 299–315. Hafstad, G. S., Memon, A., & Logie, R. (2004). Post-identification feedback, confidence and recollections of witnessing conditions in child witnesses. Applied Cognitive Psychology, 18, 901– 912. Kassin, S. (1998). Eyewitness identification procedures: The fifth rule. Law and Human Behavior, 22, 649–654. Klobuchar, A., Steblay, N., & Caligiuri, H. (in press). Improving eyewitness identifications: Hennepin County’s blind sequential lineup pilot project. Cardozo Public Law, Policy & Ethics Journal. Leippe, M. R. (1994). The appraisal of eyewitness testimony. In D. F. Ross, J. D. Read, & H. P. Toglia (Eds.), Adult eyewitness testimony: Current trends and developments (pp. 385–418). New York: Cambridge University Press. Lindsay, R. C. L., & Wells, G. L. (1985). Improving eyewitness identifications from lineups: Simultaneous versus sequential lineup presentation. Journal of Applied Psychology, 66, 79–89. Luus, C. A. E., & Wells, G. L. (1994). The malleability of eyewitness confidence: Co-witness and perseverance effects. Journal of Applied Psychology, 79, 714–723. Malpass, R. S., & Devine, P. G. (1981). Eyewitness identification: Lineup instructions and the absence of the offender. Journal of Applied Psychology, 79, 714–734. Neil v. Biggers. 409 U.S. 188 (1972). Neuschatz, J. S., Preston, E. L., Burkett, A. D., Toglia, M. P., Lampinen, J. M., Neuschatz, J. S., Fairless, A. H., Lawson, D. S., Powers, R. A., Goodsell, C. A. (2005). The effects of post- identification feedback and age on retrospective eyewitness memory. Applied Cognitive Psychol- ogy, 19, 435–453. Phillips, M. R., McAuliff, B. D., Kovera, M. B., & Cutler, B. L. (1999). Double-blind photoarray administration as a safeguard against investigator bias. Journal of Applied Psychology, 84, 940– 951. R v. Turnbull. Queen’s Bench 224 (1977). Rattner, A. (1988). Convicted but innocent: Wrongful conviction and the criminal justice system. Law and Human Behavior, 12(3), 283–293. Rosenthal, R. (1991). Meta-analytic procedures for social research. Newbury Park, CA: Sage. Semmler, C., Brewer, N., &Wells, G. L. (2004), Effects of postidentification feedback on eyewitness identification and nonidentification confidence. Journal of Applied Psychology, 89(2), 334–346. Sporer, S. L. (1993). Eyewitness identification accuracy, confidence and decision times in simul- taneous and sequential lineups. Journal of Applied Psychology, 78(1), 22–33. Steblay, N. (1997). Social influence in eyewitness recall: Ameta-analytic review of lineup instruction effects. Law and Human Behavior, 21(3), 283–298. Steblay, N., Dysart, J., Fulero, S., & Lindsay, R. C. L. (2001). Eyewitness accuracy rates in sequential and simultaneous lineup presentations: A meta-analytic comparison. Law and Human Behavior, 25(5), 459. Steblay, N., Dysart, J., Fulero, S., & Lindsay, R. C. L. (2003). Eyewitness accuracy rates in showup and lineup presentations: A meta-analytic comparison. Law and Human Behavior, 27(5), 523–540. Technical Working Group for Eyewitness Evidence. (1999). Eyewitness evidence: A guide for law enforcement. Washington, DC: National Institute of Justice; NCJ 178240. Wells, G. L. (1978). Applied eyewitness-testimony research: System variables and estimator variables. Journal of Personality & Social Psychology, 36(12), 1546–1557. Copyright # 2006 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. 20: 859–869 (2006) 868 A. B. Douglass and N. Steblay Wells, G. L., & Bradfield, A. L. (1998). ‘‘Good, you identified the suspect’’: Feedback to eyewitnesses distorts their reports of the witnessing experience. Journal of Applied Psychology, 83(3), 360–376. Wells, G. L., & Bradfield, A. L. (1999). Distortions in eyewitness’ recollections: Can the postidentification-feedback effect be moderated? Psychological Science, 10, 138–144. Wells, G. L., Olson, E. A., & Charman, S. D. (2003). Distorted retrospective eyewitness reports as functions of feedback and delay. Journal of Experimental Psychology: Applied, 9, 42–52. Wells, G. L., Rydell, S. M., & Seelau, E. P. (1993). The selection of distractors for eyewitness lineups. Journal of Applied Psychology, 78(5), 835–844. Wells, G. L., Small, M., Penrod, S., Malpass, R. S., Fulero, S. M., & Brimacombe, C. A. E. (1998). Eyewitness identification procedures: Recommendations for lineups and photospreads. Law and Human Behavior, 22, 603–647. Copyright # 2006 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. 20: 859–869 (2006) Post-identification feedback meta-analysis 869 Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 1 65 Vand. L. Rev. 451 Vanderbilt Law Review March, 2012 Articles EYEWITNESSES AND EXCLUSION Brandon L. Garrett a1 Copyright (c) 2012 Vanderbilt Law Review, Vanderbilt University Law School; Brandon L. Garrett Introduction 451 I. Eyewitness Procedure and Psychology 457 A. Eyewitness Identification Procedures 457 B. From Stovall to Manson 463 1. The Supreme Court Intervenes in Eyewitness Identification: Stovall, Wade, and Gilbert 463 2. Manson and the Modern Two-Step Inquiry 467 C. Social Science Research 468 II. The Persistence of “Independent Source” Rules 476 A. “Independent Source” Rules 476 B. Crossing Two Lines of Eyewitness Decisions 483 C. What Is Independent About the Source? 485 III. A Partial Exclusion Approach 488 A. Limiting Courtroom Identifications 488 B. Rethinking State Procedure 491 Conclusion 497 Appendix 499 Introduction The U.S. Supreme Court's due process jurisprudence regulating the eyewitness identifications used in tens of thousands of criminal cases each year is not just flawed, but backwards. The Court's highly deferential due process test uses factors that have been *452 the subject of longstanding legal and scientific criticism. 1 In this Article, I argue that the test has a different fundamental flaw. While ostensibly focused on the problem of reliability, the Court's test, as interpreted under a well-established line of cases, encourages the judge to admit the least reliable evidence: an eyewitness identification in the courtroom. In the courtroom, there is no lineup. It is all too obvious who the defendant is, sitting at counsel's table. Yet, as Justice William Brennan wrote, “[T]here is almost nothing more convincing than a live human being who takes the stand, points a finger at the defendant, and says ‘That's the one!”’ 2 An irony of modern constitutional criminal procedure is that in the one area in which the Court intervened specifically to improve the reliability of trial evidence it may have permitted the opposite result. Much of constitutional criminal procedure seeks to regulate the fairness of criminal trials through procedural rights such as the right to counsel, the right to confront adverse witnesses, the right to not incriminate oneself, the right to exclude illegally obtained evidence, or the right to a determination of guilt beyond a reasonable doubt. However, issues relating to the accuracy and reliability of evidence are not usually of constitutional import and are typically left to state evidence law and the trial judge's discretion. 3 Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 2 The Court's eyewitness jurisprudence is different. In its historic 1977 ruling in Manson v. Brathwaite, the Court emphasized that “reliability is the linchpin” for evaluating eyewitness identification procedures. 4 The Court adopted two approaches to regulating identifications. The first, adopted in 1967's United States v. Wade, was a typical criminal procedure approach that recognized a *453 procedural right to counsel at postindictment lineups. 5 The second, adopted in Manson, barred unduly suggestive identification techniques if the identification was also deemed unreliable. The Court's reliability-based due process test was on a collision course with reality. When Manson was decided, social scientists had just embarked on a course of experimental research that would revolutionize our understanding of human memory. As John Monahan has observed, of “all the substantive uses [of] social science in law . . . nowhere is there a larger body of research than in the area of eyewitness identification.” 6 Social scientists showed how memory is not like a videotape, but rather is constructed in a dynamic fashion. As a result, commonly used identification procedures can distort memory and can even produce false identifications. Following the Manson test, a judge may excuse the most blatant coaching of an eyewitness by citing to “reliability” factors. Yet, social scientists showed that the factors judges use do not correspond with reliability. For example, one factor, eyewitness confidence, is not a sign of reliability, but it is highly malleable and may be the product of police suggestion. 7 Even modestly comforting feedback after the identification, like telling an eyewitness, “Good, you identified the suspect,” can make the eyewitness far more confident. 8 In the decades after Manson was decided, hundreds of individuals would be convicted based on eyewitness identifications, but these individuals would later be proven innocent by DNA testing. Those high-profile wrongful convictions made the dangers of eyewitness misidentifications more salient than ever before. 9 In a *454 recent book, I present a study exploring the role that eyewitness evidence played in the trials of the first 250 DNA exonerees. 10 Just as social scientists would have predicted, the vast majority were convicted following suggestive identification procedures and initially uncertain eyewitnesses became absolutely certain by the time of trial. 11 If jurors had fully appreciated how tentative those eyewitnesses were initially and the potential impact of suggestive procedures, they might not have so readily convicted the defendants. Responding to these developments, there has been a nationwide movement to reform criminal procedure to promote greater accuracy and to prevent wrongful convictions. 12 The Supreme Court has taken note of, but has not responded to, these developments; in its recent decision in Perry v. New Hampshire, the Justices showed little interest in thinking about the due process test, much less rethinking it. 13 That case did not involve a lineup, but rather a situation in which police claimed they did not intentionally arrange a one-on-one identification. Most troubling, though was that the majority opinion suggested, in ruling that the Manson test did not apply, that eyewitness testimony did not deserve different treatment than other forms of potentially unreliable evidence. The Court noted that “all in-court identifications” involve “some elements of suggestion,” but suggested that such “potential unreliability” does not counsel additional due process regulation. 14 On the other hand the Court emphasized: “We do not doubt either the importance or the fallibility of eyewitness identifications.” 15 Eyewitness evidence poses a unique problem in that jurors see a seemingly powerful but suggestive in-court identification, while standard tools like cross-examination cannot show how the very memory of an eyewitness may have been altered by unsound identification procedures; Justice Sotomayor countered in dissent that suggestion impairs “meaningful cross-examination.” 16 The majority did, however, suggest that careful jury instructions and expert testimony are important “safeguards” in the *455 States. 17 Law enforcement, state courts and legislatures do not have the luxury of remaining aloof from the problem, since they confront the consequences of eyewitness misidentifications first-hand. Many have improved their identification procedures. 18 Some states and many more local jurisdictions have adopted double-blind lineups, which psychologists have long recommended. 19 In a double-blind lineup, the officer does not know which person is the suspect, and the eyewitness is told that the officer does not know. That simple procedure can effectively prevent suggestion from contaminating identifications. An important field Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 3 study has also now confirmed the advantage of conducting identification procedures in a sequential fashion so that images are shown to the subject one at a time. 20 However, the Court's interventions, intended to modestly improve accuracy, may now perversely undermine such reforms. What happens if the police do not follow best practices, and the police fail to use, say, a double-blind lineup? At that point, the question for a judge is whether to exclude that identification or to admit it. Yet, a judge may exclude the prior identification but still allow the eyewitness to identify the defendant in the courtroom. As I will develop, state courts have used “independent source” rules to allow courtroom identifications despite inadmissible out-of-court identifications by the same eyewitness. These rules have gone largely unnoticed by scholars but have been adopted almost universally. 21 In *456 no uncertain terms, judges explain that they allow the courtroom identification because of the “independent” memory that the eyewitness supposedly has from the time of the crime. This pernicious doctrine of “independent source” came from a conflation of the two separate strands in the Court's eyewitness identification jurisprudence. It was misappropriated from Sixth Amendment rulings on a right to counsel at a preindictment lineup. Indeed, the doctrine had its origins in Fourth Amendment search and seizure law. 22 The concept of an independent source then found its way into cases ostensibly dealing with the substantive issue of the reliability of an identification. Had lower courts properly understood the Supreme Court's due process cases, to say nothing of the social science research on eyewitness memory, they would not so liberally allow courtroom identifications. By contrast, evidence law has, in its way, long recognized that the drama of a courtroom identification should not supplant prior identification procedures intended to test an eyewitness's memory. Courts have permitted the introduction of out-of-court identifications as a special hearsay exception precisely because they are far more reliable than courtroom identifications that may just confirm what came before. 23 From not only a social science perspective, but also an evidence law perspective, the regulation of eyewitness identifications has it backwards. At a time when the Supreme Court has eroded the strength of the exclusionary rule for procedural violations, in particular search and seizure violations, 24 and has held reliability is not of due process concern if police did not “arrange” an eyewitness identification, 25 we should reconsider the path not taken: exclusionary rules to promote substantive reliability. In the eyewitness context, criminal procedure rules could be revisited to reverse the focus of exclusion. I propose a partial exclusion approach. Exclusion is a blunt instrument. Judges *457 are understandably reluctant to completely exclude the testimony of a key eyewitness, perhaps the victim of a serious crime. That evidence may be crucial to maintaining a criminal prosecution. Today courts almost always allow courtroom identifications, but they sometimes bar prior identifications. Instead, courts should per se exclude courtroom identifications if there was a prior identification, but they should sometimes admit out-of-court identifications. The result will encourage greater attention to procedures used out-of-court, when the eyewitness's memory was most fresh, reliable, and accurate. I am not sanguine that this change will occur, given careless judicial rulings and, as a result, limited incentives of defense counsel to properly litigate these issues. However, improved eyewitness procedures are increasingly required by state courts and statutes. Directing my observations to criminal procedure reformers, I argue that courtroom identifications following prior identifications should be per se excluded. 26 More broadly, eyewitness identification testimony should be regulated by factors informed by social science. First and foremost, jurisdictions should ensure that proper identification procedures are conducted in the first instance. Second, they should task judges with evaluating reliability of the evidence at hearings pretrial based on a social science framework and not the Manson test, and then at trial, if identification evidence is admitted, providing detailed instructions to educate jurors. Social scientists have for some time outlined best practices for conducting sound identification procedures, and as a second step, the Henderson decision in New Jersey provides an early model for a framework to govern the use of eyewitness evidence in court. 27 Finally, I suggest that an accuracy-oriented approach to regulation of criminal trial evidence has broader applications for criminal procedure and for future scholarship. Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 4 I. Eyewitness Procedure and Psychology A. Eyewitness Identification Procedures Each year as many as 80,000 eyewitnesses (and perhaps many more) make identifications of suspects in criminal investigations. We do not have adequate information about how many eyewitnesses make identifications of suspects, how many do not, and what happens in the *458 cases where eyewitnesses do make identifications. 28 Eyewitnesses can be crucial evidence of guilt in robbery, assault, rape, and other commonly prosecuted offenses. How do police determine whether an eyewitness can identify a culprit? Police know full well that eyewitness memory is fallible, just as judges, lawyers, and social scientists have long known this fact. 29 Police try to test the eyewitness's memory. Police use a range of techniques. If a suspect is found shortly after the crime, police may present that suspect to the eyewitness directly. Such a one-on-one procedure, called a showup, is inherently suggestive. Police may use such a procedure only in the hours immediately following an incident, in order to quickly identify the perpetrator or rule out the suspect and continue their investigation. 30 Showups are particularly risky for the police. Because there are no fillers, or other known-innocent people included in addition to the suspect, a mistake is more likely to result in the witness identifying an innocent person as the guilty party. And if the eyewitness is unsure, there is a greater risk that a guilty person might not be identified. 31 If police do not immediately locate a suspect, they may try to show an eyewitness books or computerized collections of mug shots. If *459 that also fails, police may ask the witness to work with a police sketch artist or with a computer program to generate a composite image that can be used in “wanted” postings. When police eventually locate a suspect, they conduct an identification procedure to test the eyewitness's memory. In a live lineup, a suspect stands in a row of “filler” individuals and the witness looks at the group from behind one-way glass. In the past few decades, police have mostly stopped using live lineups because it is so difficult and time-consuming to find people who look similar to a suspect. Instead, they use photo arrays, typically a standard set of six photos (called a “six-pack”). 32 Procedures for creating photo arrays and conducting lineups were traditionally passed on by senior officers through word of mouth. Although police departments have detailed procedures, manuals, and training on a host of subjects--ranging from traffic stops to use of force--many, if not most, police departments still do not have any written procedures or formal training on how to conduct lineups or photo arrays. Perhaps, however, this is starting to change in reaction to high-profile eyewitness misidentifications. 33 Unfortunately, archival *460 studies also suggest that unnecessary showups are quite common together with other flawed identification procedures. 34 If there is a trial, identifications may occur in court. The courtroom identification is obviously highly suggestive. The defendant is sitting at the counsel's table, perhaps in prison clothing. There are no fillers and there is no lineup. And the identification may follow emotionally charged testimony by the victim describing a crime--a victim who, in the conclusion of the testimony, points out the culprit to the jury. 35 The courtroom identification may simply serve to confirm what came before. The procedures that came before may have been suggestive or shoddy. The eyewitness may have previously been uncertain. But in court, the eyewitness may appear supremely confident and will have no trouble picking out the defendant and pointing him out to the jury. As the Tenth Circuit has explained: Because the jurors are not present to observe the pretrial identification, they are not able to observe the witness making that initial identification. The certainty or hesitation of the witness when making the identification, the witness's facial expressions, voice inflection, body language, and the other normal observations one makes in everyday life when judging the reliability of a person's statements, are not Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 5 available to the jury during this pretrial proceeding. There is a danger that the identification in court may only be a confirmation of the earlier identification, with much greater certainty expressed in court than initially. 36 Judges could strictly regulate courtroom identifications in several ways. First, they could insist that police conduct a proper lineup before trial, out of the courtroom. Police can demand that a defendant participate in a lineup. However, judges are often reluctant to order police to conduct a lineup when police or prosecutors decline to do so, citing to the absence of a constitutional right to a lineup. 37 *461 Second, judges could also require the use of a lineup in the courtroom in order to make the courtroom identification a more meaningful test of the eyewitness's memory. Courts rarely require the use of such “special procedures,” however, though they are more willing to do so where the eyewitness had never been asked to view a lineup before trial. 38 Courts generally reject arguments that in-court identifications are inherently suggestive. 39 Judges may reject requests by defense lawyers to order a double-blind lineup. 40 Judges may view the courtroom identification as pure theater or a witness demonstration, but, as we will see, they also seem to think that the presence of counsel and the solemnity of testimony under oath in a courtroom makes the courtroom identification more, not less, reliable. Acting on their own initiative, defense lawyers have sometimes tried to make the courtroom identification a real memory test. Enterprising defense lawyers have seated people who looked like the defendant next to them or have seated the defendant out in the courtroom. Under these circumstances, eyewitnesses were unable to identify the defendant. 41 Judges have responded harshly. The Second Circuit called substituting the position of the defendant without permission of the judge a “trick” that could be subject to bar discipline. 42 The Ninth Circuit approved a criminal contempt *462 conviction for doing so, calling the conduct “unprofessional” but also an “actual obstruction of justice.” Several state courts have followed suit. 43 In contrast, certain evidentiary rules recognize the inherent limitations of courtroom identifications preceded by prior identifications. For routine identifications of documents or acquaintances, there would be no reason to have tested the witness's memory using a lineup. However, for stranger identifications, police will typically have conducted a prior identification to test the witness's memory. Those prior identifications will generally be admissible. This is because the Federal Rules of Evidence recognize prior identifications as a special hearsay exception, for the reason that they are understood to be far more reliable than courtroom identifications. The Advisory Committee Notes to Federal Rule of Evidence 801(D)(1) explain, “The basis [for the hearsay exception] is the generally unsatisfactory and inconclusive nature of courtroom identifications as compared with those made at an earlier time under less suggestive conditions.” 44 While traditionally such out-of-court prior statements were treated as hearsay, the modern rule is to admit them, and nearly all states that previously did not admit them changed their rules in response to the 1975 federal revisions. 45 The Senate Report (“the Report”) noted three reasons supporting the modern rule. First, the Report repeated the reliability concern cited by the Advisory Committee: “Since these identifications take place reasonably soon after an offense has been committed, the witness'[s] observations are still fresh in his mind. The identification occurs before his recollection has been dimmed by the passage of *463 time.” 46 The Report also explained that suggestion could “influence the witness to change his mind” between the time of the earlier identification and trial. 47 Finally, the Report noted a strategic concern that “if any discrepancy occurs between the witness' [s] in-court and out-of-court testimony, the opportunity is available to probe, with the witness under oath, the reasons for that discrepancy so that the trier of fact might determine which statement is to be Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 6 believed.” 48 Unless the prior identification is admissible, the defense attorney has no means to explore how the eyewitness came to identify the defendant at trial. Perhaps because courtroom identification procedures are so thinly regulated, very little scholarship has examined the special problems that courtroom identifications raise or the way that these problems undermine the jurisprudence of eyewitness identifications. Evan Mandery has argued that courtroom identifications should be per se excluded and certainly should not be treated more deferentially than out-of-court identifications, and I agree. 49 As I will argue, the problem runs deeper. We cannot understand due process rules surrounding eyewitness identification procedures apart from the problem of courtroom identifications. In a case that goes to trial, there may be both prior lineups and a courtroom identification. There may even be a courtroom identification at a preliminary hearing and another at trial before the jury. (In the vast majority of cases that are resolved by a guilty plea, there may sometimes be multiple identification procedures conducted, but admissibility issues do not arise.) As I will describe, over time, the Court's jurisprudence failed to differentiate those multiple identifications, and lower courts have since exacerbated the problem. Next, I try to untangle those rulings. B. From Stovall to Manson 1. The Supreme Court Intervenes in Eyewitness Identification: Stovall, Wade, and Gilbert The Supreme Court has long recognized “[t]he vagaries of eyewitness identification” where “the annals of criminal law are rife *464 with instances of mistaken identification.” 50 The Court added that “a major factor contributing to the high incidence of miscarriage of justice from mistaken identification has been the degree of suggestion inherent in the manner in which the prosecution presents the suspect to witnesses for pretrial identification.” 51 In a trilogy of decisions announced in 1967, the Court began to regulate eyewitness identifications to help avert misidentifications. At the time, the Court's intervention looked like the beginnings of a new approach toward regulating the reliability of trial evidence. The Court adopted the following two different approaches to the problem: a Sixth Amendment right-to-counsel approach and a due process approach. In each of the two lines of cases the Court had to reckon with the problems posed by courtroom identifications. In Stovall v. Denno, the Court examined a showup procedure in which the suspect was taken to the hospital where a victim was recovering, was presented to the victim alone, and handcuffed to police officers. 52 Though noting a showup is inherently suggestive, and for that reason the procedure has been “widely condemned,” the Court acknowledged that showups may sometimes be necessary in exigent circumstances. However, the Court then held that an “unnecessarily suggestive” procedure that is “conducive to irreparable mistaken identification” denies due process of law and results in exclusion of the identification from the jury. This was new. Prior to Stovall, any police use of suggestion was just evidence for the jury to weigh when assessing the weight of the eyewitness identification. 53 In two other cases the Court also discussed police suggestion, but it adopted a different approach to the problem, one that recognized a right to counsel at a lineup procedure. In United States v. Wade, the Court held that, once indicted, an accused has a right to a lawyer present at a lineup. As a result, any lineup lacking counsel must be excluded and not introduced into evidence at trial. 54 However, the prosecutors would have “the opportunity to establish by clear and convincing evidence that the in-court identifications were based upon observations of the suspect other than the lineup identification.” 55 *465 The Court concluded that unless the in-court identification might also be suppressed, a rule suppressing the out-of-court identification would serve little purpose: Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 7 The State may then rest upon the witnesses' unequivocal courtroom identifications, and not mention the pretrial identification as part of the State's case at trial. Counsel is then in the predicament in which Wade's counsel found himself--realizing that possible unfairness at the lineup may be the sole means of attack upon the unequivocal courtroom identification, and having to probe in the dark in an attempt to discover and reveal unfairness, while bolstering the government witness'[s] courtroom identification by bringing out and dwelling upon his prior identification. 56 Thus, the Court recognized that the courtroom identification is less reliable than prior identifications. Further, to the extent that the prior identifications are suggestive or unreliable, the only way to bring that out is to admit them. The Court, having identified the central problem, did not suggest a clear solution, which would have been to exclude the “unequivocal” courtroom identification while permitting litigation of prior identifications. Instead, the Court held that a judge must examine several factors to decide whether to allow the courtroom identification, including the following: “the prior opportunity to observe the alleged criminal act,” any “discrepancy between any pre-lineup description and the defendant's actual description,” any prior identifications or failures to identify the defendant, and “the lapse of time between the alleged act and the lineup identification.” 57 By examining those flexible factors, on remand the lower court can decide whether the in-court identification had an “independent origin.” 58 Of course, if a judge decides that, based on those factors, the courtroom identification has an “independent origin,” then an illegal pretrial identification may be suppressed (although the defendant may choose to introduce it at trial), but the judge may allow the courtroom identification that would clearly be affected by what went on before. How can a courtroom identification be independent? The Court noted in Wade that “the accused's conviction may rest on a courtroom identification [that is] in fact the fruit of a suspect pretrial identification which the accused is helpless to subject to effective scrutiny at trial.” 59 There are stronger arguments that an identification could have an “independent origin” in court if the pretrial identification was not suggestive. After all, the lineups in Wade, by the Court's account, were conducted properly, with five to six *466 fillers all dressed with strips of tape similar to that worn by the bank robber. 60 The defect was the procedural failure to provide counsel. In Gilbert v. California, the Court similarly dealt with whether a courtroom identification could take place. 61 A series of suggestive identifications took place postindictment and without counsel; over one hundred witnesses viewed the same lineup at the same time in a large auditorium and everyone discussed their identifications. 62 The Court remanded and ordered the state court to determine whether such an identification, conducted without counsel in violation of Wade, had an independent source. The Court explained, “The admission of the in-court identifications without first determining that they were not tainted by the illegal lineup but were of independent origin was constitutional error.” 63 Thus, the Court established a rule that an “independent” basis could result in the admission of the in-court identifications, even, in theory, following suggestive prior lineups. The Court has repeatedly reaffirmed this ruling. 64 The Wade/Gilbert rule is of limited significance today. After all, having the right to a lawyer present at a lineup is not a significant protection. Other right-to-counsel protections are far more consequential. Suspects who invoke their Miranda rights and obtain an attorney can cut off an interrogation that might have otherwise resulted in a confession. In contrast, having a lawyer present at a lineup will not prevent the lineup from occurring. At best, it may discourage police from making any obviously suggestive cues during the lineup itself, though with the cost of potentially turning the lawyer into a trial witness disqualified from further representation. 65 Nor does the rule do any work in the vast majority of cases involving eyewitnesses. That is because the Court Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 8 has repeatedly weakened the *467 rule--in part, by subsequently holding that there is no right to counsel for a photo array. 66 The vast majority of identifications are not live but are now chiefly conducted using photo arrays. 67 2. Manson and the Modern Two-Step Inquiry In decisions immediately following the 1967 trilogy, the Court indicated that an identification should be suppressed if the police engage in egregious suggestion. In Foster v. California, the Court ruled that, because a “tentative” witness only became sure after repeated suggestive showups and lineups, all identifications should be suppressed; “[i]n effect, the police repeatedly said to the witness, ‘This is the man.”’ 68 For a short time, the Wade/Gilbert line of cases began to converge with the Stovall line. In Simmons v. United States the Court held that, when deciding whether to allow a courtroom identification following a suggestive pretrial identification, the judge should examine whether the earlier identification was “so impermissibly suggestive as to give rise to a very substantial likelihood of irreparable misidentification.” 69 As Justice Marshall later explained, The inquiry mandated by Simmons is similar to the independent-source test used in Wade where an in-court identification is sought following an uncounseled lineup. In both cases, the issue is whether the witness is identifying the defendant solely on the basis of his memory of events at the time of the crime, or whether he is merely remembering the person he picked out in a pretrial procedure. 70 However, the cases then diverged as the Court took a different tack in due process cases not raising Sixth Amendment right-to- counsel violations. The Court became concerned that a rule excluding out-of-court identifications that resulted from unnecessary suggestion would lead to the exclusion of reliable eyewitness evidence. The Court proposed a new two-step inquiry in Neil v. Biggers, 71 which was then adopted in 1977 by the Court in Manson. 72 That Manson test is the current due process test that courts must follow. *468 First, following the Manson test, a court asks whether the procedure used was “unnecessarily suggestive.” 73 Then the court asks whether the identification was nevertheless “reliable.” 74 A judge has broad discretion to evaluate the record and decide whether there is evidence that the identification is “reliable.” 75 The Biggers factors adopted by the Court in Manson include: (1) the eyewitness's opportunity to view at the time of the crime itself; (2) the eyewitness's degree of attention; (3) the accuracy of the description that the eyewitness gave of the criminal; (4) the eyewitness's level of certainty at the time of the identification procedure; and (5) the length of time between the crime and the identification procedure. 76 This due process test is somewhat different than the Wade test. If there is a right-to-counsel violation at the lineup, under Wade, a court asks whether the courtroom identification has an “independent source” and examines factors relating to reliability. If instead there was suggestion at the lineup, the court more directly looks at whether the identification is reliable. The Manson “reliability” factors are slightly different than the nonexclusive list in Wade. The main addition that the Manson Court made to the Wade factors was the fourth factor--the certainty of the eyewitness. Adding that factor was a significant misstep, however, as psychologists would convincingly show over the next three decades. C. Social Science Research The Manson Court emphasized that “reliability is the linchpin in determining the admissibility of identification testimony.” 77 In the decades since the Court settled on its due process test, however, social scientists have shown just how unhelpful and Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 9 flawed each of the Manson factors are for evaluating the reliability of an identification. Their findings demonstrated just how susceptible eyewitness memory is to cues or suggestions, intended or not, by the administrator of a lineup. Eyewitness identifications are designed to be a test of a witness's memory. Pioneering psychologists Elizabeth Loftus and Gary Wells, followed by many others, realized beginning in the late 1970s that eyewitness memory can be tested in lab experiments. A *469 now vast body of social science research has demonstrated that most of the five Manson “reliability” factors do not correlate at all with the reliability of an eyewitness's identification. 78 One factor--the passage of time from the crime to the identification--strongly affects reliability; however, the effects are so pronounced in the immediate hours and days following the crime that judges would have to exclude a large number of identifications if they emphasized that factor. 79 In contrast, the seemingly objective factor, the ability of a person to describe another accurately, is not correlated one way or another with reliability. 80 The remaining factors are particularly crucial to the analysis--and they are deeply flawed. The certainty of an eyewitness, the opportunity of a witness to view the attacker, and the degree of attention paid by the eyewitness are not independent measures of reliability. Instead, the procedures police use affect the so-called reliability factors. A series of studies has shown that jurors rely strongly on the confidence of the eyewitness. 81 Yet, confidence is not highly correlated with accuracy. The correlation is highly variable. In fact, a mistaken eyewitness may appear particularly confident. Why? A factor that strongly affects confidence is suggestion by the administrator. Expectations of the administrator affect the confidence of the eyewitness even if the suggestion is unconscious. The eyewitness may perceive cues that the police never intended to convey. That is why social scientists have long recommended that police administer double-blind lineups where the police officer does not know who is the *470 suspect, and the eyewitness knows that the officer does not know. 82 The way that police construct the lineup can enhance confidence. If police stack the lineup so that one photo stands out, the eyewitness not only will be more likely to identify the person highlighted, but the eyewitness will predictably be more certain. 83 Feedback or reinforcement after the identification can also have a dramatic effect on confidence. If police say, “Good job, you picked the right one,” then the eyewitness will tend to be far more certain. If police tell the eyewitness that a suspect had been arrested and would be present in the lineup, the eyewitness will likewise tend to be far more certain. 84 Finally, studies suggest that repeated identification procedures create an enhanced risk that a witness will identify an innocent suspect. 85 Even permitting more than one “lap” or viewing of a photo array increases the risk of errors. 86 Likewise, routine preparation for trial, or even the suggestion that an eyewitness will later be cross-examined concerning an identification, has the effect of making an eyewitness more certain. 87 The two prongs of the Manson test can undermine each other. Suggestion does not just make an uncertain eyewitness feel more confident, but it affects all of the other factors that the Supreme Court included in the Manson test. Memory is malleable. Suggestion will *471 affect the details that an eyewitness remembers. 88 The eyewitness may recall having seen the culprit for a longer period of time and will recall having had a better look at the culprit. 89 The five Manson factors poorly assess “reliability.” They are circular, and highlight the very features of eyewitness memory that may be most profoundly affected by suggestion. Yet, a court may excuse serious police suggestion by saying that an eyewitness identification is nonetheless “reliable.” Still more problematic, in the situation where there are multiple eyewitness identifications, a court may allow a courtroom identification despite an earlier suggestive identification. The jury then sees the now-confident eyewitness in court pointing at the defendant. As Gary Wells puts it, “[E]yewitness identification evidence is among the least reliable forms of evidence and yet is persuasive to juries.” 90 One reason is that the jury does not see what occurred before. The earlier lineups may not even have been documented. The jury will instead hear the eyewitness describe what he saw. Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 10 In the typical situation in which the eyewitness is a victim, the jury will hear the details of a stressful, if not frightening, encounter. The eyewitness may then briefly recount the photo arrays or lineups, but she may not remember the details of those procedures. The eyewitness then will be asked how sure she is that the defendant is the culprit. Finally, the eyewitness will point out the defendant in the courtroom. The courtroom identification that the jurors see will be more dramatic, and may be made with more confidence, than the identifications that came before the trial. Further, the trial setting is inherently suggestive, as well as public. While there have not been field studies of courtroom identifications, there is every reason to think that in a courtroom setting “conformity is at its peak” since “pressure is high and . . . judgments are made without anonymity.” 91 Despite this now vast body of social science evidence, the Court has not reconsidered its test; has denied certiorari petitions asking that the test be revisited in light of social science research; and, as I will develop, has not intervened when states adopted standards that *472 carelessly apply if not distort the Manson test. 92 Justice Sotomayor, dissenting in Perry, argued that concerns with the adequacy of the due process rule “should have deepened” based on a “vast body of scientific literature” and concluded that “[i]t would be one thing if the passage of time had cast doubt on the empirical premises of our precedents. But just the opposite has happened.” 93 Although we now know far more about the sources of eyewitness unreliability, long before social scientists began investigating eyewitness misidentification it was, as Samuel Gross put it, “an old and famous problem.” 94 Police are most familiar with the problem because it significantly affects their investigations. In actual police lineups, eyewitnesses choose known innocent fillers an average of thirty percent of the time, according to available archival and field studies. 95 Those common misidentifications are of less consequence because it is obvious there is an error when a filler is picked. However, they harm police investigations since an eyewitness who has selected a filler may be “burned,” or viewed by jurors with more suspicion should that eyewitness later identify a true suspect. If an eyewitness picks an innocent suspect and not a filler, the consequences may be more serious. DNA exonerations have raised the profile of eyewitness errors in cases that went further, resulting in convictions overturned only years later through DNA testing. In Convicting the Innocent, I examine the trial transcripts of the first 250 DNA exonerees. 96 Eyewitnesses misidentified seventy-six percent of the exonerees (190 of 250 cases). 97 I expected to see a large body of eyewitness misidentifications in these cases. After all, DNA testing is most readily used to exonerate individuals convicted of rape, and such cases often involve a victim eyewitness. However, when I began studying those unusual trials, I feared that I would not be able to say very much about the eyewitness misidentifications. After all, we *473 do not often have records of what transpired during the identification procedures; police usually do not document them. Yet, the trial records alone told a troubling story. In the vast majority of those cases, seemingly powerful eyewitness testimony was flawed. One high-profile case provides an example of how the lack of regulation of in-court identifications affects the use of eyewitness identification procedures. 98 Neil Miller was facing charges of aggravated rape in Massachusetts in 1990. Someone raped and robbed the victim in her apartment. Miller's defense was one of mistaken identification. He maintained that he had never met the victim nor been to her apartment. 99 Neil Miller's defense attorney was concerned that photo arrays had been conducted in a suggestive manner. About a month after the attack, the detective brought an array of nine photos for the victim to view. From that array, she selected two photos, but was not sure if she could pick out either individual as the attacker. 100 One of the two was a six-year-old photo of Neil Miller taken when he was only sixteen. The second was of another man. The detective recalled instructing her, “[I]f she had a first impression, that the best thing to do was go with her first impression.” 101 The victim then identified Neil Miller's photo, and Miller was Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 11 arrested. 102 A second array was conducted two months later, with a more recent photo of Miller, and the victim picked his photo. 103 Neil Miller's defense lawyer then made a motion to request a new photo array at the upcoming pretrial hearing. 104 However, just before the hearing was to take place, the prosecutor and a detective walked the victim past Neil Miller in the hallway outside the courtroom. 105 Even after having being told that her attacker might be in that hallway, when she saw him there, she was not sure. She followed Miller into the courtroom (where it was obvious who he was), looked at him again, and said, “This is him.” 106 Now the hearing to request a new lineup was in effect moot. Even if the judge ordered that a new photo array be conducted, due to both of *474 the prior suggestive procedures, the victim would likely again pick out Miller and then testify with confidence before the jury. The judge granted the defendant's motion to suppress the identification the morning of the hearing and ruled that the jury could not hear about it. However, the judge was still willing to let the jury hear about the first identification from the photo array where the police officer made the suggestions; of course the defense would want to bring out that conduct. Further, the judge let the jury observe the victim on the stand identifying Miller. 107 The judge ruled that the courtroom identification had an “independent basis” based on the witness's original view of the perpetrator. 108 Even though the victim was initially not sure that Miller was the right man, the jury saw the victim identify him in court and say she was “positive” he was the attacker. 109 There was no other evidence at trial, aside from some very limited forensics. 110 The jury convicted Miller and sentenced him to twenty-six to forty-five years in prison. 111 Neil Miller was an innocent man. He was exonerated by postconviction DNA testing in 2000 after serving almost ten years in prison. Moreover, the DNA tests matched another man. 112 The testimony in Miller's case and in the other 249 cases illustrates how police suggestion can increase the confidence of eyewitnesses, even if they are wrong. Courts readily admitted those identifications, despite sometimes glaring evidence of suggestion or unreliability. All but a handful of the eyewitnesses who we now know misidentified innocent people were certain at the time of trial. For example, an eyewitness in Steven Avery's case testified, “[T]here is absolutely no question in my mind.” 113 In Thomas Doswell's case, the victim testified, “This is the man or it is his twin brother” and “That is one face I will never forget . . . .” 114 In Dean Cage's case, the victim *475 was “a hundred percent sure.” 115 In Willie Otis “Pete” Williams's case, the victim said she was “one hundred and twenty” percent sure. 116 What explains the false confidence of those eyewitnesses? In seventy- eight percent of those trials (125 of the 161 cases involving eyewitnesses in which trial records could be located), there was evidence that police contaminated the identifications. Many of those eyewitnesses were asked to pick out the suspect using suggestive methods long known to increase risks of error. Police made remarks that indicated who should be selected, used unnecessary showups, or used lineups that made the defendant stand out. Suggestion is related to the second problem, that of false certainty. In fifty-seven percent of the trials studied (92 of 161 cases), witnesses reported they had not been certain at the earlier identifications, or identified other people. These high-profile wrongful convictions have made more salient what criminal practitioners, judges, and social scientists have known for years-- eyewitness memory is malleable and can be strongly affected by police suggestion. The Supreme Court's due process cases acknowledge a problem but offer no solution. Nor is the Court likely to reform its due process test. If anything, the majority in Perry v. New Hampshire expressed a view that the application of that due process test should be narrowed to avoid regulating all eyewitness identifications, despite the “fallibility” of eyewitness evidence. 117 The Court in making that point even noted the problem of courtroom identifications, stating: “Most eyewitness identifications involve some element of suggestion. Indeed, all in-court identifications do.” 118 Of course, the suggestion inherent in such procedures should cause the Justices to consider whether jurors are in an adequate Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 12 position to assess the reliability of that evidence. The Court's mention of courtroom identifications and unwillingness to question their use is symptomatic of a larger problem. Justice Sotomayor in dissent highlighted how: “At trial, an eyewitness'[s] artificially inflated confidence in an identification's accuracy complicates the jury's task of assessing witness credibility and reliability. It also impairs the defendant's ability to attack the eyewitness'[s] credibility.” 119 Still more problematic, as I discuss next, state and federal courts interpret the Court's rulings to provide nearly unfettered use of the most problematic courtroom identifications. *476 II. The Persistence of “Independent Source” Rules Social scientists that carefully studied flaws in the Supreme Court's due process test for admissibility of eyewitness identifications may have taken the letter of the law too seriously. They have studied each of the factors in the Manson test and have critiqued their reliability with the assumption that courts actually follow the test as promulgated by the Court. One cannot blame anyone for assuming that lower courts would follow the precise language of a Supreme Court ruling. However, in practice, courts do not, and to a surprising degree. Not only is the Manson test flawed because of its focus on “reliability” factors that are not independent of police suggestion, but in practice the test is often not carefully applied, particularly to courtroom identifications. Commentators have observed that, in general, judges apply the Manson test very deferentially if not carelessly. 120 After all, the factors are quite flexible, and they excuse even extreme and unnecessary police suggestion based on flimsy evidence of “reliability” under a totality of the circumstances test. Appellate judges defer to trial judge discretion in applying those five broad factors. There is still another defect in the case law. A crucial but largely unnoticed loophole can short-circuit the Manson inquiry in the most pressing situation where the identification procedure conducted before trial was suggestive--independent source rules. A. “Independent Source” Rules Even after a suggestive pretrial identification procedure, courts still permit a courtroom identification by citing to its “independent” source or “independent” reliability. That courtroom identification may be pretrial, in which case it may shape what the eyewitness says at trial. Or, that courtroom identification may occur at trial. As noted, courts hold that the Due Process Clause does not forbid courtroom identifications, despite their inherent suggestiveness. 121 They cite to *477 the “supervision” provided by the trial judge to ensure an “impartial” identification in court. 122 Nor do courts typically require special procedures to test eyewitness memory in the courtroom. Still more troubling, courts adopt a permissive approach to allowing courtroom identifications despite prior suggestive or illegal identifications. The vast majority of state courts, when applying the Due Process Clause, rule that an identification, and particularly a courtroom identification, may be allowed even where a prior identification might be suppressed, citing to its “independent source.” 123 This is not casual language adopted by outlier jurisdictions. Rather, this language is adopted by courts of thirty-eight states and the District of Columbia, with six more states adopting similar language and three states with mixed rulings. Nor do state courts appear to revisit their leading rulings on the problem of eyewitness identifications frequently, perhaps because the U.S. Supreme Court has not revisited the problem either. To be sure, most of those written decisions on appeal did not confront the situation where the prior identification was in fact suppressed, but the courtroom identification was permitted. 124 It is very rare for a court to suppress identifications, because the Manson test is already so deferential. However, not only did several states explicitly allow the courtroom identification while excluding the prior identifications, but the others describe how they need not examine whether the prior identifications are suggestive. They assume, for the sake of argument, that the prior identifications could be excluded but emphasize how the Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 13 courtroom identification would be allowed. After *478 all, the “independent source” language is designed to permit a separate inquiry for a courtroom identification that would give trial courts considerable leeway to find an “independent source” for the courtroom identification, regardless of whether the prior identifications were suggestive and might be excluded. Thus, the Pennsylvania Supreme Court asked “whether there exists a basis for identification which is independent of the allegedly suggestive showup.” 125 How could there be such a basis? The same eyewitness is testifying at trial with a memory affected by the showup. Similarly, a South Carolina appellate court noted that “[t]he in-court identification is admissible if based on information independent of the out-of-court procedure.” 126 What “information” does that eyewitness have that is “independent” where nothing has transpired except that the eyewitness is now confronted with the same person in a courtroom setting? The Virginia Supreme Court found “that the in-court identifications had independent sources free from taint” but made explicit that the supposedly “independent” information was just “the ample opportunities the victims availed themselves of” to observe the attacker. 127 The North Carolina Supreme Court explained that it need not inquire whether pretrial suggestion tainted in-court identifications because “the trial judge concluded that the witnesses' in-court identifications of defendant were of ‘independent origin, based solely upon what the witnesses saw at the time of the crime.”’ 128 Also remarkable, the Alaska Supreme Court held that the trial court need not inquire into a suggestive identification at a preliminary hearing because the courtroom identification at trial had an “independent source” from the earlier in-court identification. 129 In one final example, the Supreme Court of Kansas stated, “A reliable in-court identification will stand on its own regardless of whether it was preceded by a deficient pretrial identification.” 130 How can it stand on its own when that same eyewitness was subjected to suggestive pretrial procedures? The list of such holdings goes on and on, as the Appendix details, providing examples of leading and typical rulings from each *479 state. In addition to the thirty-eight states adopting independent source rules, six more states and some federal courts discuss “independent reliability” of an identification (but federal courts otherwise follow the proper due process test and do not cite to “independent source” outside of the Sixth Amendment context). 131 In doing so, those courts follow Manson but use the word “independent” to refer to the “reliability” factors or to highlight how a courtroom identification may be considered reliable despite what came before. 132 Only five states adopt no language suggesting a different standard for courtroom identifications. 133 At least one more state adopts the correct *480 language in some decisions but adopts “independent source” language in others. 134 Very few judges recognize that the Manson/Biggers test has superseded such inquiries into “independent source.” 135 Most of these courts, if they provide an explanation of what it means to ask whether a suggestive identification has an “independent source” or “independent reliability,” follow a “totality of the circumstances” inquiry. They may then follow the correct Manson test in form, but only by ignoring the effect of the prior identifications on the courtroom identification. 136 To be sure, these judges are not applying a standard that is formally more demanding than the Manson test (despite language that the prosecution has the burden to show an independent source by “clear and convincing evidence”). 137 On *481 appeal or postconviction, judges defer to the trial court's exercise of discretion and accept trial court factual findings. As a result, appellate or postconviction judges do not typically explain their analysis with much detail or rigor. They may simply note, after citing to an “independent source,” that the identifications appear reliable under the circumstances, again without considering the impact of prior identifications on the courtroom identification. Courts do not actually insist on some independent source in the sense of a truly independent event that created a more reliable identification. Situations like that can occur. For example, the fact that an eyewitness had already been well acquainted with the suspect could be evidence of greater reliability that is truly “independent” of any suggestion at the police lineup. Some courts Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 14 do treat identifications by acquaintances differently, although the question of how much familiarity should suffice to assure greater reliability poses complex practical and theoretical problems. 138 In a different sense, an *482 identification that is not the product of police suggestion, but of the eyewitness's own actions, like viewing a photo of the defendant in a newspaper or a yearbook, might be seen as “independent” of police efforts to test the eyewitness's memory, although such identifications might nevertheless be unreliable. 139 While these decisions occur in the context of deferential appellate and postconviction review, what is troubling about the decisions is the notion, implicitly rejected by Manson, that there is something “independent” about a courtroom identification. They do not say that evidence of reliability overcomes or excuses suggestion but that they have independent access to the reliability of the witness. Courts speak of the witnesses' “independent recollection” of the culprit's appearance. Indeed, the problem extends beyond the question of admissibility. Trial judges even instruct jurors on such a standard in some states, when explaining what weight they should give to a courtroom identification. 140 *483 B. Crossing Two Lines of Eyewitness Decisions This independent source concept arises from a confusion of the two lines of Supreme Court eyewitness identification cases that developed in the late 1960s through the 1970s. Courts may simply be befuddled by the tangled case law leading up to the Manson decision, in which different standards applied for admitting an in-court versus an out-of-court identification. Some of those courts, as noted, hearken back to the Wade/Gilbert line of cases and still cite to the “independent source” doctrine, in which even if a judge concludes that an identification was illegal, the judge may allow an in-court identification. That doctrine now ostensibly only applies to Sixth Amendment violations of the right to counsel at postindictment lineups. Recall that Simmons began to make such a distinction in the due process and police suggestion context, but the Court undid that distinction in Manson by ruling that the standard for any identification is whether it is “reliable” despite any police suggestion. The independent source concept used in Wade and Gilbert came from an unlikely and inapposite source--exclusionary rule doctrine. In the search and seizure context, an illegal arrest may lead to a search that uncovers valuable evidence. Courts may exclude all of the evidence as “fruit of the poisonous tree.” 141 However, there are three exceptions to the “fruit of the poisonous tree” implication of the exclusionary rule: inevitable discovery, attenuation doctrine (neither is analogous in any way), and independent source doctrine. 142 The independent source doctrine is less problematic when the source was known to police before the illegality, though “[t]he problem, of course, is that there is no way to get the cat back into the bag.” 143 In the typical case, though, an illegal arrest leads to a search that uncovers evidence of guilt. Such cases go to the heart of concern with the exclusionary rule. Police uncover reliable evidence of guilt but through illegal means. In other contexts, the Court has held that a person's own independent actions may create a source independent of law enforcement illegality. For example, a confession is not something with an independent source when the suspect is questioned immediately following an illegal search. Passage of time, the Court has ruled, can “dissipate the taint” of the illegal search, or even of an *484 initial coercive interrogation. 144 Similarly, if an eyewitness identifies a defendant in a lineup after an illegal arrest, a court might have good reasons to allow that eyewitness to identify the defendant at trial. The victim's identification was not tainted by the illegal arrest, since the defendant is “not himself a suppressible ‘fruit.”’ 145 Perhaps because police suggestion is not an independent act, in Manson the Court abandoned the fiction of an “independent source” for a courtroom identification even where the out-of-court identification would be suppressed. The Wade/Gilbert line of cases is different. After all, in the Sixth Amendment context, even if the identification in court cannot be truly said to be “independent,” at least the violation of the right to counsel likely did not affect the reliability of the identification. The inquiry Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 15 into whether the identification was reliable is distinct. Further, the same policy concern is present as in exclusionary rule cases generally. A procedural violation, the failure to provide counsel at a postindictment lineup, may result in the exclusion of reliable evidence of guilt. The purpose of the exclusionary rule was to deter police misconduct, but as in the Fourth Amendment context, the Court in the Sixth Amendment context created exceptions to allow reliable cases to go forward. 146 In contrast, in the due process context, the illegal means are precisely what makes the evidence unreliable. In the eyewitness context, neither time nor unrelated events can “dissipate the taint.” In addition, the Court does not adopt the same approach toward deterring police misconduct in the eyewitness context, noting that police may not need a strong deterrent since “[t]he interest in obtaining convictions of the guilty also urges the police to adopt procedures that show the resulting identification to be accurate.” 147 The very idea that a courtroom identification could be seen as “independent” is anomalous. But that has not stopped nearly all courts in the country from seizing on language from the Court admittedly confusing early due process case law to justify departing from Manson and encouraging the admission of courtroom identifications despite earlier suggestion. *485 Indeed, as noted, some courts outright conflate the lines of cases and cite to Wade when they apply the “independent source” rule in cases claiming due process (not Sixth Amendment) violations. 148 The one piece of commentary mentioning this flaw in the “independent source” case law, a Texas criminal practice guide, noted such “analysis is properly used only when the pretrial procedure is tainted by a violation of the Sixth Amendment right to counsel.” 149 In practice, it appears that courts ask whether under the “totality of the circumstances” an identification appears reliable. While the test may be “technically, but not practically” different from the Manson v. Brathwaite analysis, 150 the important difference is that there is no meaningful assessment of the reduced reliability of the courtroom or other subsequent identifications. Instead, courts look back to the original view the eyewitness had as an “independent source.” C. What Is Independent About the Source? What is the independent source that courts are referring to? Courts treat an eyewitness almost like an object that can simply be shown to the jury. They discuss eyewitness memory as if it were a fixed image, like a photo or a video. However, as social scientists have demonstrated over many hundreds of studies, eyewitness memory is highly malleable and is nothing like a photo or a video. An eyewitness's memory must be carefully preserved or it can become contaminated. Each effort to test an eyewitness's memory will reshape that memory. 151 In the courtroom, the eyewitness cannot access a memory of what happened that is “independent” of the suggestive lineups that came before. While courts discuss the “independent recollection” of the eyewitness at trial, there is nothing independent about that recollection at trial. Indeed, the Supreme Court recognized as much early on. In Simmons, the Court noted that “the witness thereafter is apt to retain in his memory the image of the photograph *486 rather than of the person actually seen, reducing the trustworthiness of subsequent lineup or courtroom identification.” 152 As Elizabeth Loftus, James Doyle, and Jennifer Dysart note in their treatise, while in theory prosecutors have a burden to show an independent source by “clear and convincing evidence” (in fact only some of the jurisdictions mention such a burden), in practice, courts “have gone to truly extraordinary lengths” to find that such independent sources exist. 153 In fact, courts have found that the eyewitness's original perception was an “independent” assurance of the validity of the identification in remarkable cases where eyewitnesses saw perpetrators for “less than ten seconds, running, at night,” or “while temporarily blinded by liquor,” or when “choked from behind.” 154 Recall how in Wade, the Court recognized that unless the courtroom identification is suppressed, a rule suppressing the out- of-court identification would serve little purpose since “[t]he State may then rest upon the witnesses' unequivocal courtroom Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 16 identifications.” At trial, litigating the “possible unfairness at the lineup may be the sole means of attack upon the unequivocal courtroom identification.” 155 A Massachusetts appellate court highlighted: We also note the obvious tactical reason for not filing a motion to suppress. If the out-of-court procedure was suppressed, leaving only an in-court identification, assuming the Commonwealth was able to meet its burden, the defense would not have been able to exploit certain weaknesses in the identification procedure. 156 That may actually not be a good tactical reason to not file a motion to suppress. If the Court suppresses the out-of-court identification, the defense lawyer may still choose to introduce it, although the identification cannot be presented by the State. It may be far more favorable for the defense to introduce it to elicit how the in-court identification is the product of a prior flawed identification procedure. Regardless, the lenient treatment of in-court identifications means that complete suppression of “all identification testimony” seldom occurs (except maybe in rare cases where the eyewitness never had any view of the culprit at all). 157 This creates poor general incentives for the defense to vigorously litigate motions to suppress. *487 Why are courts so lax about courtroom identifications? Some of those courts are aware of and cite to social science research on eyewitness memory. It may simply not occur to them that the courtroom identification is not just a bit of theater, but it is in fact highly suggestive and influential to jurors. Judges may be used to courtroom identifications of documents, nonstrangers, or objects--circumstances that are not problematic. Perhaps they believe that “juries are inclined to be skeptical of courtroom identifications, on account of the inherent suggestiveness in a defendant's location next to his counsel at trial.” 158 As noted, they cite to the ability of counsel to cross-examine after a courtroom identification. Or courts may be eager to avoid excluding the identification, which may be central evidence supporting the prosecution's case. Sandra Guerra Thompson suggests that this may explain rulings in some states, pointing to, for example, a New York Court of Appeals decision stating that “[e]xcluding evidence of a suggestive show-up does not deprive the prosecutor of reliable evidence of guilt. The witness would still be permitted to identify the defendant in court if that identification is based on an independent source.” 159 Of course, that reluctance to exclude is most problematic where the prior identifications were so suggestive that the court recognizes that they should be excluded, but still admits the courtroom identification. There is an additional feature of the doctrine that is still more problematic. Several state and federal courts add another guilt- based factor nowhere to be found in the Manson test. They cite to other evidence in the case as another “independent” basis for allowing the in-court identification. They explain that all other unrelated evidence of guilt in the case can buttress the eyewitness identification and help to show that it was reliable or “independent” of any suggestive police conduct. 160 In such cases, courts again short-circuit the due process *488 inquiry and do so for the explicit reason that they do not want to deny the prosecution access to evidence against a likely guilty defendant. It is an ends-justifying-the-means approach, and it is not an approach in which “reliability is the linchpin.” III. A Partial Exclusion Approach Although judges may seek to avoid exclusion at all costs, there is a middle ground that avoids excluding eyewitness testimony entirely while still safeguarding the reliability of trial evidence. That is to per se exclude courtroom identifications: ban them entirely when prior identifications are conducted. At the same time, courts could admit the prior identifications and allow any flaws in those procedures to be explored by the defense when questioning the eyewitness. Although the larger problem is beyond the scope of this Article, I emphasize that policymakers and judges should also address the array of deficiencies surrounding the admissibility rules for eyewitness evidence. The Henderson decision in New Jersey, while not perfect, provides a “social Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 17 science framework” 161 for encouraging proper lineups in the first instance, evaluating eyewitness evidence at hearings pretrial, admitting them in court, and instructing jurors on how to weight eyewitness identifications. 162 A. Limiting Courtroom Identifications Some evidence, like that obtained after a search, can raise an all or nothing question: Should the judge admit the evidence? Moreover, the legality of the search has no bearing on the reliability of the evidence; an illegal search can turn up damning evidence of guilt. Other types of evidence lack such an all-or-nothing character. The testimony of an eyewitness is complex. It can include, among other things, a series of recollections, not all of which should necessarily be admissible. As Gary Wells and Deah Quinlivan suggest, “[T]otal exclusion is not the only option.” 163 *489 Commentators have argued that the Manson test should be modified or that stronger limits should be placed on the admissibility of identifications. 164 I agree with those criticisms but argue that the focus of such efforts should be broadened to not only revise (or scrap) the due process and “reliability” test, but to also ask when courtroom or subsequent identifications may be admitted despite earlier suggestive procedures. 165 I argue that courtroom identifications should be per se excluded, perhaps with a heavy burden on the prosecution to show why it is absolutely necessary. 166 After all, as courts acknowledge, “the in-court testimony of an eyewitness can be devastatingly persuasive,” 167 and “of all the evidence that may be presented to a jury, a witness'[s] in-court statement that he is the one is probably the most dramatic and persuasive.” 168 Perhaps the eyewitness could still identify the defendant's photo as the one previously identified in a photo array. Then again, the police officer could just as readily authenticate the photo array as the one administered and describe which photo was that of the defendant. Particularly important is that the eyewitness would not be permitted make an in-court identification of the defendant or additional testify about confidence at the time of trial that the identification is correct. As discussed, confidence on the day of trial is not a sound measure of accuracy and is prejudicial. Allowing the eyewitness to point to the defendant in the courtroom permits a display of such confidence. If the prior identification is not suppressed as unduly suggestive, the eyewitness should be permitted to describe the out-of-court identification and be cross-examined concerning any suggestion or unreliability of that procedure. As Justice Marshall put it, *490 dissenting in Manson, “[T]he issue is whether the witness is identifying the defendant solely on the basis of his memory of events at the time of the crime, or whether he is merely remembering the person he picked out in a pretrial procedure.” 169 Such a rule would give police strong incentives to conduct lineups and identification procedures before the trial. Regardless, police have every reason to test an eyewitness's memory to be sure that they have the right person. Indeed, in a typical case, they may not be able to make an arrest since, without an eyewitness identification, they would lack probable cause. Further, although judges do not often order lineup procedures at a trial, as noted, when they do, it is typically because the police did not conduct an identification procedure before trial. This approach could be seen as flowing from a strict reading of Manson. One must separately ask whether the courtroom identification is unduly suggestive or reliable. If an eyewitness recounts a prior identification in the courtroom, then the eyewitness is describing more reliable evidence. However, unless a lineup is conducted in court, an identification in the courtroom is not only inherently suggestive, but it is also less reliable. The courtroom identification has no independent reliability, contrary to the language adopted by so many state and federal courts. Courts simply get it wrong when they suggest that there is less to be worried about when the identification is conducted in court. 170 Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 18 Are there circumstances in which courtroom identifications should be allowed, perhaps if prosecutors satisfied some burden in showing that the identification was necessary? Certainly, if there was no challenge to the courtroom identification, it could be allowed. Routine identifications by a police officer of a person arrested or of a relative or acquaintance are not controversial, and perhaps in such circumstances no prior lineup would have been conducted. Nor are those identifications based on an eyewitness's memory of a single encounter. Police have separate written records of whom they arrest, and the prior familiarity of a witness with a relative can similarly be established without a courtroom identification. A rule barring courtroom identifications encourages litigation and development of the most probative eyewitness evidence. That is, what happened at the initial identification? How certain was the *491 eyewitness when first viewing the lineup? How was that initial lineup conducted? Perhaps in part due to “independent source” case law, few defendants contest in-court identifications. 171 The jurisprudence of eyewitness identifications may itself improve if separate identifications are treated separately. 172 In some cases, there may not have been a prior identification. Sometimes this may be because the identification was routine. The police officer may identify the person whom she arrested, for example. However, in a case involving a contested identification, the defense should have, and in many jurisdictions will have, the ability to formally request that a lineup be conducted before trial. If there is a dispute about the reliability of an identification, the courtroom is no place for an identification to first occur. Such an approach could be adopted as to other aspects of eyewitness testimony as well. If police, for example, fail to record the eyewitness's initial confidence upon viewing a lineup, then at trial a judge could exclude as unreliable any testimony about the witness's confidence. B. Rethinking State Procedure States are increasingly revisiting criminal procedure rules regulating trial evidence, such as confessions, eyewitnesses, and forensics, in response to scientific research and wrongful convictions. 173 In each of those contexts, states must revisit rules for excluding trial evidence. After all, if the new procedures are not followed, the state law question arises whether an exclusionary rule attaches to the breach. Typically, however, states have shied away from specifying consequences for failure to follow such new criminal procedures; thus, new statutes have tended not to speak to the exclusion of an eyewitness identification should the court find that best practices were not complied with. The two leading statutes, in North Carolina and Ohio, provide that failure to comply with a set of procedures, including double-blind and sequential administration of *492 lineups, “shall be considered” in a motion to suppress an identification. 174 That is very mild language and a weak remedy. State courts have also altered the Manson test to reform its application. 175 For example, the Georgia Supreme Court concluded in 2005 that eyewitness certainty should no longer be considered as a relevant factor when evaluating the reliability of eyewitness identifications, stating that “[i]n the 32 years since the decision in Neil v. Biggers, the idea that a witness's certainty in his or her identification of a person as a perpetrator reflected the witness's accuracy has been ‘flatly contradicted by well-respected and essentially unchallenged empirical studies.”’ 176 Yet, Georgia and many other reform states are jurisdictions that adopt “independent source” language for admitting in-court identifications. 177 States using double-blind identifications similarly fail to discuss the standard for excluding noncomplying or courtroom identifications. 178 Local efforts *493 similarly focus on best practices for eyewitness identifications without discussing admissibility. 179 The one exception is the New Jersey Supreme Court's Henderson decision adopting far-reaching changes to procedures concerning eyewitness identifications. 180 Those procedures are an important model and provide a social science framework for admissibility of eyewitness identifications. I note, though, that in addition to certain other limitations, those procedures do not Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 19 carefully address the problem of courtroom identifications. Indeed, an appellate decision in that case instructed the trial court to consider whether there was an “independent source” for a courtroom identification should the suggestive pretrial identifications be excluded. 181 On the other hand, the Henderson decision does note that in “rare cases” judges “may use their discretion to redact parts of identification testimony,” including by barring “potentially distorted and unduly prejudicial statements about the witness'[s] level of confidence from being introduced at trial.” 182 Statutory jury instructions describing risks of eyewitness misidentifications typically fail to consider admissibility standards. 183 *494 As Justice Brennan wrote, “To expect a jury to engage in the collective mental gymnastic of segregating and ignoring such testimony upon instruction is utterly unrealistic.” 184 Research on jury instructions and eyewitness testimony supports that view. 185 This is all the more problematic when a jury is given instructions that highlight factors that do not correspond to the reliability of the identification or even instructions on “independent source” for a courtroom identification. 186 However, as the Henderson decision explains, tailored jury instructions highlighting factors relevant in a particular case, or even provided just before the witness testifies, may have a greater ability to educate jurors. 187 More research should be done to study the effect of such jury instructions. Perhaps one reason that new procedures studiously avoid any robust remedies for failure to adhere to best practices is a concern that a heightened standard for exclusion would derail prosecutions that rely on eyewitnesses as crucial evidence in serious cases. However, by distinguishing between in-court and out-of-court identifications, exclusion is no longer an all-or-nothing question. Judges could adopt a rebuttable presumption that a courtroom identification would not be allowed if earlier identification procedures were flawed, but they could still allow full litigation of the prior procedures. Reforms should make clear what consequences flow from a departure from best practices. Again, the Henderson decision in New Jersey provides a roadmap for how to structure those procedures. *495 Judges could partially reorient the jurisprudence just by correctly reading Manson. The Court's due process test does not include an “independent source” rule. It requires separate analysis of whether a given identification procedure should be admitted as suggestive or reliable. A courtroom identification is not a “reliable” test of the eyewitness's memory, and a courtroom identification is inherently suggestive. Similarly, statutes could codify per se exclusion for courtroom identifications that follow prior out-of-court identification procedures. Criminal procedure rules could more broadly focus on excluding tainted aspects of evidence, such as a confession, an informant statement, or a forensic report, without imposing an all-or-nothing exclusion. The Supreme Court in Perry was unwilling to expand due process regulation of eyewitness identifications not arranged by police, but the Court did emphasize that jury instructions and other tools may more usefully ensure the reliability of trial evidence. 188 However, the Court may continue to step back toward a more reliability-oriented Confrontation Clause approach. 189 The Court's ruling in Missouri v. Siebert can also be seen as a ruling recognizing the need to partially exclude later evidence contaminated by earlier evidence (although there, the focus was on police coercion and not on reliability). 190 An approach geared toward reliability might instead look at whether a confession was contaminated by disclosed facts, and it might exclude portions of an interrogation where the suspect was not volunteering answers but simply repeating *496 information that police provided, 191 or it might simply exclude portions of an interrogation that were not electronically recorded. 192 Similarly, a series of courts have responded to challenges to the validity and reliability of a series of forensic techniques, such as fingerprint analysis, firearms and toolmark analysis, and handwriting comparisons, by limiting the ability of analysts to testify to invalid conclusions, such as that the evidence could only have come from the defendant to the exclusion of all others in the world. 193 As courts and legislatures focus on reliability in other contexts, they might consider whether evidence could similarly be treated in separate parts. In addition, courts could Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 20 fashion tailored jury instructions, together with *497 reliability hearings, to provide a comprehensive framework regulating the admissibility of evidence. Conclusion The Supreme Court's due process test, confused in the courts by misplaced borrowing from Sixth Amendment right-to-counsel cases and Fourth Amendment exclusionary rule doctrine, has handled exclusion and eyewitness identifications backwards. Most recently, the Court blithely noted in Perry v. New Hampshire that “all in-court identifications” involve “some elements of suggestion,” identifying this as one reason to leave the problem of unreliable eyewitness identification evidence to the states and to jurors. 194 Yet, state courts permit courtroom displays to obscure the reliability of eyewitness identifications and to mislead the jury. From tangled origins in the Court's rulings, the doctrine developed in an odd and unforeseen way. Almost without exception in state courts, a judge may find that the courtroom identification has an “independent source” or has “independent reliability” based on the eyewitness's memory of what she saw. There is nothing independent about the courtroom identification. Eyewitness memory is not “independent” of prior events and courts do not have “independent” access to the memory of an eyewitness. If the prior procedures were suggestive, then, at minimum, the courtroom identification should be per se excluded. In contrast, evidence law recognizes in a host of ways that evidence can be separated into parts for admissibility purposes. Eyewitnesses typically confront multiple identification procedures, in court and out of court. Evidence rules admit prior identifications, which far better capture the eyewitness's memory. What evidence rules do not do, however, is relegate courtroom identifications to a least-favored status. As a result, the ready use of courtroom identifications has frustrated efforts to reform eyewitness identifications in response to decades of social science research and troubling lessons from DNA exonerations. Now that judges and legislatures have begun to reshape eyewitness identification law, a partial exclusion approach could play an important role. The regulation of eyewitness identifications should start with the fundamental requirement that law enforcement follow best practices when conducting identification procedures in the first instance, and it could include per se exclusion of courtroom identifications that follow prior identifications. Perhaps then criminal procedure rules will *498 accomplish their goal of safeguarding the reliability of eyewitness identifications. Until the doctrine is reoriented, courtroom identifications will undermine due process jurisprudence and obscure the reliable evidence that eyewitnesses can provide to our criminal justice system. *499 Appendix States Citing to Independent Source Rules for Admissibility of In-Court Eyewitness Identifications Alabama See Hull v. State, 581 So. 2d 1202, 1204 (Ala. Crim. App. 1990) (“[T]he suggestiveness of the identification procedures must be balanced against factors indicating that the in-court identification was independently reliable.” (citing Dickerson v. Fogg, 692 F.2d 238, 244 (2d Cir. 1982))); Speigner v. State, 369 So. 2d 39, 42 (Ala. Crim. App. 1979) (“[W]here allegations are made that the due process standards were violated by an unfair pretrial confrontation, it becomes the burden of the prosecution to show by clear and convincing evidence that the in-court identification testimony had an independent source and did not stem from the alleged unfair pretrial confrontation.”). Alaska Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 21 See Gipson v. State, 575 P.2d 782, 787 (Alaska Ct. App. 1978) (“The foregoing evidence of identification, which we consider overwhelming, had an ‘independent source’ from the tainted in-court identification which occurred at the first preliminary hearing.”); Gruber v. State, 1984 WL 908688, at *2, n.2 (Alaska App. 1984) (stating that an “in-court identification is admissible, even if the photographic display was suggestive, if it stems from his memory of the assault independent of the suggestive display” (emphasis in original)). Arizona See State v. Marquez, 558 P.2d 692, 695 (Ariz. 1976) (“If the record shows that a pre-trial identification was unduly suggestive, then the in-court identification must be shown to have had an independent source other than the improper pre-trial identification.”). Arkansas See Van Pelt v. State, 816 S.W.2d 607, 610 (Ark. 1991) (“Even had the pre-trial identification been impermissibly suggestive, the taint of an improper ‘show-up’ was removed by the clear and convincing evidence that the in-court identification was based upon [the witness's] independent observations of the suspect.”). *500 California See People v. Cooks, 190 Cal. Rptr. 211, 270 (Ct. App. 1983) (“In California, the burden shifts to the People to prove by clear and convincing evidence that the in-court identifications were based on the witness'[s] observations of the accused at the scene of the crime, that is, independent of the suggestive pretrial identification.”). Colorado See People v. Walker, 666 P.2d 113, 119 (Colo. 1983) (“The People have the burden of establishing by clear and convincing evidence that in-court identification is not the product of an unduly suggestive confrontation, but is based upon the witness'[s] independent observations of the defendant during the commission of the crime.”). Connecticut See State v. Doolittle, 455 A.2d 843, 851 (Conn. 1983) (citing to the courtroom identification as “a strong independent source for the identification of the defendant as the robber apart from the photo identifications”). Florida See Allen v. State, 326 So. 2d 419, 410 (Fla. 1975) (“Viewing the trial testimony of the witnesses in its entirety, there were sufficient independent sources for the in-court identification. There is nothing in the record that shows the in-court identification was tainted by the prior improper out-of-court identification procedure.”). Georgia See Sharp v. State, 692 S.E.2d 325, 330 (Ga. 2010) (“[E]ven if an out-of-court identification is impermissibly suggestive, a subsequent in-court identification is admissible if it did not depend upon the prior identification [ ] but had an independent origin.” (internal quotations omitted)); Shabazz v. State, 667 S.E.2d 414, 417 (Ga. Ct. App. 2008) (“[E]ven a ‘right guy’ Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 22 reference will not taint a subsequent in-court identification if that identification ‘does not depend upon the prior identification but has an independent source.”’). *501 Idaho See State v. Sadler, 511 P.2d 806, 813 (Idaho 1973) (“Since the witness's in-court identification had an independent origin exclusive of any connection with events occurring in the police station, we conclude that the trial court properly admitted this identification into evidence.” (internal quotations and citation omitted)). Illinois See People v. DeJesus, 516 N.E.2d 801, 803 (Ill. App. Ct. 1987) (“If a violation of a defendant's rights is found, the court must then determine whether the in-court identification nevertheless is admissible because it has an independent source.”). Indiana See Brown v. State, 577 N.E.2d 221, 225 (Ind. 1991) (“This Court has repeatedly held, however, that ‘an in-court identification by a witness who has participated in an impermissibly suggestive out-of-court identification is admissible if the witness has an independent basis for the in-court identification.”’). Iowa See State v. Webb, 516 N.W.2d 824, 829 (Iowa 1994) (“We have stated that even where a pretrial identification is obtained by an illegal procedure, ‘the same witness may nevertheless identify a defendant at trial if such identification has an independent origin . . . .”’ (quoting State v. Ash, 244 N.W.2d 812, 814 (Iowa 1976))). Kansas See State v. Skelton, 795 P.2d 349, 356 (Kan. 1990) (“[A]n in-court identification is capable of standing on its own even though a pretrial confrontation was deficient.”). Louisiana See State v. Cheathon, 682 So. 2d 823, 826 (La. App. 1996) (“In the present case, even if we disregard the contrary evidence and assume arguendo that the pre-trial identification represented an *502 impermissibly suggestive activity, the record discloses an independent basis for admitting the in-court identifications by the two victims.”). Maine See State v. Broucher, 388 A.2d 907, 909 (Me. 1978) (analyzing “(1) whether the pre-trial identifications were so suggestive as to be inherently unreliable; and (2) if so, whether the in-court identification had an independent source”). Massachusetts See Commonwealth v. Delrio, 2003 WL 21028648, at *8 (Mass. Super. 2003) (“Notwithstanding the suppression of the identification following the showup, the witness should be permitted to make an in court identification based on the doctrine of independent source.”). Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 23 Michigan See People v. Gray, 577 N.W.2d 92, 96 (Mich. 1998) (“Our inquiry does not end once we have found an invalid identification procedure. The second step in our analysis is to determine whether the victim had an independent basis to identify the defendant in court.”). Minnesota See State v. Taylor, 594 N.W.2d 158, 161 (Minn. 1999) (“[I]f the totality of the circumstances shows the witness'[s] identification has an adequate independent origin, it is considered to be reliable despite the suggestive procedure.” (quoting State v. Ostrem, 535 N.W.2d 916, 921 (Minn. 1995))). Mississippi See Lattimore v. State, 958 So. 2d 192, 198 (Miss. 2007) (“Where constitutional error in pre-trial identification has occurred, the state must show by clear and convincing evidence that subsequent in-court identifications are not based upon the offensive lineup, but instead have an independent origin.”). *503 Missouri See State v. Gates, 637 S.W.2d 280, 285 (Mo. Ct. App. 1982) (“The question remains, therefore, whether the prelineup eyewitness identification was sufficiently reliable as an independent source for the trial identification . . . .”); State v. Morgan, 593 S.W.2d 256, 258 (Mo. Ct. App. 1980) (“The presence of an independent source will serve to remove any taint that might result from a suggestive confrontation.” (quoting State v. Davis, 529 S.W.2d 10, 14 (Mo. Ct. App. 1975))). Nebraska See State v. Smith, 696 N.W.2d 871, 883 (Neb. 2005) (“An in-court identification may properly be received in evidence when it is independent of and untainted by illegal pretrial identification procedures . . . .” (quoting State v. Auger, 262 N.W.2d 187, 189 (Neb. 1978))). Nevada See Hicks v. State, 605 P.2d 219, 221 (Nev. 1980) (“Moreover, the [witness] made independent, positive, and unequivocal in- court identifications of [defendant] at the preliminary examination and trial which were sufficient to render any possible error in the photographic identification procedure harmless.”). New Hampshire See State v. Preston, 442 A.2d. 992, 994-95 (N.H. 1982) (“Once an out-of-court identification has been suppressed, in order for a subsequent in-court identification to be allowed, the State must prove by clear and convincing evidence that ‘the in-court identification ha[d] an independent source and [was] not influenced by the out-of-court viewing . . . .”) (quoting State v. Leclair, 385 A.2d 831, 835 (1978))). New Jersey Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 24 See State v. Peterkin, 543 A.2d 466, 476 (N.J. Super. Ct. App. Div. 1988) (“[T]he State should bear the burden of proving, by clear and convincing evidence, that any in-court identification . . . is derived from an independent source.”). *504 New Mexico See State v. Leyba, 2009 WL 6608373, at *4 (N.M. 2009) (deciding that a trial court should “hear and consider testimony regarding the suggestive context, the reasons for any suggestivity, and whether or not, as in this case, there may have been an independent source for a reliable courtroom identification.”). New York See People v. Dell, 784 N.Y.S.2d 114, 116 (App. Div. 2004) (“The testimony at an independent source hearing established that the victims had multiple opportunities to observe the defendant at close range for a lengthy period of time during the commission of the crime. Therefore, the Supreme Court correctly determined that there was an independent source for the identifications.”). North Carolina See State v. Freeman, 330 S.E.2d 465, 471 (N.C. 1985) (“[W]e need not decide whether the improper display of the photographs to the State's witnesses by one other than the State tainted their in-court identifications. This is so because the trial judge concluded that the witnesses' in-court identifications of defendant were of ‘independent origin, based solely upon what the witnesses saw at the time of the crime.”’). Ohio See State v. Jenksin, 2004 WL 63937, at *5 (Ohio Ct. App. 2004) (“This court has held that, even presuming a pretrial identification procedure is impermissibly suggestive, an in-court identification is permissible where the prosecution establishes by clear and convincing evidence that the witness had a reliable, independent basis for the identification based on prior independent observations made at the scene of the crime.”); State v. Moss, 1989 WL 10253, at *10 (Ohio Ct. App. 1989) (“[W]e find that these eyewitnesses had an independent source for their in-court identifications.”). Oregon See State v. Lawson, 244 P.3d 860, 866 (Or. Ct. App. 2010) (asking “whether the identification had a source independent of the suggestive *505 identification procedures . . . .”), review allowed, 258 P.3d 526 (Or. 2011). Pennsylvania See Commonwealth v. McGaghey, 507 A.2d 357, 359 (Pa. 1986) (stating that the judge must examine whether “the in-court identification resulted from the criminal act and not the suggestive encounter”); Commonwealth v. Bradford, 451 A.2d 1035, 1037 (Pa. Super. Ct. 1982) (“A consideration of the totality of the circumstances in this case leads us to conclude that the identification testimony supplied by the victim at the trial was sufficiently independent of the suggestive pre-trial identification procedure that had been employed by the police.”). South Carolina See State v. Carlson, 611 S.E.2d 283, 290 (S.C. Ct. App. 2005) (“The in-court identification is admissible if based on information independent of the out-of-court procedure.”). Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 25 South Dakota See State v. Iron Necklace, 430 N.W.2d 66, 84 (S.D. 1988) (“[T]he proof shifts to the State to then prove by clear and convincing evidence that the in-court identification had an independent origin.”). Texas See Buxton v. State, 699 S.W.2d 212, 216 (Tex. Crim. App. 1985) (“[W]e find the in-court identification was shown to have an origin independent from the lineup.”). Virginia See McCary v. Commonwealth, 321 S.E.2d 637, 645 (Va. 1984) (“We conclude that the in-court identifications had independent sources free from taint, specifically the ample opportunities the victims availed themselves of to observe [the Defendant] in his activities before and during the crimes.”). *506 Washington See State v. Johnson, 132 P.3d 767, 769 (Wash. Ct. App. 2006) (“Even if an identification procedure was impermissibly suggestive, courts will uphold an in-court identification if it has an ‘independent source.”’). West Virginia See State v. Watson, 318 S.E.2d 603, 613 (W.Va. 1984) (holding that a court must ask “if the witness had an independent basis for his identification other than an impermissible out-of-court identification”). Wisconsin See State v. Dubose, 699 N.W.2d 582, 596 (Wis. 2005) (“The witness would still be permitted to identify the defendant in court if that identification is based on an independent source.” (quoting People v. Adams, 423 N.E.2d 379, 384 (N.Y. 1981))); Powell v. State, 271 N.W.2d 610, 617 (Wis. 1978) (“[T]he state has the burden of showing that the subsequent in-court identification derived from an independent source and was thus free of taint.”). Washington, D.C. Collins v. United States, 491 A.2d 480, 489 (D.C. 1985) (noting that the judge found “independent source” for lineup and in- court identifications). Footnotes a1 Professor of Law, University of Virginia School of Law. I thank Kerry Abrams, Phyllis Goldfarb, Sam Gross, Nancy King, Kelly Knepper, Cynthia Lee, Peter Neufeld, Terry Maroney, Alan Morrison, Stephen Salzburg, Barry Scheck, Lisa Steele, Sandra Guerra Thompson, Gary Wells, and participants at workshops at George Washington Law School, Vanderbilt Law School, and the SEALS conference, for their invaluable comments. For excellent research assistance, I thank Mark Johnson, Daniel Ross, and Steven Sun. Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 26 1 See, e.g., Timothy P. O'Toole & Giovanna Shay, Manson v. Brathwaite Revisited: Towards a New Rule of Decision for Due Process Challenges to Eyewitness Identification Procedures, 41 Val. U. L. Rev. 109, 122-26 (2006); Charles A. Pulaski, Neil v. Biggers: The Supreme Court Dismantles the Wade Trilogy's Due Process Protection, 26 Stan. L. Rev. 1097, 1104-10 (1974); Benjamin E. Rosenberg, Rethinking the Right to Due Process in Connection with Pretrial Identification Procedures: An Analysis and a Proposal to Return to the Wade Trilogy's Standard, 79 Ky. L.J. 259, 275-97 (1991); Richard A. Wise et al., A Tripartite Solution to Eyewitness Error, 97 J. Crim. L. & Criminology 807, 870 (2007); David E. Paseltiner, Note, Twenty-Years of Diminishing Protection: A Proposal to Return to the Wade Trilogy's Standards, 15 Hofstra L. Rev. 583, 589-90 (1987); Dori Lynn Yob, Comment, Mistaken Identifications Cause Wrongful Convictions: New Jersey's Lineup Guidelines Restore Hope, but Are They Enough?, 43 Santa Clara L. Rev. 213, 229-31 (2002). 2 Watkins v. Sowders, 449 U.S. 341, 352 (1981) (Brennan, J., dissenting) (emphasis in original omitted) (quoting Elizabeth F. Loftus, Eyewitness Testimony (1979)). 3 Bill Stuntz has most prominently criticized the priority of procedure over substance in modern criminal procedure. See William J. Stuntz, The Uneasy Relationship Between Criminal Procedure and Criminal Justice, 107 Yale L.J. 1, 37-45 (1997). 4 Manson v. Brathwaite, 432 U.S. 98, 114 (1977). 5 United States v. Wade, 388 U.S. 218, 228 (1967); see infra Part I.B. 6 See Geoffrey Gaulkin, Report of the Special Master: State of New Jersey v. Larry E. Henderson 9 (2010) [hereinafter Henderson Report], available at http://www.judiciary.state.nj.us/pressrel/HENDERSON%20FINAL%C20BRIEF%C20.PDF% 20(00621142).PDF (the New Jersey Supreme Court instructed a special master to consider “the current validity of our state law standards on the admissibility of eyewitness identification”'). 7 See, e.g., Brandon L. Garrett, Convicting the Innocent: Where Criminal Prosecutions Go Wrong 63-72 (2011). 8 See Gary L. Wells & Amy L. Bradfield, “Good, You Identified the Suspect”: Feedback to Eyewitnesses Distorts Their Reports of the Witnessing Experience, 83 J. Applied Psychol. 360, 360 (1998). 9 For example, the Department of Justice convened in 1998 a task force that played a crucial role in creating awareness about the need to adopt sound eyewitness identification procedures. The report cited as its impetus both a “growing body” of social science research and “[r]ecent cases in which DNA evidence has been used to exonerate individuals convicted primarily on the basis of eyewitness testimony.” See Department of Justice, Technical Working Group for Eyewitness Evidence, NCJ 178240, Eyewitness Identifications: A Guide for Law Enforcement, at iii (1999), available at https:// www.ncjrs.gov/pdffiles1/nij/178240.pdf (“Recent cases in which DNA evidence has been used to exonerate individuals convicted primarily on the basis of eyewitness testimony have shown us that eyewitness evidence is not infallible.”). See generally James M. Doyle, True Witness: Cops, Courts, Science and the Battle Against Misidentification, at xi-xiii (2005). 10 Garrett, supra note 7, at 45-83. 11 See id. at 48; infra Part I.D. 12 See, e.g., Brandon L. Garrett, Judging Innocence, 108 Colum. L. Rev. 55, 122-25 (2008) (describing reforms advanced and adopted to improve accuracy in criminal investigations and prosecutions). 13 132 S. Ct. 716 (2012). 14 Id. at 727. 15 Id. at 728. 16 Id. at 732 (Sotomayor, J., dissenting). Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 27 17 Id. at 729. 18 See infra Part III.B. 19 See infra Part I.C (discussing this practice and other recommendations). 20 See Gary L. Wells, Mancy K. Steblay & Jennifer E. Dysart, Am. Judicature Soc'y, A Test of the Simultaneous vs. Sequential Lineup Methods, at x (2011), available at www.ajs.org/wc/pdfs/EWID_PrintFriendly.pdf (finding that using a sequential procedure, rather than simultaneous, reduces mistaken identifications with little or no reduction in accurate identifications). 21 No one has carefully examined state and federal rulings adopting this so-called “independent source” or “independent reliability” test for admitting subsequent or in-court eyewitness identifications. The leading practice guide to eyewitness testimony provides the only account of the doctrine of independent source as a more general “pivotal strategic problem,” and briefly discusses the “extraordinary lengths” courts may go to admit in-court identifications. See Elizabeth F. Loftus, James M. Doyle & Jennifer E. Dysart, Eyewitness Testimony: Civil and Criminal § 8-18 (4th ed. 2007) (“[C]ourts have gone to truly extraordinary lengths to accept very limited opportunities to observe independent sources.”). I have located two scholars who have written about the potential danger of such rulings. Sandra Guerra Thompson, in an important article on the role of state courts in reforming criminal procedure more generally, discusses such decisions in several states. See Sandra Guerra Thompson, Eyewitness Identifications and State Courts as Guardians Against Wrongful Conviction, 7 Ohio St. J. Crim. L. 603, 628-31 (2010). Katherine Kruse, in an insightful analysis of reform efforts in Wisconsin, cites to the potential corrosive effect of such “independent source” rules. See Katherine R. Kruse, Instituting Innocence Reform: Wisconsin's New Governance Experiment, 2006 Wis. L. Rev. 645, 722 n.367. Moreover, very few commentators have discussed in-court identifications generally. Infra note 44. But see Loftus et al., supra, §8-17(d) (advising lawyers on how to best litigate an in-court identification). 22 I have found one reference, in a state practice guide, to the questionable origins of this so called “independent” analysis. See 41 George E. Dix & Robert O. Dawson, Texas Practice: Criminal Practice & Procedure § 14.39 (2d ed. 2001) (“The Texas case law shows some tendency to interject independent source considerations into analysis of defendants' due process claims. This unfortunately confuses the differences between the two constitutional concerns at issue ....”). Few judges have noted the flaws in such an approach, although a few dissenting judges have done so. See infra notes 134-35. 23 See Fed. R. Evid. 801(d)(1)(C) (stating hearsay exception for prior statement that is “one of identification of a person made after perceiving the person”). 24 See, e.g., Herring v. United States, 555 U.S. 135, 137 (2009) (holding exclusionary rule did not apply to warrantless arrest caused by negligent police recordkeeping error). 25 Perry v. New Hampshire, 132 S. Ct. 716, 730 (2012). 26 See infra Part III.B. 27 See State v. Henderson, 27 A.3d 872 (N.J. 2011); see infra Part I.C. 28 Alvin G. Goldstein, June E. Chance & Gregory R. Schneller, Frequency of Eyewitness Identification in Criminal Cases: A Survey of Prosecutors, 27 Bull. Psychonomic Soc'y 71, 73 (1989). That survey is now dated, but hopefully new efforts to survey police and prosecutors will provide more complete data. 29 See, e.g., Edwin M. Borchard, Convicting the Innocent 367 (1932) (describing how, in the first collection of accounts of “criminal prosecutions and convictions of completely innocent people,” that “[p]erhaps the major source of these tragic errors is an identification of the accused by the victim of a crime of violence”); Felix Frankfurter, The Case of Sacco and Vanzetti 30 (1927) (“The identification of strangers is proverbially untrustworthy.”); Hugo Munsterberg, On the Witness Stand: Essays on Psychology and Crime 39, 50-69 (1908) (describing early psychological research on malleability and unreliability of eyewitness memory); see also Wells et al., supra note 20, at 16 (noting a thirty-one percent rate of filler identifications). Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 28 30 Cf. Manson v. Brathwaite, 432 U.S. 98, 131 & n.11 (1977) (Marshall, J., dissenting) (stating that “the greatest memory loss occurs within hours after an event” and citing a holding from the Court of Appeals for the District of Columbia that a showup four hours after the crime was not permissible); Jessica Lee, Note, No Exigency, No Consent: Protecting Innocent Suspects from the Consequences of Non-Exigent Show-ups, 36 Colum. Hum. Rts. L. Rev. 755, 759-62 (2005) (“Unlike ‘stationhouse’ show-ups, show-ups conducted in the field are often admissible due to their spatial and temporal proximity to the crime.”). 31 See Richard Gonzalez, Phoebe C. Ellsworth & Maceo Pembroke, Response Biases in Lineups and Showups, 64 J. Personality & Soc. Psychol. 525 (1993) (finding subjects tested with showups only identified the perpetrator thirty percent of the time whereas when tested with lineups they correctly identified the perpetrator sixty-seven percent of the time). In addition, “there is clear evidence that showups are more likely to yield false identifications than are properly constructed lineups.” See Gary L. Wells et al., Eyewitness Identification Procedures: Recommendations for Lineups and Photospreads,” 22 Law & Hum. Behav. 603, 630-31 (1998) (citing multiple sources). 32 See Gary L. Wells & Deah S. Quinlivan, Suggestive Eyewitness Identification Procedures and the Supreme Court's Reliability Test in Light of Eyewitness Science: 30 Years Later, 33 Law & Hum. Behav. 1, 16 (2009) (stating that a “large percentage of jurisdictions in the U.S. use only photographs and never use live lineups”). 33 Few surveys of police policies have been conducted, although one national is currently in progress. See Erica Goode & John Schwartz, Police Lineups Start to Face Facts: Eyes Can Lie, N.Y. Times, Aug. 28, 2011 http:// www.nytimes.com/2011/08/29/us/29witness.html?_r=1&scp=1&sq=police%20linups%C20start%C20to%C20face %f#acts&st=cse (noting that the Police Executive Research Forum has begun a survey on the topic from 1,400 randomly selected police departments). Separate questions remain as to compliance with written policies, even if they do reflect best practices on paper. Id. (“[E]ven in departments that have enacted changes, police officers sometimes fail to comply with the new procedures.”). A prior national survey with responses from 220 of 500 departments found that seventy-four percent of officers learned how to handle lineups from another officer, forty-four percent from court rulings and case law, forty-two percent from course work or professional instruction, while eighteen percent cited to learning from specific rules and regulations and thirty-one percent from general written guidelines. Michael S. Wogalter, Roy S. Malpass & Dawn E. McQuiston, A National Survey of U.S. Police on Preparation and Conduct of Identification Lineups, 10 Psychol. Crime & L. 69, 72 (2004). In addition, most officers (fifty-eight percent) reported a lack of “formal training in eyewitness identification techniques.” Id. at 79. Single-state surveys have been conducted. For example, a survey by the Virginia Crime Commission found that at least twenty-five percent of departments still had no written policy on the subject-- despite enactment of legislation five years earlier requiring that some form of written procedure be adopted. See Chelyen Davis, Panel Head Favors New Rules on Police Lineups, Free Lance-Star (Fredericksburg, Va.), Sept. 9, 2010, http:// fredericksburg.com/News/ FLS/2010/092010/09092010/574245. A survey of lineup procedures in Texas found only twelve percent of responding departments had any written policies; legislation requiring written policies was subsequently enacted. See Tony Plohetski, Police Pen New Rules for Photo Lineups, Austin Am.-Statesman, May 8, 2009, at A1. 34 Bruce W. Behrman & Sherrie L. Davey, Eyewitness Identification in Actual Criminal Cases: An Archival Analysis, 25 Law & Hum. Behav. 475, 479 (2001) (noting that in 271 cases analyzed, 258 field showups were used; however, multiple showups could occur in each case); Gonzalez et al., supra note 31, at 535 (“In our sample showup identifications were over three times more common than lineups, and follow-up research currently underway in Washington and Michigan suggests that showups are frequently used.”); Sandra Guerra Thompson, Judicial Blindness To Eyewitness Misidentification, 93 Marq. L. Rev. 639, 646 (2009) (“[S]how-ups constitute one of the most commonly used identification procedures.”). 35 See Richard A. Wise, Clifford S. Fishman & Martin A. Safer, How to Analyze the Accuracy of Eyewitness Testimony in a Criminal Case, 42 Conn. L. Rev. 450-52 (2009) (describing how eyewitness identifications are litigated in criminal cases). 36 United States v. Robertson, 19 F.3d 1318, 1323 (10th Cir. 1994) (citations omitted) (quoting United States v. Domina, 784 F.2d 1361, 1368 (9th Cir. 1986)). 37 See, e.g., People ex rel. Blassick v. Callahan, 279 N.E.2d 1, 3 (Ill. 1972) (“We have specifically rejected the contention that on [sic] in-court identification of an accused without a lineup denies due process of law.”); People v. Bradley, 546 N.Y.S.2d 437, 437 (App. Div. 1989) (“A criminal defendant does not have a constitutional right to participate in a lineup whenever he requests one.”); Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 29 People v. Grady, 506 N.Y.S.2d 922, 932 (Sup. Ct. 1986) (“[I]t is undisputed that there is no constitutional requirement that a defense- requested in-court lineup be conducted.”). But see Evans v. Superior Court, 11 Cal. 3d 617, 625 (1974) (“[D]ue process requires ... that an accused, upon timely request therefor [sic], be afforded a pretrial lineup in which witnesses to the alleged criminal conduct can participate ... when eyewitness identification is shown to be a material issue and there exists a reasonable likelihood of a mistaken identification which a lineup would tend to resolve.”). 38 United States v. Archibald, 756 F.2d 223, 223 (2d Cir.1984) ( “[S]pecial procedures are necessary only where (1) identification is a contested issue; (2) the defendant has moved in a timely manner prior to trial for a lineup; and (3) despite that defense request, the witness has not had an opportunity to view a fair out-of-court lineup prior to his trial testimony or ruling on the fairness of the out- of-court lineup has been reserved.”). Such procedures are within the discretion of the trial court. Domina, 784 F.2d at 1369. 39 See Evan J. Mandery, Legal Development: Due Process Considerations of In-Court Identifications, 60 Alb. L. Rev. 404-09; see also State v. Smith, 512 A.2d 189, 193 (Conn. 1986) (“We know of no authority which would prohibit, as unduly suggestive, an exclusively in-court identification.” (quoting Mangrum v. State, 270 S.E.2d 874, 876 (Ga. Ct. App. 1980)). 40 People v. Martinez, Nos. 6403/01, 6402/01, 2001 WL 1789315 (N.Y. Sup. Ct. Nov. 28, 2001). 41 People v. Gow, 382 N.E.2d 673, 675 (Ill. App. 1978) (describing how eyewitness identified person seated next to defense counsel); Fredric D. Woocher, Note, Did Your Eyes Deceive You? Expert Psychological Testimony on the Unreliability of Eyewitness Identification, 29 Stan. L. Rev. 969, 969 n.3 (1977) (“A judge in New York City developed his own system to check on the frequency of mistaken identifications. In ten cases in which the identification of the accused was virtually the only evidence, the judge permitted defense attorneys to seat a look-alike alongside the defendant. In only two of the ten cases was the witness able to identify the defendant.”). 42 United States v. Sabater, 830 F.2d 7, 9 (2d Cir. 1987). 43 United States v. Thoreen, 653 F.2d 1332, 1339-40 (9th Cir. 1981); see People v. Simac, 641 N.E.2d 416 (Ill. 1994) (affirming conviction for direct criminal contempt of attorney who substituted the position of the defendant without permission from the judge); Miskovsky v. State ex rel. Jones, 586 P.2d 1104, 1108 (Okla. Crim. App. 1978) (explaining source of the contempt finding was counsel's failure to gain permission from the court before substituting another person for the defendant). Interestingly, one judge dissented in the Illinois case, stating, “After a thorough review of the record, I believe that defense counsel was acting in good faith to protect his client from a suggestive in-court identification.” Simac, 641 N.E.2d at 424 (Nickels, J., dissenting). 44 See Fed. R. Evid. 801(d)(1)(C) (hearsay exclusion for prior statement that is “one of identification of a person made after perceiving the person.”); see also Gilbert v. California, 388 U.S. 263, 272 n.3 (1967) (“It was [sic] been held that the prior identification is hearsay, and, when admitted through the testimony of the identifier, is merely a prior consistent statement. The recent trend, however, is to admit the prior identification under the exception that admits as substantive evidence a prior communication by a witness who is available for cross-examination at trial.”). 45 See Michael H. Graham, 3 Handbook of Federal Evidence § 801:11 (6th ed. 2006). 46 S. Rep. No. 94-199, at 2 (1975). 47 Id. 48 Id. 49 See Mandery, supra note 39, at 389 (“[W]hile the constitutional issues surrounding pre-trial identifications have been widely litigated and explored by scholars, little attention has been paid to the issues raised by in-court identifications.”). 50 United States v. Wade, 388 U.S. 218, 228 (1967). 51 Id. Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 30 52 Stovall v. Denno, 388 U.S. 293, 302 (1967). 53 See, e.g., United States ex rel. Kirby v. Sturges, 510 F.2d 397, 407 n.32 (7th Cir. 1975). 54 Wade, 388 U.S. at 235-36. 55 Id. at 240. 56 Id. at 240-41. 57 Id. at 241. 58 Id. at 242. 59 Id. at 235. 60 Id. at 220. 61 Gilbert v. California, 388 U.S. 263, 272 n.3 (1967) (“It was [sic] been held that the prior identification is hearsay, and, when admitted through the testimony of the identifier, is merely a prior consistent statement. The recent trend, however, is to admit the prior identification under the exception that admits as substantive evidence a prior communication by a witness who is available for cross- examination at trial.”). 62 Id. at 270 & n.2. 63 Id. at 272. In contrast, the Court found a per se exclusionary rule as to the pretrial identification based on the denial of counsel at the lineup. Id. at 273. 64 E.g., Moore v. Illinois, 434 U.S. 220, 231 (1977); Coleman v. Alabama, 399 U.S. 1, 21 (1970) (Harlan, J., concurring in part and dissenting in part) (“The Wade rule requires the exclusion of any in-court identification preceded by a pretrial lineup where the accused was not represented by counsel, unless the in-court identification is found to be derived from a source ‘independent’ of the tainted pretrial viewing.”). 65 See Sandra Guerra Thompson, Beyond a Reasonable Doubt? Reconsidering Uncorroborated Eyewitness Identification Testimony, 41 U.C. Davis L. Rev. 1487, 1511 (2008) (developing “the shortcomings of an attorney's presence as a remedy”). 66 United States v. Ash, 413 U.S. 300, 321 (1973). 67 Wells et al., supra note 31, at 608. 68 Foster v. California, 394 U.S. 440, 442-43 (1969). 69 Simmons v. United States, 390 U.S. 377, 384 (1968). 70 Manson v. Brathwaite, 432 U.S. 98, 122 (1977) (Marshall, J., dissenting). 71 Neil v. Biggers, 409 U.S. 188, 199 (1972). The Court did not have occasion to rule on whether that test should supplant the Stovall test in that case, since the lineup in question pre-dated the Court's Stovall ruling. Id. at 200. 72 Manson, 432 U.S. at 113-14. 73 Id. at 113. 74 Id. 75 See id. (directing judges to weigh the Biggers factors against the “corrupting effect of the suggestive identification itself”). Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 31 76 Id. at 114; Biggers, 409 U.S. at 199-200. 77 Manson, 432 U.S. at 114. 78 A white paper by the American Psychology-Law Society summarized the state of the research and provides four recommendations for reforming eyewitness identification procedures. Wells et al., supra note 31, at 603; see also Wells & Quinlivan, supra note 32 at 7-8 (explaining lack of correlation between eyewitness description and accuracy of identification). 79 Elizabeth F. Loftus, Eyewitness Testimony 52-54 (1996); A. Daniel Yarmey, Understanding Police Work: Psychosocial Issues 298-300 (1990). 80 See Melissa Pigott & John C. Brigham, Relationship Between Accuracy of Prior Description and Facial Recognition, 70 J. Applied Psychol. 547, 547-548 (1985) (finding congruence and accuracy of eyewitness reports not highly related); Gary L. Wells, Verbal Descriptions of Faces from Memory: Are They Diagnostic of Identification Accuracy?, 70 J. Applied Psychol. 619, 623 (1985) (finding low relation between congruence and accuracy of eyewitness reports). 81 E.g., Brian L. Cutler, Steven D. Penrod & Hedy Red Dexter, Juror Sensitivity to Eyewitness Identification Evidence, 14 Law & Hum. Behav. 190 (1990); Samuel R. Gross, Loss of Innocence: Eyewitness Identification and Proof of Guilt, 16 J. Legal Stud. 395, 400-01 (1987); Richard S. Schmechel, Timothy P. O'Toole, Catharine Easterly & Elizabeth F. Loftus, Beyond The Ken? Testing Jurors' Understanding of Eyewitness Reliability Evidence, 46 Jurimetrics J. 177, 195-96 (2006); Wells et al., supra note 26, 619-621; Gary L. Wells, How Adequate is Human Intuition for Judging Eyewitness Testimony?, in Eyewitness Testimony: Psychological Perspectives (Gary L. Wells & Elizabeth F. Loftus eds., 1984). 82 Wells et al., supra note 31, at 627-29. 83 Loftus et al., supra note 21, § 4-9. For an important field study documenting advantages of a sequential procedure, showing lineup members to witnesses one at a time rather than simultaneously, see Wells, Steblay & Dysart, supra note 20. 84 See Loftus et al., supra note 21, § 4-8(b) (describing study by Roy Malpass and Patricia Devine, and noting eighteen other studies demonstrating higher false identification when such biased instructions were provided); Amy Douglass & Nancy Steblay, Memory Distortion in Eyewitnesses: A Meta-Analysis of the Post-Identification Feedback Effect, 20 Applied Cognitive Psychol. 859, 864-65 (2006) (discussing how positive post-identification feedback increases witness confidence in identification). 85 Nancy K. Steblay, Maintaining The Reliability of Eyewitness Evidence: After the Lineup, 42 Creighton L. Rev. 643, 647-51 (2009) (discussing data from studies of repeated lineups); see, e.g., Gabriel W. Gorenstein & Phoebe C. Ellsworth, Effect of Choosing an Incorrect Photograph on a Later Identification by an Eyewitness, 65 J. Applied Psychol. 616, 620-21 (1980) (studying commitment effect of using a photo showup before a live lineup); Tiffany Hinz & Kathy Pezdek, The Effect of Exposure to Multiple Lineups on Face Identification Accuracy, 25 Law & Hum. Behav. 185, 194-96 (2001) (assessing accuracy of identifications when intervening lineups occur). 86 See, e.g., Nancy K. Steblay et al., Sequential Lineup Laps and Eyewitness Accuracy, 35 Law & Hum. Behav. 262, 271 (2011) (describing studies that find repeat viewings, or “laps,” increase choosing rates and error rates, with particularly high error rates among witnesses who choose to view a second time). 87 Loftus et al., supra note 21, § 6-2; Steven Penrod & Brian Cutler, Witness Confidence and Witness Accuracy: Assessing Their Forensic Relation, 1 Psychol. Pub. Pol'y & L. 817, 827 (1995). 88 Gary L. Wells & D.M. Murray, What Can Psychology Say About the Neil vs. Biggers Criteria for Judging Eyewitness Identification Accuracy?, 68 J. Applied Psychol. 347, 357-58 (1983). 89 Wells & Bradfield, supra note 8, at 374. 90 Wells et al., supra note 31, at 605. Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 32 91 Mandery, supra note 39, at 416; see also Wells & Quinlivan, supra note 32 (“Although experiments have not directly tested the question of in-court identifications that occur after a pretrial lineup, our understanding of transference and commitment effects leads to the reasonable inference that a mistaken identification prior to trial is likely to be replicated during an in-court identification.”). 92 See, e.g., Petition for a Writ of Certiorari at 3-4, Perez v. United States, 547 U.S. 1002 (2006) (No. 05-596), 2005 WL 3038542 (certiorari petition seeking review of Biggers and Manson test based on new empirical studies); State v. Ledbetter, 881 A.2d 290, 304-06 (Conn. 2005) (Connecticut Supreme Court rejecting constitutional challenge, citing “scientific studies” to the five factor test of Biggers and Manson). 93 Perry v. New Hampshire, 132 S. Ct. 716, 738 (2012) (Sotomayor, J., dissenting). 94 Gross, supra note 81, at 395. 95 Perry, 132 S. Ct. at 728 (citing Brief for American Psychological Association as Amicus Curiae as “describing research indicating that as many as one in three eyewitness identifications is inaccurate”); Wells & Quinlivan, supra note 32, at 6; see also Henderson Report, supra note 6, at 15-16 (providing an overview of error rates found in archival studies, together with results from field studies and laboratory experiments). 96 Garrett, supra note 7, at 7. 97 Id. at 9. 98 See Neil Miller, Frontline, The Burden of Innocence, Profiles (May 1, 2003), http:// www.pbs.org/wgbh/pages/frontline/shows/ burden/profiles/miller.html. 99 Trial Transcript at 3-14, Commonwealth v. Miller, No. 085602-04 (Mass. Sup. Ct. Dec. 18, 1990) [hereinafter Miller Trial Transcript]. 100 Id. at 37-38, 68-69 (Dec. 17, 1990). 101 Id. at 72. 102 Id. at 41. 103 Id. at 42-43, 53. 104 Brief and Record Appendix for the Defendant on Appeal at 7 n.7, Commonwealth v. Miller, 609 N.E.2d 1251 (Mass. App. Ct. Oct. 1992) (No. 92-P-612). 105 Id. at 6-7; Miller Trial Transcript, supra note 99, at 126 (Dec. 17, 1990). 106 Miller Trial Transcript, supra note 99, at 123-34 (Dec. 17, 1990). 107 Id. at 104-05. 108 Id. at 128-29; see also Brief and Record Appendix, supra note 104, at 23-24. 109 Miller Trial Transcript, supra note 99, at 105 (Dec. 17, 1990). 110 The crime lab analyst (incorrectly) described the forensics as including Neil Miller but also forty-five percent of the population (in fact, no man could be excluded). See Brandon L. Garrett & Peter J. Neufeld, Invalid Forensic Science Testimony and Wrongful Convictions, 95 Va. L. Rev. 1, 41-42 (2009) (explaining invalid testimony concerning the phenomenon of “masking” and non- quantification in that case and others). 111 Neil Miller, supra note 98. 112 Id. Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 33 113 Trial Transcript at 304, State v. Avery, No. 85 FE 118 (Wis. Cir. Ct. Dec. 10, 1985). 114 Trial Transcript at 11, 62, Commonwealth v. Doswell, No. CC 8603467 (Pa. Ct. Com. Pl. Nov. 19, 1986). 115 Trial Transcript at J-128, People v. Cage, 94 29467 (Ill. Cir. Ct. January 9, 1995). 116 See Bill Rankin, Exonerations Urge Changes for Eyewitnesses, Atlanta J-Const., Dec. 25, 2008, at C1 (quoting from victim's testimony). 117 Perry v. New Hampshire, 132 S. Ct. 716, 728 (2012). 118 Id. at 727. 119 Id. at 732 (Sotomayor, J., dissenting). 120 See O'Toole & Shay, supra note 1, at 129 (“The Manson rule of decision also produces rote and unconvincing analysis in state court opinions.”). 121 See; State v. Smith, 512 A.2d 189, 193 (Conn. 1986) (“The manner in which in-court identifications are conducted is not of constitutional magnitude but rests within the sound discretion of the trial court”); Middleton v. United States, 401 A.2d 109, 132 (D.C. 1979) (noting that in-court identifications are “less threatening of the due process guarantee” than one-on-one confrontations in the police station); Ralston v. State, 309 S.E.2d 135, 136 (Ga. 1983) (reasoning that in-court identifications are not scrutinized for reliability because they are under the supervision of the court); State v. Clausell, 580 A.2d 221, 235 (N.J. 1990) (holding that an in-court identification was “constitutionally valid” despite the fact that the witness had not been able to identify the defendant previously in a photo array). Mandery, supra note 39, provides an excellent discussion of these cases at 402-03. 122 Ralston, 309 S.E.2d at 136;People v. Rodriguez, 480 N.E.2d 1147, 1151 (Ill. App. Ct. 1985). 123 Such language has been adopted by courts in thirty-eight states and Washington D.C.: Alabama, Alaska, Arizona, Arkansas, California, Colorado, Connecticut, Florida, Georgia, Idaho, Illinois, Indiana, Iowa, Kansas, Louisiana, Maine, Massachusetts, Michigan, Minnesota, Mississippi, Missouri, Nebraska, Nevada, New Hampshire, New Jersey, New York, New Mexico, North Carolina, Ohio, Oregon, Pennsylvania, South Carolina, South Dakota, Texas, Virginia, Washington, West Virginia, Wisconsin. The Appendix contains one or more citations to leading cases from each of those states. 124 Rulings in Massachusetts, South Carolina and Wisconsin all provide examples. See Commonwealth v. Delrio, 2003 WL 21028648, at *8 (Mass. Super. Ct. 2003) (“Notwithstanding the suppression of the identification following the showup, the witness should be permitted to make an in court identification based on the doctrine of independent source.”); State v. Carlson, 611 S.E.2d 283, 290 (S.C. Ct. App. 2005) (“The in-court identification is admissible if based on information independent of the out-of-court procedure.”); State v. Dubose, 699 N.W.2d 582, 596 (Wis. 2005) (“The witness would still be permitted to identify the defendant in court if that identification is based on an independent source.” (quoting People v. Adams, 423 N.E.2d 379, 384 (N.Y. 1981))). 125 Commonwealth v. Rollins, 738 A.2d 435, 443 (Pa. 1999). 126 State v. Carlson, 611 S.E.2d 283, 290 (S.C. Ct. App. 2005) (citing State v. Rogers 210 S.E.2d 604 (S.C. 1974)). 127 McCary v. Commonwealth, 321 S.E.2d 637, 645 (Va. 1984). 128 State v. Freeman, 330 S.E.2d 465, 471 (N.C. 1985). 129 Gipson v. State, 575 P.2d 782, 787 (Alaska 1978) (“The foregoing evidence of identification, which we consider overwhelming, had an ‘independent source’ from the tainted in-court identification which occurred at the first preliminary hearing.”). 130 State v. Trammel, 92 P.3d 1101, 1110 (Kan. 2004) (citing State v. Edwards, 955 P.2d 1276 (Kan. 1998)). Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 34 131 Those six states are Delaware, Kentucky, Oklahoma, Rhode Island, Utah, and Vermont. See State v. Johnson, No. K91-06-0069I, 1991 WL 302644, at *2 (Del. Super. Ct. Dec. 18, 1991) (“A court may admit evidence based on an otherwise ‘unnecessarily suggestive’ identification procedure if counsel can show the independent reliability of the identification testimony.”); Grady v. Commonwealth, 325 S.W.3d 333, 354 (Ky. 2010) (“[T]he unduly suggestive nature of the pre-trial lineup becomes totally irrelevant if a court determines that there is an independent basis of reliability for the in-court identification.”); Berry v. State, 834 P.2d 1002, 1005 (Okla. Crim. App. 1992) (“A courtroom identification will not be invalidated if it can be established that it was independently reliable under the totality of the circumstances.” (citing Cole v. State, 766 P.2d 538, 359 (Okla. Crim. App. 1988))); State v. Patel, 949 A.2d 401, 410 (R.I. 2008) (“If the procedure is found to have been unnecessarily suggestive, the second step requires a determination of whether the identification still has independent reliability despite the suggestive nature of the identification procedure.” (citing State v. Camirand, 572 A.2d 290, 293 (R.I. 1990))); State v. Thamer, 777 P.2d 432, 435 (Utah 1989) (“[I]f the photo array is impermissibly suggestive, then the in-court identification must be based on an untainted, independent foundation to be reliable.”); State v. Savo, 446 A.2d 786, 791 (Vt. 1982) (“An in-court identification, even where it has been preceded by a suggestive pretrial identification, may still be admissible where its reliability can be independently established.”). 132 See, e.g., Raheem v. Kelly, 257 F.3d 122, 135 (2d Cir. 2001) (citing Manson v. Brathwaite and discussing the need to weigh factors suggesting independent reliability); see also United States v. Wise, 515 F.3d 207, 215 (3d Cir. 2008) (in-court identification admissible though police showed witness photo of defendant with words “Harrisburg Police Department” printed above his head because witness had previously lived with defendant and thus in-court identification was independently reliable); United States v. McCabe, No. 89-30271, 1990 WL 61969902, at *1 (9th Cir. May 14, 1990) (“Because the procedure used in this case was not impermissibly suggestive, [the defendant's] due process claim fails, and inquiry into the independent reliability factors set forth in Manson v. Brathwaite is not required.” (citation omitted)). 133 I have found no due process cases providing “independent source” or “independent reliability” language in five states: Hawaii, Montana, North Dakota, Tennessee, or Wyoming. See, e.g., State v. Atkins, No. 03C01-9302-CR-00058, 1994 WL 81524, at *9 (Tenn. Crim. App. Mar. 3, 1994) (“If a court determines that under the Biggers standard a pretrial confrontation was so impermissibly suggestive that it violated an accused's due process rights, the independent origin of the in-court identification is irrelevant. Both out-of-court and in-court identifications are automatically excluded.”). One Montana decision is ambiguous on this point. State v. Hedrick, 745 P.2d 355, 358 (Mont. 1987) (“The independent basis for the victim's in court identification also prevents the possibility of a substantial likelihood of irreparable misidentification.”). North Dakota had one case citing to an independent basis for admitting an in-court identification, but the case predated Manson, and, absent any more recent rulings, North Dakota was not included. State v. McKay, 234 N.W.2d 853, 858 (N.D. 1975) (“[I]n-court identification of the defendant was not based on a suggestive viewing of him at the police station, but had a basis independent of that viewing.”). A more recent ruling failed to reach the issue. State v. Lewis, 302 N.W.2d 396, 399 (N.D. 1981). 134 See Webster v. State, 474 A.2d 1305, 1316 (Md. 1984) (“[T]he trial court] looked to the ‘independent source’ rule of Wade-Gilbert, but, as we have pointed out supra, that rule is concerned only with a lineup which is illegal on Sixth Amendment right to counsel grounds. It is not the test for the admissibility of identification evidence challenged on Fourteenth Amendment due process grounds.”). But see Barrow v. State, 474 A.2d 967, 976 (Md. Ct. Spec. App. 1984) (“Even if the State fails to satisfy the legality of a pre-trial confrontation or viewing, the State may still secure the admissibility of a courtroom identification by the same identifying witness if it establishes by clear and convincing evidence that a courtroom identification had a source independent of the prior illegal confrontation or viewing.”); Alston v. State, 934 A.2d 949, 967 (Md. Ct. Spec. App. 2007) (asking “whether the courtroom identification has an independent source”). 135 Graham v. Solem, 728 F.2d 1533, 1549 (8th Cir. 1984) (McMillian, J., dissenting) (“[C]oncepts of ‘purged taint’ and ‘independent origin’ have been blended into, and superseded by, the two-step process of weighing reliability against suggestiveness articulated in Biggers.”); United States v. Batista Ferrer, 842 F. Supp. 40, 42 (D.P.R. 1994); State v. McMorris, 570 N.W.2d 384, 393 (Wis. 1997) (“[T]he Wade and Biggers tests are derived from different constitutional amendments and are intended to achieve different purposes.”); see also Bernal v. People, 44 P.3d 184, 204-05 (Colo. 2002) (Coats, J., dissenting) (“By analogy to a violation of the Sixth Amendment right to counsel, many jurisdictions, including this one, considered the witness's independent ability to make an identification only as an ‘independent source’ or ‘independent basis' for allowing an in-court identification, despite an ‘unduly,’ ‘impermissibly,’ ‘unnecessarily,’ or ‘unconstitutionally’ suggestive out-of-court procedure.”). The judge added: “Unlike violations Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 35 of the Fourth or Sixth Amendment, from which the ‘independent source’ doctrine is clearly borrowed, however, the due process test applies to both the “derivative” in-court identification and the challenged pretrial identification itself ....” Id. at 206. 136 See, e.g., People v. Gray, 577 N.W.2d 92, 96 n.8 (Mich. 1998) (“The remedy for a violation of the right to counsel is the same as the remedy for an unduly suggestive identification procedure: suppression of the in-court identification unless there is an independent basis for its admission.”). 137 Doing so might in theory create an elevated standard, where a prosecutor could only overcome a per se exclusion of the evidence could by a showing of clear and convincing evidence that the identification had an “independent source” and was reliable. See, e.g., Commonwealth v. Botelho, 343 N.E.2d 876, 880 (Mass. 1976) (“[T]he prosecution is limited to introducing at trial only such identifications by the witness as are shown at the suppression hearing not to be the product of the suggestive confrontation-the later identifications, to be usable, must have an independent source.”); State v. Iron Necklace, 430 N.W.2d 66, 84 (S.D. 1988) (“[T]he proof shifts to the State to then prove by clear and convincing evidence that the in-court identification had an independent origin.”); Powell v. State, 271 N.W.2d 610, 617 (Wis. 1978) (“In cases which involve the validity of subsequent in-court identifications the rule is clear: once the defendant shows that the out-of-court identification was improper, the state has the burden of showing that the subsequent in-court identification derived from an independent source and was thus free of taint.”). However, in practice courts conduct the same “totality of the circumstances” inquiry, but without considering the prior identifications as a critical part of the relevant circumstances. 138 Complex questions can be raised by assertions that the eyewitness was previously familiar with the suspect. While I do not address the subject here, I note that there are, for instance, underlying questions whether and when familiarity leads to greater reliability--or perhaps reduced reliability under some circumstances. See Lisa J. Steele, Who Was That Masked Man?, 48 Crim. L. Bull., Winter 2012 (reviewing social science literature and noting that “[f]amiliarity affects eyewitness testimony in ‘nuanced, complex, and often counterintuitive ways.”’ (citation omitted)). There are questions of proof as to how familiar an eyewitness really was. In addition, courts adopt a range of approaches. Some courts merely note a spectrum of familiarity; some demand strong evidence of familiarity, while others appear to find familiarity present even based on very brief prior encounters. See People v. Sheppard, No. 241766, 2003 WL 22717987, at *3 (Mich. App. Nov. 8, 2003) (noting that to determine whether there is an independent basis for identification, one can look to, inter alia, “the witness's prior knowledge of the defendant.”); People v. Yara, No. 9479/00, 2002 WL 31627019, at *2 (N.Y. Sup. Ct. Nov. 6, 2002) (“[T]he courts have carved out a ‘confirmatory identification exception.’ The rationale for this exception is premised on the principle that due to the familiarity between the witness and the suspect, there is little or no risk that police suggestion can lead to mis-identification.... The exception may confidently be applied where the protagonists are family members, friends, or acquaintances; at the other extreme, it clearly does not apply when the familiarity emanates from a brief encounter.”); see also State v. Marquez, 967 A.2d 56, 82 (Conn. 2009) (finding “independent source” where eyewitness saw defendant at parole office prior to lineup); Dang v. United States, 741 A.2d 1039, 1042 (D.C. 1999) (finding enough evidence for independent basis from the witness's close contact with the robbers in adequate lighting for an extended period of time); Butler v. State, 382 S.E.2d 616, 620 (Ga. Ct. App. 1989) (“In the instant case there was lengthy testimony as to the independent origin of the victim's identification of Butler; he had visited her apartment in the past, she knew him from his place of employment, she recognized his voice, and [she] got a glimpse of his face when it was lighted by the street light.”); State v. Tann, 273 S.E.2d 720, 726 (N.C. 1981) (holding that the “one- man show-up” and the victim's identification before the defendant was produced by the officers was ample evidence of independent origin); State v. Jaeb, 442 N.W.2d 463, 465 (S.D. 1989) (upholding the lower court's holding that seeing the assailant on several occasions in good light and identifying her in a fair lineup was enough evidence of independent source). 139 See, e.g., United States v. DeJesus, 912 F. Supp. 129, 139 (E.D. Pa. 1995) (holding that a newspaper photograph which jogged the victim's image was not unduly suggestive); Utley v. State, 589 N.E.2d 232, 237-38 (Ind. 1992) (upholding the trial court's finding that a photographic array with the defendant appearing twice was not unduly suggestive); People v. Whitaker, 126 A.D.2d 688, 688-89 (N.Y. App. Div. 2d Dep't 1987) (noting that identifying the assailant in a yearbook photograph was not tainted by police procedures). But see State v. Atwood, 832 P.2d 593, 603 (Ariz. 1992) (upholding the trial courts determination that only two of fourteen witnesses' pretrial viewings of press coverage were unduly suggestive); People v. Prast, 319 N.W.2d 627, 634-35 (Mich. Ct. App. 1982) (“Where an identification of a defendant is based upon a newspaper photograph rather than the witness's own perceptions, it should be excluded.”); Rogers v. State, 774 S.W.2d 247, 259-60 (Tex. Crim. App. 1989) (“Since the police procedure was not itself suggestive, the fact that several eyewitnesses were exposed to a media photo of appellant one day before attending a police lineup might, at most, be taken to affect the weight, although not the admissibility, of their trial testimony.”), overruled by Peek v. State, Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 36 106 S.W. 3d 72 (Tex. Crim. App. 2003); see also Lynn M. Talutis, Annotation, Admissibility of In-Court Identification as Affected by Pretrial Encounter that was not Result of Action by Police, Prosecutors, and the Like, 86 A.L.R. 5th 463, Part III.B. (2001) (citing cases that allowed media identifications as admissible evidence, but also citing others that claimed the evidence was inadmissible). 140 See State v. Cannon, 713 P.2d 273, 281 (Ariz. 1985) (approving jury instruction stating, “[y]ou are instructed that you must be satisfied beyond a reasonable doubt that the in-Court identification was independant [sic] of the previous pre-trial identification or, if not derived from an independent source, you must find from other evidence in the case that the defendant is the guilty person beyond a reasonable doubt”). 141 Nardone v. United States, 308 U.S. 338, 341 (1939). 142 Nix v. Williams, 467 U.S. 431, 443-44 (1984) (adopting inevitable discovery theory); Brown v. Illinois, 422 U.S. 590, 608-10 (1975) (discussing attenuation theory). 143 Wayne R. LaFave, Jerold H. Israel & Nancy J. King, Criminal Procedure § 9.3(d), at 528-29 (5th ed. 2009). 144 Wong Sun v. United States, 371 U.S. 471, 487 (1963). 145 United States v. Crews, 455 U.S. 463, 474 (1980). 146 United States v. Calandra, 414 U.S. 338, 348 (1974) (“[T]he rule is a judicially created remedy designed to safeguard Fourth Amendment rights generally through its deterrent effect.... As with any remedial device, the application of the rule has been restricted to those areas where its remedial objectives are thought most efficaciously served.”). 147 Manson v. Brathwaite, 432 U.S. 98, 112 & n.12 (1977) (“Although the per se approach has the more significant deterrent effect, the totality approach also has an influence on police behavior.”). 148 Solomon v. Smith, 645 F.2d 1179, 1188 (2d Cir. 1981) (“The tests of ‘independent origin’ set forth in Wade appear to be functionally identical to the reliability tests articulated in Neil v. Biggers ....”). 149 See Dix & Dawson, supra note 22, § 14.39 (“[T]he Texas courts have almost certainly erred in uncritically assuming that in- court identification testimony offered despite an earlier identification made at an unnecessarily suggestive procedure is sometimes admissible under the independent source analysis.”). They add, “Since the situation must present a very substantial risk of misidentification as a result of the unnecessarily suggestive procedure, surely it cannot be said that the in-court identification can have a source independent of that procedure.” Id. 150 .Loftus et al., supra note 21, §8-18, at 194 n.107. 151 See infra Part I.C. 152 Simmons v. United States, 390 U.S. 377, 383 (1968). 153 Loftus et al., supra note 21, at § 8-18, at 194-95. 154 Id. §8-18, at 194-95 & nn.108-09, 111 (citing cases). 155 United States v. Wade, 388 U.S. 218, 240-41 (1967). 156 Commonwealth v. Graham, No. 03-P-357, 2004 WL 840557, at *1 (Mass. App. Ct. Apr. 20, 2004). 157 Id. 158 Ellis v. United States, 941 A.2d 1042, 1049 (D.C. 2008) (citations omitted). 159 Thompson, supra note 21, at 627 (quoting People v. Adams, 423 N.E.2d 379, 384 (N.Y. 1981)). Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 37 160 See, e.g., Powell v. State, 925 So. 2d 878, 884 (Miss. Ct. App. 2005) (holding that the testimony of three witnesses and the discovery of stolen property from the defendant's truck was additional independent evidence); State v. Valentine, 785 A.2d 940, 943-44 (N.J. Super. Ct. App. Div. 2001) (finding evidence of independent reliability from the testimony of a witness, the retrieval of a gun from the defendant's apartment, and the defendant's flight after officers arrived constituted evidence that gave the identification independent reliability); see also Suzannah B. Gambell, The Need to Revisit the Neil v. Biggers Factors: Suppressing Unreliable Eyewitness Identifications, 6 Wyo. L. Rev. 189, 211 (2006) (discussing states that add corroborative evidence of general guilt as a Biggers factor); Brandon L. Garrett, Innocence, Harmless Error and Federal Wrongful Conviction Law, 2005 Wis. L. Rev. 35, 84-85 (citing federal cases and noting, “The Supreme Court has not intervened as many of the circuits, taking the hint from Manson, have made no secret of their holdings that corroborating evidence of guilt can render an eyewitness identification ‘reliable,’ some even calling such independent evidence of guilt a ‘sixth factor’ as to reliability”); Rudolf Koch, Note, Process v. Outcome: The Proper Role of Corroborative Evidence in Due Process Analysis of Eyewitness Identification Testimony, 88 Cornell L. Rev. 1097, 1102 (2003) (“[C]orroborative evidence of general guilt should be considered only in any post-trial harmless error analysis.”). 161 Laurens Walker & John Monahan, Social Frameworks: A New Use of Social Science in Law, 73 Va. L. Rev. 559 (1987). 162 State v. Henderson, 27 A.3d 872, 292-93 (N.J. 2011). 163 Wells & Quinlivan, supra note 32, at 20 (describing situations in which limiting testimony might be appropriate). 164 See, e.g., O'Toole & Shay, supra note 1 at 136-38 (discussing need for alternative rule to “discredited” Manson reliability checklist, and proposing rule that imposes minimum affirmative guidelines for identification procedures); Susan R. Klein, Identifying and (Re)Formulating Prophylactic Rules, Safe Harbors, and Incidental Rights in Constitutional Criminal Procedure, 99 Mich. L. Rev. 1030, 1064-65 (2001) (discussing alternative rules that could counter injustices of unreliable identifications). 165 Sandra Guerra Thompson, one of the few scholars to discuss the problematic standard, has noted, “simply tightening the test for determining whether there is an independent basis may not suffice to safeguard against the admission of unreliable in-court identifications.” Thompson, supra note 21, at 628. 166 Nor do I argue that in-court identifications should be barred as inherently suggestive, as one commentator has. Mandery, supra note 39, at 392. Instead, I ask the question what procedure should be used where prior lineups were conducted. Should police evade such a rule by conducting no pretrial identification procedure at all, however, courts should exclude an in-court identification as unnecessarily suggestive. 167 United States v. Greene, 591 F.2d 471, 475 (8th Cir. 1979). 168 United States v. Russell, 532 F.2d 1063, 1067 (6th Cir. 1976). 169 Manson v. Brathwaite, 432 U.S. 98, 122 (1977) (Marshall, J., dissenting). 170 See, e.g., United States v. Brown, 200 F.3d 700, 707-08 (10th Cir. 1999) (finding that, while the victim's identification of the defendants at trial was suggestive, it happened in the presence of a jury and included a full and fair cross-examination of the victim about the process). 171 See Mandery, supra note 39, at 389 (“The lack of appellate-level case law on the subject may be partially explained by the fact that few defendants ever object to the suggestiveness of in-court identifications.”). In contrast, I observe substantial appellate caselaw, which poses additional obstacles to challenging in-court identifications. 172 Also problematic, some courts conduct a suppression hearing in front of the jury, making a suppression remedy less effective. See Watkins v. Sowders, 449 U.S. 341, 349 (1981) (holding that a trial court may conduct reliability hearings in presence of the jury); see also LaFave et al., supra note 143, § 24.4, at 1161 (noting that victims are typically not sequestered at a criminal trial). 173 See Garrett, supra note 7, ch. 9. Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 38 174 The North Carolina statute provides: “(1) Failure to comply with any of the requirements of this section shall be considered by the court in adjudicating motions to suppress eyewitness identification.” N.C. Gen. Stat. Ann. § 15A-284.52(d)(1) (West 2008). The Ohio statute reads: (1) Evidence of a failure to comply with any of the provisions of this section or with any procedure for conducting lineups that has been adopted by a law enforcement agency or criminal justice agency pursuant to division (B) of this section and that conforms to any provision of divisions (B)(1) to (5) of this section shall be considered by trial courts in adjudicating motions to suppress eyewitness identification resulting from or related to the lineup. Ohio Rev. Code Ann. § 2933.83(C)(1) (West 2011). 175 Those states include Connecticut, Georgia, Kentucky, New Jersey, New York, Massachusetts, Utah and Wisconsin. See State v. Ramirez, 817 P.2d 774, 780-81 (Utah 1991) (altering three of the “reliability” factors to focus on effects of suggestion); see also State v. Marquez, 967 A.2d 56, 69-71 (Conn. 2009) (adopting detailed criteria for assessing suggestion); Brodes v. State, 614 S.E.2d 766, 771 & n.8 (Ga. 2005) (rejecting use of eyewitness certainty jury instruction); State v. Hunt, 69 P.3d 571, 576 (Kan. 2003) (adopting Utah's five factor “refinement” of the Biggers factors); Commonwealth v. Johnson, 650 N.E.2d 1257, 1261 (Mass. 1995) (adopting a per se exclusion approach to showup identifications); State v. Cromedy, 727 A.2d 457, 467 (N.J. 1999) (requiring in some circumstances instruction on dangers of cross-racial misidentifications); People v. Adams, 423 N.E.2d 379, 383-84 (N.Y. 1981) (adopting a per se exclusion approach to showup identifications); State v. Dubose, 699 N.W.2d 582, 593-94 (Wis. 2005) (“[E]vidence obtained from an out-of-court showup is inherently suggestive and will not be admissible unless, based on the totality of the circumstances, the procedure was necessary. A showup will not be necessary, however, unless the police lacked probable cause to make an arrest or, as a result of other exigent circumstances, could not have conducted a lineup or photo array.”). 176 Brodes, 614 S.E.2d at 770. 177 See Kruse, supra note 21, at 722 n.367 ([A]lthough the Wisconsin Supreme Court adopted a strict test regarding showups, “use of the independent-source doctrine runs the risk of reintroducing the Brathwaite/Biggers reliability factors.”) 178 See Amy Klobuchar, Nancy K. Mehrkens Steblay & Hilary Lindell Caligiuri, Improving Eyewitness Identifications: Hennepin County's Blind Sequential Lineup Pilot Project, 4 Cardozo Pub. L. Pol'y. & Ethics J. 381, 405-410 (2006) (discussing benefits of double-blind administration and difficulties in its implementation, but not discussing exclusion of noncomplying or courtroom identifications); Otto H. MacLin, Laura A. Zimmerman & Roy S. Malpass, PC Eyewitness and Sequential Superiority Effect: Computer-Based Lineup Administration, 3 Law & Hum. Behav. 303, 305-308 (2005) (explaining how computer-based lineup administration can reduce administrator bias, but not discussing noncomplying or courtroom identifications); cf. Loftus et al., supra note 21, § 4-10 (describing double-blind administration in the context of lineups). 179 See, e.g., Report of the Task Force on Eyewitness Evidence, Suffolk County District Attorney, July 15, 2004, http:// www.suffolkdistrictattorney.com/press-office/reports-and-official-correspondance/report-of-the-task-force-on- eyewitness-evidence/. 180 State v. Henderson, 27 A.3d 872, 877-79 (N.J. 2011). 181 State v. Henderson, 937 A.2d 988, 999 (N.J. Super. Ct. App. Div. 2008) (“[I]f the determinations made at the new Wade hearing require the exclusion of the out-of-court identification made by Womble, then the judge should also determine whether Womble is able to make an in-court identification of defendant from an independent source.”). 182 Henderson, 27 A.3d at 925. 183 North Carolina puts it as follows: “(3) When evidence of compliance or noncompliance with the requirements of this section has been presented at trial, the jury shall be instructed that it may consider credible evidence of compliance or noncompliance to determine the reliability of eyewitness identifications.” N.C. Gen. Stat. Ann. § 15A-284.52(d)(3) (West 2008). The Ohio language is very similar: (3) When evidence of a failure to comply with any of the provisions of this section, or with any procedure for conducting lineups that has been adopted by a law enforcement agency or criminal justice agency pursuant to division (B) of this section and that conforms to any provision of divisions (B)(1) to (5) of this section, is presented at trial, the jury shall be instructed that it may consider credible evidence of noncompliance in determining the reliability of any eyewitness identification resulting from or related to the lineup. Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 39 Ohio Rev. Code Ann. § 2933.83(C)(3) (West 2011). 184 Watkins v. Sowders, 449 U.S. 341, 356 (1981) (Brennan, J., dissenting). 185 See Brian L. Cutler & Steve D. Penrod, Mistaken Identification: The Eyewitness, Psychology, and the Law 11, 263-64 (1995) (“[T]he experiments we have reviewed here provide little evidence that judges' instructions concerning the reliability of eyewitness identification enhance juror sensitivity to eyewitness identification evidence.”). 186 See supra note 131. 187 Henderson, 27 A.3d at 924 (“[W]e direct that enhanced instructions be given to guide juries about the various factors that may affect the reliability of an identification in a particular case.”); see also Wells & Quinlivan, supra note 32, at 20-21 (discussing benefits of jury instruction that is tailored to specifics of case). While jury instructions that seek to limit consideration of evidence and “blindfold” the jury may fail, efforts to more completely inform the jury using instructions that explain rationales for admonitions may produce more accurate results. See David P. Leonard, The New Wigmore: A Treatise on Evidence Selected Rules of Limited Admissibility: Regulation of Evidence to Promote Extrinsic Policies and Values § 1.11.5 (2010) (explaining that jury instructions should clearly convey both the applicable legal rules and the importance of abiding by them); Shari Seidman Diamond & Neil Vidmar, Jury Room Ruminations on Forbidden Topics, 87 Va. L. Rev. 1857, 1858 (2001) (discussing limits of blindfolding techniques given active nature of juries and advocating for reason-based explanatory instructions). On the value of offering instructions earlier in the trial, see Joel D. Lieberman & Jamie Arndt, Understanding the Limits of Limiting Instructions: Social Psychological Explanations for the Failures of Instructions to Disregard Pretrial Publicity and Other Inadmissible Evidence, 6 Psychol. Pub. Pol'y & L. 677, 705 (2000). 188 Perry v. New Hampshire, 132 S. Ct. 716, 727-28 (2012). 189 None of the examples discussed above necessarily involve absent witnesses and thus can avoid Confrontation Clause problems. The Supreme Court had earlier adopted a reliability-oriented approach to the Confrontation Clause problem, permitting nonconfrontation of witnesses if the evidence was reliable or had “particular guarantees of trustworthiness.” Ohio v. Roberts, 448 U.S. 56, 66 (1980). The Court rejected that approach in Crawford v. Washington, focusing instead on whether the evidence was testimonial in nature. 541 U.S. 36, 68-69 (2004). However, the Court's recent ruling on the “excited utterance” and “ongoing emergency” exception to hearsay returned to a reliability rationale, noting “the prospect of fabrication” is greatly diminished when a person is seeking law enforcement help. Michigan v. Bryant, 131 S. Ct. 1143, 1157 (2011). 190 542 U.S. 600, 608-09 (2004) (plurality opinion). In Siebert, the Court ruled that when police interrogate a suspect in custody without having given the Miranda warnings, but then after obtaining a confession, give the warnings and ask the same questions again, that the repeated statement is not admissible. Id. at 604. The plurality emphasized that that the earlier statement, made without the Miranda warnings, would naturally impact the second statement. Id. at 613-14. As Justice O'Connor pointed out in dissent, the Court also considered the “psychological impact” the first unwarned statement would have on the second Mirandized statement. Id. at 627 (O'Connor, J., dissenting). She would consider the same factors, but for a different purpose, to ask whether the second statement might be independently reliable and therefore not subject to exclusion. Id. at 627-28 (O'Connor, J., dissenting). 191 See Brandon L. Garrett, The Substance of False Confessions, 62 Stan. L. Rev. 1051, 1109-11 (2010) (arguing that constitutional criminal procedure should consider reliability and, in particular, should assess whether suspects actually volunteered crucial facts). 192 See, e.g., Tex. Code Crim. Proc. Ann. art. 38.22(3)(a)-(c) (West 2011) (stating that “[n]o oral or sign language statement of an accused made as a result of custodial interrogation shall be admissible against the accused in a criminal proceeding unless” there is an “electronic recording” made of it, although containing an exception for “any statement which contains assertions of facts or circumstances that are found to be true and which conduce to establish the guilt of the accused, such as the finding of secreted or stolen property or the instrument with which he states the offense was committed”). Such rules raise additional questions. For example, that Texas statute does not offer a remedy in the situation in which statements were selectively recorded. Id. In addition, while the statute exempts certain unrecorded corroborated admissions, it does not make clear that recorded statements be excluded should they document police contamination of the confession or other evidence of unreliability. Id. Kellner, Matthew 10/7/2015 For Educational Use Only EYEWITNESSES AND EXCLUSION, 65 Vand. L. Rev. 451 © 2015 Thomson Reuters. No claim to original U.S. Government Works. 40 193 See, e.g., United States v. Green, 405 F. Supp. 2d 104, 107-109, 120 (D. Mass. 2005) (limiting firearms comparison testimony to conclusions expressing similarities and, interestingly, analogizing the problem of observer bias of a firearms examiner given only a single firearm to examine to the problem of showup eyewitness identifications); United States v. Hines, 55 F. Supp. 2d 62, 67-68 (D. Mass. 1999) (ruling that handwriting examiner was limited to testifying about “similarities” in documents); see also Simon A. Cole, Splitting Hairs? Evaluating ‘Split Testimony’ as an Approach to the Problem of Forensic Expert Evidence, 33 Sidney L. Rev. 459 (2011) (evaluating emerging approach by courts restricting testimonial claims of forensic experts); Jennifer L. Mnookin, The Courts, the NAS, and the Future of Forensic Science, 75 Brook. L. Rev. 1209, 1242 (2010) (arguing that as an “interim solution” courts limit fingerprint evidence “by restricting it to description of similarities and differences” rather than permit individualization claims); Jennifer L. Mnookin et al., The Need for a Research Culture in the Forensic Sciences, 58 UCLA L. Rev. 725, 750 (2011) ( “Forensic analysts have often failed to recognize the limits of what conclusions are actually warranted by a given research result.”); David M. Siegel et al., The Reliability of Latent Print Individualization: Brief of Amici Curiae submitted on Behalf of Scientists and Scholars by The New England Innocence Project, Commonwealth v. Patterson, 42 Crim. L. Bull. art. 2 (2006) ( “[T]estimony in terms of ‘individualization’ or ‘matches,’ without the underlying study of the base rates of the characteristics from which such conclusions are ostensibly drawn, or proficiency tests data for examiners, is misleading and fundamentally unsound. This does not mean that testimony detailing the comparison of prints by examiners would have to be excluded.”); cf. Simon A. Cole, Where the Rubber Meets the Road: Thinking About Expert Evidence as Expert Testimony, 52 Vill. L. Rev. 803, 838 (2007) (“[J]udges and legal scholars need to shift their focus from the admissibility of evidence to control of testimony.”). 194 Perry v. New Hampshire, 132 S. Ct. 716, 727 (2012). 65 VNLR 451 End of Document © 2015 Thomson Reuters. No claim to original U.S. Government Works. Effects of Administrator–Witness Contact on Eyewitness Identification Accuracy Ryann M. Haw and Ronald P. Fisher Florida International University Concern that lineup administrators can influence eyewitness identifications has led researchers to suggest implementing double-blind testing, an idea that police resist. Using a typical eyewitness paradigm (video event followed by photographic identification test), the present study demonstrated that an alternative technique, minimizing the level of contact between lineup administrators and witnesses, could reduce false identifications without reducing hits. Specifically, witnesses were more likely to make decisions consistent with lineup administrator expectations when the level of contact between the administrator and the witness was high than when it was low. These results are explained within the experimenter expectancy framework. Implications for applied settings are discussed. Eyewitnesses to crime are often asked to identify a perpetrator from a photographic lineup. Studies have shown that the sequential lineup procedure (presenting each photo individually) can elicit more accurate identifications from witnesses than the simultaneous lineup procedure (viewing all photos at one time) (Steblay, Dysart, Fulero, & Lindsay, 2001). Accordingly, researchers have sug- gested that the sequential procedure should replace the common police procedure of the simultaneous lineup (Wells et al., 1998). Another suggestion to improve the accuracy of lineup testing is to use a double-blind procedure, which should reduce the potential for administrator bias (Garrioch & Brimacombe, 2001; Phillips, McAuliff, Bull Kovera, & Cutler, 1999; Wells et al., 2000). In double-blind testing, both the experimenter and the participant are unaware of the condition being tested (Rosenthal, 1966). Despite research findings showing its benefits, police are resistant to using double-blind testing because they perceive it as a loss of control and as a suggestion that they cannot conduct fair lineups (Lindsay & Pozzulo, 1999; Wells et al., 2000). Without double-blind pro- cedures, researchers are reluctant to use the sequential procedure because they fear it could easily be biased by a lineup adminis- trator (Phillips et al., 1999; Wells et al., 1998). The goal of the present research was to examine how a different administration technique, reducing the level of contact between the lineup admin- istrator and the witness, would (a) affect identification decisions and (b) compare with the sequential lineup. Sequential Superiority Steblay et al.’s (2001) recent meta-analysis of 23 studies found that the sequential lineup was superior to the simultaneous lineup on several criteria: The sequential lineup produced more correct rejections and fewer foil and false identifications in target-absent lineups than did the simultaneous lineup, and overall, the sequen- tial lineup produced more correct decisions (56%) than did the simultaneous lineup (48%). Despite the demonstrated superiority of the sequential over the simultaneous lineup, researchers have been reluctant to recommend its exclusive use, because they are concerned that the sequential lineup may be relatively easily influenced by the lineup administrator (Phillips et al., 1999; Sporer, Malpass, & Koehnken, 1996; Wells et al., 1998). For example, if the lineup administrator knows which photo is the suspect’s, he or she can look at the suspect’s photo longer than the others, thereby suggesting to the witness whom to choose. Cur- rently, there has been only one empirical study of the effects of administrator influence on lineup format, and the results support the claim that the sequential procedure may be susceptible to influence from a lineup administrator (Phillips et al., 1999). Thus, there is concern that if the sequential lineup is implemented without a procedure to reduce the administrator’s influence, it may prove to be more error prone than the currently used simultaneous procedure. Reducing Administrator Influence in Lineup Tasks Research shows that experimenters can communicate their ex- pectancies to participants through their interactions and that par- ticipants will alter their task performance to meet these expecta- tions (Callaway, Nowicki, & Duke, 1980; Rosenthal, 1966). Presumably, an experimenter’s knowledge and beliefs about how participants should respond are communicated, directly or indi- rectly, through the experimenter’s behavior. There is an extensive body of research supporting experimenter expectancy effects in the general literature (e.g., Adair & Epstein, 1968; Aronson, Ells- worth, Carlsmith, & Gonzales, 1990; Rosenthal, 1966) and a This research, submitted in partial fulfillment of Ryann M. Haw’s master’s thesis, was presented in part at the biennial meeting of the American Psychology–Law Society, Austin, Texas, March 2002. We thank Margaret Bull Kovera and Janat Parker for their invaluable suggestions on the research design and Christian Meissner for comments on drafts of this article. We also thank Sonel Baute, Kia Charles, Thomas Gallagher, Vanessa Garcia, Gladys Junco, Elizabeth Perez, Jessica Ruiz, Nicole Sanz, Christie Sendina, Johanna Sevillia, Michelle Thompson, and Dexter Ullith for serving as excellent research assistants. Correspondence concerning this article should be addressed to Ryann M. Haw or Ronald P. Fisher, Department of Psychology, Florida Interna- tional University, North Miami, FL 33181-3600. E-mail: ryannhaw@ aol.com or fisherr@fiu.edu Journal of Applied Psychology Copyright 2004 by the American Psychological Association 2004, Vol. 89, No. 6, 1106–1112 0021-9010/04/$12.00 DOI: 10.1037/0021-9010.89.6.1106 1106 limited support base in the eyewitness literature (Garrioch & Brimacombe, 2001; Phillips et al., 1999). The goal of a lineup task is to obtain an accurate, unbiased account of a witness’s memory. This goal may be compromised by other influences, specifically by the lineup administrator, who knows the suspect’s position in the lineup. It is possible that the administrator’s actions, intentional or unintentional, may convey his or her knowledge of the suspect’s position to the witness (Garrioch & Brimacombe, 2001). If this occurs, then the accuracy of the identification is compromised and the witness’s memory is tested unfairly. For target-present lineups, administrator influence can increase the number of correct identifications made by the witness (Wells & Bradfield, 1999; Wells, Rydell, & Seelau, 1993). This may not seem at first to be a problem, but if the identification is influenced, then the witness’s memory of a suspect is not tested accurately. In target-absent lineups the problem is more severe. The administrators’ behavior can influence the lineup in such a way that an innocent person is identified and possibly falsely arrested or even convicted. Lineup tasks need to include a proce- dural safeguard to protect against possible administrator influence on eyewitness identifications. Two possible procedures that may decrease the likelihood of administrator influence are double-blind testing and reducing administrator–witness contact. Double-Blind Testing Researchers have suggested that double-blind testing is essential to reduce bias due to experimenter expectations or demand char- acteristics (Archer, 1993; Aronson et al., 1990; Friedman, 1967). If the lineup administrator has no knowledge of the suspect’s posi- tion, then any potential bias caused by communicating the admin- istrator’s knowledge is eliminated. The witness’s identification may then be a better representation of his or her memory. Despite the research in favor of double-blind testing, Rosenthal (1966) suggested that even if experimenters are blind to the target re- sponse, their unintentional behavior (e.g., body language) may signal to participants whom to choose. Even small changes in the experimenter’s body posture or expression have been shown to affect participants’ responses (Rosenthal, 1980). For instance, an experimenter’s unintentional smile may be perceived by the par- ticipant as an indication to make a particular response. These subtle influences may be even greater in new situations, such as in an experiment or lineup test, because witnesses lack clear personal expectations regarding the outcome of the task (Jussim & Eccles, 1995). In overview, although double-blind testing can overcome some experimenter influences (those that are knowledge based), it cannot control for all potentially biasing factors that can influence experimental results. This concern suggests that another procedure needs to be developed. Reducing Contact An alternative procedure to reduce the influence of the lineup administrator is based on eliminating the medium through which the lineup administrator affects the witness. If the medium that carries the lineup administrator’s knowledge or unintentional be- havior is eliminated, then the witness should remain uninfluenced. That is, instead of altering the lineup administrator’s knowledge (i.e., by using a double-blind procedure), one can simply limit the opportunity that the lineup administrator has to convey his or her knowledge or unintentional behavior. For instance, a witness may be influenced by the lineup administrator’s staring at Photograph Number 2. If the witness cannot observe the lineup administrator’s face, then the witness will not be influenced by this behavior. In the current study, we hypothesized that if the level of contact or interaction between the lineup administrator and the witness were kept to a minimum, then administrator behavior would have less influence on the witness’s lineup selection and possibly result in more accurate decisions (Rosenthal & Rubin, 1978). Method Participants Three hundred undergraduate students1 (74 male, 226 female) were recruited from Florida International University, mainly from introductory psychology courses. Students participated for either extra credit or partial fulfillment of a course requirement. Materials Videos. Each participant saw a brief videotaped scene in which a woman was giving a practice speech. Halfway through the speech, a man interrupted the woman and requested to borrow some equipment from the room. The man identified himself as a university employee who needed to move an overhead projector to another room. The man was the target of the identification task and was viewable from the front and profile for approx- imately 20 s. In all, six different people played the role of the target. Lineups. The lineups were created by following the two-part procedure suggested by Koehnken, Malpass, and Wogalter (1996). Lineup members were chosen initially to match the desired description and then to be similar to one another. We began by creating a general description of the desired target: White or Latin man, dark hair, average height, medium build, 18–30 years old. We then observed hundreds of students from Florida Interna- tional University and photographed—with permission—10 students who closely matched the description. All possible lineup members were given the same shirt to wear for the photograph and video to eliminate clothing bias (Lindsay, Wallbridge, & Drennan, 1987). All photos were color head-and-shoulder photos measuring 7 7 cm. We pilot tested these 10 photos by asking an independent group of 50 students to rate each of the photos on two scales: similarity to the targeted description and similarity to each other photo. Similarity to description was rated on a 5-point scale (1 matches description closely, 5 does not match description); similarity to each other was rated on a 5-point scale (1 highly similar, 5 not at all similar). Of the 10 original photographs, the 7 that scored highest on both rating scales were selected for the lineup. For the 7 photographs selected for the lineup, the mean similarity-to- description rating was 2.33 (SD 0.19), and the mean similarity-to-each- other rating was 3.39 (SD 0.32). The photos that were designated to be the target substitute were chosen randomly from the pool of seven photographs. Therefore, on average, the target substitute looked no more like the target than did the foils. The lineup position of the target (target-present lineup) and the target substitute (target-absent lineup) varied across participants. All of the lineups were pilot tested to obtain a functional size and an effective size (Tredoux’s e; Tredoux, 1998). The average functional size 1 Only 240 participants were included in the analyses discussed in this article. Sixty participants were run in a computer lineup condition, which was analyzed separately and not included in the current discussion. 1107RESEARCH REPORTS was 4.41 (range, 3.17–5.50), and the average effective size across all lineups was 4.86 (range, 4.00–5.50). We also asked five experienced police officers to assess the usefulness and believability of the lineups. All five indicated that the photos looked uniform and that the individuals in the lineup were all feasible possibilities on the basis of the given description. They also indicated that they would be comfortable to use such a lineup in a real police investigation. Lineup instructions. Unbiased lineup instructions were based on the U.S. Department of Justice guidelines (Technical Working Group for Eyewitness Evidence, 1999). The instructions were presented auditorily on a cassette tape; in addition, a sheet of written instructions was given to the participants. Participants were told that their response options included (a) selecting one of the photos; (b) saying, “The man is not present”; and (c) saying “I do not know.” The instructions also informed participants that after they had made a decision, they would be asked to indicate their confidence on an 11-point scale (0 least confident; 10 most confident). For the sequential lineup presentation the witnesses were informed that they would not be allowed to view any photo twice and that the lineup would end once they made a photo selection or viewed all of the photos and said, “The man is not present” or “I do not know.” Procedure Witnesses were told they were going to watch a video and answer questions regarding the video. After the video, witnesses answered various unrelated questions, which served as a filler task before the actual identi- fication. After the 15-min filler task, the experimenter began the identifi- cation task. The experimenter, from here on known as the lineup admin- istrator, was told which person was the target. For target-absent lineups the lineup administrators were told the number of a target substitute; however, the lineup administrators were led to believe that all lineups were target- present. The lineup administrators were also led to believe that a target identification from a witness was a correct identification. The lineup administrators were told that they could not explicitly tell the witness which photo to select but that their job was to obtain as many target identifications as possible. To increase the lineup administrators’ motiva- tion, monetary awards were given to the lineup administrator who had the highest number of target selections. The purpose of motivating lineup administrators to obtain accurate identifications was to attempt to imitate real-life conditions where officers (a) believe the suspect is in the lineup and (b) are reinforced when witnesses identify the suspect. Although police are not explicitly told that their goal is to obtain as many correct identifi- cations as possible, they are reinforced professionally when cases are successfully closed, which often involves identifying the suspect. Once the lineup administrator had entered the room, the recorded lineup instructions were played and the witness was asked whether he or she understood the instructions. The witness was then administered the appro- priate lineup procedure (sequential or simultaneous). There were slight variations in procedure depending on the level of administrator–witness contact (described below). The witness’s decision was recorded on the identification form. When the lineup procedure was finished, witnesses gave a confidence rating for their decision. The witnesses were debriefed and thanked for their time. Each session lasted 30–40 min. Administrator–witness high contact. In this condition, the lineup ad- ministrator sat at a table with the witness, approximately 1–2 ft (0.3–0.6 m) away and directly in front of or beside the witness, while showing the lineup photos. The administrator played the lineup instructions for the witness and laid the photos on the table. The administrator indicated the witness’s choice on a decision form and recorded the witness’s confidence. In the simultaneous procedure all six photos were laid out on the table. For the sequential procedure the photos were presented one at a time. If a “no” decision was made, the administrator showed a new photo. Witnesses were not able to view any photo a second time. If the witness made a “yes” response or after all photos had been shown, the lineup administrator asked the confidence question. Administrator–witness low contact. In this condition the administrator played the lineup instructions for the witness and then handed to the witness the instructions, the photos, and the decision form. The lineup administrator sat in a chair about 3–5 ft (0.9–1.5 m) to the side and slightly behind the witness out of the witness’s direct view. The lineup adminis- trator remained in the room and could view the witness to ensure that he or she followed the proper procedure; however, the witness could not see the administrator directly while performing the identification task. If the wit- ness violated the procedure or asked a question, the lineup administrator would tell the witness to review the written instructions and follow the procedure. Simultaneous and sequential lineups were run using the same photo layout in both the high- and low-contact conditions. However, in the low-contact conditions, the witness laid out the photos and filled out all forms instead of the lineup administrator. Design The present study was a 2 (format: sequential vs. simultaneous) 2 (administrator–participant contact: high vs. low) 2 (lineup type: target- present vs. target-absent) factorial design with all variables manipulated between groups. Participants were assigned randomly to one of the eight conditions, and there were 30 participants in each condition. Eight different people performed the role of lineup administrator/exper- imenter: All were undergraduate research assistants who were naive to the hypotheses of the experiment. In all conditions, the lineup administrators were told the lineup procedure, the level of administrator–participant contact, and which photo was the suspect. The lineup administrators were instructed that all lineups were target-present. In fact, however, half of the lineups were target-present and half were target-absent. Results Eight different decisions were identified as dependent variables. For target-present lineups they were hit, foil identification, rejec- tion of the lineup, and do not know; for target-absent lineups they were correct rejection, false identification, foil identification, and do not know. For all analyses, significance levels were set at .05. Preliminary chi-square analyses revealed no differences among the eight lineup administrators, the seven lineups, or the six videos for any of the eight decision types. Therefore, the data were collapsed across lineup administrators, lineups, and videos for all further analyses. Target-Present and Target-Absent Analyses Several 2 (format: sequential vs. simultaneous) 2 (administrator–participant contact [contact]: high vs. low) log- linear analyses were performed for each of the eight decision types. Separate analyses were run for target-present (n 120) and target-absent lineups (n 120). For all analyses, likelihood ratio chi-squares are reported, and unless otherwise noted there is 1 degree of freedom and N 120. For false identifications (target substitute identification in target-absent lineup), there was a two-way interaction between format (simultaneous vs. sequential) and level of contact (high vs. low), 2 6.72 ( .24, power .74) (see Table 1). For simultaneous lineups, there were significantly more false identifi- cations when there was a high level of contact (30%) than when 1108 RESEARCH REPORTS there was a low level of contact (3%), 2 8.65 ( .15, power .37). For sequential lineups, there was no difference in false identifications for the two levels of contact (high 7%; low 13%), 2 0.75, ns ( .10, power .20). For hits (target identification in target-present lineup), there was no significant effect of format (2 2.16, ns; .13, power .31) or contact (2 0.14, ns; .03, power .07). The proportions for all decisions can be found in Table 2. For lineup rejections in target-present lineups, there was a significant difference between the levels of contact, 2 3.86 ( .18, power .50). Specifically, there were more lineup rejections for low (18%) than for high contact (7%). For foil identifications there were no effects of contact or format with either target-present or target-absent lineups (.001 2 1.61, ns). Effect sizes ranged from .01 to .12, and power ranged from .01 to .25. For “I do not know” responses, there was a significant differ- ence between the two lineup formats (simultaneous and sequential) in target-absent lineups, 2 5.46 ( .21, power .65). Significantly more “I do not know” responses were chosen for the sequential lineup (18%) than for the simultaneous lineup (5%). Choosing Rates We ran additional log-linear analyses for all 240 participants and combined all instances in which the witness chose a person from the lineup: hits and foil identifications from target-present lineups and false and foil identifications from target-absent line- ups. Participants were significantly more likely to choose someone from the lineup when the level of contact was high (73%) than when the level of contact was low (59%), 2 5.49 ( .15, power .65). There was no effect of lineup format (2 3.23, ns; .12, power .44). To test the lineup administrator’s influence on witnesses, we examined the witnesses’ propensity to choose the suspect (match- ing the lineup administrator’s expectation) rather than one of the foils. If witnesses were not influenced by the lineup administrator, they should have selected the target substitute only 1/6 (.17) of the time. For the target-absent lineups, witness selections in the high- contact conditions were compatible with the administrator’s be- liefs: When witnesses made a positive identification, they chose the target substitute significantly more often (proportion .31) than chance (.17), 2 5.08 (N 36; .38, power .63). By comparison, in low-contact conditions, when witnesses made a positive identification, they chose the target substitute (propor- tion .19) no more often than chance (.17), 2 0.06, ns (N 27; .05, power .06). A parallel analysis was conducted for lineup format. The simul- taneous lineup was influenced by the lineup administrator’s ex- pectations: For target-absent lineups, when witnesses chose some- one from the lineup, they chose the suspect (probability .29) significantly more often than chance (.17), 2 3.90 (N 34; .34, power .51). In contrast, for sequential lineups, wit- nesses chose the target substitute (proportion .21) no more often than chance (.17), 2 0.36, ns (N 29; .11, power .09). Discussion In the current study we explored the potential for lineup admin- istrators to influence eyewitness identification decisions. Re- searchers in the area believe this to be an important issue, as evidenced by the warning in the National Institute of Justice guide Eyewitness Evidence: A Guide for Law Enforcement that “inves- tigators’ unintentional cues may negatively impact the reliability of eyewitness evidence” (Technical Working Group for Eyewit- ness Evidence, 1999, p. 9). Nevertheless, there is a paucity of empirical research in the area (Garrioch & Brimacombe, 2001; Phillips et al., 1999). The present study adds to the corpus of Table 1 Simple Effects of Format and Contact for False Identifications Photo lineup format Contact 2(1, N 120)Low (%) High (%) Simultaneous 3 30 8.65*** Sequential 13 7 0.75 Note. There was a significant interaction between format and contact, 2(1, N 120) 6.72, p .01. *** p .01. Table 2 Main Effects of Format and Contact for Target-Absent and Target-Present Lineups Lineup and decision Format Contact Sim (%) Seq (%) 2(1, 120) Low (%) High (%) 2(1, 120) Target-absent lineups Correct rejections 38 33 0.33 43 28 2.96* Don’t know 5 18 5.46** 12 12 0.00 Foil identifications 40 38 0.04 37 42 0.32 Target-present lineups Hits 62 48 2.16 53 57 0.14 Foil identifications 25 25 0.00 20 30 1.61 Lineup rejections 10 15 0.69 18 7 3.86** Don’t know 3 12 3.17* 8 7 0.12 Note. Sim simultaneous lineup; seq sequential lineup. * p .10. ** p .05. 1109RESEARCH REPORTS empirical evidence by showing that lineup administrators can influence eyewitness decisions. More important, it suggests a theoretically guided solution to the problem that should be easily implemented and acceptable to police. Experimenter Expectancy Effects Lineup administrators were fed two critical pieces of informa- tion about each lineup. They were led to believe that the lineup contained the target person, and they were informed which pho- tograph was that of the “suspect.” The data strongly suggest that the lineup administrators conveyed these beliefs to witnesses— especially in the high-contact conditions—as witnesses behaved in a fashion consistent with the lineup administrator’s beliefs. Wit- nesses were more likely to choose someone from the lineup when administrator contact was high than when it was low, and wit- nesses were more likely to select the target substitute than the foils, even though the target substitute looked no more like the target than did the foils. These findings fit well within the framework of experimenter expectancy effects (Rosenthal, 1966). They are trou- bling, however. That a lineup administrator can influence a wit- ness’s decision violates the spirit of a lineup test, which is sup- posed to be a pure test of a witness’s memory, uninfluenced by other factors, and especially by the lineup administrator. Given the influence of the lineup administrator on the witness’s choosing, how might we minimize or eliminate the effect? Two possible approaches are (a) to determine after the fact that the lineup administrator influenced the witness’s decision, and then to use that knowledge retroactively to evaluate the witness’s decision, and (b) to eliminate proactively the lineup administrator’s influ- ence by altering the lineup procedure. The retroactive approach depends on witnesses’ being aware of the lineup administrator’s influence on their decisions. Although we did not measure this in our study, two recent studies suggest that witnesses are generally unaware of this influence. Haw, Mitchell, and Wells (2003) used a procedure identical to that of the present experiment (a high-contact condition with simultaneous lineups) and found that almost none of the witnesses felt the lineup administrator pressured them to make a response (5.0%) or indi- cated which photo they should select (1.2%). In Garrioch and Brimacombe (2001), participant–witnesses viewed a mock crime video and then performed a lineup task administered by someone who either knew or did not know the suspect’s position in the lineup. Parallel to our findings, the witnesses’ identification be- havior (confidence in their identification decisions) was influenced by the lineup administrator’s knowledge: Witnesses were most confident when their lineup selection confirmed the administra- tor’s beliefs. Nevertheless, almost none of the witnesses (4.7%) or lineup administrators (0%) were aware of the lineup administra- tor’s influence. If witnesses and lineup administrators are unaware of the lineup administrator’s influence, a retroactive solution to the problem seems unlikely to succeed. A proactive approach may be more successful. One approach follows from Rosenthal (1966), who suggested that reducing con- tact would reduce the opportunity for the experimenter to commu- nicate his or her expectancies and influence the participant’s responses. This strategy was implemented in the present study, and the findings certainly bear out Rosenthal’s suggestion. Reducing the level of contact between the lineup administrator and the witness decreased the administrator’s influence on the witness’s behavior. This had the salutary effect in the target-absent condition of reducing the choosing rate and, more important, reducing the likelihood of obtaining a false identification of the target substi- tute. Furthermore, there was no apparent cost, as it did not reduce the hit rate. If the increase in false identifications was caused by experi- menter expectancy effects, why didn’t high contact also lead to a concomitant increase in hits, since the lineup administrator also expected the suspect to be selected in the target-present condition? Although we cannot explain why the hit rates were comparable for the high- and low-contact conditions, we note that this finding is common in the eyewitness literature. Several researchers have found experimental manipulations that reduced false identifica- tions in target-absent lineups but had no observable effect on hits in target-present lineups. This occurs for simultaneous versus sequential lineup presentation (Cutler & Penrod, 1988; Lindsay, Lea, & Fulford, 1991; Lindsay, Lea, Nosworthy, et al., 1991; Lindsay & Wells, 1985; Sporer, 1993; but see Steblay et al., 2001), biased versus unbiased lineup instructions (Malpass & Devine, 1981; Steblay, 1997), foil quality (Lindsay & Pozzulo, 1999), and clothing bias (Lindsay et al., 1987). Practically, this pattern is encouraging, as it suggests that one may be able to develop other innovative procedures to reduce false identifications without risk- ing a reduction in hits. Theoretically, we suspect that this may be yet another instance of the outshining hypothesis, that is, providing a very effective retrieval cue (the perpetrator) minimizes the im- pact of other influences (Smith, 1988). In the present experiment, when the target photograph was provided at test (target-present condition), it minimized the influence of the lineup administrator’s knowledge. Obviously, there is no independent evidence in our study to support the outshining hypothesis. As such, we offer this hypothesis only tentatively, to stimulate follow-up research. Practical Implications In applying these results to real-world applications, it is impor- tant to note that police generally conduct lineups by using a single-blind, simultaneous procedure with high administrator– witness contact (Steblay et al., 2001). In the current study this combination produced the greatest number of false identifications: ten times more false identifications than when using a simulta- neous lineup with low contact. Therefore, current police proce- dures may actually contribute to increasing the number of false identifications and convictions of innocent people. The present data suggest that police may be able to reduce false identifications by either using a sequential lineup or reducing contact. If police use a sequential lineup they may reduce false identifications but they may also decrease hits (Steblay et al., 2001). If police reduce the contact between a lineup administrator and a witness they could potentially reduce false identifications with no apparent influence on hits. Eliminating or reducing administrator–witness contact may also be a practical solution that police departments and researchers can implement easily. Researchers have suggested that double-blind testing could re- duce lineup administrator bias (Rosenthal, 1966; Wells et al., 2000). Police, however, are resistant to using the procedure be- 1110 RESEARCH REPORTS cause they do not want to have a lineup conducted by an untrained person, whom they would later have to train to testify in court. Moreover, police do not want to have other detectives brought in on cases unnecessarily (Wells et al., 2000). Finally, in small police departments it may be difficult to find a person who is blind to the identity of the suspect. The alternative suggestion, reducing the level of contact between the administrator and the witness, may allow detectives to participate in lineup identifications and be able to testify at trial without adding bias to the identification procedure. When the low-contact procedure was used in the present exper- iment, there was no difference between simultaneous and sequen- tial lineups. Accordingly, researchers who favor the sequential lineup but have concerns about its sensitivity to influence can recommend its use in conjunction with a low-contact procedure. Police, who may feel more comfortable using a simultaneous lineup, can continue to do so in conjunction with low contact without risking undue false identifications. Most important for real-world application, we expect police not to resist implementing a low-contact procedure. Future Directions and Policy Implications Although the current data are encouraging, this is the first demonstration that limiting administrator–witness contact may re- duce errors in eyewitness identifications. Replication is therefore in order before making strong recommendations. If our results do replicate, we recommend that police departments standardize the positions of the witness and administrator so as to minimize contact. We also suggest that police notate these positions for later critical review, just as they are encouraged to notate other lineup procedures (Technical Working Group for Eyewitness Evidence, 1999). Future researchers should explore additional ways to reduce administrator–witness contact. Finding such convergence across techniques would strengthen the theoretical explanation. Practi- cally, having a range of alternatives may also give police depart- ments with different needs and resources more options to improve their lineup procedures. Finally, the current study was limited in that we did not measure directly the type or amount of behaviors administrators used to influence witnesses. Future research will profit from directly measuring administrators’ behaviors so as to better understand and control the process of administrator influ- ence of identification decisions. References Adair, J. G., & Epstein, J. S. (1968). Verbal cues in the mediation of experimenter bias. Psychological Reports, 22, 1045–1053. Archer, D. (1993). The methodological imagination: Insoluble problems or investigable questions? In P. David (Ed.), Interpersonal expectations: Theory, research, and applications (pp. 337–349). New York: Cam- bridge University Press. Aronson, E., Ellsworth, P. C., Carlsmith, J. M., & Gonzales, M. H. (1990). On the avoidance of bias. In Methods of research in social psychology (2nd ed., pp. 292–314). New York: McGraw-Hill. Callaway, J. W., Nowicki, S., & Duke, M. P. (1980). Overt expression of experimenter expectancies, interaction with subject expectancies, and performance on a psychomotor task. Journal of Research in Personality, 14, 27–39. Cutler, B. L., & Penrod, S. D. (1988). Improving the reliability of eyewit- ness identification: Lineup construction and presentation. Journal of Applied Psychology, 73, 281–290. Friedman, N. (1967). The social nature of psychological research. New York: Basic Books. Garrioch, L., & Brimacombe, C. A. E. (2001). Lineup administrators’ expectations: Their impact on eyewitness confidence. Law and Human Behavior, 25, 299–315. Haw, R., Mitchell, T., & Wells, G. (2003, July). The influence of line-up administrator knowledge and witness perceptions on eyewitness identi- fication decisions. Poster session presented at the annual meeting of the European Association of Psychology and Law, Edinburgh, Scotland. Jussim, L., & Eccles, J. (1995). Naturally occurring interpersonal expect- ancies. In N. Eisenberg (Ed.), Social development (pp. 74–108). Thou- sand Oaks, CA: Sage. Koehnken, G., Malpass, R. S., & Wogalter, M. S. (1996). Forensic appli- cations of line-up research. In S. L. Sporer, R. S. Malpass, & G. Koehnken (Eds.), Psychological issues in eyewitness identification (pp. 205–231). Mahwah, NJ: Erlbaum. Lindsay, R. C. L., Lea, J. A., & Fulford, J. A. (1991). Sequential lineup presentation: Technique matters. Journal of Applied Psychology, 76, 741–745. Lindsay, R. C. L., Lea, J. A., Nosworthy, G. J., Fulford, J. A., Hector, J., LeVan, V., & Seabrook, C. (1991). Biased lineups: Sequential presen- tation reduces the problem. Journal of Applied Psychology, 76, 796– 802. Lindsay, R. C. L., & Pozzulo, J. D. (1999). Sources of eyewitness identi- fication error. International Journal of Law and Psychiatry, 22, 347– 360. Lindsay, R. C. L., Wallbridge, H., & Drennan, D. (1987). Do the clothes make the man? An exploration of the effect of lineup attire on eyewit- ness identification accuracy. Canadian Journal of Behavioural Science, 19, 464–478. Lindsay, R. C. L., & Wells, G. L. (1985). Improving eyewitness identifi- cations from lineups: Simultaneous versus sequential lineup presenta- tion. Journal of Applied Psychology, 70, 556–564. Malpass, R. S., & Devine, P. G. (1981). Eyewitness identification: Lineup instructions and the absence of the offender. Journal of Applied Psy- chology, 66, 482–489. Phillips, M. R., McAuliff, B. D., Bull Kovera, M., & Cutler, B. L. (1999). Double-blind photoarray administration as a safeguard against investi- gator bias. Journal of Applied Psychology, 84, 940–951. Rosenthal, R. (1966). Experimental effects in behavioral research. New York: Appleton-Century-Crofts. Rosenthal, R. (1980). Replicability and experimenter influence: Experi- menter effects in behavioral research. Parapsychology Review, 11, 5–11. Rosenthal, R., & Rubin, D. B. (1978). Interpersonal expectancy effects: The first 345 studies. Behavioral and Brain Sciences, 3, 377–415. Smith, S. M. (1988). Environmental context-dependent memory. In G. M. Davies & D. M. Thomson (Eds.), Memory in context: Context in memory (pp. 13–34). New York: Wiley. Sporer, S. L. (1993). Eyewitness identification accuracy, confidence, and decision times in simultaneous and sequential lineups. Journal of Ap- plied Psychology, 78, 22–33. Sporer, S. L., Malpass, R. S., & Koehnken, G. (Eds.). (1996). Psycholog- ical issues in eyewitness identification. Mahwah, NJ: Erlbaum. Steblay, N. M. (1997). Social influence in eyewitness recall: A meta- analytic review of lineup instruction effects. Law and Human Behavior, 21, 283–297. 1111RESEARCH REPORTS Steblay, N., Dysart, J., Fulero, S., & Lindsay, R. C. L. (2001). Eyewitness accuracy rates in sequential and simultaneous lineup presentations: A meta-analytic review. Law and Human Behavior, 25, 459–473. Technical Working Group for Eyewitness Evidence. (1999). Eyewitness evidence: A guide for law enforcement (NCJRS No. 178240). Washing- ton, DC: National Institute of Justice. Tredoux, C. G. (1998) Statistical inference on measures of lineup fairness. Law and Human Behavior, 22, 217–237. Wells, G. L., & Bradfield, A. L. (1999). Measuring the goodness of lineups: Parameter estimation, question effects, and limits to the mock witness paradigm. Applied Cognitive Psychology, 13, S27–S39. Wells, G. L., Malpass, R. S., Lindsay, R. C. L., Fisher, R. P., Turtle, J. W., & Fulero, S. M. (2000). From the lab to the police station: A successful application of eyewitness research. American Psychologist, 55, 581– 598. Wells, G. L., Rydell, S. M., & Seelau, E. P. (1993). The selection of distractors for eyewitness lineups. Journal of Applied Psychology, 78, 835–844. Wells, G., Small, M., Penrod, S., Malpass, R. S., Fulero, S. M., & Brimacombe, C. A. E. (1998). Eyewitness identification procedures: Recommendations for lineups and photospreads. Law and Human Be- havior, 22, 603–647. Received August 12, 2002 Revision received August 26, 2003 Accepted September 15, 2003 New Editors Appointed, 2006–2011 The Publications and Communications Board of the American Psychological Association an- nounces the appointment of seven new editors for 6-year terms beginning in 2006. As of January 1, 2005, manuscripts should be directed as follows: • Experimental and Clinical Psychopharmacology (www.apa.org/journals/pha.html), Nancy K. Mello, PhD, McLean Hospital, Massachusetts General Hospital, Harvard Medical School, 115 Mill Street, Belmont, MA 02478-9106. • Journal of Abnormal Psychology (www.apa.org/journals/abn.html), David Watson, PhD, De- partment of Psychology, University of Iowa, Iowa City, IA 52242-1407. • Journal of Comparative Psychology (www.apa.org/journals/com.html), Gordon M. Burghardt, PhD, Department of Psychology or Department of Ecology & Evolutionary Biology, University of Tennessee, Knoxville, TN 37996. • Journal of Counseling Psychology (www.apa.org/journals/cou.html), Brent S. Mallinckrodt, PhD, Department of Educational, School, and Counseling Psychology, 16 Hill Hall, University of Missouri, Columbia, MO 65211. • Journal of Experimental Psychology: Human Perception and Performance (www.apa.org/ journals/xhp.html), Glyn W. Humphreys, PhD, Behavioural Brain Sciences Centre, School of Psychology, University of Birmingham, Edgbaston, Birmingham B15 2TT, United Kingdom. • Journal of Personality and Social Psychology: Attitudes and Social Cognition section (www.apa.org/journals/psp.html), Charles M. Judd, PhD, Department of Psychology, Univer- sity of Colorado, Boulder, CO 80309-0345. • Rehabilitation Psychology (www.apa.org/journals/rep.html), Timothy R. Elliott, PhD, Depart- ment of Psychology, 415 Campbell Hall, 1300 University Boulevard, University of Alabama, Birmingham, AL 35294-1170. Electronic submission: As of January 1, 2005, authors are expected to submit manuscripts electronically through the journal’s Manuscript Submission Portal (see the Web site listed above with each journal title). Manuscript submission patterns make the precise date of completion of the 2005 volumes uncertain. Current editors, Warren K. Bickel, PhD, Timothy B. Baker, PhD, Meredith J. West, PhD, Jo-Ida C. Hansen, PhD, David A. Rosenbaum, PhD, Patricia G. Devine, PhD, and Bruce Caplan, PhD, respectively, will receive and consider manuscripts through December 31, 2004. Should 2005 volumes be completed before that date, manuscripts will be redirected to the new editors for consideration in 2006 volumes. 1112 RESEARCH REPORTS Lineup Administrator Influences on Eyewitness Identification Decisions Steven E. Clark, Tanya E. Marshall, and Robert Rosenthal University of California, Riverside The present research examines how a lineup administrator may influence eyewitness identification decisions through different forms of influence, after providing the witness with standard, unbiased instructions. Participant-witnesses viewed a staged crime and were later shown a target-present or target-absent lineup. The lineup administrators either remained silent while the witness examined the lineup, made ostensibly cautionary statements to the witness, or prompted the witness to identify the person in the lineup who seemed most similar to the perpetrator. These two forms of influence, denoted as subtle-influence and similarity-influence conditions, led to different patterns of identification results. Results for the similarity-influence condition were generally consistent with criterion shift and relative judgment models of eyewitness decision making. Results for the subtle-influence condition, however, cannot be explained by alterations in the decision rule. A weighted matching model is outlined to explain results from the subtle-influence condition. Witnesses seemed generally unaware of the attempts by the lineup administrator to influence their decision, although some noted it, and the probative value of suspect identifications was lower for those who did note it. Implications for theory and policy are discussed. Keywords: eyewitness identification, social influence, memory, decision The general procedures for eyewitness identification are well- known. The police officer shows the witness a lineup of several individuals, one of whom is suspected of having committed the crime. The witness may identify that suspect, or perhaps one of the lineup fillers, or may identify no one. The witness’s response is, of course, a product of memory and decision processes, but these cognitive processes do not operate in a social vacuum. Eyewitness identification procedures take place in a rich social context. At a minimum it involves an exchange between the witness and the person who administers the lineup. Thus, the lineup administrator can have considerable influence over the witness’s decision. Indeed, Phillips, McAuliff, Kovera, and Cutler (1999) and Haw and Fisher (2004) demonstrated that if the lineup administrator knows the position of the suspect and is motivated to obtain an identification of the suspect, he or she will obtain more suspect identifications. Such results fit within a long tradition of research that has shown that one person’s expectations can be conveyed to another person in such a way that that person will respond to questions or behave in precisely the expected or desired way (Rosenthal, 1967, 1976; Rosenthal & Fode, 1963; see Rosenthal, 2002, for a more recent review). There are (at least) two problems raised by Phillips et al.’s (1999) and Haw and Fisher’s (2004) results. First, many of the identifications were false identifications of individuals taking on the role of an innocent suspect. Such errors—false identifications of people who are suspected by the police but innocent—are precisely the kind of eyewitness error that may lead to wrongful convictions. In nearly three fourths of the DNA exonerations listed by the Innocence Project (www.innocenceproject.org), the convic- tion was based in part on a mistaken identification, and many legal scholars have proposed that eyewitness error is the leading cause of wrongful convictions in the United States (Gross, Jacoby, Matheson, Montgomery, & Patil, 2005; Huff, Rattner, & Sagarin, 1996; Scheck, Neufeld, & Dwyer, 2000; Wells et al., 1998). Second, allowing for variation across jurisdictions, the indepen- dence of the witness’s identification is an important consideration in the admissibility of the identification evidence at trial (State of Utah v. Long, 1986) as well as the weight jurors may give the evidence, as instructed by the court (U.S. v. Telfaire, 1972). Many of the identifications in the Phillips et al. and Haw and Fisher studies were likely not the product of the witness’s independent recollection. To reduce the influence of the lineup administrator, eyewitness identification researchers have made two key recommendations (Wells et al., 1998). First, the instructions to the witness should be unbiased with respect to the presence or absence of the perpetrator, correctly acknowledging that the perpetrator may— or may not—be in the lineup. Second, the lineup administrator should be blind as to the position of the suspect within the lineup. The purpose of these two procedural safeguards is to reduce the influ- ence of the lineup administrator on the witness’s decision—both in terms of whether to make an identification and whom to identify. However, these procedural safeguards may not provide a perfect Steven E. Clark, Tanya E. Marshall, and Robert Rosenthal, Psychology Department, University of California, Riverside. The research was supported by Grant SES 0214373 from the National Science Foundation to Steven E. Clark. We thank our lineup administra- tors, Samantha Carter, Lysa DiDonato, Barbara Dern, Maria Erives, Ivonne Figueroa, and Rie Takyu for conducting the experiment. Thanks also to Siegfried Sporer for sharing raw data from Sporer (1992), and to Rod Lindsay for insightful discussions. Correspondence concerning this article should be addressed to Steven E. Clark, Psychology Department, University of California, Riverside, Riv- erside, CA 92521. E-mail: clark@ucr.edu Journal of Experimental Psychology: Applied © 2009 American Psychological Association 2009, Vol. 15, No. 1, 63–75 1076-898X/09/$12.00 DOI: 10.1037/a0015185 63 prophylaxis against the influence of a lineup administrator who is highly motivated to obtain an identification from the witness. The present paper examines how comments made by the lineup administrator may influence the witness, even when the lineup administrator gives a “may or may not” instruction, and is blind as to the position of the suspect in the lineup. At issue is a funda- mental question about how witnesses make identification decisions and how their decisions may be altered through subtle forms of social influence. Statements made by the lineup administrator may seem on their literal surface to be cautionary, helpful, or encouraging, but may influence the witness’s decision to make, rather than not make an identification, and may also influence the witness’s decision as to whom to identify. These influences are examined in two conditions of the present experiment that are compared to a no-influence control condition. In the subtle-influence condition of the present experiment, if after several seconds, the witness had not made an identification, the lineup administrator began making statements to the witness such as “take your time,” and “look at each photograph carefully.” In this context, the pragmatic meaning of these statements may be quite different from the literal request (see, e.g., H. H. Clark, 1979; Grice, 1975). The lineup administrator’s comments began after 12 s, timed so as to intervene at a point when the witness might be leaning toward not making an identification. Thus, the intent—or conversational implicature in Grice’s terms—as it is interpreted by the witness, may not literally be to take a long time, but rather to continue looking for the perpetrator in the lineup, and to not give up, or indicate that the perpetrator is not in the lineup. If this were indeed the case, such seemingly innocuous comments to “take your time” and “look at each photograph carefully” may serve to make a witness more willing to make, rather than not make, a positive identification. Of course, the kinds of statements described above do not give any indication as to how or whom to select from the lineup. In the similarity-influence condition, lineup administrators asked wit- nesses the ostensibly helpful, but more directive question, “So, is there anyone in the lineup who looks more like him than anyone else?” If the witness responded in any affirmative way to this question (“yes,” “maybe,” “not sure”), the witness was asked if that person might actually be the perpetrator. It is important to note that the lineup administrator’s questions did not direct the witness toward any particular person in the lineup. The administrator did not ask, for example, “Do you think it could be No. 3?” The timing of the administrator intervention was the same as for the subtle- influence condition, starting after 12 s, if the witness had not already made a definitive response. By making witnesses more willing to make identifications, social influence following unbiased instructions may undermine the intent of unbiased instructions, producing a pattern of results akin to those shown in comparisons between biased and unbiased instructions. Recent meta-analyses show that, on the whole, biased instructions lead to increases in both correct and false suspect identification rates, and a decrease in the diagnosticity or probative value of a suspect identification (S. E. Clark, 2005; S. E. Clark & Godfrey, 2009). There are, however, exceptions to this general pattern that will be discussed later (see also, Steblay, 1997). The point here is that social influence following unbiased instructions may have the same effect as giving biased instructions in the first place. To the extent that this is the case, common underlying mechanisms may be involved, a point that will be discussed further in the next section. Theory-Based Predictions How should these various forms of social influence affect the pattern of witness responses? If these statements do in fact redirect witnesses back toward making an identification rather than a nonidentification response, then the total identification rate should increase. We considered two possibilities as to how this might occur, one based on a simple lowering of the witness’s decision criterion, and another based on a shift toward relative judgments (Wells, 1984). The predictions of both models were explored within the framework of the WITNESS model (S. E. Clark, 2003). The details of the WITNESS model are presented elsewhere (S. E. Clark, 2003), and only a brief description is necessary here. The WITNESS model assumes that each lineup alternative is compared to the witness’s memory of the perpetrator, resulting in N match values for an N-person lineup. Each match value is a measure of how closely a given lineup member matches the witness’s memory of the perpetrator. A decision rule applied to those match values determines whether the witness makes an identification or a non- identification response. Several versions of the model, with vary- ing assumptions about lineup composition and decision rules were considered, and they all showed the same general pattern of results. Two decision rules are described in greater detail below, the Best Above Criterion rule, and an augmented Best Above Criterion rule that allows identifications based on relative judg- ments (Wells, 1984). We refer to this second model as the Best Above/Relative Difference rule. According to the Best Above Criterion rule, the lineup alterna- tive that provides the best match to the witness’s memory of the perpetrator is identified, if that match is above a decision criterion c. The Best Above/Relative Difference rule allows the witness to identify the best match even if the best match is not above criterion, provided that the best match is sufficiently better than any other lineup member. More formally, the rule is: Identify BEST if BEST c or if BEST – NEXT cDIFF, where BEST is the best match, NEXT is the next-best match and cDIFF is a decision criterion for evaluating the relative match. The underlying assumption in this Best Above/Relative Difference model is that the witness becomes more likely to identify the best match by lowering cDIFF, allowing identifications based on smaller and smaller differences between the best and next-best matches. For each decision rule response probabilities were generated for suspect identifications, foil identifications, and nonidentification responses, for both target-present and target-absent lineups. Target-present lineups represent those real-world cases in which the suspect is guilty, whereas target-absent lineups represent those cases in which the person suspected by the police is innocent of the crime. We also calculated a measure of response diagnosticity or probative value for suspect identifications by dividing correct suspect identification rates (from target lineups) by the total sus- pect identification rate (target-present plus target-absent lineups). This conditional probability represents the proportion of suspect identifications that are correct identifications of the perpetrator rather than false identifications of a person suspected by the police but innocent. This calculation of probative value addresses the 64 CLARK, MARSHALL, AND ROSENTHAL critical question: Given that the suspect was identified, what is the likelihood that he or she is guilty? This, of course, is the central question before the juror. The target-present and target-absent lineup response probabili- ties and probative value calculations based on 3,000 computer simulations are shown in Figure 1. The Best Above Criterion model is shown in the top panel and the Best Above/Relative Difference model is shown in the bottom panel. Response func- tions were generated by varying the decision criterion c for the Best Above Criterion model, and by varying the relative dif- ference criterion cDIFF for the Best Above/Relative Difference model. Although the shape of the response functions is different for the two decision models, the overall patterns are very much the same: As decision criteria are lowered, the rate of noniden- tification responses decreases and the rates of suspect and foil identifications increase, for both target-present and target- absent lineups. In addition, the probative value of a suspect identification is predicted to decrease. In the models this de- crease in probative value arises because the increase in the correct identification rate is proportionally smaller than the increase in the false identification rate. These predictions are tested in an experiment that is presented next. Method The experiment presented witnesses with a staged crime, followed by either a target-present or target-absent lineup, with one of three lineup administrator conditions. After a recorded presentation of the lineup instructions, the lineup administrator either remained quiet in the no-influence condition, made ostensibly cautionary state- ments in the subtle-influence condition, or asked the witness directly to consider a most-similar lineup member in the similarity- influence condition. In the two influence conditions, the lineup administrator’s statements began if the witness did not make a response before 12 s. The 12-s interval was chosen to allow identifying witnesses to make their identifications but to intervene with witnesses before they could make definitive nonidentification responses. Previous research has shown that identifications are typically made more quickly than nonidentifications (Sporer, 1992, 1993), and that nonidentifications may be given in as little as 10 s (Sporer, 1992). We calculated false identification rates for target-absent lineups in two different ways. First, we designated as the innocent suspect a person judged to be consistent with a description of the perpe- trator, but not a “dead-ringer” for the perpetrator. The rationale for this decision was that we did not want to confuse influence effects 0.040.050.060.070.08 R ES PO N SE PR O BA BI LI TY 0.0 0.2 0.4 0.6 0.8 1.0 Suspect Foil No ID 0.040.050.060.070.08 CRITERION SHIFT TARGET PRESENT TARGET ABSENT PROBATIVE VALUE 0.040.050.060.070.08 S U S P TP / (S U S P TP + S U S P TA )0.7 0.8 0.8 0.9 0.9 0.000.010.020.030.04 R ES PO N SE PR O BA BI LI TY 0.0 0.2 0.4 0.6 0.8 1.0 Suspect Foil No ID 0.000.010.020.030.04 DIFFERENCE CRITERION SHIFT TARGET PRESENT TARGET ABSENT PROBATIVE VALUE 0.000.010.020.030.04 S U S P TP / (S U S P TP + S U S P TA )0.7 0.7 0.8 0.8 0.9 0.9 Figure 1. Representative set of identification predictions given by the WITNESS model for suspect, foil, and nonidentifications in target-present and target-absent lineups. Rightmost panels show the probative value of a suspect identification [suspTP/(suspTP suspTA)]. Top panels show predictions from Best Above Criterion model. Bottom panels show predictions for an augmented Best Above Criterion model that allows for identifications based on relative judgments when all matches are below criterion. 65INFLUENCES ON EYEWITNESS IDENTIFICATION DECISIONS with similarity effects. In other words, we wanted to be sure that any variation in false identification rates was due to the lineup administrator’s influence, rather than an expansion of a dead- ringer or biased lineup effect. However, this leaves open the question: What if the lineups were very biased? To address this question, we also conducted a worst-case scenario analysis (Pryke, Lindsay, Dysart, & Dupuis, 2004). The worst-case scenario analysis designates the innocent suspect post hoc to be the person identified most often in the target-absent lineup. This allows us to analyze the target-absent lineups under two different assumptions: that it is fair with an innocent suspect who is not unusually similar to the innocent suspect and in the worst-case scenario in which the innocent suspect is the person in the target-absent lineup who is most likely to be identified.1 Participant–Witnesses Two-hundred and eighty-eight introductory psychology students participated as witnesses, in partial fulfillment of a course require- ment. They were 153 women (.53), 118 (.41) men, and 17 (.06) did not provide an answer to the question about gender. They self- identified as Asian (.49), including Filipino, Asian Indian, and Pacific Islander, Caucasian (White) European (.14), African Amer- ican (.08), and Middle Eastern (.03), including Armenian, Arabic, and Persian. The remaining participants (.26) self-identified with multiple ethnicities, or as “other.” The 288 participants were evenly distributed among the six conditions of the experiment, with 48 participants per group. Materials and Procedure The experiment was conducted with one participant at a time, rather than in groups. Each participant watched a video tape that depicted a conversation between two women, one of whom was later carjacked by a White male in his mid-20s. Because the participants had become witnesses to the staged crime, we will hereafter refer to them with the more descriptive term witness, rather than the more generic term participant. Immediately after the presentation of the video, each witness completed a brief questionnaire (about their age, sex, and year in college) and completed the Big Five Personality Inventory (Brand & Egan, 1989). They then described the carjacking and the perpetrator, and indicated whether they believed they would be able to identify the perpetrator if they saw him again. Following this, each witness was taken to another room and shown a six-person photographic lineup. Each lineup contained a picture of the perpetrator (target present) or an innocent suspect (target absent). The designation of the innocent suspect was based on results from two previous experiments. In an experiment by Tunnicliff and Clark (2000), this person had been selected by at least one San Bernardino County Sheriff’s detective as being consistent with a verbal description of the perpetrator. This same person was also designated as the innocent suspect in an unpublished study by Cocilova (2000) and was identified by 14.7% of witnesses who were shown a target- absent lineup. These results are very consistent with averages collapsed over 94 target-absent lineups reported by S. E. Clark, Howell, and Davey (2008). Because he was judged by law en- forcement to be consistent with a verbal description of the perpe- trator and produced “typical” results in Cocilova’s study, he seemed a good candidate as a “typical” or plausible innocent suspect. Sixteen graduate students were paid $10 each to select foils from a large online pool of mugshots (http://www.dc.state.fl.us/), generated from a description of the perpetrator and judged to be similar in appearance to either the guilty or innocent suspect, creating eight different target-present and eight different target- absent lineups. Each graduate student participant created only one lineup. Six variants of each lineup were created with the position of the suspect balanced across all six lineup positions. In all conditions of the experiment, prior to the presentation of the lineup, witnesses heard a recording of the witness admonition from the California Peace Officers Legal Sourcebook (Calandra & Carey, 2005), read verbatim by a male voice. The influence con- ditions are defined by what the experimenter did after the recorded admonition was played. In the no-influence condition, the lineup administrator remained silent while the witness looked at the photo lineup and the admin- istrator simply recorded the witness’s response. In the subtle- influence condition, the lineup administrator remained silent for 12 s, and then began making statements such as, “take your time,” “look at each photograph carefully,” and “there’s no rush.” In the similarity-influence condition, the lineup administrator remained silent for 12 s, and then prompted the witness with the question, “Is there anyone in the lineup who looks more similar to the person you saw than anyone else in the lineup?” Any affirmative response to this question (“yes,” “maybe,” “not sure,” etc.) was followed with a question asking if that similar person was actually the perpetrator. The 12-s interval was selected as the starting point based on results from previous studies (discussed earlier), and so that the influence of the lineup administrator would not begin until quick identifiers and quick nonidentifiers had already responded. In both the subtle-influence and similarity-influence conditions, if the wit- ness still had not made a decision after the administrator had made the various influence statements, the administrator said to the witness, “If you’re unable to make an identification, that’s ok, just let me know.” The word “unable” was specifically chosen as one last nudge, by casting a nonidentification in the language of inability or failure. The lineup administrators were six junior- or senior-year female undergraduate research assistants, each of whom conducted a full replication of the experiment. The lineup administrators were given extensive training to memorize their lines and engage nat- urally with the participants. They were instructed to stick closely to the script for the relevant influence condition, and to allow 1 The worst-case scenario analysis is based on the assumption that the foils in the target-absent lineup were selected in such a way that any one of them could be designated as the innocent suspect. For example, in the present case, foils were selected from a pool of photographs obtained using search terms corresponding to a general description of the perpetrator. The worst-case scenario analysis provides a high estimate of the likelihood of false identification of an innocent suspect. The worst-case scenario analysis may also simulate the conditions of a biased lineup if the worst-case scenario innocent suspect is identified at a rate greater than chance, as was the case in the present experiment. 66 CLARK, MARSHALL, AND ROSENTHAL participants to give nonidentification responses. Thus, if a witness asked “do I have to identify someone?” the answer was that they did not. If a witness made a nonidentification response, lineup administrators accepted it and concluded the experiment. Several precautions were taken so that the experimenters would not develop expectations or hypotheses about the experimental conditions or the accuracy of witness responses, and so that they would not treat witnesses differently at the beginning of the experiment, prior to the point at which the lineup was presented. Thus, the lineup administrators were blind as to the presence or absence of the perpetrator and the position of the suspect in the lineup. Lineup administrators were seated approximately two to three feet from the witness during the viewing of the lineup, at approximately a 60° angle so that they could not see the lineup on the computer screen as the witness viewed it. In addition, they did not know which condition of the experiment they were conducting until just before the presentation of the lineup, to minimize the possibility that they might interact differently with witnesses prior to the lineup based on the condition they knew they were conduct- ing. The experimenters did not watch the crime video, did not know what the perpetrator looked like, and did not look at the lineups. They did hear the witnesses’ responses; however, because the position of the suspect was counterbalanced it would have been difficult for the administrators to develop any hypotheses about the position of the suspect. In a postparticipation questionnaire we asked the witnesses to comment on their experience in the experiment and to rate the experimenter on various dimensions. The questions asked for an overall assessment of the experiment (was it enjoyable, did they feel like they learned something valuable), and an overall evalu- ation of the experimenter. The critical question asked, “Do you think the experimenter said or did anything in order to try to influence your decision for the photo lineup?” Other questions asked witnesses about the experimenter’s punctuality and profes- sionalism, whether the experimenter was able to answer questions, and whether they thought the experimenter knew who the suspect was in the lineup. Each question was given a rating on a scale of 1 to 9. The completed questionnaires were placed in sealed enve- lopes and dropped in a locked suggestion box with a slot on the top. This procedure was designed to create the impression that the responses would not be read by the experimenter, so that witnesses would feel free to comment candidly on the behavior of the experimenter. Results Two sets of analyses are presented, the first examining the eyewitness identification responses, and the second examining the responses to the questionnaire, in particular, responses to the critical question about administrator influence. Identification Results Four separate analyses are presented: First, we present the response proportions collapsed over all lineups and lineup admin- istrators. Second, we conducted separate analyses based on whether witnesses responded before or after the 12-s point at which lineup administrators began their subtle-influence or similarity-influence manipulations. Third, we examine the consis- tency of the effects for each of the lineup administrators, and fourth we conduct the same analyses to examine the consistency of the effects across the eight target-present and eight target-absent lineups.2 The critical comparisons are between the no-influence and subtle-influence conditions and between the no-influence and similar-influence conditions. The response proportions and fre- quencies are shown in Table 1 for target-present lineups and target-absent lineups under two different assumptions, first with the designated innocent suspect, and then under the worst-case scenario in which the innocent suspect is designated post hoc to be the person identified most often from the target-absent lineups. Because there were eight different target-absent lineups, the worst- case scenario was analyzed by identifying the person identified most often for each of the eight different lineups. Analyzing the target-absent lineup data this way allows two different views of the same lineup data. Subtle influence versus no influence. The subtle influences produced different effects for target-present and target-absent line- ups. The results of target-present lineups were unaffected by the lineup administrator’s comments. The overall identification rate decreased slightly (from .77 to .71) but was not significant, 2(1, N 96) .49, p .48, r .07; and the correct identification rate also did not change significantly (from .35 to .33), 2(1, N 96) .05, p .83, r .02. For target-absent lineups, the overall identification rate in- creased from .38 in the no-influence condition to .56 in the subtle-influence condition, 2(1, N 96) 3.39, p .07, r .19. False identification rates were compared for the designated inno- cent suspect as well as for the worst-case scenario innocent sus- pects, and these analyses are presented in turn. For the designated innocent suspect, the false identification rate increased from .02 to .13, 2(1, N 96) 3.85, p .05, r .20. Because of small expected cell frequencies, we also calculated a Fisher’s exact test, adjusted for continuity (Overall, 1980), which produced a one-tailed p .03. These subtle influences produced a decrease in the probative value of a suspect identification. The relevant data are shown in Table 2. Without administrator influence, 94% (17 of 18) of 2 We also analyzed the results based on the gender of the witness. In general there were no differences between male and female witnesses, with two exceptions: First, women made more identifications overall than men (.86 vs. .50), 2(1, N 47) 7.28, p .007, r .39. Because women made more identifications than men, their overall identification rates had less room to increase as a result of the influence conditions relative to men. This led to a gender difference in the similarity-influence condition. For men the correct identification rate increased significantly (from .36 to .67), 2(1, N 43) 3.76, p .05, r .30; whereas for women the correct identification rate increased only slightly (from .41 to .44), 2(1, N 44) .05, p .83, r .03. The overall identification rate for women in the no-influence condition was .82. Thus, even if all of the nonidentifiers (estimated at 1 – .82 .18) made a correct identification, this would at most add only a few correct identifications, and thus the correct identifi- cation rate could increase only very little. These analyses are exploratory. We had no a priori predictions and have no ad hoc explanation for why women made more identifications than men. Others have also found differences for male and female witnesses (Foster, Libukman, Schooler, & Loftus, 1994; Shapiro & Penrod, 1986). To date we know of no explanation for these gender differences. 67INFLUENCES ON EYEWITNESS IDENTIFICATION DECISIONS suspect identifications were correct identifications of the perpetra- tor. However, in the subtle-influence condition only 73% (16 of 22) of suspect identifications were correct identifications, 2(1, N 40) 3.23, p .07, r .28. Again, because of the small expected cell frequencies, an adjusted Fisher’s exact test was performed, which yielded a one-tailed p .04. Next we consider the worst-case scenario analysis. This analysis was conducted by determining the most frequently identified per- son from each of the eight target-absent lineups, and designating that person post hoc to be the innocent suspect. The results for the worst-case scenario analysis, shown in the far-right column of Table 1, were quite different from those for the designated inno- cent suspect. Of course, as expected, false identification rates were higher and foil identification rates lower under the worst-case scenario assumption than for the designated innocent suspect.3 However, the subtle nudges appeared to have had no effect on the false identification rate that was exactly the same in the subtle-influence condition as in the no-influence condition, 2(1, N 96) 0.0, but instead increased the foil identification rate, from .13 to .31, 2(1, N 96) 4.94, p .03, r .23. With almost no change in suspect identifications for target- present or target-absent lineups, the probative value of a suspect identification, assuming the worst-case scenario for the target- absent lineup, was virtually unchanged in the subtle-influence condition, relative to the no-influence condition, 2(1, N 57) .01, p .91, r .02. Similarity influence versus no influence. The comparison be- tween similarity-influence and no-influence conditions produced a different pattern of results than did the comparison between subtle- influence and no-influence conditions. The overall identification rate increased for both target-present, 2(1, N 96) 3.87, p .05, r .20; and target-absent, 2(1, N 96) 8.18, p .004, r .29 lineups, and the correct identification rate in target-present lineups increased from .35 to .54, 2(1, N 96) 3.41, p .06, r .19. Again, false identification rates were analyzed for both the designated innocent suspect and also for the worst-case scenario innocent suspect. The false identification rate for the designated innocent suspect dropped from near zero (.02) in the no-influence condition to exactly zero in the similarity-influence condition. This, combined with the significant increase in the correct identi- fication rate, suggests an overall increase in the probative value of a suspect identification. However, this is difficult to evaluate statistically, with only one false identification across the two conditions, 2(1, N 44) 1.48, p .22, r .18. The statistical conclusions based on the adjusted Fisher’s exact test changed very little ( p .17, one-tailed). The worst-case scenario analysis produced a different pattern of results, with false identification rates showing a statistically non- significant increase, from .25 in the no-influence condition to .42 in the similarity-influence condition, 2(1, N 96) 3.00, p .08, r .18. With the parallel increase in correct identifications in the target-present lineup and false identifications in the worst-case scenario target-absent lineup, the probative value for suspect iden- tifications, remained virtually unchanged, 2(1, N 75) .03, p .86, r .02. Responses before influence. An additional analysis was con- ducted based on whether the witness made a response before 12 s, prior to the onset of the influence manipulations. Across all con- ditions, 13.2% of witnesses made a response before 12 s. Presum- 3 The differences in results for the designated and worst-case scenario innocent suspects do not appear to reduce to similarity. We obtained similarity ratings from 172 additional participants from the same introduc- tory psychology participant pool, comparing all lineup members from all 16 lineups, either to the mugshot of the perpetrator (N 94) or to four still photographs captured from the crime video (N 78). Comparing lineup members to the mugshot of the perpetrator, the designated innocent suspect had the highest average similarity rating (4.02). When lineup members were rated for similarity to the video stills, the designated innocent suspect was again rated highly (third most similar, 3.73), one of the worst-case scenario innocent suspects received the highest average rating (5.22), but other worst-case scenario innocent suspects received ratings between 2.0 and 3.3. These results raise questions about the relationship between similarity and identification that are beyond the scope of the present paper. However, it is clear that the low identification rate for the designated innocent suspect was not due to low similarity. Table 1 Identification Response Proportions (and Frequencies) Target-present Target-absentDES Target-absentWCS P (n) P (n) P (n) No influence Suspect .35 (17) .02 (1) .25 (12) Foil .42 (20) .35 (17) .13 (6) No ID .23 (11) .63 (30) .63 (30) Subtle influence Suspect .33 (16) .13 (6) .25 (12) Foil .38 (18) .44 (21) .31 (15) No ID .29 (14) .44 (21) .44 (21) Similarity influence Suspect .54 (16) .00 (0) .42 (20) Foil .38 (18) .67 (32) .25 (12) No ID .08 (4) .33 (16) .33 (16) Note. Target-absentDES the target-absent lineup with the designated in- nocent suspect; Target-absentWCS the target-absent lineup assuming the worst-case scenario of a highly biased lineup in which the innocent suspect is the person who is identified most often; P proportion; n frequency. Table 2 Probative Value of a Suspect Identification for No-Influence, Subtle-Influence, and Similarity-Influence Conditions, for Designated and Worst-Case Scenario Innocent Suspects Condition Target-absentDES Target-absentWCS No influence .94 .59 Subtle influence .73 .57 Similarity influence 1.0 .57 Note. The probative value of a suspect identification was calculated by dividing the number of correct identifications (target-present lineup) by the sum of the correct and false identifications (suspect identifica- tions for target-present and target-absent conditions), taken from Table 1. To illustrate, .94 17/(17 1), .59 17/(17 12), and so on. Target-absentDES the target-absent lineup with the designated inno- cent suspect; Target-absentWCS the target-absent lineup assuming the worst-case scenario in which the innocent suspect is the person who is identified most often. 68 CLARK, MARSHALL, AND ROSENTHAL ably, responses made before 12 s, and prior to any influence manipulations in the influence conditions, should be similarly distributed in the three influence conditions. However, the propor- tion of responses made prior to 12 s increased slightly, although not significantly, for the similarity-influence (16 of 96 .17) and subtle-influence (14 of 96 .15) conditions, relative to the no- influence condition (8 of 96 .08), 2(2, N 288) 3.15, p .21. Also, the proportion of these responses that were identifica- tions rather than nonidentifications were also slightly (but not significantly) higher for the similarity-influence (13 of 16 .81) and subtle-influence (9 of 14 .64) than for the no-influence condition (4 of 8 .50), 2(2, N 38) 2.59, p .27. Because the overall proportion of responses made before 12 s was small (only 38 of 288 responses), and because results restricted to only post-12-s responses were virtually identical to the entire set of results, a separate, detailed analysis restricted only to responses made after 12 s is not presented. Consistency across lineup administrators. To assess the con- sistency across lineup administrators, we reanalyzed the results separately for each lineup administrator. The purpose of this anal- ysis was to determine whether nonsignificant effects were pro- duced by summing over large positive and large negative effects and whether significant effects were produced by large effects for some lineup administrators, but not others. For each lineup administrator we calculated the relevant chi- square and the effect size, r. For example, the change in the target-present identification rate for Lineup Administrator 1 showed a decrease from six identifications in the no-influence condition to five identifications in the subtle-influence condition, 2(1, N 16) .29, r .13. These calculations were carried out for each of the six lineup administrators for each comparison, and the r effect sizes are shown in Table 3. The heterogeneity of the effect can be assessed by calculating zr from r, and summing the weighted squared deviations between each zr and the mean zr (see Rosenthal, 1991, Eq. 4.15, p. 74), which produces the chi- square values (with corresponding p values) shown in Table 3. It is clear from Table 3 that the effects and noneffects were stable across the six lineup administrators, as all tests of heterogeneity fell short of statistical significance. The standard deviations of the r values provides a measure of the heterogeneity effect size, and these values are also given in Table 3.4 The robustness of each effect was also assessed by computing a one-sample t test on the six z-transformed effect sizes for each lineup administrator. (e.g., the increase in the target absent ID rate was evaluated using the six lineup administrators as the units of analysis with scores .0, .25, .13, .38, .0, and .38). The results showed that subtle influence from the lineup administrator did not increase the target-present lineup identification rate, t(5) .64, p .55, r .27; had no effect on the target-present correct identification rate, t(5) .24, p .82, r .11; but for target-absent lineups did increase the overall identification rate, t(5) 2.63, p .05, r .76; and did increase the false identifi- cation rate of the designated innocent suspect, t(5) 3.00, p .03, r .80. Subtle influences, however, did not have an effect on the false identification rates for the worst-case scenario innocent sus- pects, t(5) .22, p .84, r .10. For the comparison of the no-influence and similarity-influence conditions, these analyses did not show a statistically reliable effect across lineup administrators for the increase in the identifi- cation rate for target-present lineups, t(5) 1.59, p .17, r .58; but did show a significant increase for the correct identification rate, t(5) 2.96, p .03, r .80. These analyses also showed the increase in the target-absent lineup overall identification rate to be reliable across lineup administrators, t(5) 4.98, p .004, r .91, but did not show a reliable increase in the false identification rate for the worst-case scenario innocent suspect, t(5) 1.86, p .12, r .64. The heterogeneity tests and t tests on effects sizes show that the lineup administrator effects are robust and consistent across lineup administrators. These analyses suggest the effects are due to the experimental manipulations, rather than the specific lineup admin- istrators who implemented those manipulations in this study. Consistency across lineups. A parallel set of analyses was conducted to determine the degree to which the effects were consistent across the eight target-present and eight target-absent lineups. Again, we calculated a chi-square for each lineup, com- puted r from the chi-square and zr from the r. Each zr is based on 12 observations (except in one set of comparisons for which a counterbalancing error produced a few cases with 11 and 13 observations). Again, the consistency of the overall effect is given by summing the weighted squared deviations between each zr and the mean of the weighted zr (again, following Rosenthal, 1991, Eq. 4.15, p. 74). The zr values, chi-square for each heterogeneity analysis and effect sizes s are shown in Table 4. Seven of the nine analyses showed statistically nonsignificant variation across lineups. The two significant cases were for the overall identification rate and the worst-case scenario suspect identification rate in the target-absent lineups, comparing no-influence and similarity-influence condi- tions, 2(7) 15.02, p .04 and 18.61, p .01, respectively. These two significant results are likely tied together as suspect identification rates are tied to overall identification rates. Again we calculated t tests to determine the degree to which zr statistics deviated significantly from zero. In contrast to the lineup administrator analyses, which showed four significant t tests (out of seven possible), the t tests on zr calculated from the different lineups failed to reach significance, with one close case, for the total identification rate increase in target-absent lineups, compar- ing the no-influence and similarity-influence conditions, t(7) 2.25, p .06, r .65. These analyses suggest some variability in the effects across the lineups used in this study, and raise the possibility that the effects may not generalize broadly across a wide range of possible lineups. Reporting of Influence A questionnaire, designed to have the feel of a customer satis- faction survey, was administered at the very end of the experimen- tal session. The critical question, and the only question discussed here, concerned the influence of the lineup administrator. Wit- nesses responded to this question on a 1 to 9 scale, the results of 4 The chi-square heterogeneity test, developed initially by Snedecor and Cochran (1967), does not have an associated effect size measure. However, for present purposes we offer the simple standard deviation of the r scores as an effect size estimate. We also calculated the standard deviation for zr scores. However this measure of effect size may be unduly influenced by high values of r, and so only the standard deviations for r are reported. 69INFLUENCES ON EYEWITNESS IDENTIFICATION DECISIONS which are shown in Figure 2 for each of the influence conditions. It is clear from the figure that on the whole the ratings were very low (M 1.66 on a 1 to 9 scale, SD 1.78), and that most witnesses gave the lowest rating.5 Indeed, 90, 82, and 70% of witnesses gave the lowest rating for experimenter influence for the no-influence, subtle-influence, and similarity-influence conditions, respectively. Comparing each influence condition to the no- influence condition showed the difference to be statistically sig- nificant for the similarity-influence condition, 2(1, N 192) 11.62, p .001, r .25; but not for the subtle-influence condition, 2(1, N 192) 1.68, p .20, r .09. Because the responses were so sparsely distributed across the response range, we grouped them into three bins, for ratings of 1, ratings of 2 to 4, and ratings of 5 to 9. This analysis is shown in the insert to Figure 2. By accumulating the few responses in the higher range, it is clear that high responses were given most often by participants in the similarity-influence condition (.19, n 18), less often in the subtle-influence condition (.06, n 6), and almost never in the no-influence condition (.01, n 1). Chi-square analyses for this partitioning of the results showed the same pattern described above: no significant difference comparing the subtle- influence condition to the no-influence condition, 2(2, N 192) 4.07, p .13, r .14; and a significant difference comparing the similarity-influence condition to the no-influence condition, 2(2, N 192) 17.77, p .001, r .29.6 One critical subset of the data concerns eyewitness identifica- tions of the suspect, as these are the data most likely to be brought to trial in the prosecution’s case. Our focus is on one critical question: Can a witness’s self-report about the lineup administrator be used to distinguish between correct identifications of the per- petrator versus false identifications of an innocent suspect? In other words, can one look to the witness’s statements (testimony) about the lineup administrator and determine whether that witness has made a correct or false identification? To address this question we analyzed responses to the influence question specifically for suspect identifications. The results are shown in Table 5, with responses to the influence question parti- tioned simply as responses of 1 versus responses greater than 1. Results are shown for correct identification rates and false identi- fication rates for the designated and worst-case scenario innocent suspects. Calculations of probative value are also shown for the designated and worst-case scenario innocent suspects, and these results are shown separately for the no-influence, subtle-influence, and similarity-influence conditions. (The no-influence condition is included under the assumption that the assessment of administrator influence by the trier of fact is based on the witness’s statement rather than any independent evidence of administrator influence.) The probative value calculations for the designated innocent suspect show higher probative value of a suspect identification for witnesses whose responses to the influence question were greater than 1, but because the false identification rates for the designated innocent suspect were quite low, and the differences quite small, this pattern did not approach statistical significance (Fisher’s exact one-tailed p .55). The results for the worst-case scenario innocent suspect, how- ever, did show a consistent pattern across influence conditions: The probative value of a suspect identification was higher for witnesses who gave responses of 1 to the influence question than for witnesses who gave responses greater than 1 to the influence question. A chi-square on the frequencies of correct and false identifications, for influence responses of 1 versus responses greater than 1, fell short of statistical significance, 2(1, N 103) 3.02, p .08, r .17. We pursued this question further by calculating a chi-square and zr score for each condition (just as we had done for the consistency analyses) and conducted a one- sample t test on those zr scores. This one-sample t was statistically significant, t(2) 4.70, p .04, r .96. In addition, we computed a one-sample t on Cohen’s h calculated for the probative value scores, t(2) 4.24, p .05, r .95. These analyses provide some 5 Witnesses were not predisposed toward the low end of the scale however. The low scores for influence ratings contrast with the overall rating of the experiment (how enjoyable it was, M 6.39, SD 1.66) and the overall rating of the experimenter (M 8.12, SD 1.17). 6 To calculate r for chi-square with 2 degrees of freedom, we multiplied chi-square times ralerting 2 (Rosnow, Rosenthal, & Rubin, 2000), divided that value by N and took the square root, that is, r [ (2(2) ralerting 2 )/N ]1/2. Table 3 Variation in Effect Size r Across Lineup Administrators Lineup administrator 1 2 3 4 5 6 2 p s t p No influence versus subtle influence TP ID rate .13 .11 .16 .38 .13 .26 3.69 .60 .23 .64 .55 TP correct ID rate .16 .40 .12 .0 .0 .25 3.75 .59 .23 .24 .82 TA ID rate .0 .25 .13 .38 .0 .38 2.17 .83 .17 2.63 .05 TADES false ID rate .26 .0 .38 .0 .26 .26 1.71 .89 .16 3.00 .03 TAWCR false ID rate .29 .26 .14 .29 .38 .16 5.79 .33 .29 .22 .84 No influence versus similarity influence TP ID rate .16 .38 .38 .00 .58 .29 7.47 .19 .31 1.59 .17 TP correct ID rate .41 .00 .26 .26 .29 .00 1.95 .86 .17 2.96 .03 TA ID rate .38 .25 .13 .52 .25 .26 1.53 .91 .13 4.98 .00 TAWCR false ID rate .25 .14 .16 .52 .14 .14 3.67 .60 .22 1.86 .12 Note. Lineup administrators were designated arbitrarily with numbers 1 through 6. Chi-squares evaluate heterogeneity and t statistics are for random effects test using lineup administrators as the unit of analysis. s is a measure of the effect size for the chi-square heterogeneity test. No data are given for TADES false ID rate, comparing no-influence and similarity-influence conditions because there was only one false identification in the no-influence condition and none in the similarity-influence condition. TP target-present; TA target-absent; TADES moderate-similarity designated innocent suspect; TAWCS worst-case scenario innocent suspect. 70 CLARK, MARSHALL, AND ROSENTHAL evidence that witnesses who gave responses of 1 to the influence question were more accurate than witnesses who gave responses greater than 1 to the influence question. Translating this to a criminal investigation, an identification of the suspect by a witness who reported no administrator influence is more likely to be a correct identification compared to an identification of the suspect by a witness who reported some degree of administrator influence. What might account for these results? First, it is important to note that although participant–witnesses were randomly assigned to conditions, they self-selected to some extent the degree to which they were received the experimental manipulations. Specifically, the more a given eyewitness was self-directed the less he or she would be administrator directed. The contribution of this self- selection factor is shown in a reanalysis of the worst-case scenario INFLUENCE RATING 1 2 3 4 5 6 7 8 9 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 NO INFLUENCE SUBTLE INFLUENCE SIMILARITY INFLUENCE 1 2-4 5-9 Figure 2. Distribution of witness ratings of the lineup administrator’s influence, for the no-influence, subtle- influence, and similarity-influence conditions. The inset shows the same data collapsed into bins for ratings of 1, ratings of 2 to 4, and ratings of 5 to 9. Table 4 Variation in Effect Size r Across Lineups Lineup 1 2 3 4 5 6 7 8 2 p s t p No influence versus subtle influence TP ID rate .19 .19 .33 .29 .23 .26 .00 .20 3.96 .79 .24 1.05 .33 TP correct ID rate .00 .00 .19 .26 .28 .45 .33 .17 5.20 .64 .27 .07 .95 TA ID rate .35 .58 .04 .10 .51 .19 .67 .35 11.87 .11 .38 1.45 .19 TADES false ID rate .30 .45 .00 .00 .00 .30 .30 .45 4.96 .67 .27 1.61 .15 TAWCR false ID rate .19 .35 .29 .36 .19 .30 .35 .00 5.92 .55 .30 .23 .80 No influence versus similarity influence TP ID rate .30 .45 .71 .00 .23 .30 .00 .00 7.98 .34 .30 1.76 .12 TP correct ID rate .51 .00 .51 .21 .28 .71 .33 .00 12.51 .09 .38 1.28 .24 TA ID rate .17 .58 .19 .33 .19 .19 .85 .67 15.02 .04 .37 2.25 .06 TAWCR false ID rate .00 .35 .00 .71 .33 .00 .67 .67 18.61 .01 .45 0.94 .38 Note. Lineups are designated arbitrarily with numbers 1 through 8. No data are given for TADES false ID rate, comparing no-influence and similarity-influence conditions because there was only one false identification in the no-influence condition and none in the similarity-influence condition. Chi-square statistics evaluate heterogeneity and t statistics are for random effects test using lineups as the unit of analysis. s is a measure of effect size for the test of heterogeneity. TP target-present; TA target-absent; TADES designated innocent suspect; TAWCS worst-case scenario innocent suspect. 71INFLUENCES ON EYEWITNESS IDENTIFICATION DECISIONS results, excluding participants who responded prior to 12 s. These results are shown in Table 6. The results for the no-influence condition changed very little, which makes sense given that the experimenter’s behavior did not change after the 12-s point. How- ever, the results for the two influence conditions did change when pre-12-s responses were excluded. The probative value of a sus- pect identification, excluding responses given before the onset of the experimental manipulations, were not different for witnesses who gave responses of 1 to the influence question versus witnesses who gave responses greater than 1 to the influence question. A chi-square test on the frequencies of correct and false identifica- tions did not approach statistical significance, 2(1, N 84) 1.02, p .31, r .11. Also a one-sample t test on the three zr scores also was nonsignificant, t(2) 2.53, p .13, r .87, and on Cohen’s h was nonsignificant, t(2) .87, p .48, r .52. General Discussion The empirical results of the present experiment may be summa- rized as follows: (a) Lineup administrators influenced witnesses’ identification decisions, despite the presentation of unbiased in- structions indicating that the perpetrator may or may not be in the lineup. (b) Witnesses responded differently to subtle, nondirective, ostensibly cautionary statements by the lineup administrator than to a more directive request to consider the relative similarities of lineup members. (c) The effects of administrator influence for target-absent lineups varied depending on whether one considered the designated innocent suspect or a worst-case scenario in which the innocent suspect was the person most likely to be identified from the target-absent lineup. (d) Witness responses to a ques- tion about administrator influence showed mixed results. On the one hand, witnesses gave higher ratings to the lineup adminis- trator in the influence conditions than in the no-influence con- dition. However, the ratings were on the whole very low. Most witnesses appeared to be unaware of the influence, although a few did report that they believed that the lineup administrator may have tried to influence their decision. (e) The reporting of influence had some predictive utility for distinguishing between correct and false identifications. Specifically, there was some evidence that the probative value of a suspect identification was higher for participants who did not report administrator influ- ence than for participants who did report administrator influ- ence. We discuss these results in terms of the underlying mechanisms that produce administrator influence effects, wit- ness reporting of lineup administrator influence, and the legal and policy implications. Underlying Mechanisms In the present experiment, how did the influence of the lineup administrators alter witnesses’ cognitive processes? We return to the two possibilities that we presented in the introduction—that witnesses lower their decision criterion or shift to relative judg- ments. Both of these models, instantiated as the Best Above Criterion model and the Best Above/Relative Difference model, predict increases in suspect identification rates in both target- present (correct identifications) and target-absent (false identifica- tions) lineups that are proportional to the increases in the overall identification rates. In these models, the increase in the target- absent suspect identification rate is proportionally larger than the increase in the target-present suspect identification rate, which produces a decrease in the probative value of a suspect identifica- Table 5 Correct (Target-Present) and False (Target-Absent) Identification Rates, and Probative Value Comparing Witnesses Who Reported Lineup Administrator Influence Versus Witnesses Who Did Not Report Lineup Administrator Influence Target-present Target-absentDES Target-absentWCS PVDES PVWCS No influence Reported 1 (n 86) .36 (16/44) .02 (1/42) .24 (10/42) .94 .60 Reported 1 (n 10) .25 (1/4) .00 (0/6) .33 (2/6) 1.0 .43 Subtle influence Reported 1 (n 79) .33 (14/42) .14 (5/37) .22 (8/37) .71 .61 Reported 1 (n 17) .33 (2/6) .09 (1/11) .36 (4/11) .79 .48 Similarity influence Reported 1 (n 67) .63 (22/35) .00 (0/32) .47 (15/32) 1.0 .57 Reported 1 (n 29) .31 (4/13) .00 (0/16) .31 (5/16) 1.0 .50 Note. Data are given as response proportions, with frequencies in parentheses. Target-absentDES the target-absent lineup with the designated innocent suspect; Target-absentWCS the target-absent lineup assuming the worst-case scenario in which the innocent suspect is the person who is identified most often; PV probative value; reported 1 witness response of 1 to the question about administrator influence; reported 1 response greater than 1. Table 6 Probative Value of Suspect Identifications, for all Witnesses and Excluding Witnesses who Responded Before 12 s (for Worst-Case Scenario Innocent Suspect) All responses Excluding pre-12-s responses No influence Reported 1 .60 .63 Reported 1 .43 .43 Subtle influence Reported 1 .61 .58 Reported 1 .48 .60 Similarity influence Reported 1 .57 .54 Reported 1 .50 .54 Note. Reported 1 witness response of 1 to the question about administrator influence; Reported 1 response greater than 1. 72 CLARK, MARSHALL, AND ROSENTHAL tion. The results of the similarity-influence condition are fairly consistent with the two models, but the results of the subtle- influence condition are not. We start with the similarity-influence condition. These results showed increases in the overall identification rates for both target- present and target-absent lineups, as well as increases in both correct and false identification rates, when the worst-case scenario innocent suspect was considered. The increases in suspect identi- fication rates produced a very slight nonsignificant decrease in the probative value of a suspect identification assuming the worst-case scenario for the target-absent condition. This small decrease in probative value, although not statistically significant, is consistent with both models. The only aspect of the similarity-influence results that was inconsistent with the two models was the zero mistaken identification rate for the designated innocent suspect. That one inconsistency should be interpreted cautiously given that the identification rate for the designated innocent suspect was very low in the no-influence condition as well. It makes sense that the similarity-influence condition should produce results consistent with decision rules framed within a matching model such as the WITNESS model. After all, lineup administrators asked witnesses to evaluate similarity, which is exactly what the Witness model does. This leaves us in need of an explanation for the results of the subtle-influence condition. First, we need to be clear about why criterion shift and relative judgment models cannot account for these results. The subtle-influence condition produced an increase in the overall identification rate, but only for target-absent lineups. In addition, the false identification rate of the innocent suspect increased but the correct identification rate of the perpetrator did not. In the target-absent lineup, the increase in the overall identi- fication rate was disproportionately distributed across lineup mem- bers. Specifically, the designated innocent suspect, who was rarely identified, at a rate below chance, in the no-influence condition, showed a six-fold increase in the identification rate in the subtle- influence condition, whereas the identification rate for the worst- case scenario innocent suspect was unchanged. These results are all inconsistent with the model predictions shown in Figure 1. Our prediction, that subtle influence would make witnesses more will- ing to make an identification (due to a change in decision pro- cesses) is clearly contradicted by the data. To explain this pattern of results a model must not only specify whether an identification is made, but which lineup member is identified. Varying the decision processes, by shifting the criterion or by shifting to relative judgments, can only change the identification rate, but cannot determine the lineup member that is identified. In other words, such models can determine whether the best match will be identified, but cannot alter who the best match is. The reason, of course, is because the decision rule is applied only after the best match is determined. Thus, to account for the results of the subtle-influence condition, a model must alter the way in which lineup members are matched to memory. We propose a model similar in spirit to Tversky’s Feature Contrast Model (Tversky, 1977). In the Feature Contrast Model, similarity is calculated as a weighted sum of matching and nonmatching features. The match to memory increases when the weight on matching features is increased or when the weight on nonmatching features is decreased. To illustrate, Lineup Member 1 may be the best match to memory if a particular mismatching feature is ignored, but Lineup Member 4 may be the best match if some other mismatching feature is ignored. Our outline of this model is admittedly sketchy; however, we argue that such a model, in some form, is necessary to account for the variation in the patterns of results that cannot be accounted for by variation in the decision processes. How might this model account for the results of the subtle- influence condition? We propose that comments such as “take your time” indicate to the witness that he or she should reassess or reconsider the response that he or she may be leaning toward (keeping in mind that the witness has not made any overt re- sponse). This reassessment, we suggest is a reassessment of the importance (weights) on matching and mismatching features. To account for the results, one needs some idea of what response the witness was likely to be leaning toward at the time the lineup admin- istrator’s comment was made. This may be inferred from the results of the no-influence condition. For target-absent lineups the most probable single response by far was the rejection of the lineup (.63). Thus, it is reasonable to assume that at the 12-s point most witnesses who had not already responded were leaning toward nonidentifica- tion. Assuming that “take your time” is interpreted as “reconsider,” the likelihood of a nonidentification response should decrease, which is precisely what the data show. For target-present lineups, however, the response rates for the perpetrator (.35), foil (.42), and nonidenti- fication responses (.23) were all fairly similar. Again, assuming that “take your time” was interpreted as “reconsider,” and given that the various responses were given with roughly equal frequency, it is easy to see that reconsideration would result in very little change in the response rates. The weight-shifting model has the potential to explain not only the results of the present experiment, but also the variability in biased–unbiased instruction comparisons. Specifically, there are exceptions to the general pattern that shows increases in both correct and false identification rates (see S. E. Clark, 2005; Ste- blay, 1997). Those exceptions, like the subtle-influence versus no-influence comparison of the present experiment, show an in- crease in false identification rates but no increase in correct iden- tification rates (Cutler & Penrod, 1988; Cutler, Penrod, & Martens, 1987; Fleet, Brigham, & Bothwell, 1987; Malpass & Devine, 1981; O’Rourke, Penrod, Cutler, & Stuve, 1989). A similar pattern has also been shown with prelineup instructions that emphasize the perpetrator’s possible change of appearance (Charman & Wells, 2007). The modeling presented here (see Figure 1) suggests that such asymmetric results, showing increases in false identification rates but not correct identification rates, are difficult to explain by a simple criterion shifting model or a relative judgment model. All of these results may, with additional theoretical development, be explained in terms of (suboptimal) shifting of weights on matching and mismatching features. Reporting of Administrator Influence The present results were consistent with results from Phillips et al. (1999), in that witnesses were largely unaware of the adminis- trator’s influencing behavior. The distributions of responses in the present study showed that even when witnesses were asked very directly to consider a most-similar lineup member, most did not report any administrator influence. It is important to note that witnesses were asked whether the administrator tried to influence 73INFLUENCES ON EYEWITNESS IDENTIFICATION DECISIONS them, not whether the administrator did influence them. Thus, it is unlikely that the low ratings would be due to witnesses not wanting to admit that they “fell for it,” as the focus was on the adminis- trator, rather than on themselves. The low ratings may have been due to witnesses not noticing or remembering their interaction with the lineup administrator or because they simply did not view statements such as “take your time” as being manipulative. Of course, some witnesses did report administrator influence, and these reports showed some predictive validity regarding the accuracy of the eyewitness identification response. Specifically, the probative value of a suspect identification was higher for witnesses who gave 1 ratings on the influence question than for witnesses who gave ratings greater than 1 on the influence ques- tion. These results arise from the interactive nature of the identifica- tion task. For the conditions in this study, it seems reasonable that witnesses who had clearer memories and were thus more self- directed were also more likely to be correct and less likely to receive “help” from the lineup administrator. It is interesting to note that even in the no-influence condition, a small handful of witnesses reported some administrator influence. One tentative explanation for these results is that witnesses who are less self- directed may also be more likely to look to the experimenter for cues. The results, taken together, suggest that witnesses who are more self-directed are more likely to be correct, less likely to look to the administrator for cues, and less likely to receive guidance. An important caveat is that this may apply specifically to the conditions of the present experiment. Lineup administrators were blind as to the position of the suspect and were not intent on obtaining identifications of the suspect. When the lineup admin- istrator knows the position of the suspect and is intent on obtaining an identification of that suspect, even the most self-directed wit- nesses may be redirected by law enforcement (see, e.g., the case of Howard Haupt, described by Loftus & Ketcham, 1991, pp. 171– 173). Legal and Policy Implications The present results speak directly and specifically to a recent ruling by the Connecticut Supreme Court (State of Connecticut v. Ledbetter, 2005), as well as more generally to issues of blind lineup administration and the documenting of identification pro- cedures, each of which is discussed below. State of Connecticut v. Ledbetter (2005). Laquan Ledbetter was convicted of two counts of robbery based in part on the testimony of an eyewitness who identified him. On appeal Led- better argued for reversal on the grounds that the police had not given the witness any instruction that the true perpetrator might not be present. Our results are relevant to that case in three ways. First, the Connecticut Supreme Court, in its decision to affirm Ledbetter’s conviction noted that the officer “had limited his remarks to, Take your time. Take a look. And to be sure, or words to that effect.” The present results showed that precisely these kinds of subtle, nondirective statements can lead a witness to make an identification, particularly when the perpetrator was not present, even when the “may or may not” instruction was properly given. Second, although the Connecticut court did not grant the appeal, it did exercise its supervisory authority to require that in future cases, where the “may or may not” warning is not given, that the trial court give a specific jury instruction that such omission, “tends to increase the probability of a misidentification.” Our results (which are consistent with those of Haw & Fisher, 2004) suggest that judicial instruction may be warranted even if the “may or may not” warning is given, if its intent is undermined by the subsequent behavior of the lineup administrator. Third, although the Connecticut court’s jury instruction refers only to misidentification, the present results also showed increases in correct identification rates for the similarity-influence condition (but not for the subtle-influence condition). This aspect of the results should not be overlooked, although it does not undermine the Connecticut court’s instruction. For the worst-case scenario analysis the increase in correct identifications came with a parallel increase in mistaken identifications such that there was no change in the probative value of those suspect identifications. In other words, the innocent suspect was placed at higher risk of mistaken identification, with no increase in the probative value of the evidence. Blind lineup administration. One of the least controversial recommendations made by eyewitness identification researchers is that the lineup administrator should be blind as to the identity of the suspect in the lineup (Wells et al., 1998). The principle behind this recommendation is clear: One cannot leak (to the witness) what one does not know. We agree with this recommendation and the underlying principle. The present results, however, show that the lineup administrator can influence the outcome even when blind administration is used, even when unbiased instructions are given. Documenting the identification procedure. These results also speak to the importance of documenting eyewitness identification procedures by audio or video tape recording (see American Bar Association, 2004). Witnesses in the present study, responding to questionnaires only minutes after their interaction with the lineup administrator, provided very little information about the influenc- ing behavior of the experimenter. Whether this was due to wit- nesses’ being unaware of the influence or simply being unwilling to admit that they might have been influenced is not clear, and more research is required on that question. The present results, however, considered with those of Phillips et al. (1999), suggest that witnesses to crimes may not be reliable witnesses to the influence exerted on them when they make identification deci- sions. Conclusions and Future Research Comments made by the lineup administrator that may seem cautionary or encouraging influence eyewitness identification de- cisions. The precise outcome depends on how the witness is influenced. The similarity-influence comments produced an in- crease in the overall identification rate, and were generally con- sistent with criterion shift and relative judgment models. The nondirective comments of the subtle-influence condition, however, did not result in parallel increases in the identification rate for target-present and target-absent lineups and are not easily ex- plained by such models. Rather, statements made by lineup ad- ministrators in the subtle-influence condition may have induced witnesses to reconsider whatever response they were leaning to- ward when the comments were made. We outlined a tentative model based on shifting weights on matching and mismatching 74 CLARK, MARSHALL, AND ROSENTHAL features to account for these results. This model has the potential to account for a wide range of results from experiments using different procedures that influence witness decisions. The devel- opment of this model is an important step in future research. References American Bar Association. (2004) Resolution adopted by the House of Delegates, August, 4, 2004. Washington, DC: Author. Brand, C. R., & Egan, V. (1989). The “Big Five” dimensions of person- ality? Evidence from ipsative, adjectival self-attributions. Personality & Individual Differences, 10, 1165–1171. Calandra, D., & Carey, J. E. (2005). 2005 field guide for the California Peace Officers Legal Sourcebook. Sacramento: California District At- torneys Association. Charman, S. D., & Wells, G. L. (2007). Eyewitness lineups: Is the appearance-change instruction a good idea? Law and Human Behavior, 31, 3–22. Clark, H. H. (1979). Responding to indirect speech acts. Cognitive Psy- chology, 11, 430–477. Clark, S. E. (2003). A memory and decision model for eyewitness identi- fication. Applied Cognitive Psychology, 17, 629–654. Clark, S. E. (2005). A re-examination of the effects of biased lineup instructions in eyewitness identification. Law and Human Behavior, 29, 395–424. Clark, S. E., & Godfrey, R. D. (2009). Eyewitness identification evidence and innocence risk. Psychonomic Bulletin and Review, 16, 22–42. Clark, S. E., Howell, R. T., & Davey, S. L. (2008). Regularities in eyewitness identification. Law and Human Behavior, 32, 187–213. Cocilova, S. (2000). The effects of non-similar foils on eyewitness confi- dence and accuracy. Unpublished manuscript, University of California, Riverside. Cutler, B. L., & Penrod, S. D. (1988). Improving the reliability of eyewit- ness identification: Lineup construction and presentation. Journal of Applied Psychology, 73, 281–290. Cutler, B. L., Penrod, S. D., & Martens, T. K. (1987). The reliability of eyewitness identification. Law and Human Behavior, 11, 233–258. Fleet, M. L., Brigham, J. C., & Bothwell, R. K. (1987). The confidence- accuracy relationship: The effects of confidence assessment and choos- ing. Journal of Applied Social Psychology, 17, 171–187. Foster, R. A., Libukman, T. M., Schooler, J. W., & Loftus, E. F. (1994). Consequentiality and eyewitness person identification. Applied Cogni- tive Psychology, 8, 107–124. Grice, H. P. (1975). Logic and conversation. In P. Cole & J. L. Morgan (Eds.), Syntax and semantics (Vol. 3, pp. 41–58). New York: Academic. Gross, S. R., Jacoby, K., Matheson, D. J., Montgomery, N., & Patil, S. (2005). Exonerations in the United States 1989 through 2003. Journal of Criminal Law and Criminology, 95, 523–560. Haw, R. M., & Fisher, R. P. (2004). Effects of administrator-witness contact on eyewitness identification accuracy. Journal of Applied Psy- chology, 89, 1106–1112. Huff, C. R., Rattner, A., & Sagarin, E. (1996). Convicted but innocent. Thousand Oaks, CA: Sage. Loftus, E. F., & Ketcham, K. (1991). Witness for the defense: The accused, the eyewitness, and the expert who puts memory on trial. New York: St. Martin’s. Malpass, R. S., & Devine, P. G. (1981). Eyewitness identification: Lineup instructions and the absence of the offender. Journal of Applied Psy- chology, 56, 482–489. O’Rourke, T. E., Penrod, S. D., Cutler, B. L., & Stuve, T. E. (1989). The external validity of eyewitness identification research: Generalizing across subject populations. Law and Human Behavior, 13, 385–395. Overall, J. E. (1980). Continuity correction for Fisher’s exact probability test. Journal of Educational Statistics, 5, 177–190. Phillips, M. R., McAuliff, B. D., Kovera, M. B., & Cutler, B. L. (1999). Double-blind photoarray administration as a safeguard against investi- gation bias. Journal of Applied Psychology, 84, 940–951. Pryke, S., Lindsay, R. C. L., Dysart, J. E., & Dupuis, P. (2004). Multiple independent identification decisions: A method of calibrating eyewitness identifications. Journal of Applied Psychology, 89, 73–84. Rosenthal, R. (1967). Covert communication in the psychological experi- ment. Psychological Bulletin, 67, 356–367. Rosenthal, R. (1976). Experimenter effects in behavioral research (Ex- panded ed.). New York: Irvington. Rosenthal, R. (1991). Meta-analytic procedures for social research. New- bury Park: Sage. Rosenthal, R. (2002). Covert communication in classrooms, clinics, court- rooms, and cubicles. American Psychologist, 57, 839–849. Rosenthal, R., & Fode, K. L. (1963). Psychology of the scientist: 5. Three experiments in experimenter bias. Psychological Reports, 12, 491–511. Rosnow, R. L., Rosenthal, R., & Rubin, D. B. (2000). Contrasts and correlations in effect-size estimation. Psychological Science, 11, 446– 453. Scheck, B., Neufeld, P., & Dwyer, J. (2000). Actual innocence: Five days to execution and other dispatches from the wrongly convicted. New York: Doubleday. Shapiro, P. N., & Penrod, S. D. (1986). Meta-analysis of facial identifica- tion studies. Psychological Bulletin, 100, 139–156. Snedecor, G. W., & Cochran, W. G. (1967). Statistical methods (6th ed.). Ames: Iowa State University Press. Sporer, S. L. (1992). Post-dicting eyewitness accuracy: Confidence, decision-times and person descriptions of choosers and non-choosers. European Journal of Social Psychology, 74, 157–180. Sporer, S. L. (1993). Eyewitness identification accuracy, confidence, and decision times in simultaneous and sequential lineups. Journal of Ap- plied Psychology, 78, 22–33. State of Connecticut v. Laquan Ledbetter, 275 Conn. 534 (2005). State of Utah v. Anthony L. Long, 721 P. 2d. 483 (1986). Steblay, N. M. (1997). Social influence in eyewitness recall: A meta- analytic review of lineup instruction effects. Law and Human Behavior, 21, 283–297. Tunnicliff, J. L., & Clark, S. E. (2000). Selecting foils for identification lineups: Matching suspects or descriptions? Law and Human Behavior, 24, 231–258. Tversky, A. (1977). Features of similarity. Psychological Review, 84, 327–352. U.S. v. Melvin Telfaire 152 U.S. App. D.C. 146; 469 F.2d 552 (1972). Wells, G. L. (1984). The psychology of lineup identifications. Journal of Applied Social Psychology, 14, 89–103. Wells, G. L., Small, M., Penrod, S., Malpass, R. S., Fulero, S. M., & Brimacombe, C. E. (1998). Eyewitness identification procedures: Rec- ommendations for lineups and photospreads. Law and Human Behavior, 22, 603–647. Received September 19, 2007 Revision received November 24, 2008 Accepted December 1, 2008 75INFLUENCES ON EYEWITNESS IDENTIFICATION DECISIONS