Wherever two people communicate, deception is a reality. It is present in our everyday social and professional lives and its detection can be beneficial, not only to us individually but to our society as a whole. For example, accurate deception detection can aid law enforcement officers in solving a crime. It can also help border control agents to detect potentially dangerous individuals during routine screening interviews. Currently, the most successful and widespread system is the polygraph which monitors uncontrolled changes in heart rate and electro-dermal response, as a result of the subject’s arousal to deceit. Unfortunately, its widespread use does not necessarily mean it is a perfect system. Firstly, in order for it to take the necessary measurements, it needs to be continuously connected to the subject’s body. This means that the subject must be cooperative and in close proximity to the device. Secondly, it requires accurate calibration at the beginning of every session, so that a baseline of measurements can be established.
Occasionally, it may still fail to give accurate readings, despite the calibration step; if for example, the subject’s heart rate increases for reasons unrelated to deception. Furthermore, the polygraph is an overt system, which means that the subject knows they are being monitored and also knows what measurements are being made. As a result, they may devise techniques to trick the machine, such as remaining calm, in an attempt to control their heart rate or being excited during the calibration phase, so that any excitement due to deception that the polygraph later registers, will mistakenly be regarded as a normal response. Lastly, the polygraph requires a trained operator, whose skills and abilities control both the likelihood of human error in the interview and the length of the interview itself. Unlike computers, humans will get tired and will eventually need a break. Therefore, what is needed is an automatic and covert system, which can continuously and unobtrusively detect deception, without requiring the subject’s cooperation.
In response to this need, researchers have long been trying to decode human behavior, in an attempt to discover deceptive cues. These would aid them in designing systems for automatic deception detection or for training others to detect it. Some deceptive behaviors fall into one of two groups: over-control and agitation. In an attempt to hide their deception, liars who are aware of possible deceptive behavioral cues, may exert extra effort in hiding any behavior and particularly reducing movements of their hands, legs and head, while they are being deceptive. At the other extreme are liars who show signs of agitated behavior triggered by nervousness and fear. As a result, their speech tends to be faster and or they may engage in undirected dieting. Nevertheless, it is incorrect to assume that agitated or over-controlled behavior is always a sign of deception.
One should also consider the normal behavior of a person, as well as the tone and context of the communication taking place. It may be the case that some subjects have a tendency of behaving over-controlled when interrogated by strangers. Others may seem agitated during an interrogation because they had just returned from their morning jog. According to Burgoon’s Expectancy Violations Theory (EVT), if in a communication there is considerable deviation of the observed behavior from the expected behavior, then this is a cause for suspicion. For example, an interrogator may become suspicious of a suspect who is relaxed at the beginning of the interrogation but becomes agitated as soon as they are questioned about a crime. Furthermore, in their Interpersonal Deception Theory (IDT), Buller and Burgoon state that deception is a dynamic process, whereby liars adjust their behavior according to how much they believe they are suspected of being deceitful. It is likely that during their interaction, liars will unintentionally reveal some behavioral cues as a result of their deception and suspicion.
Early Attempts to Detect Deception
People who were involved in the assessment of credibility looked for accurate detection methods. Many of the early attempts to detect deceptionwere founded on magic and mysticism. Actually, all the ancient methods could be classified as trial by either torture, combat or ordeal. Torture was used to make the presumed guilty person confess his guilt. In ancient Greece torture was reserved for slaves and for strangers. Free citizens were not tortured. In Rome, torture was a widespread procedure that spared no one. The torture of freemen accused of treason or of other crimes against the state became an admitted principle of Roman law. Torture was used for the investigation of truth, not as punishment. Therefore, it was used for witnesses as well. When the testimony of slaves was required, it was necessarily accompanied by torture to confirm it. However, there were some limitations to the use of torture. Women were spared torture during pregnancy. Judges were not allowed to torture more than necessary, and it was only used when the accused was about to confess. Marcus Aurelius ordered the exemption of patricians and of the higher imperial officers.
Diocletian forbade torture of soldiers, and after the adoption of Christianity, Theodosius directed that priests should not be subjected to torture. The notorious Spanish inquisition used torture to detect the hidden crimes of those who were unfaithful to the church. In fact, the whole system of the inquisition was built on torture. The presumption of guilt led to the use of cruel methods to force the accused to confess. Hence, all the odds were turned against the accused and only few escaped. Trial by combat was based on the adoption of divine judgment, and contestants let the outcomes of the battle decide who is truthful and who is not. Such a belief assumes that the adversaries should defend their claims themselves. However, in cases where the accused was unfit to fight, for example when a woman was accused, she was allowed to employ a champion to fight for her. At first, the champion was some member of the family. Later, it became the custom to substitute the contestant with a skilled champion, and professional champions sold their skill to the highest bidder.
Champions had an interest not to inflict injuries, and they agreed on rules such as not to use teeth and hands in the fight. In medieval Italy, champions were recognized as a class with an established institution consisting of selected individuals. To enhance fairness, efforts were made to select champions who were equal in age, size and strength. However, the efficacy of judicial combat is questionable. For example, in about the year 1100 Anselm stole the sacred vessels from the church of Leon. The merchant to whom he sold the vessels revealed his name to the church authorities. Anselm denied the accusation, offered to battle the merchant, and defeated him. Anselm was, therefore, proclaimed innocent. Later, Anselm confessed to his crime.
The injustice of both torture and trial by combat, their fallibility, and the advance of civilization encouraged the search for more peaceful modes to detect guilt. Torture and combat were abandoned. The ordeal is another way by which people cast their doubts on a higher power. It was based on the belief that God will protect the innocent and punish the guilty person. For example, in 592 a bishop, who was accused of a crime, took an oath on the relics of St. Peter. It was evident that the accused exposed himself to immediate danger, if guilty. However, he performed the ceremony unharmed and this was a proof of his innocence.
The literature refers to the ordeal of boiling water, the ordeal of cold water, the fire ordeal, the ordeal of balance, the ordeal of rice chewing, and a variety of other ordeals. Unlike torture and combat, ordeals are presently practiced to detect deception. In Tahiti the priest digs a hole in the clay floor, fills it with water and stands over it praying to God. God is supposed to lead the spirit of the thief over the water and the priest, who constantly looks at the water will see it. It seems that the thief who stands with others in a circle is more anxious than others to see if the priest detected his spirit. He approaches the pool and the water reflects his image.
The ordeal of the red-hot iron is applied among Bedouin tribes. The suspect licks a gigantic duly heated spoon and if he does not burn his tongue he is acquitted. One would expect that many innocent suspects will thus burn their tongue. However, many suspects are proclaimed innocent after the test. This can be explained by the trust of the truthful suspect in the ordeal. The innocent suspect behaves normally, and the saliva on his tongue spares him the injury. The fear of the guilty suspect, on the other hand, reduces the activity of the salivary glands, and when the dry tongue touches the hot iron it is injured. Similar to trial by combat, ordeals rely heavily on the belief of the suspect in God or in other mystical powers that control the outcome of the test. In an era of skepticism, ordeals can not provide the solution. Hence, it has to be replaced by methods that better reflect the spirit of the time.
The Myths of Hypnosis and Narcoanalysis
Gradually the detection of deception methods became more sophisticated and used more advanced technologies. Hence, it was thought that the use of hypnosis would assist in detecting deception. There are two major views to hypnosis. One believes that hypnosis represents a special form of consciousness which permits access to hidden parts of the mind. The other major view explains hypnosis by social psychological mechanisms and suggests that the hypnotized individual is affected by the social situation. Both views agree that hypnosis has no truth-compelling capacity. The person under hypnosis retains control and is able to judge events, and therefore can lie. Another myth is that of narcoanalysis (‘truth drugs’). Narcoanalysis was first used in psychiatric proceedings to facilitate communication with the emotionally disturbed patient. Drugs such as sodium amytal, and sodium pentothal induced relaxation, ease, confidence, and a marked verbal release. It seemed that the patient under the influence of the drug did not stop talking. The relief from inhibitions and the decreased self-protective censorshipof speech led to the idea that using drugs would reveal the hidden truth. However, it soon turned out that guilt-ridden people confessed under narcoanalysis to offenses they had imagined but had not committed, and others denied crimes that objective signs indicated they had committed.
Paper and Pencil Integrity Tests
Paper and pencil tests were developed to predict future thefts and other counterproductive behaviors of employees in the workplace. The tests are used in pre-employment screening and for other selection purposes. Some are overt integrity tests that ask applicants about their attitudes towards theft and other dishonest activities and about their own involvement in such behavior. Other tests disguise their purposes and include questions about dependability, conscientiousness, social conformity, trouble with authority and hostility. Integrity tests assume that honest people will be honest in all situations, and dishonest people are consistent in showing dishonest behavior.
Integrity tests are controversial. The controversy revolves around their effectiveness and the consequences of their use. Opponents claim that there is no such trait as honesty. The US Office of Technology Assessment (OTA) reviewed the research on integrity tests and concluded that there is no scientific support for the claim that integrity tests can predict dishonest behavior. The report asserts that integrity tests are biased against the applicant yielding a high rate of false positive errors (classifying an honest applicant as dishonest). These people will suffer from the stigma and be denied employment. The American Psychological Association (APA) published another report which provided a generally favorable conclusion regarding the use of paper-and-pencil integrity tests in personnel selection. It was suggested that properly documented integrity tests can predict a number of dishonest behaviors, and their validity is comparable to many other tests used in pre-employment selection. It seems that at present there is insufficient evidence to reach definite conclusions on the validity and the applicability of the integrity tests. However, there are indications that this direction may be promising.
Detection of Deception Through the Voice
Several devices were invented for the detection of emotional stress in the voice. The PSE (Psychological Stress Evaluator) has received more attention than others. The PSE was said to detect changes in the infrasonic frequency modulation that vary between 5 Hz and 20 Hz. These modulationss are controlled by the central nervous system and disappear during stress. The theoretical basis of the PSE was criticized as invalid, and experimental studies failed to establish its validity. The popularity of the PSE declined, and today it is rarely used. However, the interest in the voice as a potential channel for the detection of deception remains intact. New technologies were developed, and recently a computerized system called the ‘TrusterPro’ was introduced. At present its validity is unknown.
Other methods which were invented to detect deception are based on the analysis of statements. Transcripts of statements made by suspects or by witnesses in which they detail what they did or what they saw are analyzed. The methods assume that truthful statements differ from false ones in both content and quality. One method, which was developed for the assessment of child witnesses in sexual abuse cases, is known as ‘Statement Validity Analysis’ (SVA). The SVA consists of two components: (1) a criteria-based content analysis (CBCA) in which 19criteria have been proposed to reflect qualitative and quantitative differences between truthful and untruthful reports; and (2) examination of other evidence in the case. It is thought, for example, that a detailed report, which contains the exact description of the place, vivid description of people, and a step by step description of the events would suggest that the statement reflects the truth. Research conducted to validate the SVA yielded positive results for three CBCA criteria.
It was found that deceptive accounts contained fewer details and were rated as less coherent than truthful reports. Dishonest suspects were also less likely to admit not remembering aspects of the event under consideration. Another method through which deception may be detected is the Scientific Content Analysis (SCAN). The SCAN assumes that deceptive suspects will use more deviations in pronoun usage such as replacing I with you. Furthermore, deceptive suspects will present long introductions or omit the introduction altogether. Deceivers will also use many unnecessary connectors, such as ‘after I left’, and ‘and then’. Intuitively, the SCAN may work but to date there is no scientific evidence on the validity of the SCAN. A third method suggested that the number of distinct words in a statement (types) should be divided by the total number ofwords (tokens). It was suggested that a higher type-token ratio may indicate deception. This is explained by the cautious approach of deceptive suspects who try not to reveal self-incriminating information. Therefore, they phrase their testimony with higher lexical diversity. Further research is required to support the validity of lexical diversity.
Psychophysiological Detection of Deception (Polygraph)
When we speak about detection of deception the first thing that comes to mind is the polygraph. The polygraph is a device that continuously measures and records physiological responses from an examinee who answers a series of questions. Recordings are made of respiration, palmar sweating and relative blood pressure. Changes in respiration are obtained from two pneumatic rubber tubes positioned around the thoracic area and abdomen. The palmar sweating, or electrodermal skin response, is obtained from stainless-steel electrodes attached to the volar side of the index and fourth fingers of the examinee’s hand. Blood pressure is obtained from an inflated pressure cuff positioned around the upper portion of the examinee’s contralateral arm (Fig. 1). Another measurement that is used less frequently is the peripheral blood flow obtained from a photoelectric plethysmograph placed on a finger.
After the test, the recordings are analyzed. The analysis emphasizes inhibition of respiration, electro-dermal response amplitude and changes in the blood pressure and blood volume (Fig. 2). The most common psychophysiological detection methods are the variety of procedures known as the Control Question Technique (CQT). Basically, the CQT contains three types of questions. (1) Relevant questions refer to the crime under investigation in the ‘did you do it?’ form (e.g. ‘Did you break into Mr Jones’s store last Saturday?’). Relevant questions are typically answered ‘no’. (2)
Control questions deal with undesirable acts in the past and pertain to matters similar to the crime being investigated but with a larger scope (e.g. ‘between the ages of 12 and 16 have you ever taken something valuable without permission?’). The control questions are formulated during the pretest interview with the intention that the examinee will remain with some doubt about the veracity of his ‘no’ answer. (3) Irrelevant questions focus on completely neutral issues to which the affirmative answer is a known truth (e.g. ‘are you sitting on a chair?’). Irrelevant questions are intended to absorb the initial orienting response evoked by any opening question, and to enable rest periods between the more loaded questions. Typically, the whole question series is repeated three or four times.
Figure 1 The polygraph attachments.
The inference rule underlying the CQT is based on a comparison of the responses evoked by the relevant and control questions. Deceptive individuals are expected to show more pronounced responses to the relevant questions, whereas truthful individuals are expected to show larger responses to the control questions. The CQT is available and convenient to use. Actually, all that is needed is a suspect who denies involvement with the crime and is willing to take the test. However, the CQT is also highly controversial. The controversy revolves around the plausibility of the rationale, the standardization ofthe test, the contamination by background information, the problem of countermeasures, and finally, the test’s validity.
Actually, the vigorous controversy around the polygraph can be traced to the Frye case in 1923. Frye was charged with murder and denied the charge. He was tested with a systolic blood pressure test and was found truthful. The court did not admit the test’s results as evidence and in a landmark decision stated that: ‘while courts will go a long way in admitting expert testimony deduced from a well-recognized scientific principle or discovery, the thing from which the deduction is made must be sufficiently established to have gained general acceptance in the particular field in which it belongs’. Since 1923 the polygraph changed. However, the Frye decision reinforced the reluctance of most courts to admit the polygraph results. In addition, the scientific community also ignored polygraph testing. Hence, practice has outpaced research and the polygraph was operated by practitioners who lacked training and expertise in psychology. This resulted in low standards and severe ethical problems, which further heated the controversy. Only recently the Daubert case overruled the Frye decision.
Figure 2 Illustration of respiration, electrodermal and cardiovascular patterns of recording on a chart. The Guilty Knowledge Test
The Guilty Knowledge Test (GKT), also known as the Concealed Knowledge Test (CKT), is a less controversial polygraph test. The GKT is used in applied settings to detect information that a suspect cannot or does not wish to reveal. The test utilizes a series of multiple-choice questions, each having one relevant alternative (e.g. a feature of the crime under investigation) and several neutral (control) alternatives, chosen so that an innocent suspect would not be able to discriminate them from the relevant alternative. The following is an example of a real GKT test which led to the apprehension of the culprits. At 22.45 h, four masked men entered the home/parish of a Nazareth priest. They gagged him and two other residents and locked them in a remote room after which they robbed the contents of the parish safe. Intelligence information led to the suspect who was a member of the priest’s congregation. Twelve days after the robbery the suspect was brought in for a polygraph test. After denying knowledge of the correct answers, the suspect was asked the following questions: (a)
‘What was the object used to gag the priest?’ The following alternatives were presented to the suspect: Scarf, shirt, handkerchief, towel, sock, diaper, undershirt. The relevant item was a sock. (b) ‘Which name was uttered by oneof the robbers?’ The following alternative answers were presented: Jacob, Salomon, David, Abraham, Samuel, Benjamin, Daniel. The relevant name was Abraham. It is assumed that only a guilty suspect will be able to single out and respond differentially to both the object that gagged the priest and the name the robber uttered. Innocent suspects, who have no guilty knowledge, are unable to distinguish crime-related information from other alternatives.
Inferences are made on the basis of the GKT by comparing the responses elicited by the relevant item with the responses to irrelevant items. Only if the responses to the relevant item are consistently larger, is guilty knowledge inferred. This provides a proper control against false positive outcomes, inasmuch as the likelihood that an innocent examinee might show consistently greater responsiveness to the correct alternative just by chance can be reduced to a low level by adding irrelevant items and by utilizing more GKT questions. In the robbery case, the guilty suspect, who later confessed, responded to both relevant items.
A recent review of 15 GKT mock crime experiments revealed that the rate of correct detection reported in simulated GKT experiments is quite impressive. It was found that across these 15 studies, 80.6% of 299 guilty examinees and 95.9% of 291 innocent examinees were correctly classified. Furthermore, in eleven of the studies, no false positives were observed. This supports the notion that the GKT can protect the innocent from false detection. To establish the accuracy of the GKT in real-life criminal investigations, two field studies were designed. In both studies, the amplitudeof the electrodermal response was used as an index of guilty knowledge. Excluding inconclusive outcomes, a very high detection rate for innocent suspects (97.9% and 97.4%, respectively) has been obtained. However, both studies reported a considerably lower detection rate for guilty suspects (50% and 53.3%, respectively). The low detection rate obtained for guilty suspects, may be attributed to the differences between simulated and true examination conditions.
First, the police investigator in the field cannot be sure that the guilty suspect noticed all the relevant details and remembered them at the time of the test. Therefore, an appropriate use of the GKT procedure requires the use of at least four or five GKT questions. In that case, the recognition of the other relevant items will compensate for overlooking one item by the guilty suspect. In any case, the small number of questions that were used in the field studies (mean number of 2.04 and 1.8 questions, respectively) may have contributed to the false negative error rate. Second, the use of a single electrodermal measure as the dependent variable undermines the test’s accuracy. The second field study revealed that the addition of a respiration measure enhanced detection of guilty suspects to 75.8% while keeping the false-positive error rate relatively low (5.9%). Thus, a proper integration oftwo efficient measures increases the likelihood that the guilty suspect, who is responsive to at least one measure, will be detected.
An important issue related to polygraphic testing concerns the extent to which a high accuracy level can be maintained when guilty suspects use specific point countermeasures in an effort to distort their physiological responses and yield favorable outcomes. There are two major types of specific counter-measures: physical activities such as muscular movements, self-induced pain or controlled respiration, and mental activities such as thinking relaxing or exciting thoughts. The present evidence suggests that both countermeasure types may be efficient in distorting electrodermal responses, and it is possible for some deceptive individuals to beat the GKT in an experimental situation were the conditions are optimal and the participants receive information and training about the effective use of countermeasures. However, in real life conditions, when the test is administered immediately after the interrogation, the suspect will not have the time to prepare, and the countermeasures may not be effective.
Furthermore, the respiration measure is more resistant to the effects of specific countermeasures. This may suggest that more weight should be given to respiration measures in the GKT. Another way to deal effectively with specific countermeasures is the use of slow-wave components of the event-related brain potentials (ERP). ERPs are measures of brain electric activity obtained from electrodes connected on the person’s scalp. The P300 component ofthe ERP refers to a positive change in voltage of the peak which emerges at about 300 ms after presentation of a stimulus. The P300 signals special cognitive processing of recognized or infrequent stimulus. Therefore, it is ideal for detecting concealed knowledge.
It has been shown that guilty participants produce P300 responses to stimuli recognized as guilty knowledge. The short latency of the P300 response is likely to undermine the effectiveness of specific countermeasures. To conclude, the GKT is a standard psychological test. The questions are determined by the feature of the crime and may not depend on examiner or examinee factors. The experience in Japan, where the GKT is widely applied and the criminal justice system accepts its outcomes as evidence in court suggest that the GKT might meet the requirements of a legally accepted form of evidence.
Polygraph tests were developed to help criminal investigators in situations where they are required to decide whether or not to believe the suspect. There are two major polygraph procedures, the CQT and the GKT. The more convenient and easy to operate CQT is widely used by law enforcement agencies. However, there is major controversy revolving around its rationale and inference rule, as well as around the empirical question of its validity. Opponents of the CQT suggest it should be replaced with the more standardized and scientifically based GKT. However, in many cases the GKT can not be an alternative for the CQT. Furthermore, GKTs are considered unimportant by federal law enforcement agencies in the US and are almost never used in criminal cases. A survey of FBI polygraph investigations estimated that the GKT might have been used in only 13.1% of them.
The reason for this poor estimate is the difficulties involved in obtaining a sufficient number of proper GKT questions (i.e. questions that used items that can be identified by the guilty suspects but not by innocent people). Furthermore, the relevant information should be kept away from the public, the press, and the suspects during their interrogation. This can be achieved only by highly disciplined police investigators. In the United States most relevant case information is revealed either through the mass media or during the interrogation. A contrasting view about the GKT prevails in Japan where the GKT is applied almost exclusively and its outcomes are accepted in court. Effort should be invested in applying the GKT more frequently, but at the same time, the CQT should not be abandoned.
The GKT can be used immediately after the initial interrogation as an elimination method. The CQT would be more useful later on after the suspicion against the suspect has been elaborated. The CQT is especially effective in cases where one suspect blames the other of committing an offense and the other claims to be falsely accused (e.g. complaints against police brutality and sex offenses). If both sides are being examined by two independent examiners and yield conflicting outcomes, even a modestly valid polygraph test would be sufficient. Proper use of the various polygraph methods may facilitate police work and contribute to the welfare of innocent suspects. In many cases the polygraph is the only means by which innocent examinees can prove their innocence and free themselves from further interrogation.
This report evaluates the effectiveness of the polygraph method of lie detection. In this technique, the physiological responses of a person being interrogated are observed to provide a basis for inferring whether or not an attempt has been made .o deceive the interrogator. The major finding is that, although the method of lie detection has been used extensively and is regarded favorably by its practitioners, the degree of its validity is still not known. This situation is the result of a failure to collect objective data necessary to assess the effectiveness of this method of interrogation. The report describes the methodological problems which must be faced in order to collect meaningful data, recommends research which should be undertaken to increase our understanding of this technology, and makes suggestions for improving professional standards in this area.
Objective data to demonstrate the degree of effectiveness of the polygraph as an instrument for the detection of deception has not been compiled by the agencies that use it in the Department of Defense. This is true despite the fact that about 200,000 such examinations have been performed over the last 10 years. Up to the present time, it has proved impossible to uncover statistically acceptable performance data to support the view held by polygraph examiners that lie detection is an effective procedure. There can be no doubt that the measurement of physiological responses in the context of a structured interview provides a basis for the detection of deception by objective means. Extensive research by physiologists and psychiatrists shows that humans exhibit many physiological responses in stressful situations; however, such research was not performed to explore its relevance to lie detection. Thus, we do not know at present the increment in effectiveness which the polygraph brings over an interrogation without a polygraph.
There is a lack of professional standards for the regulation of lie detection activities throughout the Department of Defense. Many aspects of the technology of lie detection are inadequately developed. Areas which require study are the reliability and validity of lie detection in laboratory and real life situations, the incremental value of new physiological Indicators, improvement of the interview procedures application of automatic data processing to polygraph records, and examination of the possibility that individuals exhibit unique patterns of autonomic response. Recent developments in medical electronics provide more reliable and convenient sensors than those now used in lie detection.
The research problems in lie detection are straightforward and there is every reason to believe that a research program would achieve its objectives. There is evidence that training, possibly supported by drugs and hypnosis, can be used to introduce spurious effects into test records. The extent to which such methods could succeed or an examiner could counteract them is unknown. Improvements in the art of lie detection would be useful not only for its present applications to security and criminal interrogations, but for screening foreign personnel and as one means of inspection in an arms control agreement.
Establish a program for research and development in the technology of lie detection: This program should include studies on the validity of lie detection, improvement of interview procedures, the development of improved sensors, the effectiveness of adding new physiological indicators to the polygraph, and automatic data processing of test records. There is a need to study measures that could be taken to avoid detection on the polygraph and, if they are shown to be effective, to develop suitable countermeasures. The program should also include studies on the effect of cultural and political influences on the value of lie detection, if it were considered as one means of inspection in an aims control agreement. Establish a program to develop professional standards for polygraph interrogation throughout the Department of Defense: This program should consider selection, train, and certification of examiners; methods of supervision methods of maintaining competence; reorient d performance evaluation; and relation of operating personnel to research and visions in this area.