Halo Effect Physical Attractiveness Research Paper

The halo effect is a type of cognitive bias in which our overall impression of a person influences how we feel and think about his or her character. Essentially, your overall impression of a person ("He is nice!") impacts your evaluations of that person's specific traits ("He is also smart!").

One great example of the halo effect in action is our overall impression of celebrities. Since we perceive them as attractive, successful, and often likable, we also tend to see them as intelligent, kind, and funny.

Definitions of the Halo Effect

  • "Also known as the physical attractiveness stereotype and the "what is beautiful is good" principle, the halo effect, at the most specific level, refers to the habitual tendency of people to rate attractive individuals more favorably for their personality traits or characteristics than those who are less attractive. Halo effect is also used in a more general sense to describe the global impact of likeable personality, or some specific desirable trait, in creating biased judgments of the target person on any dimension. Thus, feelings generally overcome cognitions when we appraise others."
    (Standing, L. G., in The SAGE Encyclopedia of Social Science Research Methods, Volume 1, 2004)
  • "In a study made in 1915 of employees of two large industrial corporations, it appeared that the estimates of the same man in a number of different traits such as intelligence, industry, technical skill, reliability, ect., etc., were very highly correlated and very evenly correlated... Ratings were apparently affected by a marked tendency to think of the person in general as rather good or rather inferior and to color the judgments of the qualities by this general feeling. This same constant error toward suffusing ratings of special features with a halo belonging to the individual as a whole appeared in the ratings of officers made by their superiors in the army."
    (Thorndike, E. L., "A Constant Error in Psychological Ratings," 1920)

The History of the Halo Effect

Psychologist Edward Thorndike first coined the term in a 1920 paper titled "The Constant Error in Psychological Ratings." In the experiment described in the paper, Thorndike asked commanding officers in the military to evaluate a variety of qualities in their subordinate soldiers.

These characteristics included such things as leadership, physical appearance, intelligence, loyalty, and dependability.

Thorndike's goal was to determine how ratings of one quality bled over to assessments of other characteristics. He found that high ratings of a particular quality correlated to high ratings of other characteristics, while negative ratings of a specific quality also led to lower ratings of other characteristics.

"The correlations were too high and too even," Thorndike wrote. "For example, for the three raters next studied the average correlation for physique with intelligence is .31; for physique with leadership, .39; and for physique with character, .28."

So why do our overall impressions of a person create this halo that influences our evaluations of specific traits? Researchers have found that attractiveness is one factor that can play a role.

Several different studies have found that when we rate people as good-looking, we also tend to believe that they have positive personality traits and that they are more intelligent. One study even found that jurors were less likely to believe that attractive people were guilty of criminal behavior.

However, this attractiveness stereotype can also be a double-edged sword.

Other studies have found that while people are more likely to ascribe a host of positive qualities to attractive people, they are also more likely to believe that good-looking individuals are vain, dishonest, and likely to use their attractiveness to manipulate others.


  • "In the classroom, teachers are subject to the halo effect rating error when evaluating their students. For example, a teacher who sees a well-behaved student might tend to assume this student is also bright, diligent, and engaged before that teacher has objectively evaluated the student's capacity in these areas. When these types of halo effects occur, they can affect students' approval ratings in certain areas of functioning and can even affect students' grades."
    (Rasmussen, Encyclopedia of Educational Psychology, Volume 1, 2008)
  • "In the work setting, the halo effect is most likely to show up in a supervisor's appraisal of a subordinate's job performance. In fact, the halo effect is probably the most common bias in performance appraisal. Think about what happens when a supervisor evaluates the performance of a subordinate. The supervisor may give prominence to a single characteristic of the employee, such as enthusiasm, and allow the entire evaluation to be colored by how he or she judges the employee on that one characteristic. Even though the employee may lack the requisite knowledge or ability to perform the job successfully, if the employee's work shows enthusiasm, the supervisor may very well give him or her a higher performance rating than is justified by knowledge or ability."
    (Schneider, F.W., Gruman, J. A., & Coutts, L. M., Applied Social Psychology, 2012)

The Halo Effect at Work in the Real World

As you read above, the halo effect can influence how teachers treat students, but it can also impact how students perceive teachers. In one study, researchers found that when an instructor was viewed as warm and friendly, students also rated him as more attractive, appealing, and likable.

Marketers take advantage of the halo effect to sell products and services. When a celebrity spokesperson endorses a particular item, our positive evaluations of that individual can spread to our perceptions of the product itself.

Job applicants are also likely to feel the impact of the halo effect. If a prospective employer views the applicant as attractive or likable, they are more likely to also rate the individual as intelligent, competent, and qualified.

So, the next time you trying to evaluate another person, whether it is deciding which political candidate to vote for or which movie to see on a Friday night, consider how your overall impressions of an individual might influence your evaluations of other characteristics.

Does your impression of a candidate being a good public speaker lead you to feel that she is also smart, kind, and hard-working? Does thinking that a particular actor is good-looking also lead you to think that he is a compelling actor?

Being aware of the halo effect, however, does not make it easy to avoid its influence on our perceptions and decisions.


Rasmussen, K. Halo Effect. In N. J. Salkind & K. Rasmussen (Eds.), Encyclopedia of Educational Psychology, Volume 1. Thousand Oaks, CA: Sage Publications, Inc.; 2008.

Schneider, F.W., Gruman, J. A., & Coutts, L. M. Applied Social Psychology: Understanding and Addressing Social and Practical Problems. London: SAGE Publications, Inc.; 2012.

Standing, L. G. Halo Effect. In M. S. Lewis-Black, A. Bryman, & T. F. Liao (Eds.), The SAGE Encyclopedia of Social Science Research Methods, Volume 1. Thousand Oaks, CA: SAGE Publications, Inc.; 2004.

Thorndike, E. L. The Constant Error in Psychological Ratings. Journal of Applied Psychology. 1920;4, 25-29.


A review by Langlois et al. [1] suggested that people regularly make judgements based on appearance and argued that “if humans were not biased to judge others on their appearance, they would not need to remind their children not to judge books by their covers” (p. 408). While frequently warned against ‘judging a book by its cover’, the field of face perception is filled with evidence that suggests that the face does contain a substantial amount of information for evaluators to infer traits. For instance, Kramer and Ward [2] found that four of the Big Five personality traits, as well as physical health, were perceived with some limited accuracy from internal facial features alone and three of the Big Five traits were accurately perceived (just above chance) from just one side of the face. Similarly, Penton-Voak, Pound, Little, and Perrett [3] found that there was some limited accuracy in perceptions of extraversion, emotional stability, and openness to experience when presented with images of composite faces (combining the faces of people with the same personality). Also, research by Little, Burt, Penton-Voak and Perrett [4] found that evaluators were differentially attracted to faces depending on personality traits desired in a partner; that is, “if a trait is desired then faces perceived to possess that trait are found more attractive than faces which do not possess that trait” (p. 1107). Such research highlights potential accuracy in face perception and the relationship between limited accuracy in perceived traits and attractiveness.

Indeed, when investigating the accuracy of perceived intelligence [5] and of perceived health [6] in faces it was found that accuracy was improved to a level above chance when controlling for attractiveness bias. The ‘attractiveness halo effect’ in which desired personality traits are ascribed to attractive people over unattractive people [7] seems to influence the use of attractiveness as a cue when attempting to accurately perceive health or intelligence in faces and is in turn, limiting people’s accuracy. The relationship seems to reflect a suppression effect, in which the suppressor (perceived attractiveness) is correlated with the other predictor variable (perceived health or intelligence), but is not related to the dependent variable (actual health or intelligence), so when this noise (relationship between attractiveness and perceived health or intelligence) is controlled for the accuracy in perceptions of actual health or intelligence is increased [8–12].

Accurate Perceptions of Intelligence and Attractiveness Halo

Kleisner, Chvatalova, and Flegr [5] reported accurate perceptions of intelligence in men’s but not women’s faces. It is important to note that a significant relationship between perceived and actual intelligence was only evident after statistically controlling for perceived attractiveness, though perceived attractiveness itself was not found to be a valid cue to actual intelligence. Kleisner et al. [5] argue that one of the reasons accurate estimations of intelligence are demonstrated in men but not women may be due to the stronger effect of the attractiveness halo in perceptions of female intelligence. These findings highlight the pervasive and detrimental influence of attractiveness on accuracy in attributions.

For decades researchers have debated the accuracy in perceived intelligence and whether attractiveness is a valid cue to actual intelligence [1,5,13–15]. A study by Zebrowitz, Hall, Murphy, and Rhodes [16] found that judgments of intelligence from faces were more accurate than chance for images from childhood, puberty, and middle adulthood, but not more accurate than chance in adolescence or late adulthood. Zebrowitz et al. [16] discussed how facial attractiveness might relate to actual intelligence based on various potential paths: (a) biological, with good genes being inherited; (b) environmental, including the impact of nutrition and healthcare; (c) influence of intelligence on grooming and health decisions; (d) and a self-fulfilling prophecy, in which attractive people are expected to be smarter and given greater opportunities to become smarter. A later study by Zebrowitz and Rhodes [17] investigated the relationship between facial attractiveness and actual intelligence in the upper and lower halves of the attractiveness distribution and reported that, consistent with the ‘bad genes hypothesis’, facial attractiveness was a valid cue to actual intelligence only in the lower half of the attractiveness distribution. Consistent with the ‘anomalous face overgeneralization hypothesis’, attractiveness was used (spuriously) as a cue to intelligence across the entire attractiveness distribution [17]. Thus, participants were accurate in judging intelligence based on attractiveness, but only because faces perceived as unattractive were judged as having low intelligence. These findings are consistent with the ‘bad genes’ hypothesis, which implies that faces perceived as very unusual or unattractive may be an indicator of poor genetic fitness.

A more recent study by Mitchem et al. [18] highlights several problems in previous research investigating attractiveness and intelligence, namely publication bias, inconsistencies in definitions of intelligence and attractiveness, research design flaws, and small sample sizes. They conducted research on the largest sample to date, utilizing a twin dataset and independently collected measures of facial attractiveness and general intelligence. They found no support for a relationship between actual intelligence and perceived facial attractiveness.

Attractiveness and Academic Performance

Research has also investigated the potential relationship between perceived attractiveness and actual academic performance, with no clear consensus. Some investigations have showed that students who are perceived as more attractive achieve higher grades and higher scores on standardized achievements tests (e.g. [19–21]). Other studies failed to find any relationship (e.g. [22,23]).

Nonetheless, the relationship between perceived attractiveness and perceptions of academic performance is clear. A meta-analysis conducted by Dusek and Joseph [24] scrutinized fourteen studies investigating physical attractiveness and its relation to teacher expectancy. The review concluded that perceived facial attractiveness is significantly correlated with teacher expectations of academic performance and positive personality attributes. For example, a cornerstone study by Clifford and Walster [25] indicated a significant correlation between physical appearance and teacher expectations. A similar study also suggested a positive correlation between teachers’ ratings of attractiveness and expectations of children’s skills [26] showing that teachers judged children rated as more attractive as more social, confident, popular, academically strong, and more likely to become leaders than students who were rated as less attractive.

Another meta-analytic review by Ritts, Patterson, and Tubbs [27] found that students perceived as attractive are more likely than students perceived as unattractive to be ascribed positive educational traits. Specifically, students perceived as attractive were judged as more intelligent, having more academic potential, and having better grades. It was also noted that other variables such as gender, race, and knowledge of past performance also influenced expectations, but were not significant moderators to the attractiveness influence [27]. Consequently, while there is little consensus and weak supporting evidence for a relationship between perceived attractiveness and actual intelligence or academic performance, there is convincing research documenting the relationship between perceived attractiveness and perceived intelligence or academic performance.

Accuracy in Face Perception

Research suggests extroversion can be accurately perceived after only a 50-ms exposure to a face [28], strength can be accurately estimated from faces independent of height, weight, and age [29] and the dark triad of personality (Machiavellianism, narcissism, and psychopathy) can be accurately perceived in composites of expression-neutral facial images [30]. Note here that accuracy does not imply a large effect size; accuracy may be significant, but with performance only slightly above chance. Nonetheless, this limited accuracy is still somewhat impressive given the lack of conventional information (i.e. information about behaviour) that we typically think affects such judgements; thus, the effects may be small but they are still noteworthy. Todorov, Olivola, Dotsch, and Mende-Siedlecki [31] suggests that little time is needed to arrive at a consensus on social attributions from faces, however many studies overstate the validity of these attributions. There are various perspectives on why and how such social attributions from faces are made that explain the potential both for accuracy and for limitations in accuracy.

Biological cues may shed light on how people are rating social judgements at above-chance accuracy from neutral-expression facial images alone. For instance, research suggests the shape of a face is related to the current [32–34] and prenatal [35] levels of testosterone. Research has also suggested that facial adiposity is closely associated with circulating testosterone [36] and that facial adiposity has been shown to be related to perceived health and attractiveness, as well as measures of actual cardiovascular health and proneness to respiratory illness [37]. Further, facial symmetry, and sex typicality in face shape has been shown to be related to disease resistance [38]. Similarly, an average face shape may signal health, as abnormalities that make a face look slightly different from the average may be caused by genetic or environmental stress [17]. Carotenoid coloration in the face has also been found to signify quality of current diet [39]. The face can also provide clues to recent sleep history, with those who are sleep deprived having less eyelid-openness and more downward mouth curvature than those that are well rested [40].

Health, Attractiveness and Over-generalization

Clearly, the face provides a variety of cues to hormones, health, and sleep status. One thing all of these cues have in common is their relationship to attractiveness. Namely, research investigating attractiveness and the ‘good genes’ theory has argued that facial symmetry [41], averageness [42], sexual typicality [43], eyelid-openness and mouth-curvature [44], carotenoid coloration in the face [39], and adiposity [37] may be attractive because of their relationship to health [45–47].

The link between potential cues to health in the face and perceived attractiveness is one explanation for the ‘attractiveness halo effect’. Research suggests this preference for attractive (or healthy looking) individuals appears early in infancy, with infants as young as two-months old gazing longer at attractive faces over unattractive or unusual looking faces [48,49]. It is unknown whether or not such preferential looking reflects early learning [50–52]. Further, Langlois, Roggman, and Reiser-Danner [53] found that twelve-month-old infants would play longer, have more involvement, experience less distress and withdrawal, and seem to exhibit more pleasure when interacting with attractive people as compared to unattractive people. Also noteworthy is the degree of agreement regarding facial attractiveness. Specifically, studies have shown consistency between men and women regarding opinions of facial attractiveness [45]. Surprisingly, agreement on facial attractiveness is apparent even across different countries [1,54].

In an attempt to investigate whether facial attractiveness provides evidence of actual health, which may partially explain this positive bias towards attractive people, Kalick, Zebrowitz, Langlois and Johnson [6] found that evaluators’ perceptions of attractiveness are actually poor predictors of current or future actual health. While attractive faces were mistakenly rated as healthier than their peers, the correlation between perceived health and actual health increased when attractiveness was statically controlled, implying that attractiveness suppresses the accurate recognition of health.

This improvement in accuracy of health judgments after controlling for attractiveness is similar to the improved accuracy of intelligence judgments when the attractiveness halo is statistically controlled [5]. Indeed there is evidence to suggest a relationship between various health factors and cognitive or intellectual performance. Specifically, it has been found that phobic anxiety [55], trait anxiety [56,57], drug use [58], diabetes [59,60], poor sleep [61], and frailty [62] have been negatively associated with both health and cognitive function in older individuals. Similarly, exposure to chronic aircraft noise [63], infection with parasitic worms [64] and food insufficiency [65] have been found to negatively impact health and cognitive performance in children. Given the close relationship between actual health, actual cognitive performance, and perceived attractiveness, facial cues to health might also be cues to both attractiveness and cognitive ability, leading to correlations between attractiveness and perceived competence. Such correlation might lead to overgeneralization and inaccurate perceptions of academic ability in healthy individuals based spuriously on attractiveness. Hence we explore whether or not the ‘blinded by beauty’ phenomenon found in perceptions of health [6] and intelligence [5] also applies to the perception of academic performance from first impressions of neutral-expression static facial images.

Theories of Intelligence and Academic Performance

Given the controversy over definitions of intelligence and differences in theories of intelligence [66] it is likely that, in addition to being limited by the attractiveness halo, accurate perceptions of intelligence are also limited by variation in understanding on the meaning of the term ‘intelligence’. While someone who agrees with a fixed theory of intelligence believes there is little a person can do to change their actual intelligence, someone with a growth theory of intelligence argues that intelligence can change over time with the appropriate environment [67–69].

Perceptions of academic performance from faces are likely to suffer similar inconsistences in evaluator perspectives of what factors most influence academic performance. While research has consistently shown that intelligence predicts academic performance [70], it is well documented that the personality trait of conscientiousness is a stronger predictor of academic performance than intelligence [70,71]. Hence, it could be argued that asking evaluators to assess academic performance from faces would yield just as much ambiguity as attributions of intelligence, as consensus would be adversely affected both by disagreement in fixed vs. growth theories of intelligence, and by different perspectives on how much academic performance relies on intelligence versus conscientiousness.

Research on the Intelligence Competence Theory (ICT) further undermines consensus of perceived academic performance by suggesting that people who are less intelligent compensate by becoming more conscientious to reach their goals [72,73]. Thus, some might think a person with a less intelligent looking face is more academically able because the person may work harder to get better grades. Previous studies have highlighted consensus and accuracy of perceptions of most of the big five personality traits from face, yet conscientiousness is sometimes [74], but not always correctly detected [3,75]. Given the relationship between actual conscientiousness and academic performance (compared to intelligence), we explore whether perceptions of conscientiousness are more likely to predict actual academic performance than perceptions of intelligence.

Research Questions

Research investigating perceptions of academic performance has primarily been concerned with exploring the potential of attractiveness to be a valid predictor of academic performance [19–23,76] and exploring the effects of perceived academic performance on students’ actual performance in the future [24–27]. No research that we are aware of has investigated the potential accuracy of perceptions of actual academic performance from faces when controlling for the attractiveness halo. Given the different perspectives and theories of the term ‘intelligence’ [66,67,77] and the varying perspectives on how much intelligence predicts academic performance compared to conscientiousness [72,73], we hypothesize that evaluators will be more accurate in perceiving actual academic performance when specifically asked to rate conscientiousness than when asked to rate the more ambiguous terms ‘intelligence’ or ‘academic performance’.

Further, it is possible that attractiveness detracts from accuracy in perceptions of academic performance much as attractiveness can detract from accuracy in perceptions of health and intelligence. While there are various seemingly logical explanations for why attractiveness could be a valid cue to academic performance, the empirical evidence for a link between the two is extremely weak and perhaps only existing in the lower half of the distribution (i.e. driven by potential outliers with genetic or developmental problems affecting both appearance and cognitive ability). We hypothesise an ‘attractiveness halo’ in which attractiveness is not linked to actual academic performance but is significantly correlated with perceptions of academic performance. Further, we hypothesise that controlling for the misperceptions about attractiveness may improve accuracy in perceptions of academic performance.

We argue that this effect of controlling for attractiveness takes the form of a classic type of suppression (see [8–12]). In classical suppression, the suppressor is unrelated to the variable of interest but is related to the predictor, and therefore the shared variance between the predictor (in this case, perceived conscientiousness, intelligence or academic performance) and the suppressor (attractiveness) is unrelated to the outcome measure (actual academic performance). By controlling for this irrelevant variance in the predictor, the strength of the association between the predictor and outcome variable increases. In other words, controlling for attractiveness may reveal a ‘blinded by beauty’ phenomenon similar to that found in health [6] and intelligence [5] judgments.


All data collection was approved by UTREC and the School of Psychology and Neuroscience ethics committee (PS1087), University of St Andrews. All participants provided informed written consent and were debriefed accordingly. The individuals in this manuscript have also given written informed consent to blend their facial photographs to create average faces and publish these case details. Written consent was recorded via both electronic submission and on hard copies. The ethics committee approved this consent procedure.

Facial Stimuli

Students from the University of St Andrews were recruited to take part in an experiment called “Influences in the perception of intelligence in faces” as part of a larger data collection. One-hundred of the most standardized (e.g. clean shaven, neutral expression and head posture) Caucasian faces between the ages of 18 and 24 (Mage = 20.85, SD = 2.15; 67 females, 33 males) were chosen as stimuli. The original image collection contained more women than men and removal of males with beards enhanced the gender bias. Nonetheless, we maximised the number of stimuli available for judgments to maintain power in the analysis. Selection of standardized faces was done blind to their academic performance. Todorov and Porter [78] highlight significant differences in person impressions within multiple facial photos of the same person due to random variation and discuss how this can influence accuracy of personality inferences based on faces. Thus, it was important to select the most standardized stimuli. All of the stimuli photographs of participants used were taken under standardized lighting conditions and camera set-up; individuals had their hair pulled back, did not wear any kind of make-up or jewellery, and were instructed to pose with a neutral facial expression. Face images were aligned on left and right pupils. Images were then resized and cropped (1608 x 2584 pixels) so that an equal proportion of hair and neck was exposed in each.

Academic Performance Measures

All participants consented to releasing their academic performance records for the purpose of this research. Academic records were accessed via the Universities database. Academic performance at the University of St Andrews is marked on a 20-point scale reported to one decimal place for final module grades. An average academic performance was calculated by taking the Grade Point Average (GPA) across every year weighted by every module credit completed by the student. Participants varied in their course of study and the number of modules completed based on their year and semester of study (63 in Sciences, 37 in Arts; 44 first and second year undergraduates, 39 third and fourth year undergraduates, and 17 in postgraduate courses). Accordingly, methods of evaluation (e.g. exam, essay, and dissertation) varied.

Face Ratings

Four separate groups of participants were recruited and paid via Amazon Mechanical Turk to obtain ratings of perceived attractiveness, intelligence, conscientiousness, and academic performance (no other face ratings obtained for this study). Table 1 shows the demographics of each participant group. Differences in sample sizes were based on differences in the number of participants completing the task while the link was live on Amazon Mechanical Turk and number of exclusions. Participants who reported their ethnicity as different from ‘white Caucasian’ were excluded when calculating the average ratings of perceived attractiveness, intelligence, conscientiousness, and academic performance, as stimuli presented were Caucasian and judgments of other ethnicities may be more susceptible to stereotypes [79]. Analysis was re-run with all participants and there were no differences in the pattern of findings; i.e., all significant results remain significant, and all non-significant results remain non-significant.

Evaluators first previewed all stimuli with each image displayed for one second. The stimuli were then re-presented so that participants could rate the face on the focal trait for each sample: perceived attractiveness, intelligence, consciousness, or academic performance. Faces were presented in random order. To ensure the paid participants were not quickly and hastily clicking through images, images were presented for at least one second before participants were allowed to continue to the next image, but no maximum response time was enforced. Evaluators then completed a questionnaire inquiring about their age, gender, and ethnicity.

Facial ratings were done on a 7-point scale with endpoints according to the face rating task: attractiveness endpoints were not at all attractive to very attractive; perceived intelligence endpoints were not at all intelligent to very intelligent; perceived conscientiousness endpoints were not at all conscientious to very conscientious.; and perceived academic performance endpoints were very low academic performance to very high academic performance.

Participants who rated perceived academic performance were presented with a statement at the top of each facial image presented asking “Please rate how well you think this person does in University compared to the other people presented”. Participants who rated perceived conscientiousness were presented with a statement at the top of each facial image presented that read “Conscientiousness is the personality trait of being thorough, careful, or vigilant–with the desire to do a task well. Based on the definition of conscientiousness provided–how conscientious do you perceive this face to be compared to the other faces presented”.


An average score of perceived attractiveness, intelligence, academic performance and conscientiousness was calculated for each of the 100 faces based on the average of all the evaluator ratings. Table 2 gives the zero order correlations between ratings and academic performance and demographic variables. There was a significant correlation between older age and higher actual academic performance and female faces were perceived as more attractive (see Table 2).

As predicted, there was no relationship between attractiveness and actual academic performance (r = 0.03), but a strong positive correlation between attractiveness and perceived intelligence (r = 0.81), attractiveness and perceived academic performance (r = 0.74) and attractiveness and perceived conscientiousness (r = 0.81).

Given the high correlations between rated attributes (perceived attractiveness, perceived conscientiousness, perceived intelligence and perceived academic performance), we wanted to ensure that any statistical controls were based on sufficiently reliable measures and discriminability valid constructs. Cronbach’s alphas were calculated for perceived attractiveness (32 ratings; α = 0.94), intelligence (25 ratings; α = 0.86), academic performance (20 ratings; α = 0.73), and conscientiousness (47 ratings; α = 0.91). After correcting for attenuation due to measurement error [80,81] the relationships between attractiveness and perceived intelligence (r = 0.90), between attractiveness and perceived academic performance (r = 0.89), and between attractiveness and perceived conscientiousness (r = 0.88) were all marginally higher but do not indicate redundancy.

We explored any potential issues with multi-collinearity, as research has suggested high VIF calculations may raise concerns over interpretations [82]. The test to see if the data met the assumption of collinearity indicated that multi-collinearity was not a concern (VIF scores over 10 are seen as problematic; [82]). In this study none of the VIF values were a concern: perceived attractiveness, tolerance = .243, VIF = 4.11; perceived academic performance, tolerance = .236, VIF = 4.24; perceived intelligence, tolerance = .152, VIF = 6.58; perceived consciousness, tolerance = .193 VIF = 5.18).

Partial correlations were conducted in which the influence of age of face, sex of face, and perceived attractiveness were controlled for. Partial correlations revealed (see Fig 1) a significant correlation between perceived conscientiousness and actual academic performance (r = 0.22, p = 0.035). The partial correlations reveal no relationship between actual academic performance and perceived academic performance (r = 0.13, p = 0.191) or perceived intelligence (r = 0.06, p = 0.544). Findings do not change when controlling for only attractiveness in the partial correlation. Nor do they change when controlling for the combination of attractiveness and age or the combination of attractiveness and sex of face.

Fig 1. Partial Correlations.

This bar graph shows the increased accuracy of the different perceived competence variables when controlling for perceived attractiveness. The same pattern emerges when controlling for the additional variables of sex and age of face.


We investigated the predictive power of perceived conscientiousness over attractiveness and the other perceived competence variables with a multiple linear regression model. In a simple regression model perceived conscientiousness significantly predicted actual academic performance (B = 1.59, SE = 0.712, 95% CI [0.18, 3.00], p = 0.027, β = 0.48), but perceived academic performance (B = -0.03, SE = 0.843, 95% CI [-1.70, 1.64], p = 0.969, β = -0.01), perceived attractiveness (B = -0.89, SE = 0.51, 95% CI [-1.89, 0.11], p = 0.082, β = -0.34), and perceived intelligence (B = -0.07, SE = 0.849, 95% CI [-1.75, 1.62], p = 0.939, β = -0.02) did not significantly predict actual academic performance (overall model: adjusted R2 = 0.08, F(4, 106) = 2.13, p = 0.082).

In a multi-step hierarchical model (1st step independent variables: perceived attractiveness, perceived intelligence, perceived academic performance, 2nd step independent variable: perceived conscientiousness) predicting actual academic performance, the second step in the model (perceived conscientiousness) showed a significant increase in variance explained (R2 change = 0.45 p = 0.027).

Facial Averages

Facial averages of faces were created to help the reader visualize perceptions of conscientiousness and the attractiveness halo. All face images were manually delineated with 188 points. The averaging (a) computes the average coordinate values for 188 facial landmarks within the set of face images, (b) warps each shape of each facial image into these average coordinates, and then blends the warped component images [83,84]. Facial averages (see Fig 2) were synthesized from the top 25% male and female faces (8 male and 16 female faces) and bottom 25% male and female faces with the highest and lowest scores on perceived conscientiousness [85,86]. These average images were then made symmetrical (see [41]).

Fig 2. Composite images of percieved consientiousness.

The images presented reflect the top and bottom 25% of faces percieved as most (left) and least (right) conscientious. The attractiveness halo would suggest that faces percieved as most conscientious (left) would be more attractive than the faces rated as least conscientious (right).



There are three main findings. First, there was no first-order relationship between perceptions of conscientiousness, academic performance or intelligence and actual academic performance. Second, when controlling for the expected influences that age, sex and perceived attractiveness on perceptions of competence (perceived conscientiousness, academic performance and intelligence), then the relation between perceived competence and actual academic performance increased in strength. Third, perceived conscientiousness was the single best face perception predictor of actual academic performance (outperforming perceived intelligence and perceived academic performance), and again accuracy was significantly improved when controlling for the suppressor variable of attractiveness.

As we expected, the form of the relationship is one of classic suppression in which there is some factor (perceived attractiveness) that is correlated with perceptions of conscientiousness, but not correlated with actual academic performance [8–12]. When this factor is controlled, the relationship between perceived conscientiousness and actual academic performance is increased (see Fig 3). It should also be noted that, although some previous literature suggests weak correlations between attractiveness and cognitive performance measures [16], in our study perceived attractiveness was not a valid cue to actual academic performance. These results suggest that we are ‘blinded by beauty’ in a way in which we would be more accurate in our perceptions of academic performance from faces if we were not influenced by the ‘attractiveness halo’ effect.

Fig 3. Attractiveness Suppression.

This figure shows the noise in perceived conscientiousness (the overlap between perceived attractiveness and perceived conscientiousness) and how by suppressing this noise results in an improved predictor of actual academic performance (greater overlap between the remaining perceived conscientiousness and actual academic performance).


Given the amount of research on higher expectations and desired educational traits being ascribed to attractive students over unattractive students, it is not surprising that faces that were rated as more intelligent, having better academic performance and being more conscientious were also rated as more attractive (see composite faces in Fig 2). As predicted, there were high correlations between perceptions of attractiveness and perceptions of intelligence, conscientiousness, and academic performance, likely reflecting the strength of the attractiveness halo, as well as the similarities among these perceived competence measures [87]. While there is less evidence to suggest perceptions of intelligence and academic performance are unique constructs, the possibility that perceived conscientiousness and perceived attractiveness are not distinguishable empirically is dealt with in two ways: face validity of the items for which evaluators were clearly rating conscientiousness or attractiveness (the measures were unambiguous to the evaluators); and we calculated inter-evaluator reliabilities for conscientiousness and attractiveness ratings and even after correcting for attenuation due to measurement error, the correlations between these variables remained distinct (i.e. they were imperfectly correlated). Taken together, these elements suggest that these measures can be treated here as distinct constructs, and that they are measured with sufficient reliability to be distinguished empirically in this study. The high correlations do create potential for interpretative difficulties in multiple regression, and under such circumstances we find it important to emphasize the role of suppression in their relationship in a way that reflects the traditional understanding of the attractiveness halo.

Findings suggest that accuracy in perceptions of academic performance also increases with the clarity and validity of the question proposed. When controlling for attractiveness, age and sex, perceptions of conscientiousness in faces yielded above chance accuracy in predicting academic performance, but accuracy in predicting actual academic performance did not reach levels of statistical significance with perceptions of intelligence or perceptions of academic performance. Given the high correlations between these perceived competence measures, it is difficult to say for certain whether perceptions of conscientiousness are unique in their capacity to predict actual academic performance over and above perceptions of intelligence or academic performance. Rather, it seems perceptions of conscientiousness predicts actual academic performance because, in comparison, it may be the least ambiguous competence construct. As previously argued, it is likely that individual differences in theories and understandings of intelligence can lead, on average, to less accurate perceptions of intelligence in faces. Likewise, perceived academic performance is possibly confounded by a combination of the ambiguities in the term intelligence (fixed vs. malleable) and the limited consensus on how much intelligence (in relation to conscientiousness) is necessary for high academic performance; hence the limited accuracy of perceived academic performance compared to perceived conscientiousness in predicting actual academic performance.

The improved accuracy in perceived conscientiousness predicting actual academic performance over perceived intelligence is also consistent with research that suggests that actual conscientiousness is a stronger predictor of academic performance than actual intelligence [70]. Further, the Intelligence Compensation Theory (ICT) suggests that conscientiousness acts as a coping strategy for relatively less intelligent people. While evidence for ICT is limited, some studies have found significant negative correlations between fluid intelligence and conscientiousness [72,73]. Other studies have found a significant negative correlation between crystalized intelligence and conscientiousness [88]. Thus, our findings of perceived conscientiousness better predicting actual academic performance in faces than perceived intelligence is consistent with literature suggesting actual conscientiousness is a better predictor than intelligence in predicting actual academic performance. Nonetheless, given the high correlations amongst the perceived competence variables explored (perceived intelligence, perceived academic performance and perceived conscientiousness), we must be cautious in claiming that only perceived conscientiousness is related to actual academic performance; rather we argue that the specificity in rating tasks and the influence of attractiveness bias are worth considering when exploring validity of judgements based on faces.

The increased accuracy of academic performance in faces after controlling for attractiveness has important implications. Indeed, Olivola and Todorov [89] showed that judges overweigh aspects of appearance and would be more accurate in judging personality if face perception was ignored. However, facial impressions have consistently been shown to influence our opinions as well as bias decisions in politics [90], leadership [91], law [92], parental expectations and punishments on children [93], military rank promotion [94], and teacher evaluations [95]. Clearly, the power of first impressions is critical and has repeatedly been shown to influence our opinions about a person.

Furthermore, research has found that femininity is considered more attractive than masculinity [43] and that females perform better academically and stay in education longer than males [96], which likely leads to females being ascribed more desired educational traits over men. It is also well documented that older students do better on intelligence tests [97,98] and do better academically than younger students. Moreover, crystalized intelligence and perceptions of wisdom have shown to increase linearly with age [99,100], which would influence impressions of competence in older students (hence the intentionally limited university age range for facial stimuli presented). Our research suggests that when controlling for biases of attractiveness, age and sex, independently or collectively, accuracy of perceived academic performance is significantly improved.

Perhaps one of the most alarming consequences of using insufficient information to guide first impressions is the expectancy effect in education. The classic Pygmalion study conducted by Rosenthal and Jacobson [101] suggests that expectations alone are capable of influencing the targets’ actual performance. Specifically, the Pygmalion study found that students who were arbitrarily assigned the label ‘bloomers’ (i.e., anticipated to show future promise) eventually scored higher on future tests than other students, even though the students labelled as ‘bloomers’ were a random sample and not any more intelligent than the other students in the class. More recent research on expectancy effects by Sorhagen [102] found that teachers’ inaccurate expectations of students in first-grade was associated with students’ academic performance in high-school and that students from lower income families were especially influenced by this bias. Likewise, De Boer, Bosker, and Van Der Werf [103] defined expectation bias as the difference between observed and predicted teacher expectation and found a significant relationship between teacher’s expectation bias of students’ performance and actual performance 5 years later. Hence, perceptions of conscientiousness, intelligence and academic performance may play a vital role in the classroom environment and in the success of a child’s education.

Future research in face perception can benefit from noting the significant differences in perception accuracy based on different theories of intelligence or competence. Perhaps more importantly, given the well documented effects of expectations of academic performance on actual academic performance, our findings help emphasize the biased effects of perceived attractiveness on expectations of academic performance. While it seems unlikely that another person’s attractiveness can be filtered out when attempting to accurately perceive academic performance, the mere knowledge of the negative influence attractiveness has on accuracy may encourage less biased practice; for perhaps the best antidote to deter unconscious bias is to make conscious the possibility of bias.


We thank Martin Campbell for his very helpful feedback on earlier versions of the manuscript. We also thank William Peter for his assistance in participant recruitment.

Author Contributions

Conceived and designed the experiments: ST DP. Performed the experiments: ST. Analyzed the data: ST DP KM. Contributed reagents/materials/analysis tools: ST DP KM. Wrote the paper: ST DP KM.


  1. 1. Langlois JH, Kalakanis L, Rubenstein AJ, Larson A, Hallam M, Smoot M. Maxims or myths of beauty? A meta-analytic and theoretical review. Psychol Bull. 2000; 126(3):390–423. Available: http://doi.apa.org/getdoi.cfm?doi=10.1037/0033-2909.126.3.390. Accessed 14 December 2013. pmid:10825783
  2. 2. Kramer RSS, Ward R. Internal facial features are signals of personality and health. Q J Exp Psychol. 2010 Nov;63(11):2273–87.
  3. 3. Penton-Voak IS, Pound N, Little AC, Perrett DI. Personality judgments from natural and composite facial images: More evidence for a “kernel of truth” in social perception. Soc Cogn. 2006 Oct;24(5):607–40.
  4. 4. Little AC, Burt DM, Penton-Voak IS, Perrett DI. Self-perceived attractiveness influences human female preferences for sexual dimorphism and symmetry in male faces. Proc R Soc B Biol Sci. 2001 Jan 7;268(1462):39–44. Available: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1087598&tool=pmcentrez&rendertype=abstract. Accessed 29 January 2014.
  5. 5. Kleisner K, Chvatalova V, Flegr J. Perceived intelligence is associated with measured intelligence in men but not women. PLoS One. 2014;9(3):e81237. pmid:24651120
  6. 6. Kalick SM, Zebrowitz LA, Langlois JH, Johnson RM. Does human facial attractiveness honestly advertise health? longitudinal data on an evolutionary question. Psychol Sci. 1998;9(1):8–13.
  7. 7. Dion K, Berscheid E, Walster E. What is beautiful is good. J Pers Soc Psychol. 1972;24(3):285–90. pmid:4655540
  8. 8. Tzelgov J, Henik A. Suppression situations in psychological research: Definitions, implications, and applications. Psychol Bull. 1991;109(3):524–36.
  9. 9. MacKinnon DP, Krull JL, Lockwood CM. Equivalence of the mediation, confounding and suppression effect. Prev Sci. 2000;1(4):173–81. pmid:11523746
  10. 10. Mavor KI, Macleod CJ, Boal MJ, Louis WR. Right-wing authoritarianism, fundamentalism and prejudice revisited: Removing suppression and statistical artefact. Pers Individ Dif. 2009;46(5–6):592–7.
  11. 11. Conger AJ. A revised definition for suppressor variables: A guide to their identification and interpretation. Educ Psychol Meas. 1974;34(1):35–46.
  12. 12. Maassen G, Bakker A. Suppressor variables in path models: Definitions and interpretations. Sociol Methods Res. 2001;30(2):241–70.
  13. 13. Kanazawa S. Intelligence and physical attractiveness. Intelligence. Elsevier Inc.; 2011 Jan;39(1):7–14. Available: http://linkinghub.elsevier.com/retrieve/pii/S0160289610001315. Accessed 4 January 2014.
  14. 14. Denny K. Beauty and intelligence may or may not be related. Intelligence. 2008;36(6):616–8.

0 thoughts on “Halo Effect Physical Attractiveness Research Paper”


Leave a Comment

Your email address will not be published. Required fields are marked *