A Critique of The Myers Briggs Type Indicator (MBTI)—Part I: One Expert’s Review

“Although the MBTI is an extremely popular measure of personality, I believe that the available data warrant extreme caution in its application as a counseling tool, especially as consultants use it in various business settings”—Dr. David J. Pittenger, psychometric researcher and Dean of the College of Liberal Arts, Marshall University, in “Cautionary Comments Regarding the Myers-Brigg Type Inventory”,Consulting Psychology Journal: Practice and Research, summer 2005

THE MBTI FUNNEL/Image: Michael Moffa

The Myers-Briggs Type Indicator, exclusively distributed by California-based CPP, Inc, is perhaps the most popular diagnostic self-test offered to identify personality type as an adjunct to counseling, selecting and placing staff, with, in all likelihood, many jobs having been won or lost because of it. Despite its revisions, defenders and immense popularity, it has its critics—lots of them.

Among the comprehensive, authoritative and negative reviews are those of Professor David J. Pittenger, currently Dean of the College of Liberal Arts at Marshall University, and veteran researcher in the area of social science statistics and psychometrics. In his published research from 1993 to 2005, and in a recent conversation with me, Dr. Pittenger has stated and reiterated his doubts about the MBTI.

(In Part II of this article, I will add my own layman’s concerns, voiced from the perspective of a prospective MBTI end-user, with an emphasis on purely conceptual issues.)

MBTI: Popular Because It Is Popular?

Dr. Pittenger maintains that, despite revisions in the MBTI over the years, there remain major shortcomings and limitations of the test, identified and discussed in his studies spanning 18 years. His research and reviews by other critics invite the conjecture that the MBTI continues to be popular with test-takers, recruiters and counselors mostly because it has been popular for such a long time and so aggressively and effectively marketed.

Apart from being critiqued, the MBTI is also embarrassingly ignored where it hurts. Despite deeply rooted grass-roots interest in the MBTI, it seems not to have resonated with the APA, the prestigious American Psychological Association, since my search on its site turned up only one article mentioning it—a 2002 study that explicitly excluded the MBTI from its research.

To be fair and to illustrate how controversial the MBTI is, it must be noted that a search at PubMed.com (a comprehensive U.S. government archive of medical and psychological abstracts and articles) quickly reveals a significant number of recent studies that utilized the MBTI, including an abstract of a 2010 project at Stanford. Moreover, there are counter-studies that attempt to respond to the test’s critics on an ongoing basis.

Anyone motivated to delve even more deeply into these issues should explore the extant research on both sides of the issue, including Dr. Pittenger’s, that of the Myer-Briggs Foundation, qualified test administrators and other researchers.

Funneling and Other Failures

Caveats in place, here are some of Dr. Pittenger’s 1993 and 2005 criticisms, all of which he believes still serve to red-flag the MBTI:

“Several studies, however, show that even when the test-retest interval is short (e.g., 5 weeks), as many as 50 percent of the people will be classified into a different type.” This is to say that the test fails to meet standards of ‘test-retest’ reliability.” (“Measuring the MBTI…And Coming Up Short”, Journal of Career Planning and Employment, 1993. 54: p. 48-53.)

“…across a 5-week retest period, 50% of the participants received a different classification on one or more of the (MBTI) scales”(“Cautionary Comments Regarding the Myers-Brigg Type Inventory”, Consulting Psychology Journal: Practice and Research, summer 2005)

What makes this awkward for MBTI true-believers is that your type, unlike your mood or life circumstances, is, on the MBTI’s own background assumptions, supposed to never change. This kind of variability over time is as problematic for the MBTI as any changes in our innate IQs that are not supposed to occur.

This means that any conclusion you as a recruiter or as a candidate might reach on the basis of the MBTI typing, about, e.g., how good the job-candidate match is, might be wrong as much as 50% of the time—assuming that either of the two MBTI conflicting typing results is correct in the first place.

“Because the MBTI uses an absolute classification scheme for people, it is possible for people with relatively similar scores to be labeled with much different personalities… This means that although one person may score as an E, his or her test results may be very similar to those of another person’s, who scores as an I.”(Ibid.,1993)

“As a generality, (it can be said that) using dichotomous scores reduces the predictive power of continuous scales.”(Ibid., 2005)

This problem is analogous to the kind of inaccurate classification of police academy applicants that would result if they were grouped into only the categories “strong” or “weak”, based on how many push-ups they can do. A difference in one push-up would completely change the category an applicant would end up in and the chances of being recruited as a police cadet.

Moreover, a tested applicant who completed only one push-up more than the minimum, and who would therefore be classified as “strong”, would not be distinguished from another “strong” candidate who completed twice as many push-ups—thereby misrepresenting huge objective differences between them as non-existent and irrelevant and, as Dr. Pittenger suggests, making precise predictions about performance very difficult, if not impossible.

This is not to say that strict performance or other classification boundaries are not appropriate, needed and to be rigidly applied in many instances, e.g., minimum military recruitment age of 18, voter eligibility age or police cadet testing.

However, it is one thing to have an unavoidable, even if somewhat arbitrary cut-off; it is quite another to then use it to misleadingly—indeed, distortingly—to ignore dramatic and clear-cut objective, important and germane differences among test-takers, including the degree to which they display the simple all-or-none trait of being “strong” or “introverted”.

This point is easily made if the focus is shifted from the MBTI cut-off points to voter registration cut-offs: Would anybody imagine that the difference between an 18-year-old and a 17-year old indicates anything more than the difference between being “eligible” and “ineligible” according to the rule (not according to nature)? Can you imagine that it clearly demarcates “youth” and “non-youth”, or “child” and “adult”—except by legal or administrative conventions and ultimately or partially arbitrary rules?

Given the MBTI’s own methodology and declared purposes, the cut-off points between its test types should not be viewed as such “eligibility” markers or as mere conventions and rules, inasmuch as the test types are supposed to distinguish real empirically distinguished psychological types by identifying real psychological differences among test-takers.

Although the MBTI is not—unlike the military recruitment age evaluation—supposed to be an eligibility test, its artificial sharp demarcations by type labels, e.g., “ESTJ”, that appear to be precise and accurate, may tempt client companies to misapply them as criteria of eligibility for a job—a temptation to which they may succumb given human nature and despite any disclaimers from the MBTI designers, administrators and advocates.

MBTI-type black-and-white dichotomies, such as the MBTI’s “E-I (“extravert-intravert” (sic)), “N-S” (intuitive-sensing), “T-F” (thinking-feeling), “P-J” (perceiving-judging) dichotomies should be spectrums, not disjoint points. Yes, they are continuous scales in one sense. However, they are spectrums only in that they situate a given candidate’s trait score alongside other candidate scores on one “inter-personal”, comparative score axis.

In fact, they should also be interpreted as “intra-personal” continuous scales along which a single candidate can be identified with respect to not only other test-takers, but also to himself or herself , as mood, situation, priorities and other variables change or reverse themselves.

Obviously, many of us will think intensely in some situations, feel intensely in others and do neither in most of the rest. Likewise, even though most of us will have pronounced tendencies toward being either introverted or extroverted (which, following C.G. Jung, the MBTI uniquely and unconventionally spells as “intraverted”/”extraverted”), we will often reverse ourselves, depending on factors like how much we enjoy the company we are in on any given occasion, whether we’re having a bad-hair day, or on how tired we are.

Moreover, to the extent that we are mixed types, say, introverted 80% of the time and extraverted 20% of the time or in 20% of our social encounters, the rigid MBTI “intravert/extravert” dichotomy will not distinguish someone who is a 90%-10%“intravert-extravert” from someone who is 55%-45%—professionally and psychologically a potentially huge difference (where these percentages can refer to self-ratings, percentage of 100 listed situations in which one must choose behaving as one or the other, etc.)

Consider my case, for example: When asked which I am—“extrovert” or “introvert”, I reply that I am a “tactical extrovert”, but a “strategic introvert”, That means I will go to a house party, mix and mingle, talk and joke, while trying to meet and leave ASAP with one special woman I can then cocoon with in a cozy and otherwise empty all-night café.

In business, I will unhesitatingly and extrovertedly organize, interact with and even lead groups in order to make my generally preferred and customary introvert-friendly solitary work proceed much more smoothly. So, what does that make me—an “E” or an “I”?

As Dr. Pittenger maintains, disregard of interpersonal differences among test-takers tagged with the same label is a serious limitation of a test that funnels a continuum of performance or traits into a much narrower and absolute dichotomy.

To this misgiving must be added the aforementioned concerns about disregard of intra-personal differences within the single individual’s personality as mood and circumstance change.

“Specifically, 83 percent of the differences among the students could not be accounted for by the MBTI. The results led the authors to the conclude that the factors found in the statistical analysis were inconsistent with the MBTI theory.”(Ibid., 1993)

What this means is that if the MBTI theory of types is both accurate and precise, it should be both sufficient and superior to alternatives for the purpose of describing and explaining personality differences among those it tests. Dr. Pittenger is saying that in this crucial respect the MBTI fails.

“Although bimodality appears to be an essential characteristic of the distributions of scores, it appears to be conspicuously absent”(Ibid., 2005)

For the MBTI to be a valid test, the scores of test-takers should, when plotted on a graph, look like an Asian Bactrian camel—two humps, i.e., bimodal. Instead, the real data look like the African dromedary camel—only one hump, just like the normal curve for IQ.

Getting the normal curve as a result means there will actually be very few people scoring at the extremes, e.g., very low or very high IQs and, analogously, very low or very high “intraversion” scores. This is not good for claims made for the MBTI, since the test critically depends on identifying everyone as either a strongly or weakly “intraverted” type.

“The proportion of ESTJs in the teaching profession is the same as the proportion of ESTJs in the general population, or 12 percent. This similarity suggest that there is nothing special about the (MBTI) type of person who becomes an elementary school teacher.”(Ibid., 1993)

To see Dr. Pittenger’s point clearly, imagine a test for identifying and matching left-handers for job placement, based on personality traits. Those who score in the range for having the traits of a left-hander are called “L”s, as opposed to “R”s (right-handers) and “A”s (the ambidextrous).

Suppose it is further argued that being an “L” indicates that you are most suited to being a teacher, if not already inclined to become one as well. If the percentage of teachers who are “L”s is the same as the percentage of “L”s in the general population, which is in fact close to 12%, and if only “L”s are truly best-suited to becoming teachers, one or more of the following must be true:

1. Despite being perfectly matched for teaching, the vast majority of “L” types somehow ended up in other professions (which raises the question of how the test designers came to associate “L” so strongly with teaching).

2. Despite being mismatched for teaching, the majority of teachers, who are “R”s or “A”s, became teachers (which raises the question of how the test designers came to so strongly exclude “R” and “A” types as well-suited for teaching).

3. Being an “L” does not explain why any, many or most teachers became teachers.

Now, replace “L” with “ESTJ” and re-read the last three points. What Dr. Pittenger suggested is that whatever is quantitatively and/or qualitatively special about the kinds of people who become teachers, it isn’t being an “ESTJ type”.

In suggesting that teachers are or should be a homogeneous ESTJ group that is at the same time different from the general population, the MBTI has to confront the very problematic factual data Dr. Pittenger cites.

“..there is no evidence to show a positive relation between MBTI type and success within an occupation.” (Ibid., 1993)

Ouch…That’s gotta hurt. If one’s MBTI type is at best a weak predictor, if any, of success within an occupation, then of what use is it?—as an indicator only of “good-fit”, if even that? If you are a teacher in a job interview, somehow explaining how you were a great, but unsuccessful teacher is probably going to take up more of your interview than you might want it to—and waving your MBTI test-result print-out in the face of a recruiter probably won’t help.

(In Part II: a prospective test-taker’s take on the MBTI.)

By Michael Moffa