Screening vs. Profiling: Are You Confusing Them?

THE LINE/Image: Michael Moffa

Screening and profiling are supposed to be different. Merriamwebster.com distinguishes them along these lines:

Screening: examination, usually methodically, in order to separate into different groups.

Profiling: the act or process of extrapolating information about a person based on known traits or tendencies, e.g., consumer profiling

Description vs. Inference: Not the Same

Notice how this distinction corresponds precisely to the difference between “descriptive statistics” and “inferential statistics”. The former entails grouping, classifying, sorting and otherwise describing data, whereas the latter involves making inferences from those data. Screening is supposed to be just like doing descriptive statistics, not inferential statistics. For example, if I report that in an entire population of software job applicants on file in my office, 40% have had 1-4 years experience, that is a descriptive statistic. I can then proceed to sort applicants into three groups: those who have 1-4 years experience, those who have more and those who have less. All these data are descriptive statistics only.

If I then (probably incorrectly) infer that, in the general population, 40% of all people in the U.S. software industry have 1-4 years experience, that would be an exercise in inferential statistics, because the conclusion makes claims about an unexamined population outside the sample on which it is based, and therefore carries a measureable risk of statistical error.

Because an inferential statistic carries or embodies a greater risk of being wrong than a descriptive statistic does, you have to be careful not to equate or confuse them. (It is true, however, that a descriptive statistic can be incorrect, e.g., at the data gathering stage, if the information is incorrectly recorded, defined, classified, measured or prioritized as to relevance. But erroneous inferences from the data will not contaminate such data at this purely descriptive stage.)

The Big Mistake

The big mistake is to fail to correctly distinguish your screening of applicants, which is simple sorting based on the known data, from your profiling of them, which is extrapolating from those data to infer other data or reach conclusions about the applicants. Ideally, this means that when you are initially reviewing resumes, your appropriate task is to merely gather information and data about each applicant, and not to make inferences from the data or about the applicant.

In practice, this can be extraordinarily difficult, psychologically speaking—especially when you are busy or otherwise tempted or pressured to jump to conclusions. It is also a manifest risk posed by your having to notice, note and juggle the numerous diverse parameters and information packed into each and every resume.

If you were a statistician, you would face no such risks: You would collect, classify and sort your data at the prerequisite descriptive statistics gathering stage. Then, and only then, you would proceed to make inferences based on them. Simple.

But you are not a statistician; so, you are at much greater risk of mixing these two processes. Suppose, for example, you are to sort applicants by several criteria, including highest degree attained, the minimum being a B.A., into two piles—“acceptable” for further screening and “not acceptable”. Then you find two applicants with master’s degrees, one from Princeton and one from Wherezits University. While juggling all the other parameters on their resumes and in your head, you instinctively put the Princeton application into the preliminary “acceptable” or “for further screening” pile and banish the Wherezits University application to the reject, “unacceptable”, “No” pile, because you’ve never heard of Wherezits University.

At that point, you have crossed the line from screening to profiling. That’s because although you were presumably sorting into piles based on satisfying/not satisfying the minimum criterion of ownership of a B.A., you were tempted or (you thought) farsighted enough to conclude that the Wherezits applicant would be unlikely to make it to the finals, much less beat the Princeton guy.

However valid that inference may have been, it amounted to a confusion of your tasks, because it transformed descriptive data gathering and sorting into statistical inference, viz., the inference that this Princeton M.A. will have been earned by an applicant who is better than the applicant with the Wherezits M.A, based on whatever “samples” you have had access to or prior (or imagined) probabilities associated with possession of a Princeton degree vis-à-vis a Wherezits degree.

Sloppy Descriptive Sorting

Moreover, you are at risk of mixing up, losing track of and inconsistently prioritizing your descriptive criteria. For example, you are sorting a pile of resumes submitted by software engineers, according to some key criteria, e.g., years of experience and programming languages mastered. Statistically, these candidates could be descriptively grouped by either criterion and ranked in one “experience” file and in a separate “programming languages” file, for further, perhaps joint review later. A more sophisticated approach would record and assign a relative weight to these two parameters as categories and a weight to the actual “scores” within the categories, for the purpose of a descriptive ranking.

If your task at this preliminary stage is merely to screen by sorting resumes into clearly acceptable/unacceptable piles, when you are juggling multiple criteria, you may, in reading Applicant A’s resume, be so dazzled by his programming language expertise that you don’t notice his limited experience, whereas Applicant B’s appeal may be precisely the reverse.

What happened is that, because you were dazzled, both ended up in the “acceptable” pile, the result of your unconsciously switching your task from looking for “experience AND programming languages” to looking for “experience OR programming languages”. This result was the consequence of your switching from sorting them by joint descriptive criteria to sorting them through the inference that if the applicants possess one of these qualifications, the lack of the second can be compensated for down the road, after hiring, or otherwise offset.

In effect, you extrapolated from experience to expertise with one applicant and the reverse with the other, rather than simply sticking to the protocol that sorted descriptively by possession of both qualifications. You switched from sorting and prioritizing descriptive data and information to unconsciously making inferences from those data.

Jumping the Gun and the Line

Resisting the temptation and pressures to jump the gun and cross a statistical line is not only difficult, but also perhaps not always necessary or desirable. Imagine you were an auto parts inspector on a manufacturing line at which you were tasked only with looking for chipped paint and noticed a dent. You would be helping out the dent inspectors by flagging that part for the dent, even though no paint was chipped. Reviewing resumes can be like that: Organic, multi-tasking, foresight-based and simultaneous, rather than purely sequentially-conducted “inspection”.

The hypothetical parts inspector knows that the dent is eventually going to get the part yanked from the line when it reaches the dent-inspection stage and will therefore flag it (to ensure it is not overlooked). Likewise, you may be predisposed to and justified in flagging a candidate who has an attribute or credential that will, for better or worse, become very germane at a later, post-preliminary screening stage.

However, this multi-tasking, foresight-oriented approach is still only a descriptive statistical exercise: You are sorting parts or people on the basis of their noted, prioritized characteristics, not on those you infer, given that what is described as a “dent” in a part or a resume is sufficient to reject it.

Despite the obvious inference that a dented part or lack of experience will be substandard, that inference is not yours. It is built into the criterion that identifies dents or lack of experience as substandard. Your job at the screening stage is to apply that standard, not to make additional inferences from it. Unless assigned the job of profiling and making inferences about batches of or individual candidates or parts, those charged with screening should limit themselves to merely sorting the dented from the undented, or flagging the dents (while screening for other things), if they are not to cross the line…

… on the factory floor or in the office.

By Michael Moffa