Bad Science: Mine is Bigger than Yours?

Ecole Polytechnique

Ecole Polytechnique is hiring full-time, permanent, faculty members in computer science (and related fields, including networking). We’re specifically looking for junior, but tenure-track, applicants – in France that position is called “Maitre de Conferences”, and at Ecole Polytechnique it entails a 2 year “trial period” before tenure – i.e., before the position becomes permanent.

First off: any candidates interested, and with a profile fitting the research interests of my research team, please do reach out to me so that I can help prepare your candidature, and answer any questions that you may have.

Secondly, every recruitment round of (potentially) permanent faculty comes with complications: for each such position open, we get 50+ qualified applicants, most of whom are even incredibly qualified. How to pick?

One (in my opinion, extremely) unfortunate dogma at the department of computer science at Ecole Polytechnique is, to stubbornly refuse to “recruit into a given thematic” for open positions. The utopia behind this dogma is “we want to always hire the best of the best of the candidates showing up, without being constrained by some a-priori decision”

… but, realistically, it just means that we have to compare an experimental scientist to a theoretician … an image recognition researcher to a data scientist … compare telecom to semantics, compare bioinformatics to quantum computing….and somehow decide between any pair of candidates “which is better”?

Given that in academia, we (i) are each specialised in our own narrow fields, and therefore (almost, by construction) are incompetent for evaluating the qualities of candidates from other fields justly, (ii) for the same reason, also likely carry some bias of the form that  “my domain is more interesting/relevant”, and (iii) a belief in objectivity and rational decision-making…..

….several metrics have, over the years, been proposed over the years for estimating “the objective value of a researcher”. The number of publications being one, alas, the prevalence of junk journals and conferences, which accept also bad science, makes that almost worthless.

Citation-count is another proposed metric: how many other publications cite your work? The main objection to that metric is, that having done one single publication getting thousand of citations may not be indicative of a “good scientist”: for example, according to Google Scholar, one of my own publications has received in excess of 5000 citations, to date but it is not necessarily my best, nor my most significant, work – it’s just the (somehow) most popular.

Then, there’s the infamous h-index, which is supposed to factor in “how constant” a scientist is: an h-index of 1 means that one has made 1 publication which is cited once. An h-index of 2 means that one has made 2 publications, each of which is cited twice. An h-index of 3 means that one has made 3 publications, each of which is cited thrice – etc. The h-index is a metric supposed to eliminate the impact of a single “popular publication”, and to add value to also a continuous publication stream.

So does all this allow us to compare scientific researchers in a meaningful way? As a point of comparison a (much senior, and whom I respect immensely) colleague & friend of mine, let’s call him Bruce, who (in his domain) is an outstanding and world-leading scientist, has less than 200 citations for his most cited paper, has about 2700 citations in total and has a h-index of (according to Google Scholar) 29.

My own (again, according to Google Scholar) h-index is 28, my most cited paper has >5000 citations, and I total approximately 10500 total citations.

Given that Bruce is much senior to me, a beancounter just looking at numbers would draw a conclusion in my favour – and in disfavour of Bruce.

Now, I picked this particular Bruce as a point of comparison, because he’s a leading researcher whom I respect immensely, a good friend, and an internationally acclaimed scientist. And, more importantly, is someone in an entirely different scientific field (though, we’re in the same department), and he’s of an entirely different scientific tradition from me.

See, this Bruce is (mostly) a theoretician. I am (mostly) an experimentalist. When Bruce proves a theorem, it is the result of months or years of minute work, and results in one publication. When I conduct an experiment, it is also the result of months or years of minute work, but it results in (usually) multiple publications. See the discrepancy, which mechanically will lead to a very different publication profile?

Consequently, of course, comparing the publication profile, the bibliometrics, of this Bruce, and myself is “comparing apples to oranges”, and brings no understanding on our respective “scientific value”.

Recruitment decisions in academia are always contentious, simply due to the frightening scarcity of (tenure track) positions. When open positions are not earmarked for a specific thematics (or, set of thematics), it’s simply impossible to use bibliometrics in a meaningful fashion as a tool of comparison. Yet, bibliometrics often end up being trotted as an argument for arbitration between candidates (& as starting points for long and drawn out arguments in recruitment committees) – end up boiling down to a “whose ****** is bigger” discussion. We’re talking h-indexes, publication counts, citation counts, of course …

Does that mean that I think that bibliometrics are useless? Not exactly. However they do put an extra onus on candidates for (non-thematically earmarked) faculty positions to explain the context in which bibliometrics are to be interpreted, both in their application and in their interview talk(s).

 

Featured image “Big Data” courtesy of Timo Elliott – http://timoelliott.com – and used with his kind permission