Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Open Source Science

All Researchers To Be Allocated Unique IDs 164

ananyo writes with information on a new scheme to help uniquely identify authors in the face of ambiguous names. From the article: "In 2011, Y. Wang was the world's most prolific author of scientific publications, with 3,926 to their name — a rate of more than 10 per day. Never heard of them? That's because they are a mixture of many different Y. Wangs, each indistinguishable in the scholarly record. The launch later this year of the Open Researcher and Contributor ID (ORCID), an identifier system that will distinguish between authors who share the same name, could soon solve the problem, allowing research papers to be associated correctly with their true author. Instead of filling out personal details on countless electronic forms associated with submitting papers or applying for grants, a researcher could also simply type in his or her ORCID number. Various fields would be completed automatically by pulling in data from other authorized sources, such as databases of papers, citations, grants and contact details. ORCID does not intend to offer such services itself; the idea is that other organizations will use the open-access ORCID database to build their own services."
This discussion has been archived. No new comments can be posted.

All Researchers To Be Allocated Unique IDs

Comments Filter:
  • by girlintraining ( 1395911 ) on Wednesday May 30, 2012 @12:24PM (#40156399)

    Hmm. A new program to uniquely track and identify scientists springs up in the middle of an all out war between science and the idiocracy. Totally coincidental. *adjusts tin foil hat*

    • I hear they also get these spiffy armbands they can^H^H^H are required to wear.
    • There has always been a war between science and the idiocracy. It's just usually previously we attached a different prefix to "-cracy".
    • by eldavojohn ( 898314 ) * <eldavojohn@noSpAM.gmail.com> on Wednesday May 30, 2012 @12:28PM (#40156473) Journal

      Hmm. A new program to uniquely track and identify scientists springs up in the middle of an all out war between science and the idiocracy. Totally coincidental. *adjusts tin foil hat*

      No need to adjust your tinfoil hat. I read this article and thought "Oh, great, now Virginia's Attorney General can conduct more accurate witch hunts [slashdot.org]." (he was unable to properly identify over 30 scientists and researchers)

    • Alex Jones, is that you?

      Is Y. Wang the inventor of Wang computers? I hear women in offices love their Wangs.

      Why not solve this problem by just using the full name?
      Oh well. (shrug)

      • Nope, that was An Wang [invent.org]
      • by LostOne ( 51301 )

        Full names are not necessarily unique either.

        • by jc42 ( 318812 )

          Full names are not necessarily unique either.

          Indeed. A few years ago, I ran across a US Census Bureau web page that gave the number of people with specific first or last names, and an estimate (likely from multiplying the fractions) of the number of people in the US with a given first+last name. It said that there are about 1800 people in the US with the same name as me, and my family name isn't even Smith or Jones or any of the other top 100.

          Through the years, I've seen a number of bibliographies that list things that I've written, intermixed wi

        • by rgmoore ( 133276 )

          Full names are not necessarily unique either.

          Or constant, for that matter. Even in academia, some people (mostly women) still change their names when they marry, which can add to the confusion. Imagine tracking all the papers by Mary Jane Smith nee Jones. Having a unique personal ID would solve the changing name problem as well as the non-unique name problem.

          • by dtmos ( 447842 ) * on Wednesday May 30, 2012 @03:00PM (#40158543)

            . . .some people (mostly women) still change their names. . .

            . . .not to mention the difficulties faced by, e.g., Lynn Conway [wikipedia.org]. Being able to generate a new identity for oneself can have advantages [umich.edu].

            Lynn was fired, and forced to leave the field of computer science/engineering after telling her bosses at IBM that she was to undergo sex reassignment surgery in 1968. She could re-enter the field only because she could create a new identity (this time as a woman), starting a new career all over again at the bottom of the ladder.

            Who is this person, you may ask? As a man, in the 1960s, she invented processor multiple-issue out-of-order dynamic instruction scheduling. After her transition? Oh, nothing . . . only co-authoring (with Carver Mead [wikipedia.org]) Introduction to VLSI Systems which, by promoting the use of standard cells, automated design tools, and silicon foundry services (e.g., MOSIS [wikipedia.org]), revolutionized the field of digital integrated circuits. Virtually every digital chip today is designed in this way; and there are many in the field who cannot conceive of any other way to do design ("Wasn't it always done this way?").

            If Lynn could not have generated a new identity and re-entered the field as she did, these critical advances may have been delayed for years.

      • Re:Unique IDs eh? (Score:5, Insightful)

        by jdgeorge ( 18767 ) on Wednesday May 30, 2012 @12:55PM (#40156869)

        Why not solve this problem by just using the full name?

        Because it wouldn't solve the problem at all. There are many researchers with the exact same full name. One reason we have Social Security numbers in the US is because full names have a strong tendency to be similar.

        That said, I'm sure the Wangs can come up with a solution. huh-huh...

    • Middle of an all out war? I don't really think it's the middle of anything. Science is always going to be opposed to ignorance and superstition. The proponents of ignorance and "My religion says your facts are wrong!" aren't burning scientists at the stake or arresting them too much anymore, so I'd say the battle has gotten more civil anyway.
  • by bakuun ( 976228 ) on Wednesday May 30, 2012 @12:26PM (#40156425)
    ... - one of them, for example, is ResearcherID at http://www.researcherid.com/ [researcherid.com] . None of them have really taken off so far, and there is nothing to say that this one will. I am skeptical.
    • Individual researchers will be able to get an ORCID number for free as of later this year, whereas universities, companies and other organizations will pay tiered-subscription charges. So far, the scheme has been sustained by members working in kind, as well as by donations of US$574,000 and loans of $1.2 million. Once membership fees begin flowing, they are expected to raise $2.5 million each year.

      With $1.2 million dollars of debt already, and with the expectation that ORCID will become a tiered paid subscription service, I don't see any reason why anyone would want to use ORCID instead of researcherID [researchid.com]. What will happen to your ID if ORCID goes bankrupt?

      Also, what happens when you're a prolific researcher into two different field of studies that usually do not mix very well? Can the new system assign you two different IDs?

  • 16-digit ID (Score:4, Insightful)

    by Alomex ( 148003 ) on Wednesday May 30, 2012 @12:28PM (#40156461) Homepage

    I'm so glad they made the ID a fixed length 16-digit number. Experience shows that we are very good at predicting the total number of IDs ever to be needed.

    Plus 54 bits should be more than enough, so no need to make the number extensible, thus wasting one precious bit as a field extension identifier.

    • How big of a population explosion are you expecting?
      • Organizations and their sub-organizations, as well as devices themselves (e.g. Watson) may have research IDs. Imagine a Beowulf cluster of machine researchers!
        • The article said the number is assigned to the researcher ( a person ) and not the institutions or its sub-organizations. I did see a tiered subscription rate being mentioned to pay for its employees to be a member. I didn't see any references to artificial intelligence machines.
        • how in earth was such "failed at reading comprehension" junk get up-modded?
      • by Lumpy ( 12016 )

        "How big of a population explosion are you expecting?"

        It depends if T Wang's Cloning project is a success.

        It could be a massive explosion.

    • Re:16-digit ID (Score:4, Insightful)

      by Kenja ( 541830 ) on Wednesday May 30, 2012 @12:33PM (#40156531)
      Dont worry. The number of researchers, scientists and engineers is going down, not up.
      • The number of researchers, scientists and engineers is going down, not up.

        In terms of proportion of population, that may be true; in terms of absolute numbers, I'm pretty sure it's not. The number of papers published continues to grow at a more-or-less exponential rate, and while it's true that the "publish or perish" mentality forces researchers to have their names on more papers now than ever before (which is easier than it used to be, because author lists are also getting longer; it's not unusual for papers in biology to have ten or more authors listed) I have a hard time bel

        • which is easier than it used to be, because author lists are also getting longer; it's not unusual for papers in biology to have ten or more authors listed

          So true. Here's [nature.com] a paper with 27 authors listed.... here's [nature.com] another one with a whopping 80 authors!

      • A sum of numbers that are decreasing may still be infinite.

    • by Jeng ( 926980 )

      If this program runs long enough for this to be an issue then it would have to be very very successful for many hundreds of years.

      • If this program runs long enough for this to be an issue then it would have to be very very successful for many hundreds of years.

        With a 16 digit number, it would be successful for at least 100,000,000 years, assuming EVERYONE was a researcher....

        With a more realistic estimate of the number of researchers, the 16 digit number ought to be good for well beyond the lifetime of the Universe....

    • If the Machines take over and use researchers as power cells with their unique 16-digit number to identify them, each researcher could take up 1.5 square feet and they would still run out of land area on Earth before than ran out of IDs.

    • by simonbp ( 412489 )

      Are you predicting that there are going to be 10^16 scientists anytime soon? If they are all on Earth, that would be just less than 20 scientists per square meter (or about 65 scientist per square meter if you just count land area).

      • by kbg ( 241421 )

        Well, scientists have a tendency to die after 100 years or less, so it is possible to to cram 20 or more on a square meter when they are under ground and made of ash :)

    • 9 999 999 999 999 999
      I have no idea what number that is. What comes after trillions? Anyway that's 10,000 trillion people that can be identified with this system..... more than the total number of humans that have ever lived on earth. More than the 40 billion that lived on Asimov's metal world/capital planet called Trantor. (Or when Lucas ripped it off: Coruscant.)

    • by tlhIngan ( 30335 )

      I'm so glad they made the ID a fixed length 16-digit number. Experience shows that we are very good at predicting the total number of IDs ever to be needed.

      You know, a financial payment card (credit card, debit card, etc) are 16 digits in length, The first 6 are special as is the last, which mean there are 9 unique ID digits in it. Yet we don't seem to be running out of numbers even though when a bunch get "liberated" from a payment processor, most financial institutions simply re-issue a new number to you.

      • by plover ( 150551 ) *

        No, we're constantly running out of account numbers. It's such a common problem that several stop-gap solutions have become commonplace, which is why you never noticed.

    • hmm, 10^16 = 10,000,000,000 * 1,000,000
      So enough researcher IDs for everyone on earth to get a new one every year for the next million years.... Somehow I suspect the system will be replaced by something else for completely unrelated reasons long before they start running out of available IDs.

      • by Alomex ( 148003 )

        There were equally convincing arguments when we chose seven digit phone numbers, 16 digit account numbers, and 32 bit IP numbers.

        Yet we ended up running of each one of those. The reason why is that once an identifier succeeds, its use gets extended beyond its original purpose. For example, phone numbers were supposed to be one per household, yet my household with only two adults has seven phone numbers attached to it.

  • by nutgirdle ( 2640927 ) on Wednesday May 30, 2012 @12:29PM (#40156479)
    is researcher M.Y. Wang. He does mostly the same experiment once or twice a day.
    • Re: (Score:2, Funny)

      by Anonymous Coward

      He is working on the new handshaking protocol, right?

  • public key (Score:5, Insightful)

    by ags1 ( 1883204 ) on Wednesday May 30, 2012 @12:32PM (#40156511)
    Can't we just sign docs with a private key? The public key's finger print can be your unique id. Or are we still attached to paper?
    • by plover ( 150551 ) *

      The problem isn't attribution as much as it is cross referencing. Say I want to refer to Y. Wang's paper on network theory. I wouldn't use his signature in my paper (signatures are not searchable.)

      • by ags1 ( 1883204 )
        Cross referencing would be done on name and the public key's finger print, not the key itself.

        Anyone can generate a public/private key, so we don't need an organization to manage (collect fees) the handing out of numbers. Or deciding who is a scientist and who deserves to get a number.

        Attribution would be a nice bonus.
      • You could use his key fingerprint to reference him.

        The problem with doing that kind of thing though is that over time people move to new keys. Either because they want a stronger key or because the old one is lost or compromised and of course some people may deliberately create multiple persona's for themselves (whether this is a good thing or a bad thing is a matter of opinion).

  • Is there a serious problem with authors sharing names? I am sure it happens, but (a) it seems unlikely that they would be in the same field and (b) it seems even less likely that they would be at the same institution and (c) even less likely that their contact information would be the same so are there really cases where there is confusion over who wrote a paper?
    • Re:Problem? (Score:5, Insightful)

      by JaredOfEuropa ( 526365 ) on Wednesday May 30, 2012 @12:42PM (#40156669) Journal
      I have a last name that is very uncommon in the Netherlands, even more so because it is capitalized differently than usual, and it is "misspelled" to boot. Even so, there's a guy (not family) who shares my first and last name, went to the same university, same department, and graduated on a topic similar to mine. We've published on overlapping topics. So yes, confusion does happen, and I've often been contacted by someone looking for the other guy. Sounds nice since I have just four publications to my name whereas he went into research and has many more, but of course I can't take credit...
      • There goes my slight advantage in academia due to a unique last name (a fairly common one in the Ukraine but spelled in an unusual way in English). I always hoped that it would make up for being trivial to cyber-stalk but oh well.
      • by martas ( 1439879 )
        Maybe you two should just give up and pretend to be the same person? Twice the productivity, for free!
    • by Anonymous Coward

      Actually, there is. My wife is a PhD who studies stroke and epilepsy. There is another scientist in Germany who has the same first initial and last name who also studies stroke. He's been writing papers ~20 years longer than my wife has, but when you search for her papers, you get craploads(SAE standard measure) of his, as well.

    • by ceoyoyo ( 59147 )

      Yes. And people switch institutions, and fields. At the moment, if someone has a common name, looking up their papers is an exercise in AI. With a unique identifier you'd be able to tell Google Scholar "get me all the other papers by this author in the last ten years."

    • Re:Problem? (Score:5, Insightful)

      by Daniel Dvorkin ( 106857 ) on Wednesday May 30, 2012 @12:55PM (#40156873) Homepage Journal

      I am sure it happens, but (a) it seems unlikely that they would be in the same field

      I have a few name collisions just in my own reference database (i.e., list of papers to which I've referred in my own work.) I can pretty much guarantee that if you look at the author lists for any major single-subject journal, you'll find a whole bunch of identical $FIRST_INITIAL $LAST_NAME entries which are not, in fact, the same people.

      Hell, I have a pretty rare (in the US, at least) last name -- and occasionally I still get e-mails from people who think I'm the Daniel Dvorkin who wrote a paper on psoriasis in 1989. It's not entirely unreasonable, since my name appears on a couple of papers related to inflammatory disease, but I'm a grad student in Colorado, not a dermatologist in Pennsylvania ...

      and (b) it seems even less likely that they would be at the same institution and (c) even less likely that their contact information would be the same so are there really cases where there is confusion over who wrote a paper

      True enough, but people who are looking at author names are not necessarily looking at the entire paper (where contact information is usually given.) A related problem is that journal publications are increasingly subject to various kinds of text data mining, and rightly or wrongly, the format for fields like author institution and contact information isn't standardized from journal to journal -- and in academia, both institutions and e-mail addresses are subject to frequent change. If you published a paper five years ago while at the University of East Dakota and your e-mail in the corresponding author field was given as betterunixthanunix@eastdak.edu, and you're now at South Virginia State with the address butu@svs.edu, good luck getting any database to make that connection without human assistance.

    • by godrik ( 1287354 )

      We developed recently a web service for recommending papers, reviewers and journals out of the citations of a paper ( http://theadvisor.osu.edu/ [osu.edu] ). Having conflict in the names can be problematic. Many paper recommendation algorithms use the property that two papers share the same authors, they must be somewhat related. Having name conflict lower the quality of that assumption. Though, some database are already disambiguated. For instance DBLP adds an ID to the name in case there is more than one. (but it

    • by Hatta ( 162192 )

      Think of the birthday paradox [wikipedia.org]. It's quite unlikely that any given researcher has the same name as any other given researcher. But there are n!/(2(n-2)!) pairs to consider, which gets big really fast, so you have to adjust your expectations for multiple comparisons.

    • People move between institutions and their contact information changes. So if you insist on a match on all those fields you will reduce the chances of incorrectly indicating papers as being by the same author but you will increase the chance of incorrectly indicating papers as being by different authors.

    • by mikael ( 484 )

      The funny thing is, similarnames seem to have similar levels of achievement. Perhaps parents were in similar social circles.

      • For some reason growing up, everyone I knew named "Ryan" was a trouble maker, even though they were completely unrelated and came from different families/backgrounds. I always thought the name was cursed.
    • > Is there a serious problem with authors sharing names?

      For the FIRST and LAST time, names are NOT unique; they are a just a convenient label as history clearly shows: Henry I, Henry II, Henry III, ... Henry VIII, etc.

      Unique numbers are the only proper way to solve this problem once and for all.

    • I have often been contacted via email by correspondents under the misguided impression that I am an Italian author and/or an Israeli public intellectual. I am not sure if these two are the same man, but I know from childhood that someone with my name once won some kind of prestigious prize for his fiction writing (was it a Pulitzer? I don't remember).

      For the record, I'm the computer programmer by this name from Northeastern America.

  • I changed my name to Steve Supercalifragilisticexpialidocious for nothing?!?!?
  • by acidradio ( 659704 ) on Wednesday May 30, 2012 @12:39PM (#40156609)

    The Writers Guild of America requires that all members have unique names. There cannot be two of the same person as to prevent confusion. This is evident with David X. Cohen, well known as a writer for The Simpsons and Futurama. His real name is David S. Cohen but the Writers Guild of America already had a David S., so he took David X. Cohen.

    • France used to require government approval for children's names when registering births. This was a francophone thing, not a uniqueness thing. But it could have been expanded to use a uniqueness check. Corporation and D/B/A names have to be unique within their jurisdiction.

      Names in China used to be disambiguated by asking "What is your village?" This is no longer very helpful.

    • David H. Lawrence XVII, the guy who played the Puppetmaster on Heroes, had a similar problem. His Wikipedia entry [wikipedia.org] explains: "The 'XVII' in his name was a way for Lawrence to distinguish himself from previous David Lawrences already registered with SAG. At the time, he was the 17th David Lawrence listed on IMDB, and appended the number to his name upon his own registry."

    • by Daniel Dvorkin ( 106857 ) on Wednesday May 30, 2012 @01:13PM (#40157105) Homepage Journal

      There are many, many more scientists than there are members of WGA.

    • I've always wondered why people name their kids names that already exist anyway. It's a nightmare for database administration to say the least and then on top of that I think the human race has just given up when I see this. I mean - at one point all the names that exist were new names that never existed before. People made them up. So when exactly did the creation of new names stop and why? Have we evolved imagination right out of our brains or just become lazy?

  • by DeeEff ( 2370332 ) on Wednesday May 30, 2012 @12:42PM (#40156667)

    Overall, I thought having multiple researchers with the same name was a good thing.

    Then we could each take credit for one another's work, and we'd all collectively be the biggest badass in science. It'd sure make research funding easier, in any case.

  • There is already a profusion of similar ID systems operated by the big players in the field. For instance, Web of Knowledge http://apps.webofknowledge.com/ [webofknowledge.com] and Scopus http://www.scopus.com/ [scopus.com] already have some kind of an automated author sorting system behind their paywall. I think that Thomson is also behind ResearcherID. Plus there is ResearchGate which creates a profile for you without asking you anything and computes a totally non-transparent metric of your impact as a scientist. In the end, I think that
  • It is interesting on their position on this, we will create the method but someone else will have to create the database and maintain it. What I see here is that they see a bag of worms when it comes to privacy issues and do not what to touch that part of it. If an issue results in some aspect of the collection of such information, ORCID’s only involvement will be the DB structure. They had better include some temple or recommended best know practices on how a collection of this data should be hand

    • by slew ( 2918 )

      It is interesting on their position on this, we will create the method but someone else will have to create the database and maintain it. What I see here is that they see a bag of worms when it comes to privacy issues and do not what to touch that part of it. If an issue results in some aspect of the collection of such information, ORCID’s only involvement will be the DB structure. They had better include some temple or recommended best know practices on how a collection of this data should be handled.

      Creating it is one thing, operating such a creation should also be addressed before untended consequences happen.

      FWIW, I think privacy and it's evil twin identity theft are probably issues that they shouldn't be solving since they are proposing some sort of cheezy author validated "password" system. They don't seem to have any way to address anything about keeping people from taking credit for publications of others with the same "name" (CV fraud), or publishing crap papers under someone's name to ruin their reputation, or other types of identity theft, so hopefully they aren't trying to do this. Although they are m

  • by Lumpy ( 12016 )

    Why cant they just do "Researchername,DOB"?

    If you have 20 researchers all named I.P. Freely and are all born on 12/13/1992 then I think there is a bigger problem here.

    • If you have 20 researchers all named I.P. Freely and are all born on 12/13/1992 then I think there is a bigger problem here.

      Yeah, definitely a potential overflow problem.

    • by jc42 ( 318812 )

      Why cant they just do "Researchername,DOB"?

      There have been numerous reports from many countries about duplicate government ID numbers [usatoday.com] due to schemes like this. There was a recent story about a similar case in Canada, with two people born the same day in the same hospital that were given identical names.

      Yes, the probabilities are low, but they aren't zero. If the money has any legal or financial impact, duplicates inevitably lead to lawsuits, lost time, etc, etc.

      If the ID number is important, you need to guarantee that two people don't get ass

  • This would actually be a huge boon for students looking for a research mentor or PI. I spent months trawling through google and WoS looking through faculty and it was a gigantic mess trying to separate out who was who. The professors of Asian origin were by far the worst to get through as they had 200 other guys with the same name boosting their publication counts to absurd levels. Its made worse by the habit of moving around the country and name abbreviations. Algorithms and narrowing the search criteria c
  • Authors can generate their own unique id: keyword UUID (http://en.wikipedia.org/wiki/Universally_unique_identifier)

    So no central database is needed. Just some conventions.

    S

  • ANOTHER Unique number!

    Besides:
    http://en.wikipedia.org/wiki/Virtual_International_Authority_File [wikipedia.org] , NDL Authorities, http://en.wikipedia.org/wiki/Library_of_Congress_Control_Number [wikipedia.org] , or http://en.wikipedia.org/wiki/Universal_Authority_File [wikipedia.org]

    And these are only the ones I found assigned to a single author in Wikipedia.

    Why not just use one of these?

  • This sounds similar to my idea for establishing a special top-level domain for scientists to register a permanent domain name [ideationizing.com], which I posted back in February. Except, with my system the ID incorporates the scientists name and birth date, identifying information that is already commonly used when referring to historical figures. (OK, all the Wangs may need to include the exact time of their birth, down to the second in order to get a unique ID.) With my system the ID is itself an IRI so it can be used in RD

The Tao is like a glob pattern: used but never used up. It is like the extern void: filled with infinite possibilities.

Working...