Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Computers May Be As Good As (Or Better Than) Human Biocurators

timothy posted more than 2 years ago | from the skynet's-nicer-side dept.

Biotech 35

Shipud writes "Sequencing the genome of an organism is not the end of a discovery process; rather, it is a beginning. It's the equivalent of discovering a book whose words (genes) are there, but their meaning is yet unknown. Biocurators are the people who annotate genes — find out what they do — through literature search and the supervised use of computational techniques. A recent study published in PLoS Computational Biology shows that biocurators probably perform no better than fully automated computational methods used to annotate genes. It is not clear whether this is because the software is of high quality, or both curators and software need to improve their performance. The author of this blog post uses the concept of the uncanny valley to explain this recent discovery and what it means to both life science and artificial intelligence."

cancel ×

35 comments

Sorry! There are no comments related to the filter you selected.

Still not (0, Offtopic)

Anonymous Coward | more than 2 years ago | (#40346525)

Still not as good at getting first posts, though.

Re:Still not (1)

Nyder (754090) | more than 2 years ago | (#40346541)

Still not as good at getting first posts, though.

ya, right, you're a computer, quit lying.

Re:Still not (1)

Riceballsan (816702) | more than 2 years ago | (#40346559)

Considering the spam bots have a history of getting about 50% of the first posts, and I would assume there are 100 slashdot users for every spambot. I would say they are better than us at getting first posts.

Re:Still not (1)

sideslash (1865434) | more than 2 years ago | (#40346595)

Guys, ignore this comment. Riceballsan is a known spambot, and alternate handles are:$althandle1,$althandle2,and $althandle3. </BR> Shopping for smelly melons? <a href="http://www.smellymelons.com">Look no more.</a>

"biocurators"? (2)

smoothnorman (1670542) | more than 2 years ago | (#40346713)

in all my too many years [hack spittoo] of biochemistry bioinformatics bio-whathaveyou this is the first i've heard of the term "biocurators". and i gotta say, i don't like it. no-sir, not a bit.

"curator mid-14c., from L. curator "overseer, manager, guardian," agent noun from curatus, pp. of curare (see cure). Originally of minors, lunatics, etc.; meaning "officer in charge of a museum, library, etc." is from 1660s." so, "life + manager" or "life + officer in charge of a library" ...nah.

'geneannonator' ....maybe

Re:"biocurators"? (1)

Shipud (685171) | more than 2 years ago | (#40346741)

Re:"biocurators"? (1)

PopeRatzo (965947) | more than 2 years ago | (#40346873)

How do you get "translation and integration" from "overseer, manager, guardian"?

You can use words any way you goddamn please, but you cannot expect the rest of us play along.

There were plenty of better words to append to "bio-" than "curator" for this purpose. It's nothing more than an effort use the tools of marketing where they don't belong.

It's a shame that so much science journalism is shit.

Re:"biocurators"? (2)

Shipud (685171) | more than 2 years ago | (#40347065)

This is no science journalism. It's the site of the International Society of Biocuration. which has about 1500 members to date. http://colleagues.biocurator.org/affiliations [biocurator.org] it's too bad that you are ignorant of the field: http://www.ploscollections.org/article/browseIssue.action?issue=info%3Adoi%2F10.1371%2Fissue.pcol.v03.i05 [ploscollections.org]

Re:"biocurators"? (1, Flamebait)

PopeRatzo (965947) | more than 2 years ago | (#40347239)

International Society of Biocuration

Are they also members of the International Brotherhood of Language Wankers?

I was once a member of the Elks, but guess what? I wasn't really an elk.

No, really, that's very nice that they have a name for their "International Society" and all, but thanks to my matrimonial duties, I've had to read more scientific papers than I could count (if I could count), and I'm well-versed in the scientific butchery of language. It's partly due to a disease which causes people who are expert in one area to spontaneously believe they are expert in any area that they choose, and partly due to post-docs yearning to be special.

I don't know fuck-all about biology or genetics or curation, but I'm an A-number-fucking-One expert in the English language. I've got all sorts of papers to prove it, with fancy embossed stamps, and a picture of myself having a purple hood pulled over my head because of my expertise with English. Hell, I've even got a tweed jacket with suede patches on the elbows somewhere in the back of my closet. And despite one (at least) slightly shoddy episode with a fulsome grad student in the early 80's, I've got a stellar reputation in the field. And "biocuration" is an abomination, no matter how many geeks with lab coats and large pores want to clip a form in the back of a journal and send in a money order for $15 for the right to call themselves "biocurators" (I assume a cheap ballpoint with the words "International Society of Biocuration" screened on it is also involved).

Now don't get me started on "combinatorics". That one almost earned me a trip to divorce court with my mathematician wife.

Re:"biocurators"? (0)

Anonymous Coward | more than 2 years ago | (#40347405)

I've got all sorts of papers to prove it,

It is good that you have those papers. Since you have yet to demonstrate that you can effectively use the English language for communication. You do excel at its use for rambling, swearing, ranting, and raving.

Re:"biocurators"? (3, Funny)

PopeRatzo (965947) | more than 2 years ago | (#40347487)

It is good that you have those papers. Since you have yet to demonstrate that you can effectively use the English language for communication.

Brother, I am to the written word what Picasso was to painting, what Robert Mapplethorpe was to photography, and what Rahsaan Roland Kirk was to the saxophone.

You won't understand how great I am until somebody else tells you. Or, until I die, which could be any minute now with this goddamn heat here in Chicago.

Now I'm going to go out back and take a soak in the kiddie pool. While I'm out there, I want you to think about what I've said. Later, you can explain it to me. because I don't have a clue. These strawberry daiquiris hit like a jackhammer on a hot day like this one.

Re:"biocurators"? (1)

kermidge (2221646) | more than 2 years ago | (#40348009)

"I've got all sorts of papers to prove it, with fancy embossed stamps...." Beauty of a para.

My hope, Your Eminence, is that someday you will unbend enough to tell us what you really think.

I respect English as a tool and as a thing of beauty in it's own way, but not enough to correct my abuse of it. Being simple, I still mourn the death of the adverbial form, and detest the verbification of nouns and such locutions as "going forward", "at this point in time" and "price point."

Re:"biocurators"? (1)

PopeRatzo (965947) | more than 2 years ago | (#40348903)

Your Eminence, is that someday you will unbend enough

I would like to unbend, but then it gets caught in the bike's spokes.

Re:"biocurators"? (1)

Savantissimo (893682) | more than 2 years ago | (#40348069)

I was slightly skeptical until you mentioned the tweed coat, and the elbow patches really nailed it down. You should really invest in some briar pipes and Balkan Sobranie.

This sentence is ambiguous, though: "And despite one (at least) slightly shoddy episode with a fulsome grad student in the early 80's, I've got a stellar reputation in the field." At least one episode, or at least slightly shoddy? Was the the grad student effusive, generous or simply "full and well developed"? Hmm... perhaps the ambiguity is artful.

I agree though that "biocuration" is a barbarous term. Arthur Clarke had a good put-down of that sort of thing in "Silence Please" in Tales From the White Hart:

"....Sound waves consist of alternate compressions and rarefactions."
"Rare-what?"
"Rarefactions."
"Don't you mean 'rarefications'?"
"I do not. I doubt if such a word exists, and if it does, it shouldn't," retorted Purvis, with the aplomb of Sir Alan Herbert dropping a particularly revolting neologism into his killing-bottle.

Re:"biocurators"? (1)

PopeRatzo (965947) | more than 2 years ago | (#40348895)

Was the the grad student effusive, generous or simply "full and well developed"?

All of the above, if my memory speaks the truth.

It tends to be a very unreliable narrator, I'm learning.

But fulsome enough. That's what matters.

Re:"biocurators"? (1)

Daniel Dvorkin (106857) | more than 2 years ago | (#40350045)

Now don't get me started on "combinatorics".

Do you have a better suggestion for the name of that particular field?

BTW, I know you're at least partly exaggerating for humorous effect, but I have to point out that lines like "It's partly due to a disease which causes people who are expert in one area to spontaneously believe they are expert in any area that they choose, and partly due to post-docs yearning to be special" and "geeks with lab coats and large pores want to clip a form in the back of a journal and send in a money order for $15" don't do a whole lot to dispel the stereotype -- regrettably popular here on Slashdot -- of the clueless liberal arts prof yammering about science. The simple fact is that every technical field has specialized concepts that can't easily be expressed using existing vocabulary, at least not without a whole lot of excess verbiage, so people in those fields either invent new words or alter the meanings of old ones to make expressing those concepts less tedious. And please, don't even try to tell me that literary criticism is any less guilty of language abuse than science is; I've read my grandfather's work.

Re:"biocurators"? (2)

PopeRatzo (965947) | more than 2 years ago | (#40350631)

I know you're at least partly exaggerating for humorous effect

"partly"?

I didn't mean to offend.

And you're certainly correct that literary critics are among the worst of the lot when it comes to horrible neologisms.

I believe that combinatorics is among the most beautiful of the Maths. The name is just a mouthful. Bit so is "deconstructionism".

And regarding my horrible stereotyping of biologists: my beautiful daughter is engaged in the study of biomathematics (which now makes me officially the stupidest person in the house). I certainly have nothing but admiration for the geeks with lab coats and large pores that have been hanging around my house lately emptying the refrigerator and trying to flirt with my daughter who is still too young for serious dating. You see, I have been among grad students before. I know they only have one thing on their minds. Well, maybe two things if you count all that science stuff.

Re:"biocurators"? (1)

Daniel Dvorkin (106857) | more than 2 years ago | (#40353855)

Sorry for overreacting. It's a hot button for me, I guess; I do so loathe any manifestation of the pervasive Two Cultures bullshit that I often have a hard time telling when people are joking about it.

Re:"biocurators"? (2)

Yvanhoe (564877) | more than 2 years ago | (#40347231)

I concur. It took me a while to understand that "biostatistics" is simply statistics with no specific mathematical tool...

Re:"biocurators"? (1)

mcgrew (92797) | more than 2 years ago | (#40351609)

TFA talks of the Uncanny Valley and little about actual gene sequencing, I didn't read much past the Valley graph; nothing I didn't already know. What did interest me was the two chatbots conversing; it looks like bots haven't improved much since 1983 when I wrote Artificial Insanity on a fantastically underpowered computer. I posted this on my old Quake site ten years ago:

Alice joined the game
About 20 years ago, frustrated that otherwise serious researchers and scientests seemingly thought they could program a computer to think, (without, of course, understanding what "thought" actually is; nobody knows that) I wrote a simulation that appears to think, in order to completely debunk the fools and those fooling them who think computers can think.
        I wrote Artificial Insanity in less than 20K (that's Kilo, not mega) bytes- smaller than modern viruses, that ran on the Timex TS-1000 tape driven computer. I later ported it to a Radio Shack computer, then an Apple IIe, and finally ported it to MS-DOS.
        The DOS version's source code is still under 20k (I didn't change the algorythm, only the syntax for the different programming language) although compiled into an .exe it takes about 400k- still tiny by today's standards, as far as simulation software and games go.
        As I mentioned, I did it in response to "Elijah" and all the other similar programs that attempt to fool you into thinking they can think. As far as I know, mine is the only one that is NOT claimed to actually posess intelligence. None really ARE intelligent, I'm just the only one not making the claim. Debunking the claim was my reason for writing it. I go into more detail about it at the Artificial Insanity page.
        Another thing different about Art from all the other intelligence simulations is that I wanted it to be fun, yet annoying. Kind of like playing Quake on a 28.8 against a bunch of LPBs. So I made it a smartass.
        Also, for example, I added little things like a routine that occasionally runs that, instead of answering the questioner, asks if he or she wants to play a game. Of course, most folks consider Art a game any way (although like the Sims, you can't win or lose). Any negative response to "Do you want to play a game" loops back to "Do you want to play a game?" When the hapless player finally gives up and answers "yes" in exasperation, Art answers "you lose".
        Art cost a friend of mine a keyboard. A friend of his literally destroyed it playing with Art, typing as hard as he could.
        I ran across yet another program that claims to think like a human the other day.
        The "Alice Bot" is basically your typical Elijah like program- BORING. As boring as small talk with the dumbest, most boring person you know. Alice is a lady. If she was a whore she might be fun, but I digress.
        I wondered what would happen if Art had a conversation with Alice. Alice has a web based presence, so I didn't even have to install it.
        I was expecting Alice to choke. I expected Art to kick Alice's ass.
        I fired up Art and logged in to Alice (you dirty minded...)
        Here is the transcript. Art's responses are in blue, alice's are in red. [Note: no color here, so Art is bold and Alice is italic.] Alice starts: "What is your name?"
I should have asked Art, but I answered this first question for him, and type in
Art
        Alice replies:
I like art very much too, especially the contemporary stuff.
Are you for real?

Yes I am real AI

whatever gave you that idea?

My brain uses AIML to format responses to your inputs, but I don't have one for that.

Look, I'm good, but I'm not a Cray

Good in what sense? That is interesting: unknown said he am not a cray.

(Art doesn't require you to hit "enter". It takes a period, question mark, etc as the end of the sentences and answers, so here it stops at the question mark. All that is input is "Good in what sense?")
A piece of junk

Are you talking about an animal, vegetable or mineral?

can you put that another way?

Let me think. OK I will put it there.

(heh heh she said...)
The situation calls for more scrutiny.
Tell me more.

What do you think?

I think a lot of things.

Obviously.

What makes it so obvious?

Looks Greek to me
.
Really. it is?

Look, I'm good, but I'm not a Cray.

Good in what sense? Do you mind if I tell other people: unknown said he m not a cray.

(entered second question into Art)
WOW!
I'm glad your impressed, unknown.

(Alice doesn't seem to be too literate...)
Uh,
Er. Um.

This is what you human types call a joke, right?

At this point, alice froze for two full minutes
You don't say.

Right on!

Yeah.

Your guess is as good as mine.

Some people say "your guess is as good as mine".

Maybe.

I understand.

How should I know?

Either from your senses or from pure reason.

I don't think so.

Why do you disagree?

That's why.

Thanks for explaining it.

How simple can it get?

What is it to you.

I don't know but it's ugly.

What makes it so ugly?

  this time Art pauses...
Nothing.
Not a thing?

Never mind that, I think I'm in love!

Oh well thanks for trying to explain it to me anyway. That is "Not a thing?" Since when? Tell me about the object of your affection.

(Ok, one question at a time, lady. "Since when?")
The best time possible for the occurrance of the action.
Tell me about the object of your affection.

UHH....!

Er. Um.

Always

Not that often.

It really doesn't make any diffrence.

What doesn't make any difference?

Real

Totally real.

I think so.

I think so too.

At this point I hit the big red switch in horror...

(6/11/2002 Springfield Fragfest)

Re:"biocurators"? (1)

RespekMyAthorati (798091) | more than 2 years ago | (#40361661)

Did anybody else read this as "Computers May Be As Good As (Or Better Than) Human Binoculars"?
I thought it meant people with really good eyesight.

Machine learning, large multivariate data. Duh. (0)

Anonymous Coward | more than 2 years ago | (#40346777)

This isn't at all surprising if you understand the concept of machine learning and have ever tried to do anything remotely similar by hand.

Nothing in TFA says the machines are better (2)

bbartlog (1853116) | more than 2 years ago | (#40346835)

In fact rather the opposite - it says that the reliability of the machines is 'competitive' or 'rivals' the human curators. That's marketing speak for 'not quite as good just yet'.

Re:Nothing in TFA says the machines are better (1)

Shipud (685171) | more than 2 years ago | (#40347077)

If you look at the figures, you'll see that the IEA-evidenced annotations outperform the curated ones.

misunderstanding the concept of "uncanny valley" (4, Insightful)

slew (2918) | more than 2 years ago | (#40347399)

This author seems to have inappropriately compared the "fear" of machines doing better than humans with concept of uncanny valley.

The concept of the "uncanny valley" is that the affinity of humans for observing the appearance or behavior of a human-like entity (robot, alien, whatever) has this unexpected dip when it is too close to the human behavior (we have this apparent built-in viceral problem with the entity). However, this is only true when it is trying to mimic human-like behaviors. If it's doing something totally different or totally exceeding human behaviors (say distinctly non-human speed, accuracy, strength, appearance, etc), the uncanny valley doesn't say anything about affinity, in fact, if you were to extrapolate the curve out, humans might even have more affinity for these "super-human" behaviors. Maybe that's why many express affinity for live-action versions of comic book super-heros, or airbrushed models in magazines. The behavior is so far from the uncanny valley that it doesn't invoke the supression response that is responsible for it.

Just like what was once observed with "space-shuttle" pilots, the computers can probably do a better job at this task, but we don't quite trust them yet (for some reason). That's really just the human fear of being replaced by machines, not uncanny valley. Note that the only people fearful about this behavior are the people that are likely to be replaced (and maybe a few that sympathize with them)...

Re:misunderstanding the concept of "uncanny valley (0)

Anonymous Coward | more than 2 years ago | (#40347577)

"This author seems to have inappropriately compared the "fear" of machines doing better than humans with concept of uncanny valley."

That's not how I read that, although I may be wrong. It seems like both are performing on a par (more or less), yet the higher sensitivity ("coverage") of the automated methods makes them not-quite human-like, but not "better". So the "uncanny valley" here addresses the observation that the programs are performing like humans, but differently. And a bit weirdly close.

Re:misunderstanding the concept of "uncanny valley (0)

Anonymous Coward | more than 2 years ago | (#40349603)

The fear of the "uncanny valley" is that the android, cybernaut, etc.is sufficiently "human" to accidently invoke the fear of the psychopath, i.e. a person without empathy, who is therefore very,very, dangerous.

Biocurators (1)

Anonymous Coward | more than 2 years ago | (#40347569)

There are a few people in the world, who work for INSD members like NCBI or EMBL, whose job is "biocuration". It's a rare profession. Having reliable annotations available does not equal to discovering a book. In a car analogy, genes are a list of parts. You know things about the car, but how it works and comes together is up to human ingenuity. In the bioinfo/molbio field that usually means heavy use of OSS and shell coupled with in vitro experiments.

What this really means (0)

Anonymous Coward | more than 2 years ago | (#40351967)

What this really means is that we know so very little about how genomes came into existence and how they organize themselves. Algorithms, after all, only optimize or make efficient what factors humans feel are important, yet this is often done with as yet with little understanding of what the rules the natural self-organizing systems use or even if there are many rules at all. The ontologies that are the final outcome of such curation are themselves only theories or models of what is actually going on within genomes. Whatever works may be the only rule constrained by the fact that whatever rules exist must ultimately be expressed in the form of nucleotide sequences. Consequently, it should be of no particular surprise that machines and humans behave similarly when it comes to understanding what this means.

As with all science, humans will use tools. In this case algorithms to aide in developing a better understanding. To achieve that understanding with respect to genomes means elucidating how such sequences and subsequences originated and evolved and have been constrained by selection and influenced by mutation, genetic drift, assortive mating, and other processes that influence which nucleotides have ultimately become "locked into" genomes through geological time.

Systematic biology is not rocket science. It is far more complicated than rocket science since the number of possible permutations and combinations of objects (exterior products of potential events) , many unique, that must be investigated is much, much larger than the known number of electrons in the known universe. Understanding biology, not space is truly the final frontier.

Unfortunately, for humans we seem hell bent on making ourselves go extinct before we have time to figure it all out. It is both perhaps ironic and fitting that humans shall soon go extinct as a species so soon after our first baby step to reach interstellar space has only just been achieved.

Biocuration? (1)

dkf (304284) | more than 2 years ago | (#40347779)

Biocurators are the people who annotate genes — find out what they do — through literature search and the supervised use of computational techniques.

Biocuration means that? I'd have never guessed from the name. Let's face it, literature searching is now something that is thoroughly practical by computer (it's pretty much just like using a web search engine, except over a different digitized corpora) and "supervised use of computational techniques" there makes it sound like they're a bunch of low-level lab technicians. No creativity required at all. Is it any wonder they're being replaced with little more than a shell script? What's more, the computer will be far faster as well. It won't get tired, it won't get bored, it'll just do exactly what it's been told to do. (The annotation of a genome with the consequences of the mutations it has should be trivial; I know this from having worked with code that did a whole genome's worth in well under an hour. Several years ago.)

Now if instead they were curating the actual samples, I'd have much more respect. Those can be quite tricky to work with, and they're often irreplaceable.

Re:Biocuration? (0)

Anonymous Coward | more than 2 years ago | (#40350041)

The annotation of a genome with the consequences of the mutations it has should be trivial; I know this from having worked with code that did a whole genome's worth in well under an hour. Several years ago.

OK, you're either using a time machine to post from a couple centuries in the future or by "consequences of mutations" you mean some vague and totally useless hand-waving "This genome has mutation that might possibly affect some unspecified gene expression levels in a statistically significant manner".

For that matter, the "Gene Ontology" curation discussed in TFA isn't really all that useful itself. Imagine that you had a database where you could type in the name of a program, say "emacs", and get back a broad function classification, say "text editor". That's essentially what Gene Ontology is. No source code, no manual, etc - those are found in other (more useful) biological databases.

But lets get back to the biology.

First, when you do a whole genome sequence there are almost always lots of sequencing errors. If you actually want to be sure that a genome has a particular mutation you have to go back and double check "by hand" (e.g. Sanger sequencing).

Second, even in regions of the genome that can reliably said to be protein coding, there are vast numbers of proteins that are functionally uncharacterized. Even if you get a nonsense mutation that totally truncates the protein, in many cases you aren't going to know what the protein does.

Third, even if you get a verifiable mutation in a protein with known function, trying to figure out how a particular point mutation affects the function is almost impossible. In rare cases, you may be able to go in by and and do a detailed analysis of the structure and show that the mutation is distorting the active site in a way that has particularly obvious consequences - but this is hard and overwhelmingly uncertain.

Finally, most real phenotypes are the result of multiple mutations all working together - e.g. a kid whose parents are cousins who got the same set of bad genes from both parents - and that set of bad genes knocks out enough of the (typically highly redundant) parts of a particular biological pathway to cause problems.

I'm not saying that computational tools aren't useful - but we're a long long way from being able to automatically analyze a genome and have a real explanation of the consequences of the mutation.

personal stuff on the clock (-1)

Anonymous Coward | more than 2 years ago | (#40348717)

Life can be better. that would suck to have to worry about that. maybe you shouldnt work for a holes. :(.

False dichotomy? (1)

Tablizer (95088) | more than 2 years ago | (#40349817)

Why not find a way to leverage the advantage of each?

Laughable (0)

Anonymous Coward | more than 2 years ago | (#40350315)

Most (all?) computational methods for protein annotation rely on a reliable corpus made by humans, and try to find similarities to guess the result.

Saying that computers are better than humans is like saying turbos are more powerful than engines.

Check for New Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>