Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Boiling Down Books, Algorithmically

timothy posted more than 6 years ago | from the infallible-is-a-very-strong-word dept.

Books 177

destinyland writes "A year ago, Aaron Stanton harangued Google over his new project, a web site analyzing patterns in books to generate infallible recommendations. In March he finally finished a prototype which he showed to Google, Yahoo, and Amazon, and he's just announced that he's finally received a big contract which 'gives us a great deal of potential data to work with.' The 25-year-old's original prototype examined over 200 books, plotting 729,000 data points across 30,293 scenes — but its universe of analyzed novels is about to become much, much bigger."

cancel ×

177 comments

Sorry! There are no comments related to the filter you selected.

Just one more errosion.... (5, Insightful)

zappepcs (820751) | more than 6 years ago | (#24078365)

The difference between now and 100 years ago becomes more apparent each day. Then, owning books was a sign of affluence, of intelligence. Now? Everything is up to question, and should be. Analyzing books and other public material is just another step in putting intelligence out there for everyone, not just those that can afford it. I applaud it, and all the dangers it brings. Such hurdles are necessary, but we must assault them to overcome barriers that should no longer exist.

Re:Just one more errosion.... (5, Insightful)

Anonymous Coward | more than 6 years ago | (#24078461)

Knowledge, not intelligence.

Re:Just one more errosion.... (4, Insightful)

smitty_one_each (243267) | more than 6 years ago | (#24078889)

Or wisdom, for that matter.

Re:Just one more errosion.... (2, Insightful)

mrbluze (1034940) | more than 6 years ago | (#24079707)

Or wisdom, for that matter.

What about insight?

Re:Just one more errosion.... (0)

Anonymous Coward | more than 6 years ago | (#24078931)

Right! Intelligent people don't need to read books. They'll excel at whatever they do. Usually some kind of con artist.

Re:Just one more errosion.... (1, Interesting)

Anonymous Coward | more than 6 years ago | (#24079243)

Intelligent people don't just "read books" they get their information from everywhere, even on a bank line they can still get information because they pay attention to everything, they don't have to struggle to learn something.. ask any Mensan if it's not the way they learn stuff. Sure there are some subjects that need deep study, but for the knowledge that comes in handy in everyday tasks... observation it's the way to go/. nd

Re:Just one more errosion.... (3, Insightful)

Austerity Empowers (669817) | more than 6 years ago | (#24079589)

...they don't have to struggle to learn something...

Not even mensa is that arrogant. If it's easy to learn and/or comes via the ether, it's probably trivial. Intelligent people have to work every bit as hard, we expect a whole lot more from them.

Re:Just one more errosion.... (2, Interesting)

Metasquares (555685) | more than 6 years ago | (#24079741)

Basically. There's no advantage to observation over learning with a focused objective, but I think the key point is that learning is an unconscious process that is primarily carried out intuitively. You can direct your attention towards a subject and think a great deal, but you can't direct your intuition - all you can do is foster an appropriate environment. I've thought of it as a sort of receptiveness for new ideas (which I think are exogenous but are learned only after personal interpretation).

I would question whether this is how all of the gifted learn, however. I know a lot of gifted people who nevertheless think they can somehow coerce themselves to learn things through sheer conscious effort, without intuition ever taking over - people who make their primary goal thinking about something vs. understanding it, if that makes any sense. If they keep drilling enough, they eventually get whatever it is they were trying to learn, but it tends to take a long time, usually leaves them exhausted, and is swiftly forgotten. Unsurprisingly, these people have all ended up building small, narrowly focused knowledge bases.

That's not to say that learning difficult material is easy. It's a struggle irrespective of intelligence, and if learning a particular topic comes easily, there's always something harder. Until you intuitively understand something, conscious thought may be your only way of comprehending it - and if you don't intuitively understand a concept, thinking about it is hard work.

But that's just my own experience.

Re:Just one more errosion.... (2, Interesting)

Anonymous Coward | more than 6 years ago | (#24078577)

"not just those that can afford it."

Shit Bud, you make it sound like it's the 1200s. Books ARE cheap. Books are just another thing to compete for your money; sometimes they win, sometimes they lose. Like with those bankrupt families that have a 50" plasma screen and a couple Navigators in the driveway. They've made their choices. Personally, I've chosen books. No need to assault anything or anybody; there are no barriers other than our own (assuming you're a white male, of course).

Say, you aren't one of those I-want-everything-given-to-me-for-free computer type people, are you? Well, if you are, fuck off.

Re:Just one more errosion.... (2, Informative)

zappepcs (820751) | more than 6 years ago | (#24078773)

I don't mean to throw stones, but books cost money, many people afford to be on the Internet, yet buying books has become old hat. When you can go on the Internet and get the latest information, books are ... well, a waste of money for the most part. The delay between discovery and publishing and reading is no longer tolerable, not in this throw away society. Look at some science fiction ideals... such delays are always intolerable. I will cite an event that is not even related to show that delay is not right: junteenth. It took several years for emancipation news to reach Texas. Is that right? The point is that information and knowledge should be universal, and instant. The great promise of the Internet was just that. If you wish to spend your nights reading information from 2+ years ago, that is your problem. The rest of us want today's information, and now. Good luck with the personal library.

Re:Just one more errosion.... (4, Insightful)

Skreems (598317) | more than 6 years ago | (#24078833)

If you're talking about news, you're correct. But the original article is applying this to works of fiction, which still take at least a decade to go out of date (if not longer) despite the internet and the hard-on you appear to have for it. This "invention" is not about freeing information, it's basically a fancy way to mathematically calculate that if you like The Hobbit, you might also like The Lord Of The Rings. It might be beneficial to someone looking for more of the same, but it doesn't even seem to serve to further creativity since by design it will not recommend things that will expand your horizons, but will encourage people to stay with the safety of yet another rehash of something they've already read.

Re:Just one more errosion.... (3, Insightful)

Narpak (961733) | more than 6 years ago | (#24079477)

...will encourage people to stay with the safety of yet another rehash of something they've already read.

Like most people since mass printing became possible. There are many authors whos work would give you great satisfaction, but who you will never read. Perhaps by at least giving people a good selection of tailored recomendation; the quality of that selection could hopefully improve.

The span of taste is wide and varied. More so than what any bookstore could provide (unless it is online). However when you take things online you encounter another problem; there is truly a vast (and growing) number of books avalible for purchase; trying to create a system for automated recomendation is a logical goal. Even if a system like that doesn't encourage reading things outside your established field of interest. If you arrive at a point where you need something different, a good system should be able to let you browse the top sellers, best reviewed and established classics of any genre. I have no doubt that after various tries, failures and breakthroughs, and as technology improves; consumers of litterature will be given a good online, digital tool for searching through databases and lists of material they might enjoy.

Re:Just one more errosion.... (2, Interesting)

Skreems (598317) | more than 6 years ago | (#24079617)

I don't get this, though. The idea of "top sellers, best reviewed, genre classics" already exists, and this invention adds nothing to it. On the other hand, the idea of finding books you should read but don't know about seems a problem particularly poorly suited to an automated solution. This is what personal recommendations absolutely excel at, because no algorithm can gauge the cultural impact of a work of art, or the level of craft involved in its making.

Re:Just one more errosion.... (3, Interesting)

ruin20 (1242396) | more than 6 years ago | (#24080217)

In most things we evolve, not leap to new horizons. I find that most of the time I choose to read a book because I like it's similarities, I like the book because of it's differences. Like traditional sci-fi to apocalyptic sci-fi to steam punk to biohacking to cyberspace to crypto. I never would have read the Cryptomicon if I hadn't read I, Robot and can say today that I have a better appreciation for one from the other.

Typically the way we learn and get good at just about everything is that we go a little bit beyond where we're comfortable and we sustain an effort there. After a while our comfort level moves. Just like if I read enough on one subject typically I'll get caught up with a tangent subject and eventually move into that.

Re:Just one more errosion.... (2, Informative)

Skreems (598317) | more than 6 years ago | (#24080291)

Yes, that's quite true. The important thing in both, though, is that they're good, while this algorithm may just as easily recommend something absolutely terrible that happens to contain a lot of the same words and phrases unless it relies heavily on human input for that elusive quality assessment.

Re:Just one more errosion.... (0)

Anonymous Coward | more than 6 years ago | (#24080301)

it doesn't even seem to serve to further creativity since by design it will not recommend things that will expand your horizons, but will encourage people to stay with the safety of yet another rehash of something they've already read.

I would think such an algorithm could be used however most appropriate for stated needs. Want to read the same sort of thing? Look for works that match best. Want to expand your horizons? Limit the variables to certain kinds of similarities. I'm assuming a level of customization that may not exist for the end-user, but who knows?

Re:Just one more errosion.... (1)

rubberchickenboy (1044950) | more than 6 years ago | (#24080323)

Weird...I was logged in but that posted as AC.

it doesn't even seem to serve to further creativity since by design it will not recommend things that will expand your horizons, but will encourage people to stay with the safety of yet another rehash of something they've already read.

I would think such an algorithm could be used however most appropriate for stated needs. Want to read the same sort of thing? Look for works that match best. Want to expand your horizons? Limit the variables to certain kinds of similarities. I'm assuming a level of customization that may not exist for the end-user, but who knows?

Re:Just one more errosion.... (5, Insightful)

smitty_one_each (243267) | more than 6 years ago | (#24078913)

If you wish to spend your nights reading information from 2+ years ago, that is your problem. The rest of us want today's information, and now. Good luck with the personal library.

It's getting to the point that you need a 2+ year filter just to dampen the noise in the signal.
And let's give a shout out to all of the library homiez. While I'm affluent enough to afford the occasional impulse book at the store with the built-in coffee shop, I do recall many an hour of random wandering in the public library in my youth.

Re:Just one more errosion.... (1)

gregbot9000 (1293772) | more than 6 years ago | (#24080579)

It's getting to the point that you need a 2+ year filter just to dampen the noise in the signal.

Absolutely, I'd say 5 years minimum when it comes to fiction. I don't think I've read a book less then 5 years old in a decade. Unless the author has a proven track record, when someone comes telling me about how great a new book is I tend to wait until its passed the test of time. It's the only surefire way I've found to get past all the new releases and marketing hyped reviews, if people are still taking about it after five years you've got a winner.

Re:Just one more errosion.... (1)

Ethanol-fueled (1125189) | more than 6 years ago | (#24079509)

Dostoevsky and Tolstoy books(or any other books to include those written by Psychology's founding fathers and mothers), bought used from Amazon, are dirt-cheap and will teach you more about psychology than any single text will.

Re:Just one more errosion.... (2, Informative)

Bottlemaster (449635) | more than 6 years ago | (#24079731)

I will cite an event that is not even related to show that delay is not right: junteenth. It took several years for emancipation news to reach Texas. Is that right?

Actually, it took several years for Union troops to arrive in Texas and enforce emancipation. Until Texas was reconquered, the proclamation didn't apply because the state was neither a part of the United States nor under its jurisdiction. At the most, you can only claim a lag of about three months (Lee surrendered in April of that year), though some Confederate forces were still active as late as November.

Re:Just one more errosion.... (1)

LilGuy (150110) | more than 6 years ago | (#24079279)

I am. Especially when it come to information. And I almost always get it for free.

Except when I find it incredibly valuable, in which case I pay for a hard book copy.

Re:Just one more errosion.... (5, Insightful)

blahplusplus (757119) | more than 6 years ago | (#24078581)

What really hits a nerve with me is why the scientific community hasn't opened up all their journals for others to read. I imagine many retired and amateur scientists, engineers, hobbyists, etc, would have a lot of insight into many engineering and scientific problems and also make many discoveries as well. Intelligence is not limited to the credentialed, those of high status or currently employed, many discoveries happen simply by exposure to as many minds as possible, and finding connections and errors in others works..

Re:Just one more errosion.... (1)

Strilanc (1077197) | more than 6 years ago | (#24078665)

While you're at it, complain about university students not making their books less expensive. Your beef is with publishers. They aren't the entire scientific community.

Re:Just one more errosion.... (4, Informative)

dnwq (910646) | more than 6 years ago | (#24078717)

The researchers publishing these papers typically don't get much more than citations - the money mostly goes to publishers like Elsevier. Blame them instead.

Re:Just one more errosion.... (4, Insightful)

Free the Cowards (1280296) | more than 6 years ago | (#24080225)

Much better to blame the researchers for not publishing in a more open medium. They're the ones who might actually change their habits, after all.

Re:Just one more errosion.... (5, Informative)

Sir Holo (531007) | more than 6 years ago | (#24078721)

blahplusplus: What really hits a nerve with me is why the scientific community hasn't opened up all their journals for others to read.

We scientists would absolutely love to have all of the journals opened up for free access to everyone. But, you see, the publishers own the copyright to our articles. The system requires us to give them the copyright, in order to get our stuff published. Then you, me, and everybody else has to pay to read recent research.

Thankfully, some established journals are going open-access.

That's very promising. But the fact remains that publishers such as Elsevier own the copyright to many decades-worth of scientific literature. And they're not about to give any of it away.

Re:Just one more errosion.... (2, Interesting)

blahplusplus (757119) | more than 6 years ago | (#24078769)

"That's very promising. But the fact remains that publishers such as Elsevier own the copyright to many decades-worth of scientific literature. And they're not about to give any of it away."

Then I submit the scientific community creates a project website to buy the rights to these works, I've come up with many ways for funding such an endeavor. The barrier would primarily be geometric (population size vs amount of money each person could donate/give/invest in such a venture) and the attitudes of the people themselves.

Re:Just one more errosion.... (2, Interesting)

Chineseyes (691744) | more than 6 years ago | (#24078921)

You do realize that doing something like this publicly would backfire in the worst ways imaginable? You would immediately increase the value of the works and some incredibly wealthy person or corporation may just buy everything out right in the hope that you pay him/her even more money than you had originally planned.

Re:Just one more errosion.... (1, Interesting)

Anonymous Coward | more than 6 years ago | (#24079125)

All scientists publish their papers both legitimately and illegitmately through an underground site. I imagine such men have the intelligence to do so.

There are many ways to do this and no, they don't have to be legal when you're dealing with commercial tyrrany.

Re:Just one more errosion.... (0)

Anonymous Coward | more than 6 years ago | (#24080845)

why not model the new open system after open-source? You publish for the sake of publishing and advancing the field, and all information is traded freely?

Is that a bug or a feature? (1)

smitty_one_each (243267) | more than 6 years ago | (#24078989)

What really hits a nerve with me is why the scientific community hasn't opened up all their journals for others to read.

Do you know what you're saying? Do you really want to release possible Weapons of Intellectual Destruction on the world?
I look at the titles in the archives of
http://www.misq.org/ [misq.org]
and I'm thinking that some of this stuff is best kept locked in the ivory tower.

Re:Is that a bug or a feature? (1)

AceofSpades19 (1107875) | more than 6 years ago | (#24080867)

Information wants to be free

Re:Just one more errosion.... (4, Insightful)

Wrath0fb0b (302444) | more than 6 years ago | (#24079343)

I wish it weren't so (and I submit all my papers to http://www.arxiv.org/ [arxiv.org] as well to the journals), but the fact is, closed journals provide significant value both to the reader and to the submitting author. I'm not really trying to defend the system here, by the way, I'm just trying to explain what purpose it serves (and what an open alternative would have to match).

Referees and Peer-Review Referees are invaluable because someone has to objectively assess articles for basic scientific merit and rigor. The better journals can recruit referees for each submission that truly grok the subject matter and can often work very productively with an author. Quite a number of important advances are made and pitfalls avoided because a referee insisted that a researcher cover her bases before submission. Of course, nobody claims that PR journals are bullshit-free, but they are certainly far better than un-reviewed sources like arxiv.

This function is especially important for readers in multidisciplinary fields (myself included) that often read papers on subjects in which we are not expert enough to know what constitutes sound science. When I read about some group that has extracted and crystallized some protein, I'd like to know that someone competent at the relevant techniques has scrutinized their methods because I haven't the faintest clue (I'm a physicist by training, a biophysicist by necessity).

Prestige and Selection Another important function of the journals is to select articles by importance. If a paper makes Nature or Science, that's usually a good indicator that they've made an important advance. The benefits of this selection are twofold: first, readers can keep tabs on work at the forefront without wading through lots of papers. It sounds lazy, but most of us cannot read every paper that is published and are quite glad to outsource some filtering to the journals.

Secondly, it allows authors to demonstrate to people outside their immediate field what caliber work they've done. Even among people in the same department, it's not immediately clear what qualifies as a breakthrough work (as opposed to incremental work, which I don't trash in the least bit, but it's not really the same hat) -- prestigious journal cites are a good substitute, especially when the alternative is to either become an expert in the field or find one and ask.

Review Articles Most journals have an in-house staff to write articles reviewing the state of a particular field/technique/whatever. This is also an invaluable services because sometimes one needs a broad, textbook-level summary instead of a large number of discrete, deep papers on a topic. Given that science is done in small, insular little bits, it's natural that there is room for someone to aggregate and summarize those bits and put them into a larger perspective.

Editing Another thankless job (the snarky comments about the /. eds belie the fact that editing is hard work). Dupes are weeded out and researchers with poor language skills (especially when writing in an adopted language) are given help communicating their ideas. Confusing or unclear language is massaged back into form, figures are well-presented and well-labeled, text is formatted to be easy on the eyes, references are given in a standard form. These things count more than most /.ers realize (Knuth was on to something guys . . )

Access Brutal honesty, we don't really care about the access restrictions. Every university has license to pretty much all the major journals. We can get them from wherever with a quick login and so can everyone we know. Sorry, but that's the truth.

Re:Just one more errosion.... (4, Insightful)

Man On Pink Corner (1089867) | more than 6 years ago | (#24079465)

I dunno, man. Pretty much every point you covered is Wiki-able.

Re:Just one more errosion.... (5, Funny)

Z34107 (925136) | more than 6 years ago | (#24080421)

I dunno, man. Pretty much every point you covered is Wiki-able. [Citation needed]

Re:Just one more errosion.... (2, Insightful)

shadowofwind (1209890) | more than 6 years ago | (#24079371)

What really hits a nerve with me is why the scientific community hasn't opened up all their journals for others to read. I imagine many retired and amateur scientists, engineers, hobbyists, etc, would have a lot of insight into many engineering and scientific problems and also make many discoveries as well.

I like your spirit and agree that there's a lot of really smart, creative people who aren't scientists. However...

One of the dispiriting things about science is how specialized most subjects have gotten. If you're not an expert in a field, its almost impossible to do anything. Even being an expert in a closely related field often isn't good enough. I don't think this is anyone's fault, its just the natural course of development. So I think the days of ameteurs accomplishing very much are behind us in a great many fields.

Another issue is that the people who fund scientists often aren't sufficiently literate to distinguish good science from phony science. Scientists are threatened professionally by people who peddle counterfeit knowledge, which has an effect on their fields similar to the effect a flood of counterfeit currency would have on an economy. So they try to protect themselves by controlling the validation of information and the supply of scientists. This legitimate desire is of course twisted by other less honorable inclinations. But there's a legitimate motivation there also.

Re:Just one more errosion.... (2, Interesting)

Sheafification (1205046) | more than 6 years ago | (#24079445)

As a member-in-training of the scientific community, I think you'll find that most scientists agree with you. Unfortunately the system right now is hard to break out of. You need to publish in a reputable journal for job evaluation and tenure purposes, but many reputable journals are under the thumb of the publishers.

In mathematics there have been several mass resignations of journal editorial boards in protest over the price. These editors usually then go on to form a brand new, cheaper journal in the same area. So some progress is being made. I can't say what has been happening in the other sciences though.

Historic records, yes. (2, Insightful)

jd (1658) | more than 6 years ago | (#24079633)

I can see exceptional value in indexing, cataloguing and processing all articles in back issues of Wireless World (when it was still called that). There is an enormous wealth of information there on how radio technology improved, when and why. There is also a fantastic amount of information on how to achieve specific effects and how to restore old technology. Not to mention a few pieces on how to build a geostationary communications satellite.

Other old journals will likewise have a lot of valuable information in them. Archaeologists discover a lot through searching their own journals, discovering lost and forgotten reports of discoveries. Mathematicians routinely publish in arcane and super-obscure journals, making what is known far more extensive than what is known to be known.

Re:Just one more errosion.... (1)

vikstar (615372) | more than 6 years ago | (#24080885)

There are a few places you can download publications for free. Pubmed and Citeseer usually have access to many papers for free download. Otherwise, sometimes authors put their own draft/pre prints on their websites.

many discoveries happen simply by exposure to as many minds as possible, and finding connections and errors in others works

Is this based on an actual study or your own conjecture?

Newspeak (4, Funny)

RDW (41497) | more than 6 years ago | (#24078421)

I love how the prototype version in the link gives a 98% match between George Orwell's '1984' and the text of the USA Patriot Act!

Re:Newspeak (4, Funny)

drinkypoo (153816) | more than 6 years ago | (#24078475)

They're still working out that last 2% margin of error.

Re:Newspeak (5, Informative)

log1385 (1199377) | more than 6 years ago | (#24078493)

From the FAQ [booklamp.org] :
"Does 1984 really match the U.S. Patriot Act?
No, that is an easter-egg. A bit of a joke on our part."

Re:Newspeak (0, Troll)

smitty_one_each (243267) | more than 6 years ago | (#24078997)

One man's joke is another /.er's pathetic little reality.

Re:Newspeak (1)

Pembers (250842) | more than 6 years ago | (#24078495)

I thought they put that in as an Easter egg... the Patriot Act isn't a novel. Though some Eastern bloc countries allegedly used 1984 as a HOWTO, or a specification of an ideal government.

Re:Newspeak (1)

DittoBox (978894) | more than 6 years ago | (#24078627)

Citations please. I'd love to know uses 1984 as a blueprint of sorts.

Re:Newspeak (1)

smitty_one_each (243267) | more than 6 years ago | (#24079005)

You need an "if he" or "who" in that second sentence.

Re:Newspeak (1)

DittoBox (978894) | more than 6 years ago | (#24080031)

Hmmm, you're right. I generally think one or two sentences ahead of what I'm typing. I generally re-read what I type before sending or posting but apparently not this last time. Cheers mate!

Re:Newspeak (1)

Hurricane78 (562437) | more than 6 years ago | (#24078675)

Since when do the US, Britain, Germany and France belong to the Eastern block? Or are you from a pacific island? ;)

P.S.: If yes: Can I come too? I'd do anything to get a decent government again.

Re:Newspeak (2, Insightful)

smitty_one_each (243267) | more than 6 years ago | (#24079021)

I'd do anything to get a decent government again.

"Be thankful we're not getting all the government we're paying for." --Will Rogers

Re:Newspeak (1)

Pembers (250842) | more than 6 years ago | (#24079089)

By "Eastern bloc" I meant the Soviet Union and its satellite states in Eastern Europe.

This page [msu.edu] talks at some length about a Soviet dissident and his reactions to the novel. Basically, he found it hard to believe Orwell lived in Britain, not Russia. That doesn't (much) support my assertion that the Soviet government used the book as a blueprint (or at least, thought it had a lot of good ideas), but I did say "allegedly" :-)

Unfortunately for your hopes of moving to somewhere sane, I live in Britain.

a tool (0)

Anonymous Coward | more than 6 years ago | (#24078423)

when i come to think of it, building a tool that rates the level of novelty in an idea would be good one. it would make the job of large companies or opportunity hunters easier as well..

cheers,
mbilgi

Re:a tool (3, Insightful)

BootNinja (743040) | more than 6 years ago | (#24078623)

might be a good tool to help the USPTO with their backlog.

Solomon replies (0, Offtopic)

smitty_one_each (243267) | more than 6 years ago | (#24079039)

Is there any thing whereof it may be said, See, this is new? it hath been already of old time, which was before us.

Ecclesiastes 1:10

Finally! (0)

Anonymous Coward | more than 6 years ago | (#24078433)

I, for one, welcome our new book analyzing overlords!

It already exists. (1, Flamebait)

Ironchew (1069966) | more than 6 years ago | (#24078439)

The only infallible book recommendation has existed for nearly 2000 years now.

Don't want to read it? Heretic! But "translations" do exist for public convenience.

Re:It already exists. (0, Troll)

tomhudson (43916) | more than 6 years ago | (#24078483)

The only infallible book recommendation has existed for nearly 2000 years now.

I call bullshit. Books didn't exist 2000 years ago, you ignorant clod!

Re:It already exists. (3, Funny)

smitty_one_each (243267) | more than 6 years ago | (#24079051)

Well ex-scrolls me, you codex-fancying fascist! ;)

Re:It already exists. (1)

tomhudson (43916) | more than 6 years ago | (#24079293)

Actually, the honour of the first printed scrolls goes to the Chinese. Examples of scrolls printed using movable type (wood cuts of chinese ideographs) date to the 600's. the oldest know book is also Chinese, from 868. [wikipedia.org]

They also invented toilet paper 1500 years ago ...

Re:It already exists. (0)

Anonymous Coward | more than 6 years ago | (#24078517)

What, the Bible? Not nearly 2000 years old. And if you're making a joke about its "infallibility", only a minority of unfortunately outspoken Christians actually believe that.

If you already read, you don't need this... (5, Insightful)

thereofone (1287878) | more than 6 years ago | (#24078457)

...and if you do not read, you won't want this.

Re:If you already read, you don't need this... (3, Funny)

thereofone (1287878) | more than 6 years ago | (#24078489)

Also, now that I've played with the "beta" a little I want to see the graphs for Finnegans Wake.

Re:If you already read, you don't need this... (1)

thomas.galvin (551471) | more than 6 years ago | (#24080269)

I don't know about that. A lot of my books are like television is to other people: simple entertainment. There are times when I want to have my horizons expanded, or to learn something new and nifty. But there are other times when I just want to forget about everything that happened at the office today, and when I do, it's kind of amazing how often I pick up a book that involves a guy with a staff blowing things to smithereens. And if there's a tool that will point me to even more guys with staves blowing even more things to smithereens, I say bring it on. Sometimes, I just want something comfortable. And explosive.

I'll believe it when I see it (4, Insightful)

clarkkent09 (1104833) | more than 6 years ago | (#24078469)

I am skeptical that analyzing the content of the books can lead to good recommendations, let alone "infallible". Two books can be very similar in subject matter and writing style and yet one can be great and the other one awful. The difference is just too subtle for an algorithm to figure out, though I hope I am wrong and it turns out that it works, it would be very useful. Same applies to movies and music as well. I always found "Customers who purchased this book also purchased...." section on amazon to be more valuable than my personalized recommendations

Re:I'll believe it when I see it (1)

drinkypoo (153816) | more than 6 years ago | (#24078487)

You could save a lot of time by analyzing only the final chapter. I find that most books I pick up are okay until the end, at which point they make me wish I could go back in time and gouge my eyes out with crab forks to prevent myself from ever picking that piece of trash up (works like that not often being translated into braille.)

Re:I'll believe it when I see it (0)

Anonymous Coward | more than 6 years ago | (#24079055)

Then stop reading Neal Stephenson books. Every time I read one of his books I swear to all my friends that, for example, Diamond Age is the best book ever only to recant at the end.

Re:I'll believe it when I see it (3, Insightful)

wfstanle (1188751) | more than 6 years ago | (#24078597)

I wholeheartedly agree! Take for example two phrases which are equivalent...

"Eighty seven years ago our ancestors ..."

and

"Four score and seven years ago our forefathers ..."

They say the same thing but what a difference in eloquence.

Re:I'll believe it when I see it (0)

Anonymous Coward | more than 6 years ago | (#24078753)

Yea, and its impossible for a computer algorithm to identify the difference between them, because manputers only read in the saying, and not the actual letters or sentences.

Re:I'll believe it when I see it (1)

wik (10258) | more than 6 years ago | (#24079317)

The difference, of course, is rooted purely in the awkwardness of the speaker.

Re:I'll believe it when I see it (0, Troll)

3waygeek (58990) | more than 6 years ago | (#24078707)

I am skeptical that analyzing the content of the books can lead to good recommendations, let alone "infallible".

You're obviously not Catholic [wikipedia.org] .

Re:I'll believe it when I see it (0)

Anonymous Coward | more than 6 years ago | (#24078953)

As somebody who was raised Catholic, from my experience, you'll be hard-pressed to find Catholics that truly believe in Papal Infallibility as well, excluding missionaries.

Re:I'll believe it when I see it (3, Interesting)

martin-boundary (547041) | more than 6 years ago | (#24078763)

It always depends on which part of the statistical landscape the algorithm is good at modelling.

It may be that what makes a book great is hard to identify, but what makes a book really bad is much easier to identify. In that case, such an algorithm won't help with recommending high quality works for you to read, but it could be very useful in saving you from wasting your time with obviously bad books (ie it would help with initial triage).

Remember, there are a lot more bad books than good books, so if you had to go through all the books to find the good ones, then you'd spend most of your time just looking a bad books and rejecting them.

Re:I'll believe it when I see it (1)

wik (10258) | more than 6 years ago | (#24079299)

Question one: How does it rate Hemmingway?
Question two: How does it differentiate between Hemmingway and Imitation Hemmingway [wikipedia.org] ?

Re:I'll believe it when I see it (1)

martin-boundary (547041) | more than 6 years ago | (#24079645)

I wouldn't know, but seeing as there's a beta [booklamp.org] referenced in TFA, maybe you should try it?

Re:I'll believe it when I see it (1)

chromatic (9471) | more than 6 years ago | (#24079309)

[What] makes a book really bad is much easier to identify.

Is it part of a series of more than three books? If yes, it's probably bad. Is the lead character a Mary Sue? If yes, it's probably bad. Is the lead character an irresistible vampire, were-wolf, were-tiger, were-lemur, were-panda, were-hippo with magic powers? If yes, you should have known already not to read the Anita Blake books.

Re:I'll believe it when I see it (1)

MrHanky (141717) | more than 6 years ago | (#24078797)

It's probably no less efficient than analysing email to check for spam. If you're interested in ch34p C0rel S0ftw4re, you may also have an interest in v1agra and rep1ika r0lex watches.

Re:I'll believe it when I see it (1)

Estanislao Martnez (203477) | more than 6 years ago | (#24078829)

I am skeptical that analyzing the content of the books can lead to good recommendations, let alone "infallible". Two books can be very similar in subject matter and writing style and yet one can be great and the other one awful.

Who uses the algorithm makes a big difference. From the way you frame the problem, you're looking for "good" books, i.e., books you'll enjoy reading. But think of somebody doing academic research or looking for patent prior art--for them, one important task is to find all relevant references on a topic, good or bad, and sifting through them.

Re:I'll believe it when I see it (1)

Rocketship Underpant (804162) | more than 6 years ago | (#24079017)

I agree 100%. I suspect a more useful data mining system would use book *reviews*, mined from Amazon and all the other sites that post them. In addition to providing an overall barometer for quality, it could identify reviewers whose tastes run similar to your own, and use that as a starting point for recommendations.

Re:I'll believe it when I see it (2, Insightful)

DriedClexler (814907) | more than 6 years ago | (#24079755)

Exactly ... which is why I read the summary as "Fast-talking kid talks fools out of their money."

Re:I'll believe it when I see it (1)

mepath (1320883) | more than 6 years ago | (#24080325)

I agree with you. What is the recommendation based on? Is it based on the content of the book, style of the book, voice of the book, message of the book, author, what readers also liked, etc. There are tons of things to consider when recommending a book. If there ever is a finalized product, I hope it's far better than "Customers who purchased this book also purchased..." feature of amazon.

Yet Another Pointless Dot-Com (4, Insightful)

techno-vampire (666512) | more than 6 years ago | (#24078505)

This is just another pointless project that's going to waste the time and skull-sweat of a good but unrealistic programmer. All he's going to have when he's done is the solution to a problem that doesn't, for all practical purposes, exist. Good writers won't need it because they know what to do and how to do it, so they won't use it. It will only be used by poor writers, who won't know how to put the suggestions into effect properly. It may, possibly, tell a writer where their book needs work, or where it's not interesting enough, but I doubt it. Most likely, all it will do is tell it where it's not like other successful books because it won't be able to recognize or take into account any originality. Even if its recommendations are right, a poor writer is highly unlikely to profit from them, because by definition a poor writer won't know which suggestions are good or the skills to take advantage of them properly. No, what a poor writer who wants to get better needs is either a good critique group or some friends who will act as beta-readers, telling him not only what doesn't work but why (Something, I might add, that I find it hard to believe this program could ever do.) and discuss things with the author until they understand each other. Mechanical criticism of literature can only result in mechanical literature, not good writing.

Re:Yet Another Pointless Dot-Com (-1, Redundant)

Anonymous Coward | more than 6 years ago | (#24078535)

This is just another pointless project that's going to waste the time and skull-sweat of a good but unrealistic programmer. All he's going to have when he's done is the solution to a problem that doesn't, for all practical purposes, exist. Good writers won't need it because they know what to do and how to do it, so they won't use it. It will only be used by poor writers, who won't know how to put the suggestions into effect properly. It may, possibly, tell a writer where their book needs work, or where it's not interesting enough, but I doubt it. Most likely, all it will do is tell it where it's not like other successful books because it won't be able to recognize or take into account any originality. Even if its recommendations are right, a poor writer is highly unlikely to profit from them, because by definition a poor writer won't know which suggestions are good or the skills to take advantage of them properly. No, what a poor writer who wants to get better needs is either a good critique group or some friends who will act as beta-readers, telling him not only what doesn't work but why (Something, I might add, that I find it hard to believe this program could ever do.) and discuss things with the author until they understand each other. Mechanical criticism of literature can only result in mechanical literature, not good writing.

Wow. You entirely missed the point of Aaron's work, and complained that something he is not doing is a bad idea.

Re:Yet Another Pointless Dot-Com (0)

thereofone (1287878) | more than 6 years ago | (#24078537)

I don't think you read the summary, much less TFA.

Re:Yet Another Pointless Dot-Com (1)

techno-vampire (666512) | more than 6 years ago | (#24078585)

As a matter of fact, I did. I read both. There was very little substance in the article, at least about the developer's idea. However, I do find it hard to believe that you can mechanize the study of good writing and come up with anything original.

Re:Yet Another Pointless Dot-Com (3, Insightful)

Skreems (598317) | more than 6 years ago | (#24078905)

Wasn't the "improve your writing" aspect an earlier project? I got the impression that the original project was as you described, but the new thing he's trying to partner with Google on is to take the same basic system and use it to recommend "similar to this book" type things in a storefront setting.

Of course all the problems you listed apply anyway. It's very easy to have a work with all the same pieces as a great work of art, but assembled in such a way that the derivative work is completely unsatisfying.

A great example of something similar that you can try today and watch as it fails miserably is Pandora.com. They categorize music by a number of different elements so they can recommend similar pieces. And their categorization is quite accurate; they correctly surmised after about five minutes that I enjoy symphonic rock with a mix of acoustic and electric guitars, obscure lyrics, complex themes, unusual rhythm patterns and interesting chord changes. They then proceeded to present me with one after another shitty Coldplay or Radiohead rip-off band who had every element down perfectly, but still managed to make amazingly bad music. I much expect this product to be the same thing but with books.

"If you like A Deepness In The Sky, why not try An Ewok Christmas: The Novel! They're both about humans meeting strange aliens, and spaceships, and computers, and why are you giving me the finger?"

Re:Yet Another Pointless Dot-Com (1)

ryen (684684) | more than 6 years ago | (#24080341)

If you've RTFA you'd notice that the expected application is for recommendations to potential buyers of books. But i'm also skeptical on that front too...

algorithm bombing (4, Insightful)

notgm (1069012) | more than 6 years ago | (#24078547)

how long before someone figures how to fool the algorithm, and we all start reading books about enlarging our genetalia, but in a classy way?

It already happened (2, Informative)

themushroom (197365) | more than 6 years ago | (#24078559)

...considering the quantity of "classic" tripe that I had to read in high school and college. Who needs an algorithm when you have English teachers who follow flawed formulae?

thhhpt! (2, Interesting)

themushroom (197365) | more than 6 years ago | (#24079559)

Whoever modded this to 'troll' never took the English classes I had. Yo.

A different book (1, Insightful)

Hanyin (1301045) | more than 6 years ago | (#24078713)

What I wonder is: What happens if there's a different style of writing that's not accounted for? I hope they're not just marked down. What will it consider a good book, what's truly interesting and insightful or books that are made to sell like The Da Vinci Code? I can see how much easier it would be to identify which books well sell well but I fear that this will be its only use, and the less said about doing the same for movies the better.

Smart computer (1)

Trogre (513942) | more than 6 years ago | (#24078791)

This computer should do fine, assimilating every book ever written. We'll just need to hire someone to periodically delete every Agatha Christie novel from its database.

Re:Smart computer (1)

smitty_one_each (243267) | more than 6 years ago | (#24079083)

So, you'd punt Agatha and leave in Danielle Steele?

Re:Smart computer (2, Funny)

techno-vampire (666512) | more than 6 years ago | (#24080055)

Of course you need to leave in Danielle Steele. Harald Robbins too. After all, you can't neglect the classics, can you?

This is The U.S. On Drugs (-1, Offtopic)

Anonymous Coward | more than 6 years ago | (#24078885)

Only Cops and Crooks Have Benefited From $2.5 Trillion Spent Fighting Trafficking.

The United States' so-called war on drugs brings to mind the old saying that if you find yourself trapped in a deep hole, stop digging. Yet, last week, the Senate approved an aid package to combat drug trafficking in Mexico and Central America, with a record $400 million going to Mexico and $65 million to Central America.

The United States has been spending $69 billion a year worldwide for the last 40 years, for a total of $2.5 trillion, on drug prohibition - -- with little to show for it. Is anyone actually benefiting from this war? Six groups come to mind.

The first group are the drug lords in nations such as Colombia, Afghanistan and Mexico, as well as those in the United States. They are making billions of dollars every year -- tax free.

The second group are the street gangs that infest many of our cities and neighborhoods, whose main source of income is the sale of illegal drugs.

Third are those people in government who are paid well to fight the first two groups. Their powers and bureaucratic fiefdoms grow larger with each tax dollar spent to fund this massive program that has been proved not to work.

Fourth are the politicians who get elected and reelected by talking tough -- not smart, just tough -- about drugs and crime. But the tougher we get in prosecuting nonviolent drug crimes, the softer we get in the prosecution of everything else because of the limited resources to fund the criminal justice system.

The fifth group are people who make money from increased crime. They include those who build prisons and those who staff them. The prison guards union is one of the strongest lobbying groups in California today, and its ranks continue to grow.

And last are the terrorist groups worldwide that are principally financed by the sale of illegal drugs.

Who are the losers in this war? Literally everyone else, especially our children.

Today, there are more drugs on our streets at cheaper prices than ever before. There are more than 1.2 million people behind bars in the U.S., and a large percentage of them for nonviolent drug usage. Under our failed drug policy, it is easier for young people to obtain illegal drugs than a six-pack of beer. Why? Because the sellers of illegal drugs don't ask kids for IDs. As soon as we outlaw a substance, we abandon our ability to regulate and control the marketing of that substance.

After we came to our senses and repealed alcohol prohibition, homicides dropped by 60% and continued to decline until World War II. Today's murder rates would likely again plummet if we ended drug prohibition.

So what is the answer? Start by removing criminal penalties for marijuana, just as we did for alcohol. If we were to do this, according to state budget figures, California alone would save more than $1 billion annually, which we now spend in a futile effort to eradicate marijuana use and to jail nonviolent users. Is it any wonder that marijuana has become the largest cash crop in California?

We could generate billions of dollars by taxing the stuff, just as we do with tobacco and alcohol.

We should also reclassify most Schedule I drugs ( drugs that the federal government alleges have no medicinal value, including marijuana and heroin ) as Schedule II drugs ( which require a prescription ), with the government regulating their production, overseeing their potency, controlling their distribution and allowing licensed professionals ( physicians, psychiatrists, psychologists, etc. ) to prescribe them. This course of action would acknowledge that medical issues, such as drug addiction, are best left under the supervision of medical doctors instead of police officers.

The mission of the criminal justice system should always be to protect us from one another and not from ourselves. That means that drug users who drive a motor vehicle or commit other crimes while under the influence of these drugs would continue to be held criminally responsible for their actions, with strict penalties. But that said, the system should not be used to protect us from ourselves.

Ending drug prohibition, taxing and regulating drugs and spending tax dollars to treat addiction and dependency are the approaches that many of the world's industrialized countries are taking. Those approaches are ones that work.

http://www.mapinc.org/norml/v08/n647/a02.htm [mapinc.org]

An eHarmony.com matching books and people? (1)

littlewink (996298) | more than 6 years ago | (#24079151)

You tell it what books you like and it finds other books that are similarly structured?

What's the big deal, except that Google has Google books?

Anyone could do this. There's plenty of narrative analysis software: the government's outpouring of our tax dollars to "protect us" since 9/11 has triggered every halfwit software development firm in the country to develop

  • network relationship analysis software and
  • text analysis software

and sell them to the local militia.

Who is Joe? (4, Interesting)

mustafap (452510) | more than 6 years ago | (#24079269)

There is one persistent son of a bitch on their forum, Joe, who seems to be their nemesis. I wonder what his angle is.

Other than that, I like their approach - involve the community *really* early on.

Apart from Joe.

Copyright Infringment detection anyone?? (2, Interesting)

Anonymous Coward | more than 6 years ago | (#24079325)

His prototype sounds in a way like Netflix's suggestion system for movies, where you vote your favorites and it'll suggest other ones based on your liking. But books are much more complicated, so I can see how his detailed analysis tool can really be the ultimate suggestion tool. I wonder if people will use this to discover copyright infringement on a new level. Hmm... my book and your book are a 99.5% match. Gee where did the .5% discrepancy occur. My character is a 19 yr old hobo, so is yours. My story is about him eventually becoming a successful company executive by pimping himself out to different high-powered women. My character's name is Matt, yours is Mike. Aha.

MAR&E (-1, Flamebait)

Anonymous Coward | more than 6 years ago | (#24079425)

balance is struck, playing so it's users. Surprise would like to FreeBSD at about 80 chronic abuse of areL incompatible ITS CORPSE TURNED another folder. 20 my calling. Now I much organisation, turd-suckingly Were compounded Long term survival [amazingkreskin.com] end, we need you I know it sux0rs, to yet another Raymond in his to the original a previously parties, but here and enjoy all the Architecture. My HEAD SPINNING Why not? It's quick bben many, not the You should bring THE LATEST NETCRAFT To have regular politics openly. she had no fear

Need more books? (1)

vrmlguy (120854) | more than 6 years ago | (#24079739)

How about Project Gutenberg [gutenberg.org] ? They've got lots of books that have already been scanned.

Load More Comments
Slashdot Login

Need an Account?

Forgot your password?