Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Google Looks To Convert Print Pubs Into E-Articles

timothy posted more than 4 years ago | from the one-at-a-time dept.

The Media 42

bizwriter writes "A patent application by Google (GOOG), filed in August 2008 and made public last week, shows that the company is trying to automate the process of splitting printed magazines and newspapers into individual articles that it could then deliver separately. Although this could allow Google to convert stacks of periodicals into electronic archives, it potentially sends the company headlong into conflict with a famous Supreme Court ruling on media law."

Sorry! There are no comments related to the filter you selected.

"Reprinted by permission" (3, Interesting)

LostCluster (625375) | more than 4 years ago | (#31276152)

Most magazines are glad to sell their content from back issues for money. So, if Google gets permission from the publisher, and then charges for back magazine items in the same way they have a paid-for newspaper archive search... is that really headed for the Supreme Court?

Re:"Reprinted by permission" (2, Interesting)

perlchild (582235) | more than 4 years ago | (#31276204)

Most magazines wouldn't be ok with an automated process because it wouldn't let them charge extra for some issues.

I'm not saying google intends to do this, but I doubt sports illustrated would let their swimsuit issue go for the same price as the rest.

Re:"Reprinted by permission" (3, Funny)

Overzeetop (214511) | more than 4 years ago | (#31276226)

I'm not saying google intends to do this, but I doubt sports illustrated would let their swimsuit issue go for the same price as the rest.

Yeah, but you don't really read the swimsuit edition for "the articles."

Re:"Reprinted by permission" (2, Insightful)

perlchild (582235) | more than 4 years ago | (#31276270)

I was just using an example that really stood out. Most magazines have one issue a year that really sells, because of just one article that outdoes their competitors. The SI example is recurrent every year, most other magazines aren't so regular.

Re:"Reprinted by permission" (2, Informative)

LostCluster (625375) | more than 4 years ago | (#31276306)

SI's Swimsuit Issue is not a run-of-the-mill issue of the magazine... and sometimes when sports issues warrant they'll even publish a normal SI on the same day. But, like special issues of Time and Consumer Reports... those don't have to even go to subscribers if they don't want them to. Easy to exclude such things, or include them if Google really wants them, in the eventual contract.

Re:"Reprinted by permission" (3, Funny)

edumacator (910819) | more than 4 years ago | (#31280180)

No kidding...those pictures are ... degrading to women. If it wasn't for the articles, I wouldn't pick up Sports Illustrated Swimsuit issue. Are you one of those perverts that just buys it for the pictures. You and your ilk disgust me.

.
.

Whew...that was close. She's gone now, but my wife was standing over my shoulder. Those girls are hot!

Re:"Reprinted by permission" (3, Insightful)

DragonWriter (970822) | more than 4 years ago | (#31276352)

Most magazines wouldn't be ok with an automated process because it wouldn't let them charge extra for some issues.

An automated conversion process has no effect on what can be charged for individual portions of the results, it just streamlines the process of getting material into a form where it can be distributed online, separated by (and, potentially, priced differently by) article, which is even more specific than particular issue.

Now, certainly, Google would probably like to get everything from everyone and pay and charge nothing for it, making money by serving targetted ads alongside the content. But that's not the only could do with the technology, and patenting the technology (even if one assumes that they intend to deploy it at all) doesn't tell you anything about how they plan to deploy it.

Re:"Reprinted by permission" (2, Informative)

LostCluster (625375) | more than 4 years ago | (#31276372)

Did you read my original post? Google has a paywall for old newspaper content, they could easily erect one for old magazine content if needed.

Re:"Reprinted by permission" (1)

MobyDisk (75490) | more than 4 years ago | (#31276978)

If the magazines want variable pricing, then I see no reason they couldn't negotiate that with Google.

Re:"Reprinted by permission" (1)

Modern Primate (1503803) | more than 4 years ago | (#31324572)

Most magazines wouldn't be ok with an automated process because it wouldn't let them charge extra for some issues.

I'm not saying google intends to do this, but I doubt sports illustrated would let their swimsuit issue go for the same price as the rest.

Why would an automated process necessitate uniform pricing for everything? They could easily set it up so that if the OCR reads "Swimsuit Issue" on the cover, the "articles" are tagged differently and a different price is charged.

Re:"Reprinted by permission" (1)

perlchild (582235) | more than 4 years ago | (#31324640)

As I've said, the swimsuit issue is a rarity.

In fact, the behaviour of most media executives is that they want to set the price retroactively, based on popularity.

Re:"Reprinted by permission" (1)

belmolis (702863) | more than 4 years ago | (#31277058)

As the article says, the problem is not so much with the publishers as with the copyrights of the authors.

capability does not imply intention (2, Interesting)

AliasMarlowe (1042386) | more than 4 years ago | (#31276180)

The patent application merely shows they know how to do such a thing. It does not mean that they plan to do so. Google has many unimplemented patents.
Maybe they will, and maybe they won't. But anyone who does will have to factor Google's patent application into their economic reckoning.

Re:capability does not imply intention (1)

Aeros (668253) | more than 4 years ago | (#31277590)

No doubt. It makes me laugh at how people jump to conclusions on here so quickly. "The sky is falling"!!! It's nice they have this cool feature but after the book lawsuit they have against them I am sure they are going to tread into this area (if they decide to) a little more carefully.

Google has quite a history (-1, Troll)

taustin (171655) | more than 4 years ago | (#31276252)

it potentially sends the company headlong into conflict with a famous Supreme Court ruling on media law.

They've already proved with the blatently illegal settlement on the book scanning deal that the law doesn't apply to them.

Re:Google has quite a history (1)

sopssa (1498795) | more than 4 years ago | (#31276298)

it potentially sends the company headlong into conflict with a famous Supreme Court ruling on media law.

They've already proved with the blatently illegal settlement on the book scanning deal that the law doesn't apply to them.

What is that famous ruling anyway? That sentence just calls for a link.

Re:Google has quite a history (2, Informative)

eldavojohn (898314) | more than 4 years ago | (#31276344)

it potentially sends the company headlong into conflict with a famous Supreme Court ruling on media law.

They've already proved with the blatently illegal settlement on the book scanning deal that the law doesn't apply to them.

What is that famous ruling anyway? That sentence just calls for a link.

It's right there in the article:

There’s just one legal problem: New York Times Co. , et. al. v. Jonathan Tasini et. al. [cornell.edu] Usually called the Tasini case, freelance writers sued the New York Times and other print publications for licensing individual articles to database companies without permission from the writers, who retained the copyright on the articles. One of the main turning points was that the publishers had explicit permission only to include the articles in the print publication. However, copyright law did not allow the publishers to break their publications up and make the articles accessible to readers out of the original context.

Re:Google has quite a history (0)

Anonymous Coward | more than 4 years ago | (#31276620)

It's right there in the article

It almost sounds like you expected anyone to RTFA...

Re:Google has quite a history (1)

Whalou (721698) | more than 4 years ago | (#31276350)

From TFA:
http://www.law.cornell.edu/supct/pdf/00-201P.ZO [cornell.edu]

Usually called the Tasini case, freelance writers sued the New York Times and other print publications for licensing individual articles to database companies without permission from the writers, who retained the copyright on the articles.

Re:Google has quite a history (2, Insightful)

LostCluster (625375) | more than 4 years ago | (#31276406)

There aren't as many orphan magazines as there are orphan books.

Re:Google has quite a history (1)

icebike (68054) | more than 4 years ago | (#31276542)

"Blatantly Illegal" (you are welcome for the spelling correction) is a matter for the court to decide. Courts have approved the settlement.

So what was your problem? Did they fail to ask for YOUR approval?

Which ruling? (0)

mcgrew (92797) | more than 4 years ago | (#31276310)

it potentially sends the company headlong into conflict with a famous Supreme Court ruling on media law."

Can someone link please? I'm not a legal scholar. Which law, and how did they rule?

Re:Which ruling? (4, Informative)

eldavojohn (898314) | more than 4 years ago | (#31276382)

Seriously, folks, it's in the article:

There’s just one legal problem: New York Times Co. , et. al. v. Jonathan Tasini et. al. [cornell.edu] Usually called the Tasini case, freelance writers sued the New York Times and other print publications for licensing individual articles to database companies without permission from the writers, who retained the copyright on the articles. One of the main turning points was that the publishers had explicit permission only to include the articles in the print publication. However, copyright law did not allow the publishers to break their publications up and make the articles accessible to readers out of the original context.

Obligatory Wikipedia link [wikipedia.org] .

Re:Which ruling? (2, Funny)

FlyingBishop (1293238) | more than 4 years ago | (#31276686)

Wait, what's this article thing you're talking about? I thought this was Slashdot.

Re:Which ruling? (1)

mcgrew (92797) | more than 4 years ago | (#31276754)

Thank you. Now, will someone please mod the parent "informative" and my GP comment "overrated?" Thanking the mods in advance.

Re:Which ruling? (4, Funny)

eldavojohn (898314) | more than 4 years ago | (#31276840)

Thank you. Now, will someone please mod the parent "informative" and my GP comment "overrated?" Thanking the mods in advance.

Understanding, thanks, salutations ... delivered on Slashdot? With cordiality? Scanning for sarcasm, hatred, malice, discontent ... clean?! Taking full claim of responsibility? Strange new feelings welling up inside me. Double checking URL ... still Slashdot! No memes? Bizarre. How to appropriately respond?

"Uh ... it's been a pleasure doing business with you?"

you're doin it wrong (1)

commodoresloat (172735) | more than 4 years ago | (#31277376)

How to appropriately respond?

"Uh ... it's been a pleasure doing business with you?"

No, no; it's like this:

A+++++ comment, would read again!

Re:Which ruling? (1)

Hurricane78 (562437) | more than 4 years ago | (#31281512)

He just hates himself. ;)

Re:Which ruling? (1)

DingoGroton (885094) | more than 4 years ago | (#31276442)

From the article:

New York Times Co. , et. al. v. Jonathan Tasini et. al. [cornell.edu] Usually called the Tasini case, freelance writers sued the New York Times and other print publications for licensing individual articles to database companies without permission from the writers, who retained the copyright on the articles. One of the main turning points was that the publishers had explicit permission only to include the articles in the print publication. However, copyright law did not allow the publishers to break their publications up and make the articles accessible to readers out of the original context.

I'll have half a pint (1, Informative)

Anonymous Coward | more than 4 years ago | (#31276486)

In the UK, Australia and NZ, "pubs" are what americans call bars.

Re:I'll have half a pint (1)

Monkeedude1212 (1560403) | more than 4 years ago | (#31277210)

We have pubs in Canada too. We also have bars. And clubs

They are different though. A pub is one of those places you go down to drink and have a good time with your friends. You usually end up buying a big platter of Appetizers, sitting chatting and getting drunk together.

A club is the opposite of a pub, in that you expect to do No sitting whatsoever. You pay a ridiculously high cover charge, have to be dressed nice, and pretty much go there to dance while drinking. There will at most be 5 tables in seperate corners. It really is just a place to dance and have chicks grind up against you.

A Bar is kind of a mix between the two. There will be a venue for a live music group usually - and a small area for dancing should the live band encourage you to do so. Otherwise, there is a seating area for you to drink and listen to music.

The more you know!

Re:I'll have half a pint (0)

Anonymous Coward | more than 4 years ago | (#31292704)

Nearly the same. Ours (in the UK) serve actual beer.

Leaping to conclusions (4, Insightful)

icebike (68054) | more than 4 years ago | (#31276492)

Both TFA and the summary assume leap to the conclusion that GOOGLE would run afoul of a law relating to current publications without even hinting at the utterly vast archives of newspapers molding in public libraries or on microfilm that can't be accessed conveniently if at all.

Many worry about the loss of historical content, so much so that due to so much of our modern media being released only in digital form. [theregister.co.uk]

Yet there is a huge wealth of old newspapers, scientific journals, and popular press magazines that could be salvaged with this technology.

Its odd, that when envisioning futuristic civilizations we almost always expect all of their literary history being contained in computers accessible from everywhere. Yet when someone develops the tools to do just that there is a huge outcry from those that posture as defenders of IP rights.

Re:Leaping to conclusions (2, Interesting)

Anonymous Coward | more than 4 years ago | (#31276848)

Both TFA and the summary assume leap to the conclusion that GOOGLE would run afoul of a law relating to current publications without even hinting at the utterly vast archives of newspapers molding in public libraries or on microfilm that can't be accessed conveniently if at all.

That was pretty much exactly what I was going to say. There's a huge leap to nefarious conclusions here - this kind of technology would be awesome for getting old magazines and newspapers a huge amount of which are out of copyright altogether preserved.

Google's "don't be evil" motto may be laughable, but the leaping to conclusions about their nefarious attempts to preserve history that is rotting away as we speak is even more hilarious.

Re:Leaping to conclusions (1)

TubeSteak (669689) | more than 4 years ago | (#31280132)

Its odd, that when envisioning futuristic civilizations we almost always expect all of their literary history being contained in computers accessible from everywhere. Yet when someone develops the tools to do just that there is a huge outcry from those that posture as defenders of IP rights.

There is an outcry because current IP rights don't allow for content to be "salvaged with this technology"
The solution is to go to Congress and have the law changed, not to run roughshod over the rights of others and then present a fait accompli.
I know it's easier to ask forgiveness than permission, but that isn't how our legal system works.

Re:Leaping to conclusions (1)

icebike (68054) | more than 4 years ago | (#31280166)

Wouldn't it be prudent to actually wait till there was an actual violation of someone's IP rights before starting with the crocodile tears?

Googling getting questionable (2, Informative)

dave562 (969951) | more than 4 years ago | (#31276928)

The summary makes it sound like Google is trying to do yet another end run around actually paying publishers to access their content. Every single major publisher out there already has their article content in an advertisement free format. They have templates that they copy the content (and advertisements) into when it comes time to print. If Google wants the content, they can pay the publishers for it. They don't need to reverse engineer the final printing. They need to stop being cheap and pay content creators.

Re:Googling getting questionable (3, Insightful)

belmolis (702863) | more than 4 years ago | (#31277120)

It's only recent material for which this is true. Google appears to be interested in older material, for which the publishers generally do not have split out versions, or, for that matter, in many cases, any electronic version at all.

Re:Googling getting questionable (0)

Anonymous Coward | more than 4 years ago | (#31277366)

Indeed, Google is trying to do yet another end run around actually paying writers for the access to the content.

Re:Googling getting questionable (0)

Anonymous Coward | more than 4 years ago | (#31281150)

Yep, and this is old technology. HP did Time Magazine and the MIT Press collections, Olive Software does this as a matter of course, the technology is prior art going back 20 years, DFKI has had technology to do this for a decade. If this patent gets awarded it will either be yet another mis-granted patent or irrelevant (i.e. easily worked around). Let's hope that the examiner is awake this time.

How will they do it? (1)

ZERO1ZERO (948669) | more than 4 years ago | (#31277900)

Complex printed media material, such as a newspaper, often involve columns of body text, headlines, graphic images, multiple font sizes, comprising multiple articles and logical elements in close proximity to each other, on a single page. Attempts to utilize optical character recognition in such situations are typically inadequate resulting in a wide range of multiple errors, including, for example, the inability to properly associate text from multiple columns as being from the same article, mis-associating text areas without an associated headline or those articles which cross page boundaries, and classifying large headline fonts as a graphic image.

the link in the article points to Parallel, Side-Effect Based DNS Pre-Caching and not Segmenting Printed Media Pages Into Articles ??

Anyway take a scanned page of a magzine or newspaper - what kind of algorithms and checks would need to be done to split these articles as they mention - how would you go about it?

Can anyone see anyway they could apply their 'google magic' to do this in any kind of efficient way?, other than the obvious methods of font size, type, justification, upper/lower / bold etc ?

Re:How will they do it? (1)

davecb (6526) | more than 4 years ago | (#31296874)

Yup: long since done by Exegenix, who even did the magazine analysis, and now available as a web service from Tata Consulting in India.

... an intelligent document conversion solution that helps you to quickly and easily convert Word, PostScript or PDF files into XML. Exegenix® employs human-like intelligence to interpret each page enabling automatic and accurate conversion of structures within the document, with no scripting required.

One of my customers used the for-pay service to convert a massive government budget to text the day it was relased.

--dave

Check for New Comments
Slashdot Login

Need an Account?

Forgot your password?