Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Xerox Photocopiers Randomly Alter Numbers, Says German Researcher

timothy posted about a year ago | from the we-think-you-meant-9 dept.

Bug 290

First time accepted submitter sal_park writes "According to a report from German computer scientist D. Kriesel, some Xerox WorkCentre copiers and scanners may alter numbers that appear in scanned documents. Having analyzed the output of two such devices, the Xerox WorkCentre 7535 and 7556, Kriesel found that "patches of the pixel data are randomly replaced in a very subtle and dangerous way": in particular, some numbers appearing in a document may be replaced by other numbers when it is scanned."

Sorry! There are no comments related to the filter you selected.

These numbers are not the true numbers (4, Funny)

hawkinspeter (831501) | about a year ago | (#44485185)

So, it has come to this.

Re:These numbers are not the true numbers (-1)

Anonymous Coward | about a year ago | (#44485195)

Must have been made in Africa with cheap outsourced labor. Fuckin niggers. They're not content just making our cities uninhabitable.

Re:These numbers are not the true numbers (-1)

Anonymous Coward | about a year ago | (#44485255)

So, is that made in Africa, but outsourced elsewhere?

If you Muricans didn't wan't ethnic diversity in your cities, why did your ancestors import so many of them?

Re:These numbers are not the true numbers (-1, Troll)

nedlohs (1335013) | about a year ago | (#44485471)

They are not their ancestors. Which should be pretty damn obvious to anyone with at least two brain cells to rub together.

Re:These numbers are not the true numbers (1)

somersault (912633) | about a year ago | (#44485505)

What exactly are you referring to with your "they" and "their"? Because his post was implying that a bunch of Europeans imported a bunch of Africans. Though actually the Europeans constitute a fair amount of diversity too, so bringing slavery into it was just trolling.

Re:These numbers are not the true numbers (3, Funny)

durrr (1316311) | about a year ago | (#44485343)

The dark lord is touching the world, and he's doing it through photocopy machines.

I would've expected printers or those cheap ISP-provided routers to be his preferred way of evildoing, though I guess even he/it couldn't get those to work properly.

Mission Impossible 4? (-1)

Anonymous Coward | about a year ago | (#44485197)

Tom cruise is NOT a role-model! Shame on you Xerorx!

Re:Mission Impossible 4? (5, Funny)

Entropius (188861) | about a year ago | (#44485241)

That's Xenu, not Xerox.

Re:Mission Impossible 4? (0)

Anonymous Coward | about a year ago | (#44485299)

Tom cruise is NOT a role-model! Shame on you Xerorx!

you actually watch that shit? hahahahaha. no wonder you posted ac.

They hashed the scanned image blocks? (0)

Anonymous Coward | about a year ago | (#44485199)

OOPS

Anti-counterfeiting (-1)

Anonymous Coward | about a year ago | (#44485203)

It's long been known that scanners and copiers contain code to prevent them from being used to duplicate currency. It must be triggering in this case.

Re:Anti-counterfeiting (4, Insightful)

J'raxis (248192) | about a year ago | (#44485269)

Maybe you should read the article.

Re:Anti-counterfeiting (3, Funny)

Anonymous Coward | about a year ago | (#44485361)

Huh?

I'm sorry. I understand those 6 words individually. But when you put them in that order, they don't make any sense.

Read? The? Article? You are not making any sense, man!

Re:Anti-counterfeiting (3, Funny)

Anonymous Coward | about a year ago | (#44485363)

I lack the proper attention span to read the article. Let's make a deal: I quickly skim through it, and soon return here with another completely wrong conclusion. Be back in 30 seconds.

Re:Anti-counterfeiting (1)

niftydude (1745144) | about a year ago | (#44485715)

That is asking too much of him - maybe if he just looked at the pictures in the article?

The Pentium Bug strikes again (1)

Anonymous Coward | about a year ago | (#44485205)

Now, in a more subtle way.

Slashdot affected as well (5, Funny)

Anonymous Coward | about a year ago | (#44485211)

Kriesel found that âoepatches of the pixel data are randomly replaced in a very subtle and dangerous wayâ

Slashdot users are advised not to use Xerox copiers for submissions.

Re:Slashdot affected as well (4, Informative)

J'raxis (248192) | about a year ago | (#44485245)

That bug is caused by Slashdot still refusing to implement this 20-year-old technology [wikipedia.org] . I mean, this being some sort of cutting-edge tech blog and all, who'd expect them to properly support a character-encoding technology that came out two decades ago?

Re:Slashdot affected as well (5, Funny)

intermodal (534361) | about a year ago | (#44485295)

Especially with such an international audience.

Re:Slashdot affected as well (-1, Flamebait)

stewsters (1406737) | about a year ago | (#44485339)

Smart-quotes are a blight on online publishing. Either convert them on upload, or turn them off on your word processor. http://www.techrepublic.com/blog/msoffice/turn-off-words-smart-quotes/ [techrepublic.com]

Re:Slashdot affected as well (2)

NatasRevol (731260) | about a year ago | (#44485685)

Sorry, but you have that exactly backwards.

Online publishing is a blight on smart quotes.

If your publishing can't handle smart quotes, then stop publishing. All they are is a different character. Deal with it properly or GTFO.

Re:Slashdot affected as well (0)

Anonymous Coward | about a year ago | (#44485387)

They used to support unicode. But people used to make goatse "ascii" art from the unicode characters.

Re:Slashdot affected as well (0)

Anonymous Coward | about a year ago | (#44485545)

So? is that somehow more offensive than goatse "art" in normal ascii characters?

Re:Slashdot affected as well (4, Informative)

Mr Z (6791) | about a year ago | (#44485579)

No, just significantly harder to filter effectively. Also, there were a rash of troll accounts with names that looked like the various Slashdot editors, only using accented variants of letters, such as 'tÍmothy'. All those shenanigans added up to where we are today.

Re:Slashdot affected as well (1)

xaxa (988988) | about a year ago | (#44485653)

No, just significantly harder to filter effectively. Also, there were a rash of troll accounts with names that looked like the various Slashdot editors, only using accented variants of letters, such as 'tÍmothy'. All those shenanigans added up to where we are today.

So filter usernames and email addresses for ASCII, perhaps filter comments for UTF8 basic type 'Graphic' and \n.

Problem solved? http://slashdot.jp/ [slashdot.jp] supports Unicode.

Re:Slashdot affected as well (1)

NatasRevol (731260) | about a year ago | (#44485695)

That sounds like a whole lot of whining on the editors' part.

Re:Slashdot affected as well (1)

Mr Z (6791) | about a year ago | (#44485753)

Quite possibly. *shrug* I find it very difficult to actually care.

Re:Slashdot affected as well (0)

Anonymous Coward | about a year ago | (#44485409)

I mean, this being some sort of cutting-edge tech blog

Slashdot is a news service, not a blog!

Anyway, completely agree with the Unicode thing. Millions of websites implement full character support - it's not that hard or expensive to implement.

Re:Slashdot affected as well (0)

Anonymous Coward | about a year ago | (#44485589)

Meanwhile, that whooshing sound you just heard was caused by the joke sailing over your head. ;)

Re:Slashdot affected as well (0)

slashmydots (2189826) | about a year ago | (#44485691)

Because some people are viewing with open source, half working, alpha release hippie browsers that don't support UTF-8 so they can't implement it.

oh man, what a mess (5, Informative)

Trepidity (597) | about a year ago | (#44485213)

Some of these machines have been used for digitizing documents whose originals were later shredded, so some people now have subtly wrong "original" digitals. It's particularly problematic because of the nature of degradation; usual lossy degradation of images is in a non-semantic way, just produces blurring or blocking or other kinds of artifacts, not OCR-error style mistakes.

The issue here seems to be the lossy mode of JBIG2 [wikipedia.org] , which tries to find patches of the image that approximately match, and consolidates them. The idea seems to be that if the letter "e" appears 5000 times in a document in the same typeface, you just store some version of it once, and then reference it everywhere it appears. But now you get OCR-style errors, if you end up matching some patches to incorrect partners. You have your lightly printed "8" replaced by the "0" patch now and then, that kind of thing. And unlike people doing OCR, who know they need to take this into account, the operators of these machines likely had no idea this was even a possible failure mode to watch for, so who knows how many numbers are wrong in miscellaneous documents (letters are a little less problematic, because most random letter mutations don't destroy meaning).

Blargh.

Re:oh man, what a mess (0)

zAPPzAPP (1207370) | about a year ago | (#44485251)

From the article (yeah, i know...):

"This is not an OCR problem (as we switched off OCR on purpose), it is a lot worse"

The machines are altering the scanned pictures.
And they seem to do this in locations where there are numbers in the picture.
AND they seem to do it so that the altered image still contains numbers at the same location. Just different ones.

Re:oh man, what a mess (5, Informative)

Trepidity (597) | about a year ago | (#44485297)

Yeah, it's not OCR per se, but it operates on a somewhat similar principle to OCR, identifying which numbers are which and consolidating things it thinks are the same glyph. I agree it's much worse, because it alters the actual image. And it does so in a way that still looks plausible and "clean". Really bad lossy compression that just produced a lot of artifacts so that certain numbers were unreadable would at least telegraph that you shouldn't trust the result, but the numbers here look clean and artifact-free, they just happen to be wrong.

Re:oh man, what a mess (0, Flamebait)

Anonymous Coward | about a year ago | (#44485347)

He said "ocr style" not "ocr". God damn you and everyone else who thinks your reading comprehension problem is someone else's mistake. Wastes so many posts.

Other posts are right there on your screen. You can read them ten times if that's what it takes for your stupid ass to comprehend what they do and don't say.

How's it feel to miss something that schoolchildren are expected to get right? Does it make you feel stupid? It should. It really should.

Re:oh man, what a mess (0)

Anonymous Coward | about a year ago | (#44485575)

I found his approach amusing: read the article carefully and skim the highest ranked post to find a place where he can "correct" it to appear even more deserving of mod points.

Re:oh man, what a mess (4, Insightful)

iguana (8083) | about a year ago | (#44485317)

Could also be a problem with an overly aggressive hole filling algorithm. http://www.mathworks.com/help/images/ref/imfill.html [mathworks.com]

I'd expect there's nothing nefarious going on. It's very likely an overly aggressive image processing algorithm.

Re:oh man, what a mess (5, Interesting)

Anonymous Coward | about a year ago | (#44485529)

While it isn't nefarious so far as a deliberate plot to destroy documents and their integrity, it is a bug that is of concern for those who want to preserve documents for long-term storage in an archival situation.... such as was the case with the architectural documents being scanned.

Keep in mind that in some archival situations, the original paper documents are destroyed where the scanned versions in these files are all that remains of those documents. Ultimately, by having the numbers change like this, regardless of why it is happening, now throws serious doubt as to the validity of any of the numbers in that document. This can have an enormous set of consequences if you are using this scanned document as a receipt, for banking purposes (aka the check amount might have a different number than was originally used) or other similar kinds of situations. Engineering offices, banks, and a great many other businesses are shredding mountains of paper and archiving those documents electronically, so it is a big deal.

I guess it really boils down to understanding the limitations of compression algorithms, and not buy into the hype that a vendor might have where you can save all kinds of storage space with this incredible algorithm.... and find out that all of your documents are worthless when you try to submit them to a judge & jury in a lawsuit as evidence. Perhaps an engineer needs to find the dimensions and tolerance limits of a bolt in an obscure subsystem... and the numbers change? Do you really want to fly in an airplane where the parts specifications have changed because of an error like this? Do you mind if a few hundred or even thousand dollars are taken out of your bank account that you didn't authorize?

Re:oh man, what a mess (5, Funny)

Hatta (162192) | about a year ago | (#44485543)

That's what she said.

Re:oh man, what a mess (2, Insightful)

sh00z (206503) | about a year ago | (#44485381)

The issue here seems to be the lossy mode of JBIG2

combined with the fact that he's complaining about errors in scans of a 7-point font. At that size, it probably only takes two erroneous pixels to change a 6 to an 8.

Re:oh man, what a mess (4, Informative)

Trepidity (597) | about a year ago | (#44485449)

Ran some numbers to check, and with some assumptions your estimate seems pretty close.

The modern standard "postscript point" is 1/72 in, so a 7-point font has a height 7/72 inches. The stroke distinguishing the 6 from the 8 is maybe 1/4 of the height, so let's say ~0.025 inches. If the print/scan cycle roundtrips at somewhere in the range 75-150 dpi, that's 2-4 pixels. If you can manage a professional-standard 300 dpi, you get more like 7-8 pixels, but that's a fairly optimistic case.

Re:oh man, what a mess (3, Interesting)

dj245 (732906) | about a year ago | (#44485563)

Ran some numbers to check, and with some assumptions your estimate seems pretty close.

The modern standard "postscript point" is 1/72 in, so a 7-point font has a height 7/72 inches. The stroke distinguishing the 6 from the 8 is maybe 1/4 of the height, so let's say ~0.025 inches. If the print/scan cycle roundtrips at somewhere in the range 75-150 dpi, that's 2-4 pixels. If you can manage a professional-standard 300 dpi, you get more like 7-8 pixels, but that's a fairly optimistic case.

Why wouldn't you use at least 300dpi?

Most "serious" office printers print at 600dpi or better, so the information is there. Even my $100 brother laser printer defaults to 600dpi. Every recent office multifuntion I have seen can scan at 200, 300, or 600dpi, but every single one defaults to 200dpi. 200dpi scans are hard on the eyes. I always scan at 600dpi, the file size isn't bad in the age of 300GB laptop hard drives, and if I need to send it to someone external to the company, I can always reduce the size.

Re:oh man, what a mess (1)

N1AK (864906) | about a year ago | (#44485687)

I always scan at 600dpi, the file size isn't bad in the age of 300GB laptop hard drives, and if I need to send it to someone external to the company, I can always reduce the size.

In which case it begs the question why bother using an algorithm that substitutes in the real content to save space if space isn't an option regardless of what DPI you use? Clearly space saving was a consideration for someone ;)

Re:oh man, what a mess (1)

Anonymous Coward | about a year ago | (#44485403)

According to his later posts, the HTML settings page for the scanner warns that character substitution can occur with the default compression setting. Of course, with it being the default, you'd only ever see that warning if you were going in there to change it to something else...

Re:oh man, what a mess (0)

Anonymous Coward | about a year ago | (#44485477)

I am not far from Xerox, they are an evil incompetent company that has outsourced everyone and their mother. I have no idea what they really make anymore other than the CEO's ugly mug is constantly sucking BO's C**K. I've interviewed at that hell hole and it was the rudest most annoying interview I've ever been on. Xerox the outsourced affirmative action company.

Re:oh man, what a mess, Obama birth certificate to (0)

Anonymous Coward | about a year ago | (#44485501)

This problem showed up a while ago, when Obama's birth certificate was released. Some doofus scanned it using some overblown Adobe product, which probably without asking, did OCR on it and added layers of gray OCR'ed text.

That set off a spitstorm of wingnuts posting smarmy YouTube videos where they showed how "intelligent" they were at "detecting" that the image was so, so, so "manipulated".

Re:oh man, what a mess (2)

N1AK (864906) | about a year ago | (#44485663)

I have to admit I'm actually really surprised by this. The idea and technology are good but I would think it fundamentally breaks a key feature of digitising a document: removing the need to keep the hard copy. The moment the digitised copy is more than an electronic representation of the physical document then the authenticity of anything in the digitised document is in doubt. Can it really be used to prove what someone read and signed for example, even if the chance of an error in any case is 1/10,000?

Re:oh man, what a mess (1)

nine-times (778537) | about a year ago | (#44485701)

Thanks for the quick explanation. This is kind of hilariously unfortunate, since it has the potential to undermine the reliability of lots of documents.

JBIG2 (5, Insightful)

Anonymous Coward | about a year ago | (#44485217)

Caused by misconfigured JBIG2 compression. When pixel error rate is low enough, similar looking features get printed with the same subimage.

Some image smoothing algorithm... (0)

Nutria (679911) | about a year ago | (#44485227)

which kicks in when saving to PDF, and doesn't handle low image resolution very well?

Re:Some image smoothing algorithm... (4, Informative)

Sponge Bath (413667) | about a year ago | (#44485257)

This is not smoothing, distortion or individual pad pixels. Entire image patches are copied incorrectly, essentially repeating a scanned section containing one number over another part of the image containing a different number.

Really? (-1, Redundant)

guytoronto (956941) | about a year ago | (#44485237)

Scanning 7pt text at 200dpi with consumer level scanner technology and you're complaining about scan errors. Really?

Re:Really? (0)

Anonymous Coward | about a year ago | (#44485249)

Scanning 7pt text at 200dpi with consumer level scanner technology and you're complaining about scan errors. Really?

when other brands don't do that, yes, yes we are.

Re:Really? (5, Insightful)

Sponge Bath (413667) | about a year ago | (#44485273)

Scanning an article without comprehension and your complaining about your misinterpretation. Really?

Re:Really? (0)

Anonymous Coward | about a year ago | (#44485275)

Did you even read the blog post?

Re:Really? (5, Informative)

fuzzyfuzzyfungus (1223518) | about a year ago | (#44485305)

Scanning 7pt text at 200dpi with consumer level scanner technology and you're complaining about scan errors. Really?

These 'errors' are substantially worse than ordinary scanner suckitude or lossy-compression legovision: JBIG2's pixel-block matching creates the potential for a block containing one character to be mis-identified and replaced with a block containing a different character.

The replaced character will be exactly as legible as text elsewhere on the page, just entirely incorrect.

If it were just the scan quality being lousy, or somebody turning, say, JPEG compression up to the point of pain, mangled characters would be obviously mangled. Not as good as being legible; but the issue is obvious. In this case, the errors will look as good as the rest of the document.

Re:Really? (1)

jeffmeden (135043) | about a year ago | (#44485647)

Scanning 7pt text at 200dpi with consumer level scanner technology and you're complaining about scan errors. Really?

These 'errors' are substantially worse than ordinary scanner suckitude or lossy-compression legovision: JBIG2's pixel-block matching creates the potential for a block containing one character to be mis-identified and replaced with a block containing a different character.

The replaced character will be exactly as legible as text elsewhere on the page, just entirely incorrect.

If it were just the scan quality being lousy, or somebody turning, say, JPEG compression up to the point of pain, mangled characters would be obviously mangled. Not as good as being legible; but the issue is obvious. In this case, the errors will look as good as the rest of the document.

After actually looking at the images in TFA, it does seem like there is a problem with the way 6/8 and 4/7 are interpreted. However, you can't say that the results aren't quite noisy; I would look at a scan like that with a squinty eye and be super annoyed at the jerk who couldn't just procure the *original* electronic format. Just because the scanner "seems to do ok" on other equally tiny numbers doesn't make it right. Get the goddamn original file.

Re:Really? (4, Informative)

xaxa (988988) | about a year ago | (#44485345)

Scanning 7pt text at 200dpi with consumer level scanner technology and you're complaining about scan errors. Really?

Consumer level? This isn't a home, or even home-office, machine. It's sold on the website [xerox.co.uk] under the office section.

Re:Really? (2)

Atzanteol (99067) | about a year ago | (#44485425)

A $12,000 scanner/printer is "consumer level?"

Re:Really? (1)

Anonymous Coward | about a year ago | (#44485673)

A $12,000 scanner/printer is "consumer level?"

What are we? Farmers?

Re:Really? (4, Informative)

UnknowingFool (672806) | about a year ago | (#44485649)

If you read the article you would see it's not a simple case of scan error where a "13" appears blurry and looks like "B". Whole numbers are changed: 21.11--> 17.43. This is a major issue if it was on a construction drawing for example. A beam 4m too short would be a problem. Even if caught the engineer signing off might have to go through a whole audit process.

Problem with JBIG2, not OCR (3, Insightful)

Anonymous Coward | about a year ago | (#44485283)

Before anyone spreads wrong information: The problem is with the JBIG2 image compression algorithm used when scanning to PDF format. OCR has nothing to do with this. Also, TIFF format images are not affected as they don't use JBIG2.

Re:Problem with JBIG2, not OCR (0)

Anonymous Coward | about a year ago | (#44485395)

Also, TIFF format images are not affected as they don't use JBIG2.

Given that TIFF is just a container format for many image compression formats (similar to how .AVI and .MKV are just container formats for many audio-video compression formats) I wouldn't make that assumption if I were you. TIFF portability has always been a pain-in-the-ass with different vendors using different sets of compression formats in their TIFF implementations.

Re:Problem with JBIG2, not OCR (2)

barlevg (2111272) | about a year ago | (#44485439)

He's not making an assumption--it says so right in the article.

Re:Problem with JBIG2, not OCR (1)

Charles Duffy (2856687) | about a year ago | (#44485553)

That's raw TIFFs. TIFF also supports compression, including JBIG2. Whether these devices support JBIG2 in TIFF is less clear, though indeed, as it says in the article, they definitely support raw TIFFs, which come out clean.

Re:Problem with JBIG2, not OCR (1)

Charles Duffy (2856687) | about a year ago | (#44485567)

...anyhow -- the parent didn't say that these devices didn't exhibit the problem in TIFF, but that TIFF itself was innately immune to the problem. That's a considerably more sweeping -- and, frankly, unfounded -- claim.

Machine Awakening (0)

Anonymous Coward | about a year ago | (#44485303)

It's the first subtle warning of the machine awakening. ...It's coming...

see the Xerox user manual (5, Informative)

mejustme (900516) | about a year ago | (#44485311)

Quote: "Normal/Small produces small files by using advanced compression techniques. Image quality is acceptable but some quality degradation and character substitution errors may occur with some originals"

Source: http://www.cs.unc.edu/cms/help/help-articles/files/xerox-copier-user-guide.pdf [unc.edu]

Re:see the Xerox user manual (1, Insightful)

Racemaniac (1099281) | about a year ago | (#44485375)

thanks for mentioning where in the 328 page document you linked that is written :)

Re:see the Xerox user manual (3)

mejustme (900516) | about a year ago | (#44485437)

That is why keyboards have CTRL+F. (Top of page 107.)

Re:see the Xerox user manual (0)

Anonymous Coward | about a year ago | (#44485489)

Ctrl + F you

Re:see the Xerox user manual (1)

mwvdlee (775178) | about a year ago | (#44485511)

Page 107.

It literally took longer to download the PDF than it took to find the page by Ctrl+S.

Re:see the Xerox user manual (1)

h4rr4r (612664) | about a year ago | (#44485559)

Try searching for that phrase. Should be pretty simple.

Re:see the Xerox user manual (3, Informative)

Anonymous Coward | about a year ago | (#44485481)

Interesting, since as far as I remember from reading about this issue yesterday, Xerox had not yet responded to this issue. Strange, since it's in the documentation.

But then, reading the manual in context, the quote appears on pages 107, 129, and 179, which is the chapters "Fax", "Workflow Scanning", and "Save and Reprint Jobs" respectively.

It's not in the chapter "Copying" (pages 39..63), so there is no excuse that this issue occurs in simple copy mode.

Re:see the Xerox user manual (1)

NatasRevol (731260) | about a year ago | (#44485729)

If their response is anything other than RTFM, they're dying.

RTFM? WTF? (0)

Anonymous Coward | about a year ago | (#44485513)

Quote: "Normal/Small produces small files by using advanced compression techniques. Image quality is acceptable but some quality degradation and character substitution errors may occur with some originals"

Source: http://www.cs.unc.edu/cms/help/help-articles/files/xerox-copier-user-guide.pdf [unc.edu]

Page 129 for those incapable of searching a PDF.

But, seriously dude, this is scientific research! You can't seriously expect the man to RTFM.

Re:see the Xerox user manual (1)

timeOday (582209) | about a year ago | (#44485519)

Seriously, how did you happen to know about that?

Re:see the Xerox user manual (1)

Rob the Bold (788862) | about a year ago | (#44485571)

Quote: "Normal/Small produces small files by using advanced compression techniques. Image quality is acceptable but some quality degradation and character substitution errors may occur with some originals"

Source: http://www.cs.unc.edu/cms/help/help-articles/files/xerox-copier-user-guide.pdf [unc.edu]

Very interesting find, although that warning only appears in the "Fax" section of the manual, and not in the "Copy" or "Workflow Scanning" sections.

Re:see the Xerox user manual (4, Informative)

Rob the Bold (788862) | about a year ago | (#44485713)

Very interesting find, although that warning only appears in the "Fax" section of the manual, and not in the "Copy" or "Workflow Scanning" sections.

AND I'd be wrong, it's in all three sections. Ctrl-F'ing in Ocular only finds "character substitution" when the words are side-by-side, not split by a line break as they appear in the copying and scanning sections.

That's way worse. Xerox knows about this, and just puts in a little note, rather than a big old: "WARNING: Normal/Small mode may produce undetectable text errors."

And that type of warning should be defined in the beginning of the manual as "operations that may cause data transcription errors resulting in financial harm, damage to property, injury or death".

Re:see the Xerox user manual (5, Insightful)

Atzanteol (99067) | about a year ago | (#44485593)

That's "Normal" quality? That could be *very* misleading. If you have an option that has negative side-effects such as this then the option should be titled something to indicate the risk - "Super-compressed", "dangerously small" or the like.

Though I'm surprised Xerox would even allow such a compression if such an obvious issue occurs. People would expect image quality to suffer - but full character substitution?

Re:see the Xerox user manual (2)

petermgreen (876956) | about a year ago | (#44485725)

The problem is that most people only read the manual when they discover something is wrong and there is no immediately obvious problem with the results of these scans. The problem only gets noticed much later when someone tries to work with the scanned information and discovers that it is readable but doesn't make sense.

I also notice that the manual says that the other options give larger files with better image quality but does not state clearly whether compression algorithms that can cause character substitution are disabled in those modes or whether substitution is just less likely due to higher quality settings.

When a development of a technology introduces new failure modes great care needs to be taken to inform users of those modes. Just burying it deep in a manual that people only read when things go wrong is not sufficient.

Proofreading @ Xerox Development? (1)

BoRegardless (721219) | about a year ago | (#44485315)

How could Xerox make copiers for this length of time and not have a proofreading algorithm that works with a super-resolution scan & no interpolation to "machine check" the final commercial copier as a way of quickly finding errors?

Internatlly, Xerox engineering had to know they were "correcting" pixels, rather than just "copying" them, so how did they verify their software?

Re:Proofreading @ Xerox Development? (2)

Fnord666 (889225) | about a year ago | (#44485591)

How could Xerox make copiers for this length of time and not have a proofreading algorithm that works with a super-resolution scan & no interpolation to "machine check" the final commercial copier as a way of quickly finding errors?

Internatlly(sic), Xerox engineering had to know they were "correcting" pixels, rather than just "copying" them, so how did they verify their software?

They do know [slashdot.org] about it.

Its the NSA (0)

Anonymous Coward | about a year ago | (#44485319)

NSA strikes again.

my guess (1)

Anonymous Coward | about a year ago | (#44485335)

my guess is that since digitization of documents that are later destroyed are treated as originals, then this will be used to bring uncertainty and doubt to information that will otherwise be essential to bringing accountability to large organizations that used these machines.

People: 0 , Big brother: 999999999999999999999999

Free Speech (4, Funny)

BradyB (52090) | about a year ago | (#44485341)

Hey, even photo copiers and faxes need freedom of speech.

Makes copies that are better than the original (0)

Anonymous Coward | about a year ago | (#44485349)

This Xerox product was popular on Wall Street a few years ago, especially those dealing in mortgage-backed securities.

Known Xerox Issue..... in documentation (5, Informative)

Anonymous Coward | about a year ago | (#44485351)

If you read the documentation from XEROX... it claims that on scanning it is a known problem that "Image quality is
acceptable but some quality degradation and character substitution errors may occur with some
originals." page 107 from http://www.cs.unc.edu/cms/help/help-articles/files/xerox-copier-user-guide.pdf

also on page 129 we have the following: "Quality / File Size
The Quality / File Size settings allow you to choose
between scan image quality and file size. These settings
allow you to deliver the highest quality or make smaller
files. A small file size delivers slightly reduced image quality
but is better when sharing the file over a network. A larger
file size delivers improved image quality but requires more
time when transmitting over the network. The options are:
  Normal/Small produces small files by using advanced
compression techniques. Image quality is acceptable but some quality degradation and character
substitution errors may occur with some originals."

Re:Known Xerox Issue..... in documentation (4, Insightful)

Chris Mattern (191822) | about a year ago | (#44485507)

Now the question becomes: what moron made this setting the default? Maybe a setting that can undetectably corrupt your data can be provided if appropriate warnings are given, but it sure as hell should never be the default. I would've thought that was obvious.

Re:Known Xerox Issue..... in documentation (2)

Nimey (114278) | about a year ago | (#44485659)

So you're telling us this is a problem caused by a user not RTFMing and Slashdot sensationalized it?

Surely you're joking. :P

Re:Known Xerox Issue..... in documentation (3, Insightful)

MozeeToby (1163751) | about a year ago | (#44485759)

Substitution errors shouldn't happen in corporate level scanning hardware, even if you bury a warning about it 107 pages into the 350 page manual. You can't have something that fundamentally makes your product not fit for purpose and claim that it's ok just because it's a known issue.

Wub fur (1)

Errol backfiring (1280012) | about a year ago | (#44485377)

They probably have some parts made of wub fur [wikipedia.org] . Those machines are more advanced than I thought!

No need to be a scientist.... (0)

Anonymous Coward | about a year ago | (#44485405)

to RTFM

I recognise the algorithm that gives those errors. (1)

SuricouRaven (1897204) | about a year ago | (#44485447)

I just spent ten minutes describing exactly how JBIG works here before noticing someone already realised what is happening and put it up on the page.

ImageRunner (3, Funny)

poofmeisterp (650750) | about a year ago | (#44485483)

OMG, my Canon ImageRunners are doing the same thing! It must be a virus!

I'd better write up a research document on this and request some grant money.

Interesting (3, Interesting)

jones_supa (887896) | about a year ago | (#44485527)

The things you learn. I never knew before about JBIG2 and how scanners use it to repeat pieces of image. Seems to me that the JBIG2 parameters are tuned incorrectly in these scanners.

Corporate decision (3, Funny)

Dunbal (464142) | about a year ago | (#44485537)

This was a decision by Xerox to get around ever being sued for copyright violations...

NSA BUG (1, Funny)

Sentrion (964745) | about a year ago | (#44485547)

It's just a bug in the NSA eavesdropping algorithm.

I can't understand (2)

joh (27088) | about a year ago | (#44485639)

how a compression that may lead to documents altered in such a way (numbers replaced by other numbers) can be considered fit for use in a photocopier. This can lead to very real, expensive and even dangerous problems down the line.

digital is not always better (0)

Anonymous Coward | about a year ago | (#44485755)

duh..

but its really the embedded serial numbers in scanned and printed documents that's getting in the way.

Load More Comments
Slashdot Login

Need an Account?

Forgot your password?