×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

ESR to Shred SCO Claims?

michael posted more than 10 years ago | from the woodchipper dept.

Caldera 554

webmaven writes "According to this article in eWEEK, ESR has released a utility called comparator for analyzing the similarity of source code trees. The technical details are interesting, in that ESR says he is using an implementation of a refined version of the 'shred' algorithm, with higher performance (on machines with enough RAM) than other versions. ESR won't say whether he intends the comparator to be used to compare older Unix code to Linux so as to be able to refute SCO's claims, but it's obviously well suited for such a purpose. Interestingly, as the shred algorithm can run reports on source trees using only the MD5 signature shreds (once generated), it is possible to use it to compare trees without direct access to the source code itself, leading to a possible use in comparing various proprietary source trees with each other and with Freely available code bases such as Linux and *BSD without requiring actual disclosure of the proprietary source code (a neutral third party could generate the shreds on a company's premises, and leave without taking a copy of the source with them). I'll be interested to see if (or which of) the proprietary vendors allow their source trees to be 'shredded' for such comparisons, and whether this becomes a standard forensic technique in source-code copyright and trade-secret disputes."

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered

554 comments

Let us lobeth a grenade at SCO and be done. (-1, Offtopic)

Anonymous Coward | more than 10 years ago | (#6915130)

I think he is just going to replace the 'offending' code with a big ASCII middle finger. I thought there was something fishy about all of this. I've got it figured out. And we certainly don't want to mash the SCO executives into a bloody pulp, either. You wouldn't want that to happen, would you?

This Comment was generated with the Comment-O-Matic for SCO Stories. [rageagainst.net]

Re:Let us lobeth a grenade at SCO and be done. (-1, Troll)

Anonymous Coward | more than 10 years ago | (#6915148)

In Soviet Amerika, ESR to shred SCO [goatse.cx] !

Re:Let us lobeth a grenade at SCO and be done. (-1, Offtopic)

Anonymous Coward | more than 10 years ago | (#6915168)

The penis: mightier than the sword!

But SCO's no ordinary rabbit! (3, Funny)

Anonymous Coward | more than 10 years ago | (#6915298)

Bruce Perens:
Three. Three. And we'd better not risk another frontal assault. Their legal team is dynamite.
Linus:
Would it help to confuse it if we run away more?
Bruce Perens:
Oh, shut up and go change your firewall!

Alan Cox:
Let us taunt it! Darl may become so cross that he will make a mistake.
Bruce Perens:
Like what?
Alan Cox:
Well... ooh.
ESR:
Have we got bows?
Bruce Perens:
No.
ESR:
We have the Holy Hand Grenade.
Bruce Perens:
Yes, of course! The Holy Hand Grenade of Antioch! 'Tis one of the sacred relics Brother Richard carries with him.
Brother Richard! Bring up the Holy Hand Grenade!
MONKS: [chanting]
Pie Iesu domine, dona eis requiem.
Pie Iesu domine, dona eis requiem. Pie Iesu domine, dona eis requiem. Pie Iesu domine, dona eis requiem.

Bruce Perens: How does it, um-- how does it work?
ESR:
I know not, my liege.
Bruce Perens:
Consult the Book of Armaments!
RMS:
Armaments, chapter two, verses nine to twenty-one.
OPEN SOURCE ZEALOT:
And Saint Attila raised the hand grenade up on high, saying, 'O Lord, bless this Thy hand grenade that, with it, Thou mayest blow Thine enemies to tiny bits in Thy mercy.'
And the Lord did grin, and the people did feast upon the lambs and sloths and carp and anchovies and orangutans and breakfast cereals and fruit bats and large chu--
RMS:
Skip a bit, Brother.
OPEN SOURCE ZEALOT:
And the Lord spake, saying, 'First shalt thou take out the Holy Pin. Then, shalt thou count to three. No more. No less. Three shalt be the number thou shalt count, and the number of the counting shall be three. Four shalt thou not count, nor either count thou two, excepting that thou then proceed to three. Five is right out. Once the number three, being the third number, be reached, then, lobbest thou thy Holy Hand Grenade of Antioch towards thy foe, who, being naughty in My sight, shall snuff it.'
Richard:
Amen.
KNIGHTS:
Amen.
Bruce Perens:
Right!

One!... Two!... Five!
Alan Cox:
Three, sir!
Bruce Perens:
Three!
[sco dies]

In Soviet Russia... (-1, Redundant)

Anonymous Coward | more than 10 years ago | (#6915146)

source code shreds ESR!

Re:In Soviet Russia... (-1, Offtopic)

Anonymous Coward | more than 10 years ago | (#6915193)

source code shreds ESR!

Too complex. It really works better as, "IN SOVIET RUSSIA, Source Code Shreds YOU!"

Actually, if you read the title too quickly, it looks like "ISR to Shred SCO Claims?" Of course, the answer to that question is, "ISR, SCO Claims Shred YOU!"

Re:In Soviet Russia... (-1, Troll)

Anonymous Coward | more than 10 years ago | (#6915427)

In Soviet Russia, you cause bad ISR jokes to have bad karma?

maybe... (4, Funny)

b17bmbr (608864) | more than 10 years ago | (#6915153)

microsoft can just shred their source tree and start anew. maybe...

Re:maybe... (5, Interesting)

jmv (93421) | more than 10 years ago | (#6915194)

Actually, combine this with the "shared source" program from MS and it would be easy to see if MS did (or did not) copy GPL code into Windows as some suggest.

Re:maybe... (0)

Anonymous Coward | more than 10 years ago | (#6915351)

Well, first they'll be more than happy to publish this crap [msn.com] .

I'm really annoyed that CNBC would even relay this info, considering everything that's going on. If it wasn't for Maria Bartiromo, I wouldn't watch at all...

SCO! (3, Funny)

scovetta (632629) | more than 10 years ago | (#6915156)

Of course, we can just trust SCO to show the right hashes. Why would they lie?

Re:SCO! (2, Interesting)

jmv (93421) | more than 10 years ago | (#6915229)

Ths think is that the hashes could be generated my any organisation that has access to the SysV source code. There are many of them (IBM being one).

Re:SCO! (5, Insightful)

mik (10986) | more than 10 years ago | (#6915300)

The point is that we don't need SCO to do anything. Presumably any of the many people with legal rights to SCO source code can publish the hash list without divulging any of SCO's (ahem) "IP". Even more interesting is the theoretical possibility of comparing historical releases of SCO trees against GPL-licensed code, thus (perhaps) demonstrating that SCO has illegally violated the IP of OSS developers. Of course, hash comparisons alone would be unlikely to convince a judge/jury of anything. They ought to be sufficient grounds for some embarrasing subpoenas, and maybe some really neat cease-and-desist orders, though.

The real question is: (-1, Troll)

Anonymous Coward | more than 10 years ago | (#6915371)

Will this technique be strong enough to see through the code obfuscation techniques used in Linux?

Re:The real question is: (0)

Anonymous Coward | more than 10 years ago | (#6915398)

Shut up Darl

This Is The Same Idiotic Grandstander wrote this! (-1, Offtopic)

Anonymous Coward | more than 10 years ago | (#6915160)

A few hours ago, I learned that I am now (at least in theory) absurdly rich.

I was at my machine, hacking, when I got email congratulating me on the success of the VA Linux Systems IPO. I was working on my latest small project -- a compiler for a special-purpose language I've designed called Scriptable Network Graphics, or SNG. SNG is an editable representation of the chunk data in a PNG. What I'm writing is a compiler/decompiler pair, so you can dump PNGs in SNG, edit the SNG, then recompile to a PNG image.

"Congratulations? That's interesting," said I to myself. "I didn't think we were going out till tomorrow." And I oughtta know; I'm on VA's Board of Directors, recruited by Larry Augustin himself to be VA's official corporate conscience, and it's a matter of public record that I hold a substantial share in the company. I tooled on over to Linux Today, chased a link -- and discovered that Larry Augustin had taken the fast option we discussed during the last Board conference call. VA had indeed gone out on NASDAQ -- and I had become worth approximately forty-one million dollars while I wasn't looking.

Well, that didn't last long. In the next two hours, VA dropped from $274 a share to close at $239, leaving me with a stake of only thirty-six million dollars. Which is still a preposterously large amount of money.

You may wonder why I am talking about this in public. The first piece of advice your friends and family will give you, if it looks like you're about to become really wealthy, is: keep it quiet. It's nobody else's business -- you don't want to look like you're gloating, and you don't want to be deluged with an endless succession of charity appeals, business propositions, long-lost best friends, and plain bald-faced mooching.

Trouble with the "keep it quiet" theory is that I've made my bucks in a very public way. When you're already a media figure, and your name is on the S-1 of a hot IPO, and email from friends and journalists starts coming in like crazy as the stock breaks first-day-gainplaying it coy swiftly ceases to look like a viable option.

Besides, it wouldn't be fair to dissemble. I serve a community. I'm wealthy today because my efforts to spread the idea of open source on behalf of that community helped galvanize the business world, and earned the respect and the trust of a lot of hackers. Larry thought that respect was an asset worth shelling out 150,000 shares of VA for. Fairness to the hackers who made me bankable demands that I publicly acknowledge this result -- and publicly face the question of how it's going to affect my life and what I'll do with the money.

This is a question that a lot of us will be facing as open source sweeps the technology landscape. Money follows where value leads, and the mainstream business and finance world is seeing increasing value in our tribe of scruffy hackers. Red Hat and VA have created a precedent now, with their directed-shares programs designed to reward as many individual contributors as they can identify; future players aiming for community backing and a seat at the high table will have to follow suit. In this and other ways (including, for example, task markets) the wealth is going to be shared.

So while there aren't likely to be a lot more multimillion-dollar bonanzas like mine, lots of hackers are going to have to evolve answers to this question for smaller amounts that will nevertheless make a big difference to individuals; tens or hundreds of thousands of dollars, enough to change your life -- or wreck it.

(Gee. Remember when the big question was "How do we make money at this?")

The first part of my answer is "I'll do nothing, until next June". Because I'm a VA board member, under SEC regulations there's a six-month lockout on the shares (a regulation designed to keep people from floating bogus offerings, cashing out, and skipping to Argentina before the share price crashes). So it's not strictly true that I'm wealthy right now. I will be wealthy in six months, unless VA or the U.S. economy craters before then. I'll bet on VA; I'm not so sure about the U.S. economy :-).

Assuming the economy does not in fact crater, how is wealth going to affect my life in six months? Honestly, I think the answer is "not much". I haven't spent the last fifteen years doing the open-sourcefor the money. I'm already living pretty much exactly the way I want to, doing the work that matters to me. The biggest difference the money will make to me personally is that now I should be able to keep doing what I love for the rest of my life without worrying about money ever again.

So I expect I'll just keep on as I've been doing. Hacking code. Thinking and spreading subversive thoughts. Traveling and giving talks. Writing papers. Poking various evil empires a good one in the eye whenever I get a chance. Working for freedom.

I expect most other hackers confronted with sudden wealth will make similar choices. Reporters often ask me these days if I think the open-source community will be corrupted by the influx of big money. I tell them what I believe, which is this: commercial demand for programmers has been so intense for so long that anyone who can be seriously distracted by money is already gone. Our community has been self-selected for caring about other things -- accomplishment, pride, artistic passion, and each other.

OK, so maybe I'll break down and finally get a cell phone. And cable broadband so I can surf at smokin' speed. And a new flute. And maybe a nice hotrodded match-grade .45 semi for tactical shooting. But really, I don't want or need a lot of stuff. I'm kind of Buddhist that way; I like to minimize my material attachments. (My family gripes that this makes me hell to buy Christmas presents for.)

I'm not going to minimize my attachments by giving it all away, though, so you evangelists for a zillion worthy causes can just calm down out there and forget about hitting me up for megabucks. I am *not* going to be a soft touch, and will rudely refuse all importunities.

I'm not copping this harsh attitude to protect my money, but rather to protect the far more precious asset of my time. Because I don't want to have to become a full-time specialist in deciding whose urgent pitch to buy, I'm going to turn everybody down flat in advance. Anyone who bugs me for a handout, no matter how noble the cause and how much I agree with it, will go on my permanent shit list. If I want to give or lend or invest money, *I'll* call *you*. (Sigh...)And yes, there are causes I'll give money to. Worthy hacker projects. Free-speech activism. Firearms-rights campaigns. Tibet, maybe. I might buy a hunk of rainforest for conservation somewhere. Megabucks are power, and with power comes an obligation to use it wisely. I'll give carefully, and in my own time, and only after doing my homework -- too much charity often kills what it means to nurture. And enough about that.

Ironically enough, one result of my getting rich is that I will probably start charging for speaking appearances, now that nobody can plausibly accuse me of doing it for the money. I won't charge open-source user groups or schools, but I will cheerfully extract a per diem from all the business conferences that keep wanting me to to boost their box office. Charging a price for my time will separate the expensive conferences that attract powerful people from the marginal events where the hacker community would get less leverage from my presence.

For the same reason, I'm still going to insist that anybody who wants me to give a talk has to cover my expenses and eliminate hassles. But I also expect I'll still carry my own luggage. And I'll never get too proud to crash on somebody's daybed when the local user group is too broke to cover a hotel.

But enough trivialities; I'm going to get back to work. I've got the SNG compiler stage almost done. Next up, I need to refactor the pngcheck code so I can give it a report-format option that generates SNG syntax. Then, I need to think about supporting MNG...
--
Eric S. Raymond

ESR: Surprised by cock! (-1, Offtopic)

Anonymous Coward | more than 10 years ago | (#6915243)

A few hours ago, I learned that I am now (at least in theory) absurdly gay.

I was at my machine, my 386 with 4 megs of RAM running Linux, masturbating to pictures of RMS, when I got an email congratulating me on the success of Slashdot. I was working on my latest small project-- a clever little text parser that takes input from the user and puts it in a little cartoon-style word balloon coming out of-- get this!-- a giant, erect ASCII penis's bulging head! Hahaha! It's called COCKSAY. You can download it here.

"Congratulations? That's interesting," said I to myself. "I didn't think Slashdot was coming out until tomorrow." And I oughtta know; I'm on VA's Board of Directors, recruited by Larry Augustin himself, to be VA Linux's "corporate conscience," and it's public record that I hold a substantial share in the company's semen pool. I tooled on over to Linux Today, chased a link like it was a naked hippy's ass-- and discovered that Rob Malda had taken the fast action we had discussed at the last board meeting. Slashdot had come out first thing that morning with a headline on its own site-- and I had become the figurehead of the Gay Faggot Slashdot Empire while I wasn't looking.

Well, that didn't last long. In the next two hours, 369 VA employees also disclosed that they had AIDS, leaving me with a bit of the proverbial semen on my face.

You may wonder why I am talking about this in public. The first piece of advice your friends will give you, if it looks like you're about to come out of the closet, is: keep quiet! It's really nobody else's business-- you don't want to look like you're lusting for cock, though you may want to be deluged by an endless succession of men dressed up as Navy sailors demanding blowjobs from you; fat, hairy men (the bears) wanting to fuck you in the ass; and sweet, young, hairless boys offering you the beauty of their youth.

Trouble with the "keep it quiet" theory is that I've always solicited gay male faggot sex in a very public way. When you're already a media figure, like myself, and your name is on the Faggot Manifesto your whole organization chose to use to come out, and email from friends and journalists starts coming in like crazy as the gayness of your empire breaks records even on the first day, playing it coy swiftly ceases to look like a viable option.

But it wouldn't be fair to dissemble. I serve the gay community. I'm wealthy today because my efforts to spread faggotry and venereal diseases on behalf of that community helped infiltrate the business world and earned the trust of a lot of young, naive boys. Fairness to the twinks

Re:ESR: Surprised by cock! (-1, Offtopic)

Anonymous Coward | more than 10 years ago | (#6915364)

You're quite the comedian Mr. Gates, or is this Mr. McBride? You silly scamp!

Sorry Chaps... (-1)

SCO$699FeeTroll (695565) | more than 10 years ago | (#6915161)

...you are still responsible for the $699 licensing fee ($32 for embedded devices.) I also must inform you that you are still a bunch of cock-smoking teabaggers. In time, you will become anal-fisting surrender-monkeys. Good Day.

Is there really that much data there? (4, Funny)

More Karma Than God (643953) | more than 10 years ago | (#6915173)

If there is, why couldn't MD5 shreds be used as a lossy compression scheme for code?

Re:Is there really that much data there? (2, Insightful)

Sterling Christensen (694675) | more than 10 years ago | (#6915208)

Because lossy compression would be useless. When decompressed the source code wouldn't work anymore.

Source code isn't loss-tolerant (or whatever)

Re:Is there really that much data there? (0)

Anonymous Coward | more than 10 years ago | (#6915264)

It's always funny reading the comments of the humor-challenged.

Re:Is there really that much data there? (3, Informative)

Paradox (13555) | more than 10 years ago | (#6915236)

No. Hashes are one way functions. So it'd be kinda pointless. Further, comparing two hashes for anything but equality is meaningless with most good hashing schemes (unless you're a cryptographer).

Re:Is there really that much data there? (0, Redundant)

RexHowland (71795) | more than 10 years ago | (#6915367)

But what would the hash be of? Would each line of code be a separate hash, or would lines be combined?

Wouldn't altering one letter in the code completely change the hash? If so, all you would need to do to avoid detection would be to make a few changes to minor things, and you would appear to have different hashes, even if the source code were essentially the same.

Re:Is there really that much data there? (2, Funny)

Anonymous Coward | more than 10 years ago | (#6915237)

Umm... why would you want lossy compression for code? Perhaps if it only lost the bugs?

Re:Is there really that much data there? (0)

Anonymous Coward | more than 10 years ago | (#6915246)

What good is lossy compression for code? What will you do when you uncompress your project and find it full of bugs?

Re:Is there really that much data there? (1)

(startx) (37027) | more than 10 years ago | (#6915325)

the key is in your own sentence. lossy. That implies you are lossing information. It would be bad to loss any peice of anything you want to keep. image and sound compression can be lossy because they toss out parts we don't see or hear anyway, but do you really want to lose pieces of code?

ESR ADMITS TO ENRON PRACTICES (5, Funny)

Anonymous Coward | more than 10 years ago | (#6915175)

This will only serve as another black eye on the Open Source community. ESR should know better that to shred SCO material prior to a trial.

Re:ESR ADMITS TO ENRON PRACTICES (0)

Anonymous Coward | more than 10 years ago | (#6915249)

I know that SCO practices Texas style accounting, but I never heard of Enron practices? Is that when you kill the person who knows what happened?

But the Important Question is... (2, Funny)

BlackBolt (595616) | more than 10 years ago | (#6915178)

Did he write it in Python? And did he complete it in under 6 hours?

Answered My Own Question.. (3, Funny)

BlackBolt (595616) | more than 10 years ago | (#6915221)

From the article:

"...has two advantages: one, it's amazingly fast..."

Guess not. ;-)

Re:But the Important Question is... (2, Informative)

TMB (70166) | more than 10 years ago | (#6915377)

From the README...

Besides the production C code, the distribution also includes working Python versions. These were used to prototype the concept.

No word on the latter... but it's ESR... so of course! ;-)

[TMB]

I'm more interest in time saved by programmers (1)

2TecTom (311314) | more than 10 years ago | (#6915181)

... then the number of lawyers it'll retire.

Re:I'm more interest in time saved by programmers (0)

Anonymous Coward | more than 10 years ago | (#6915287)

It's "than" not "then".

They're different words you know.

Doubt it will help (5, Insightful)

Brahmastra (685988) | more than 10 years ago | (#6915190)

I think the question here is not about whether there is common code between SCO and Linux. There is no doubt that there will be common code because of the common origins. The issue here is that SCO does not own that code.

Re:Doubt it will help (4, Insightful)

djh101010 (656795) | more than 10 years ago | (#6915234)

If there's going to be a line-by-line comparison, this is the tool to do it. Once those lines are identified, *then* it's simply a matter of finding out the origins of them; that's where we can roll it back to a textbook published in 1973 or whatever.

Until the lines that are common are identified, it's impossible to defend against the accusations. Because of that, I bet Darling Darl won't allow it to be used. The question is, how to turn the inevitable refusal into something that shuts him (up|down).

Re:Doubt it will help (1)

jmv (93421) | more than 10 years ago | (#6915263)

Well, when you find a comment segment in the code, you can always look at the Linux code and see where it comes from. This works the same way as the malloc code from the SCO presentation that was eventually traced to BSD code (or at least something which SCO does not own).

Nah... (4, Insightful)

SargeZT (609463) | more than 10 years ago | (#6915195)

This shouldn't be relied upon in the court of law. Although I acknowledge that SCO likely has no IP claim over Linux, it should have a fair case. A program that would rule out code similarities does not rule out code that is based on the SCO code. There are hundreds of ways to do a single thing, and if the GNU/Linux took ideas from the SCO kernel, SCO may be as eligible for compensation as if it were directly copied from SCO.

Re:Nah... (3, Interesting)

jmv (93421) | more than 10 years ago | (#6915307)

That's true in general. However, SCO has explicitly stated that thousands of lines of code have been illegaly copied *verbatim* from System V. This tool could at least prove that they lied (because of the verbatim copy allegation).

Re:Nah... (4, Informative)

jonabbey (2498) | more than 10 years ago | (#6915318)

if the GNU/Linux took ideas from the SCO kernel, SCO may be as eligible for compensation as if it were directly copied from SCO.

IANAL, but I don't believe this is so in the general case. Copyright protects only specific expression of ideas, not the ideas themselves.

If SCO had valid patents on some of this stuff, they'd have a point of legal leverage, but they don't from all reports.

Re:Nah... (1, Insightful)

Anonymous Coward | more than 10 years ago | (#6915321)

Copyright covers expressions, not ideas. You can't "take" ideas under american law, ideas are just there waiting to be reimplemented.

Re:Nah... (1)

nate1138 (325593) | more than 10 years ago | (#6915343)

Umm, no. How would this prevent a fair case? It wouldn't. It is simply a tool to compare files. If SCO is being truthful, they should allow the comparison. If they refuse, it is probably because it would expose them for the liars they are. Additionally, the patents on that sysV code expired a long time ago, so reimplementing something that they do is not a violation of anything. Copying the code directly would be, however.

Eh, what? (1)

Spamalamadingdong (323207) | more than 10 years ago | (#6915349)

Although I acknowledge that SCO likely has no IP claim over Linux, it should have a fair case.
A fair case of what? Proving use of trade secrets (which had been widely distributed and taught in universities for decades)? SCO has made no claims of copyright or patent infringement.

SCO is going to get its corporate head handed to it on a platter, and I hope that the courts allow the corporate veil to be pierced so that McBride and company have to bear the cost of their misdeeds personally (and not just the duped stockholders).

The truth is out there (2, Interesting)

Teahouse (267087) | more than 10 years ago | (#6915196)

The truth is out there, we will finally get to it without signing a SCO NDA. This should end the case before it begins. SHRED ON!

Re:The truth is out there (1)

Usquebaugh (230216) | more than 10 years ago | (#6915252)

Clueless.

The question is not if there is common code, there is, but rather where the code came from?

Re:The truth is out there (1)

StenD (34260) | more than 10 years ago | (#6915394)

But the first step to answering the question of the origins of the code, is determinig which code needs to be investigated. That's why identifying the duplicated code is critical.

Re:The truth is out there (1)

Teahouse (267087) | more than 10 years ago | (#6915425)

Clueless

Where did I say otherwise? If you require me to spell it out, how bout this...FINDING COMMON CODE THAT IS SOLELY SYSTEM V IP.

You can always correct my spelling next if you want, I am sure you could find hours of anal retentive pleasure there too.

Re:The truth is out there (1)

DaveAtFraud (460127) | more than 10 years ago | (#6915426)

You seem to be under the mistaken impression that SCO is interested in establishing the truth.

Breaking News! (3, Funny)

TexVex (669445) | more than 10 years ago | (#6915216)

This just in. SCO to sue ESR for patent infringement over "comparator", a software package that performs comparison between different sets of source code to determine if any code is copied between them.

In response, SCO hires Iraqi Information Minister (-1, Redundant)

DCowern (182668) | more than 10 years ago | (#6915228)

Darl's head must be spinning so fast that he doesn't know which way is up any more. Several scattered thoughts come to mind, among them "chutzpah", "pump and dump", and "someone's going to jail when this is all over." I think he is just going to replace the 'offending' code with a big ASCII middle finger. Ok, I'll stop now.

This Comment was generated with the Comment-O-Matic for SCO Stories. [rageagainst.net]

I can decipher it! (0)

normal_guy (676813) | more than 10 years ago | (#6915337)

What do we get? It's like SCO is holding a handgrenade and people are slowly moving away from the madman. Shhh! You are breaking my concentration! I'm trying to shed a bitter tear for them. You mean this whole lawsuit thing is for real?

This Comment was generated with the Comment-O-Matic for SCO Stories. [rageagainst.net]

Can Someone Explain? (2, Interesting)

Klync (152475) | more than 10 years ago | (#6915230)

If you're comparing two sets of code vis. their MD5 sums, then won't that miss matching lines that differ by even one character - like, say, a space?

Re:Can Someone Explain? (4, Interesting)

stratjakt (596332) | more than 10 years ago | (#6915284)

Perhaps if you parsed them both, and compared the resulting object code, right before compilation?

That way if your variable is called numOfPorts and mine is called countOfPorts, the parsed code is the same for both, when stuff like that becomes meaningless.

Even if not, SCO seems to be saying that much of the code is copy-n-paste anyways.

Re:Can Someone Explain? (5, Informative)

Sterling Christensen (694675) | more than 10 years ago | (#6915327)

From it's manual:
"The -w causes all whitespace in the file (including blank lines) to be ignored for comparison purposes (line numbers in the output report will nevertheless be correct). This is recommended for comparing C code; among other things it means the comparison won't be fooled by differences in indent style."

Re:Can Someone Explain? (1)

jonabbey (2498) | more than 10 years ago | (#6915352)

You'd have to transform the segments, basically boiling them down to a canonical form before generating the MD5 hash.

So you might turn all contiguous whitespace (tabs, spaces, etc.) into a single space char before generating the hashes, for instance.

Re:Can Someone Explain? (2, Insightful)

Anonymous Coward | more than 10 years ago | (#6915400)

While you might be able to deal with whitespace, you do still have the problem that you're really only looking at whole-file matches for identity. You can't find one function lifted from some other source. You can't find code that's had even minimal cosmetic surgery on the variable names.

While a high degree of exact matching between two trees would demonstrate related code, lack of a high degree of identical files as determined by this method does not demonstrate that two code trees are unrelated. It's perhaps an interesting metric for comparing two projects that you already know are related, like two forks of a project or two versions of one project. But this technique is nearly useless as an anti-SCO defense.

Fighting machines with machines is pointless (-1, Redundant)

Anonymous Coward | more than 10 years ago | (#6915232)

how long until SCO claim rights to this piece of software???

Other uses? (4, Interesting)

Not_Wiggins (686627) | more than 10 years ago | (#6915235)

It might be interesting to see how different families of Linux/Unix compare... maybe generate a veritable "family tree" of relationships.

Of course, that also depends more on how differences are actually calculated. Still, could make an interesting project to relate OSes based on how much shared code they still retain and show it in a graphical tree format, ala "family tree." 8)

Re:Other uses? (2, Funny)

JeffTL (667728) | more than 10 years ago | (#6915306)

Yeah, that'd be great. In my anthropology class we've been studying that sort of stuff, but with DNA...there are some tree diagrams of primates, so why not Unices?

Genius (2, Insightful)

seldolivaw (179178) | more than 10 years ago | (#6915245)

ESR shows us once again why exactly he has so much respect from the community. Well done, that man.

What respect? (3, Interesting)

Anonymous Coward | more than 10 years ago | (#6915283)

Most people *I* know consider ESR to be a bloated windbag with a penchant for fanatical gunrights. He's regarded as pretty much being on the same level as the late Jon Katz.

fire the "laser" (0, Offtopic)

La Temperanza (638530) | more than 10 years ago | (#6915260)

I can't decide which is funnier - the point about IBM orchestrating all the outrage, or the point that SCO is somehow more "relevant" to the tech community because they've filed a bunch of press releases! SCO has committed the most vile of sin. Ok, I'll stop now.

This Comment was generated with the Comment-O-Matic for SCO Stories. [rageagainst.net]

Re:fire the "laser" (3, Interesting)

be-fan (61476) | more than 10 years ago | (#6915374)

You know the sad thing about all this? I can't tell the difference between the auto-generator or your average Slashdotter. Does this mean that the auto-generator passes the Turing Test, or that the average Slashdotter doesn't?

Who cares? (3, Insightful)

Otter (3800) | more than 10 years ago | (#6915262)

OK, I admit that a) the guy annoys the hell out of me, b) his yapping about "one of us" DOS'ing SCO is yet another case of him embarassing Linux while aggrandizing himself and c) just the quotes in this article alone make me want to slap him. So if someone else had been involved with this, I probably wouldn't bother to care.

Anyway -- who cares? There's no question there are plenty of common chunks between Linux and SCO-owned source. And that there are ways to find them. The question is what they are (which SCO isn't saying) and what their common origin is and where that origin falls in the murky history of the Unix codebase. It's not as if anyone has been saying, "We're helpless in the face of this computational problem. If only there were a way to compare large bodies of text for common elements!"

Never mind that there are probably people who can compare both codebases in their heads.

Maybe he's made some major algorithmic breakthrough. (I doubt it but, but I'll leave that to the experts.) But this story is just him yapping again.

Re:Who cares? (2, Informative)

jmv (93421) | more than 10 years ago | (#6915354)

I think the difference is that a 3rd party that has access to the SysV source can compute the hashes and make them public without violating copyright. That way anyone can look for common lines with Linux and see where they came from (legal or not).

Finally ESR stops yapping and does some hacking (2, Funny)

Anonymous Coward | more than 10 years ago | (#6915266)

ESR is ok you know, but lately he has just been doing lots of ranting and soapboaxing and no hacking.

Finally he comes out with some hack action. About time man, I was beginning to view him as just some big windbag who hacked a little back in the day. Well I still sorta do, but this is at least pretty cool, you know.

SCO may not know origin of code (5, Informative)

Malfourmed (633699) | more than 10 years ago | (#6915267)

The Sydney Morning Herald continues its mainstream coverage of the SCO vs IBM roadshow by posting an article where Dr Warren Toomey, a Unix historian, says that SCO may not know the origin of their own code [smh.com.au] .

Article text follows:

SCO may not know origin of code, says Australian UNIX historian

By Sam Varghese
September 9, 2003

More doubts have been cast on the heritage of System V Unix code, which the SCO Group claims as its own, by an Australian who runs the Unix Heritage Society. [tuhs.org]

Dr Warren Toomey, now a computer science lecturer at Bond University, said today: "I'd like to point out that SCO (the present SCO Group) probably doesn't have an idea where they got much of their code. The fact that I had to send SCO (the Santa Cruz Organisation or the old SCO) everything up to and including Sys III says an awful lot."

He said that even though SCO owned the copyright on Sys III, a few years ago it did not have a copy of the source code. "I was dealing with one of their people at the time, trying to get some code released under a reasonable licence. I sent them the code as a gesture because I knew they did not have a copy," he said with a chuckle.

Dr Toomey's statements come a few days after Greg Rose, an Australian Unix hacker from the 1970s, raised the possibility that there may be code contributed by people, including himself, which has made its way into System V Unix and is thus being used by companies like the SCO Group.

Dr Toomey said this was one reason why the code samples which the SCO Group had shown at its annual forum had turned out to be widely published code.

SCO was unaware of the origins of much of the code and this "explains how they could wheel out the old malloc() code and the BPF (Berkeley Packet Filter) code, not realising that both were now under BSD licences - and in fact they hadn't even written the BPF code," Dr Toomey said.

He said that there was lots of code which had been developed at the University of New South Wales in the 70s which went to AT&T and was incorporated into UNIX without any copyright notices.

"At that time the development that was going on was similar to open source - the only difference was that the developers all had to have copies of the code licensed from AT&T," he said.

Dr Toomey, who served 12 years with the Australian Defence Force Academy, an offshoot of the University of New South Wales, before joining Bond University, said he had source code for Unices from the 3rd version of UNIX which came out in 1974 to the present day. "I don't have Sys V code but there are people with licences for that code who are members of the Unix Heritage Society. We can compare code samples any time," he said.

He agreed that the codebase of Sys V was a terribly tangled mess. "It is very difficult to trace origins now. There is an awful lot of non-AT&T and non-SCO code in Sys V. There is a lot of BSD code there," he said.

In March, the SCO Group filed a billion-dollar lawsuit against IBM, for "misappropriation of trade secrets, tortious interference, unfair competition and breach of contract."

SCO also claimed that Linux was an unauthorised derivative of Unix and warned commercial Linux users that they could be legally liable for violation of intellectual copyright. SCO later expanded its claims against IBM to US$3 billion in June when it said it was withdrawing IBM's licence for its own Unix, AIX.

IBM has counter-sued SCO while Red Hat Linux has sued SCO to stop it from making "unsubstantiated and untrue public statements attacking Red Hat Linux and the integrity of the Open Source software development process."

-----

Wordforge writing contest now open: deadline 2003-03-28

Be careful... (4, Interesting)

nolife (233813) | more than 10 years ago | (#6915295)

The more points you discover and disprove now with SCO's claims.. the higher quality, more refined, and detailed SCO's evidence will be when this setup finally gets to a court in front of a judge. If they went to court two months ago or even today, they would have been sent home quickly with bascially easy to disprove evidence. With the help of the open source community, they are slowly changing their weapon of choice from a shotgun to a rifle.

Slashdot to present all headlines as questions? (-1, Offtopic)

Anonymous Coward | more than 10 years ago | (#6915309)

just wondering....

Could Linux users BE any gayer? (-1, Troll)

Anonymous Coward | more than 10 years ago | (#6915311)


Constantly referring to Eric Raymond as "ESR" and Richard Stallman as "RMS".

How totally gay.

No wonder Linux has replaced MacOS as THE gay operating system.

Re:Could Linux users BE any gayer? (-1, Offtopic)

Anonymous Coward | more than 10 years ago | (#6915365)

You sir are 100% correct! Mod parent up!

Re:Could Linux users BE any gayer? (-1, Offtopic)

Anonymous Coward | more than 10 years ago | (#6915409)

yessthhh thirrr!

Unfortunate name (2, Insightful)

tordon (176098) | more than 10 years ago | (#6915312)

Upper mangeement in most enterprises have a low level of technical knowledge. To them the thought of something called shredding coming anywhere near the 'voodoo' of software development would be abhorrent.

Is this really as useful as it seems? (1)

typobox43 (677545) | more than 10 years ago | (#6915314)

What I see from the article is that it can only compare whether two code snippets are exactly alike (which makes sense from the standpoint of MD5 - they're really only useful for equality checks) - and from the claims that are being thrown around about obfuscating the supposedly legal code, that isn't going to help much of anything.

Ridiculous (1)

ckimyt (159096) | more than 10 years ago | (#6915323)

Interestingly, as the shred algorithm can run reports on source trees using only the MD5 signature shreds (once generated), it is possible to use it to compare trees without direct access to the source code itself, leading to a possible use in comparing various proprietary source trees with each other and with Freely available code bases such as Linux and *BSD without requiring actual disclosure of the proprietary source code
That's just plain dumb:

[evilscobox] /usr/codebase> find . -iname '*.c*' -o -iname '*.h*' -exec echo "/* I'm defeating MD5 */" >> {} \;

Duh!

Re:Ridiculous (1)

jonabbey (2498) | more than 10 years ago | (#6915392)

Nice job on not reading the story. Shred calculates a panoply of md5 hashes for each text file.. essentially it creates an md5 hash of each 3 line segment of code, starting at every line offset.

Re:Ridiculous (1)

Another MacHack (32639) | more than 10 years ago | (#6915414)

Shred works by taking checksums of groups of three lines. Adding a line at the end won't keep it from matching every other group of three lines.

Furthermore, the linux code in question is a matter of historical record. The only code which could be usefully changed would be the code which is being released only via shred-indexes. In that case, though, the person trying to modify the code would only decrease the match-likelyhood rating by adding random crap.

Dibs on naming the KDE GUI (4, Funny)

Eberlin (570874) | more than 10 years ago | (#6915332)

KDE GUI version should be called Krang since Shredder would obviously be used from the command line (shell). Maybe it should have helper apps called Bebop and Rocksteady. And if the need should arise, the project shouldn't fork...it should splinter.

This is actually a darn good idea (5, Informative)

RocketRick (648281) | more than 10 years ago | (#6915335)

By computing MD5 hashes of consecutive (overlapping) line triplets, the shred algorithm makes it easy to identify copied code, without ever seeing the actual code. This might be a perfect way for companies to allow a third party to compare code, without giving away any trade secrets in the process.

Of course, since MD5 is a very good cryptographic hash function, *any* one-bit change in the source will result in, on average, half of the bits in the result being flipped. So, this method of identifying copied code would only work if the code had never been run through an obfuscator. It would also be defeatable by running the source through a script to have its variable names search-and-replaced with similar names (such as replacing every variable name with a new name consisting of the old name plus "_newname")....

In short, this might be a useful technique for allowing a third party to look for trivial wholesale copying of code, but it would be useless for finding a motivated miscreant, determined to steal code without being caught.

SCOs IP violations will be obvious... (0)

Anonymous Coward | more than 10 years ago | (#6915339)

This will also find all the places SCO has
violated the terms of other organizations IP...

Ups and downs (4, Informative)

autocracy (192714) | more than 10 years ago | (#6915342)

Upside: we can maybe help catch more stolen code.
Downside: Uh... it just came out... and it's making some big, big claims involving fuzzy logic. I think it's gonna need some testing first, eh?

Also, anybody else think it only works on larger sections of code than just say 10 lines?

Bad for Students (2, Funny)

chicagoan (670650) | more than 10 years ago | (#6915347)

I'm just glad that I finished college before they had this technology otherwise I might have been caught for cheating. Although I was really good at renaming variables.

ANNOUNCEMENT! GNOME 2.4 IS A PIECE OF SHIT! (-1, Offtopic)

anonymous coword (615639) | more than 10 years ago | (#6915359)

I am sorry to announce this, but the eagerly awaited gnome 2.4 is a piece of SHIT! Please either stick to Gnome 2.2 or use the new KDE 3.2 [kde.org] alpha!.

Here is why GNOME 2.4 is shit!

  • Still has that shitty file dialog!
  • File roller has removed the extract here option from the right click menu, now you have go through about 5 menus just to extract a file
  • Metashitty still dosen't support button reordering
  • Nautilus still dosen't have split pane support!
  • The new panel architecture sucks! Its harder to use and configure
  • Epiphany, which was once going to be a lean browsing machine, has been turned into an AOL clone! Its a shameful 1.0 release, and they removed Bookmark folders, now all your bookmarks will be cluttred
  • Still no colour scheme changer
  • Wanda the fish stil looks gay
  • Totem still crashes on avi files
  • The documentation is still half assed
  • The anti feature nazis have taken over 100 features away from Gnome since 2.2, I can't list them all, figure it out for yourself
  • bonobo-slay still owns your dialogs
  • The smelly foot is still there, with NO WAY TO change it, since the feature police took it away.


Please boycott gnome until they put the features back, A REALLY PISSED GNOME USER! WHEN KDE 3.2 ALPHA COMES OUT I'M SWITCHING! WILL YOU?

Interesting implementation, but flawed (1)

mTor (18585) | more than 10 years ago | (#6915366)

A simple re-indentation or a variable change would fool comparator. What someone needs to do is to implement a parse tree comparison tool which would be able to compare files on a semantic level.

from the Bouncing legs dept. (0)

bobdotorg (598873) | more than 10 years ago | (#6915373)

When I saw the headline 'shred SCO', 'from the woodchipper department' a vision of a corporate purge 'Fargo Style' popped into my head.

Nonsensical idea (1, Interesting)

YU Nicks NE Way (129084) | more than 10 years ago | (#6915376)

Great. So cool. And so stupid.

First, IBM, Sequent, SGI and Linux wouldn't be off the hook if the provenance of each line of code were proven to have come from other sources. There are a number of trade secret issues that still could crop up.

But let's assume that Raymond's work was actually run on the SCO source and on Linux. Would the results be meaningful?

No.

Suppose I have a routine that comes originally from source B. I work for a company which has the right to copy B, but which redistributes the results of its work under a closed license. Call that new source S. It so happens that the code my company got from B had a nasty bug in it, and I spent a month finding a fix for that bug. Suppose also that the fix is quite small relative to the original code, as is ususally the case. A shredder is going to find significant similarities between at routine as implemented in source B and source in S. Now, suppose source L comes along. The authors of L had the right to copy from B, but not from S. They have a very similar routine, originally derived from B. After shredding, the routines in B, S, and L will all look similar -- but whether there's an infringement between S and L will depend solely on a tiny fragment of the code. Without disclosing that fragment, there is no way to determine if there's in infringment or not.

Re:Nonsensical idea (4, Insightful)

El (94934) | more than 10 years ago | (#6915431)

Comparing the hashes doesn't give you a definitive answer; it does, however, tell you where to look. Or which submitters to ask for clarification on the origins of potentially infringing code. That's more than we have now!

Slim to None (5, Insightful)

tomRakewell (412572) | more than 10 years ago | (#6915382)

Chances are slim to none that a software company would allow it's "shredded" source code to be publicly released. What happens if the proprietary source is found to violate the GPL?

Proprietary (closed) source companies have a tremendous advantage over open source software when it comes to violating intellectual property. Who will ever know if they did it? A source code "comparator" eliminates that crucial advantage.

Results Will Appear "Tainted" (5, Insightful)

zapf (119998) | more than 10 years ago | (#6915386)

While I fully support ESR and the rest of the open source movement's defense of Linux against SCO, I have a feeling that this tool's results will not immediately be accepted by established media simply because of ESR's bias. A reporter looking into the SCO story who knows little about open source wouldn't trust a tool made by one side of the disagreement.

It seems very important to me that "third parties" and experts who are not an integral part of the open-source movement validate that comparator works as intended and is effective at detecting code similarities. Hopefully we'll see some articles on respected sites in the next week or so with conclusive analyses of comparator. Not to mention a chance for someone to use it on SCO's code!

Oh, and "Yes, I'm being deliberately vague and tantalizing" is quite funny.

IBM has a project called History Flow (5, Interesting)

TedTschopp (244839) | more than 10 years ago | (#6915397)

This is perhaps a better project and it would be interesting to see this tool run against the source.

History Flow [ibm.com] The following is from their website:

history flow
visualizing dynamic, evolving documents and the interactions of multiple collaborating authors:

Motivation
Most documents are the product of continual evolution. An essay may undergo dozens of revisions; source code for a computer program may undergo thousands. And as online collaboration becomes increasingly common, we see more and more ever-evolving group-authored texts. This site is a preliminary report on a simple visual technique, history flow, that provides a clear view of complex records of contributions and collaboration.

Would this really work? (3, Insightful)

Paradox (13555) | more than 10 years ago | (#6915399)

Well, I was looking at ESR's description of the code (I haven't read the code yet), and it seems to say that he takes 3 line slices, MD5s them, then compares them for identical points. I'm sure he compensates for funky whitespace and whatnot like diff and patch do...

But if even one bit of the source is different, the MD5 hash will be quite different. So, the code slices have to be IDENTICAL. This is not a very good system because a simple find-replace could defeat it. A variable's name changed by one letter, or even capitalization, will defeat it.

Unless the code reveals much more complex tricks than ESR describes in the help file, this tool wouldn't be much use in the SCO case. Hell, it wouldn't be much use catching college class cheaters even.

Next Step? (1)

Bilbo (7015) | more than 10 years ago | (#6915410)

Do I smell a Court Order in the works? If you really can do this without divulging the original code, then someone could conceivably convince a judge to issue an order to have a "neutral third party" create the MD5 sums on SCO's codebase, giving us a chance to look for pirated GPL code hidden inside of SCO proprietary products, without having to look directly at the SCO code.
Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account

Loading...