Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Wikipedia and Plagiarism

CmdrTaco posted more than 7 years ago | from the less-than-college-papers dept.

267

Spo22a writes "Daniel Brandt found the examples of suspected plagiarism at Wikipedia using a program he created to run a few sentences from about 12,000 articles against Google Inc.'s search engine. He removed matches in which another site appeared to be copying from Wikipedia, rather than the other way around, and examples in which material is in the public domain and was properly attributed. Brandt ended with a list of 142 articles, which he brought to Wikipedia's attention.... 'They present it as an encyclopedia," Brandt said Friday. "They go around claiming it's almost as good as Britannica. They are trying to be mainstream respectable.'"

cancel ×

267 comments

Sorry! There are no comments related to the filter you selected.

But (1)

JesseBikman (1002865) | more than 7 years ago | (#16724991)

wikipedia is free.

Re:But (1)

The MAZZTer (911996) | more than 7 years ago | (#16725037)

So are my term papers.

Re:But (0, Troll)

NineNine (235196) | more than 7 years ago | (#16725301)

wikipedia is free.
 
... so is shit.

Re:But (1)

JesseBikman (1002865) | more than 7 years ago | (#16725375)

but if shit is knowledge, wouldn't it be good to know your shit?

Re:But (1)

Dunbal (464142) | more than 7 years ago | (#16726111)

but if shit is knowledge,

      Then tubgirl is a smart woman...

Re:But (1)

albertost (1019782) | more than 7 years ago | (#16725701)

not shit, but cancer, Mr Ballmer

Re:But (1)

Tim C (15259) | more than 7 years ago | (#16726165)

But copyright infringement isn't; just being non-commercial won't necessarily save it, if infringement is indeed taking place.

Linux FAQ (-1, Offtopic)

Anonymous Coward | more than 7 years ago | (#16724995)

The Linux FAQ


Here's a list of some frequently asked and answered question here
and elsewhere that you may find useful in your quest to try linux.
Read these carefully before you decide to invest time in Linux, you
may find that you have better things you can do instead.


SECTION ONE - INSTALLATION
--

1.1 Q: I heard linux was easy to install, is it?
        A: That depends on what distro you try. Most of them will have
              trouble detecting all your hardware. Most new hardware devices
              are not supported. If your lucky you might be able to find
              something that someone threw together on the net. But that's
              after spending a couple hours searching and probably won't take
              advantage of your hardware to it's fullest capability.

1.2 Q: Once I get it installed, then what?
        A: Then you get the joy of making sure everything is configured
              right. Plan on a minimum of two hours per device to get it to
              work. That's if the device is even supported.

1.3 Q: What happens if I'm in the middle of an install and the
              installation freezes or just stops?
        A: You get to reboot and start all over again. :) This happens
              every so often with Linux. It seems like it's buggy install
              routines or something. Ain't Linux grand? :)

1.4 Q: What's the deal?! I installed Linux and it took up almost 2GB
              hard drive space!
        A: The Linux distros usually install a LOT of never-used programs
              on the default install. You can pick and choose what you want,
              but good luck figuring out what programs are needed and what is
              useless, obscure tools. Linux usually installs stuff like 10
              different editors, 12 different mail clients, and so on.

        (more to come...)

SECTION TWO - CONFIGURATION
--

2.1 Q: What's with all these cryptic files?
        A: All of Linux is configured with cryptic text files. Some of
              the more user-friendly distros have configuration utilities
              that claim to do it for you, but success with these works
              sometimes and other times don't, so sometimes you have to
              edit them by hand. With Linux's spotty reliability in UI
              programming, you might as well get used to it.

2.2 Q: What is killall, HUP, ls, cat, rm, which, etc and why are
              these programs telling me to do them? Arggg!!
        A: These are command line programs that do things within the
              system. It's what makes Linux a powerful OS for those that
              are experienced with it. But it's also what makes it a pain
              in the arse to use and inefficient as a desktop system. Who
              wants to type all the time when they can just click?!

              (more to come...)

SECTION THREE - APPLICATIONS
--

3.1 Q: Where can I get some programs to run on linux?
        A: Good question. Because Linux doesn't have a large user base
              on the desktop,(I think it's about 0.24%, less than 1%)
              companies that make software won't write their programs for
              Linux. There's a lot of community created programs out there,
              and some are fairly good, but those are few and far between.
              Most of the Linux software that tries to mimic it's windows
              counterpart is substandard. It's usually slow and buggy and
              early in development.

3.2 Q: I tried to install an RPM but I got 'failed dependencies', what
              is that?
        A: That's Linux's version of DLL hell. Different versions and
              distros use different libraries. So unlike windows where
              programs will run on many different versions, Linux programs
              will fail if they're not made for your specific version.

3.3 Q: What is compiling and configure, make and make install? And
              what is a makefile?
        A: This is a way to build the programs from the source code
              under Linux. When the question above fails, you can always
              build it yourself. The advantage is that it works most of
              the time. The disadvantage is that it takes forever to build
              large programs, you need to know some cryptic commands and
              you have to do all this on a command line. Unlike Windows
              where you just double click and you are done.

3.4 Q: Can I go to my local store to buy any Linux applications?
        A: Not really. You can buy Linux itself at various stores. But
              not too many commercial companies make applications for Linux,
              there's no profit in it with 0.24% of the desktop market.

              (more to come...)

SECTION FOUR - SPEED ISSUES
--

4.1 Q: Why is Linux so slow?
        A: Linux is built on the technology of the old UNIX OS's. Even
              the graphical user interface of Linux is a separate program that
              is the same type they used back in the older UNIX days. So
              working with old technology will give you the old technology
              responsiveness. Also, a lot of the GUI's, although nice to look
              at, are still not mature. Using them is slow and sluggish
              compared to, say, Windows.

              (more to come...)

SECTION FIVE - CONSISTENCY
--

5.1 Q: Why are the windows different looking?
        A: Since Linux isn't built by one company, group or have any
              governing body, programs and interfaces can vary dramatically.
              You can have everything from the nice look of KDE, to something
              as ugly as Gnome and everything in between. You'll usually see some
              varying UI style in Linux.

5.2 Q: Should I buy Suse Linux?
        A: No. They make it difficult to get it for free. All the other
              distros provide free ISO's to download. Suse is the only one
              that doesn't provide them but instead has a FTP install that's
              hard to get to work. Why should they make it easy? The more
              people that can't get the download to work have to spend $80
              or more for the boxed set. And on top of all that although
              it might have a few more user friendly tools, it's still the
              same base Linux system that's in development and that all the
              other distros are using. In other words, they're all on about
              the same level of struggling to catch up to Windows, so you're
              not going to find any earth-shattering features in one compared
              to another.

              (more to come...)

SECTION SIX - LINUX COMMUNITY
--

6.1 Q: What is RTFM?!
        A: This is an acronym for Read The Fuc*ing Manual. This is a common
              answer you'll get when asking for help in the Linux community.
              It's meant to make you feel inadequate while boating the Linux
              persons ego at the same time. See, Linux enthusiasts consider
              themselves to be guru like and above helping out the simple
              newbie. You have to earn your respect by spending countless
              hours becoming a kernel hacker before you're worthy of getting
              any help.

6.2 Q: Why does everyone think they are better than you when using Linux?
        A: Same as above. When people use Linux they believe since it takes
              a little more knowledge to use Linux, they are technically superior,
              and see themselves as an elite group that doesn't have time for the
              pathetic little Windows people.

              (more to come...)

SECTION SEVEN - LINUX ADVOCACY, HELPING OR HURTING?
--

7.1 Q: Everyone in here says linux is perfect, why would they say that
              if it isn't?
        A: We really don't know. Maybe they've used Linux so long that
              they've gotten used to it. Some of these people haven't used
              Windows in years so they are comparing Linux to the last windows
              they used, maybe Windows 3.1 or 95.

7.2 Q: Why does everyone call you a troll when you ask something that
              questions linux?
        A: Most of the people here (slashdot) think of Linux more like a
              religion than an OS. They mostly are MS haters and feel that
              Linux is the greatest thing to ever hit computing. So when
              someone questions Linux it's like questioning their belief
              system. Instead of looking at it with some logic and reasonable
              judgment, they will lash out at you can't claim your are a troll
              or a paid MS supporter.

              (more to come...)

SECTION EIGHT - LINUX EVANGELISM, ZEALOTS
--

8.1 Q: There are some people that call this FAQ lies and seem to treat it
              like it's a conspiracy against them, and post all sorts of links to
              anti-microsoft articles. Why are they reacting so strongly?
        A: The people that are reacting so strongly are most likely the Linux
              extremists that believe everything negitive that is said about Linux
              comes from Microsoft. Like many cult-like groups, the people that
              belong to them don't have the ability to see things rationally or
              outside of their view. If someone replies to the FAQ, or anything
              questioning a non-favorable view on Linux, that seems a little
              "over the edge", do a google search on the person
              (http://groups.google.com/) and look at his/her posting history then
              decide for yourself if the person is credible or not.

              (more to come...)

Re:Linux FAQ (0)

Anonymous Coward | more than 7 years ago | (#16726115)

Anti-Linux trolls are so 1990's.

That doesn't seem like alot (2, Insightful)

NinjaFarmer (833539) | more than 7 years ago | (#16725019)

Doesn't Wikipedia have over a million articles (not in English alone, I know)? That would mean that's less than .1% of the articles are plagiarized. Seems reasonable to me that that amount would get by into unnoticed. All it takes is for the original author then to deal with it.

Re:That doesn't seem like alot (2, Insightful)

sprins (717461) | more than 7 years ago | (#16725051)

Apparently Wikipedia has over 1.5 million english articles alone. So your calculation of the percentage of 'problematic' articles is even more favourable. Of those 142 eledgedly 'problematic' articles only a few really seem to be a problem as the others originated from the public domain to begin with.

Sounds like much ado about nothing once more. *yawn*

Re:That doesn't seem like alot (4, Insightful)

aquaepulse (990849) | more than 7 years ago | (#16725101)

Well that 142 was found out of his search of 12000, if his methodology was sound you could expect the proportion plagiarized within the 1.5 million to be about 17750. About 1.18%.

Re:That doesn't seem like alot (2, Informative)

tomhudson (43916) | more than 7 years ago | (#16725325)

... and after an investigation of some of those by Wikipedia, it was found that some were in the public domain, some were culled from government sites, and some were copied from the wiki, and not the other way around. Of those 12,000, we can now say that the wiki is at least as clean as Ivory soap (99.44%).

Re:That doesn't seem like alot (1)

sbaker (47485) | more than 7 years ago | (#16725451)

Some are also instances of people writing something on their own web site and then later deciding to put it on Wikipedia - so even the instances where the other web site predates the Wiki article may not be copyright violations. Without discussing the matter with every single original author, it's hard to know.

I guess the only thing this study tells us is that an UPPER limit on the number of plagiarisms is of the order of 1%. That's still an alarmingly high number.

Re:That doesn't seem like alot (2, Insightful)

tomhudson (43916) | more than 7 years ago | (#16725607)

Considering that an audit of dead-tree encyclopedias hasn't been done, we can't say. What we CAN say is that its foolish to make a comparison with Britannica, when an audit of Britannica found 10% of 600 articles to be non-factual. The sources cited in those 10% disavowed the articles' contents.

This isn't all that surprising either, when you think about it. People cite people who cite people, and someone somewhere will mis-interpret what someone else wrote, or come to different conclusions while still citing the original author.

Re:That doesn't seem like alot (2, Interesting)

kkwst2 (992504) | more than 7 years ago | (#16725711)

Alarmingly high? You find it alarming that 1 of every 100 articles on a free web-based encyclopedia has plagiarized material. You are clearly much less cynical than I am. I would have guessed at least 5%, probably more.

 

Re:That doesn't seem like alot (2, Insightful)

Jazon Bladen (938809) | more than 7 years ago | (#16725493)

It's a wiki. If you find a problem with it, you fix it. Complaining about a wiki is like yelling at a puzzle with a single piece not in place: you can easily solve the problem, you're just stupid.

Re:That doesn't seem like alot (0)

Anonymous Coward | more than 7 years ago | (#16725901)

...sigh... where is BadAnalogyGuy when you really need him?

Re:That doesn't seem like alot (1)

nomadic (141991) | more than 7 years ago | (#16725091)

Except the story specifically says he checked only about 12,000 of wikipedia's articles, so that would make it about 1% are plagiarized if you extrapolated. Still not horrible, but I'm guessing it's a lot higher than Brittanica.

Re:That doesn't seem like alot (1)

Yvanhoe (564877) | more than 7 years ago | (#16725223)

Right now we can just watch and see how this story end. I doubt this automated procedure could take into account contributor copying their own copyrighted materials insode wikipedia. I think this has already happened, I don't say that 100% of the 142 articles are in this case, but I think he raises an interesting point. Let's now see how this ends

Re:That doesn't seem like alot (1)

DragonWriter (970822) | more than 7 years ago | (#16725819)

Except the story specifically says he checked only about 12,000 of wikipedia's articles, so that would make it about 1% are plagiarized if you extrapolated. Which would make sense to do if it was a systematic random sample, rather than a selection conducted by someone who has been on an anti-Wikipedia crusade for quite some time, as this one is. Of course, there is the question of the trustworthiness of the original number, as well, as the material was never independently reviewed, and Wikipedia's own reviews (as TFA notes) found some cases that Brandt did not eliminate where the other site appears to have copied Wikipedia rather than the other way around.

Re:That doesn't seem like alot (0)

Anonymous Coward | more than 7 years ago | (#16725161)

Not only that but for some reason this guy thinks material in the public domain needs to be properly attributed. It does not. And often it can not.

I hope you're not contributing... (1)

NineNine (235196) | more than 7 years ago | (#16725347)

...especially to any math articles. 142 is 1.183...% of 12000. Not "less than 0.1%"

142 out of ~12000 articles (0)

Anonymous Coward | more than 7 years ago | (#16726043)

Its just slightly over 1%.

Any statisticians able to say whether thats a valid sample?
How was the list of 12000 constructed? Completely random? Most popular articles? Most likely to contain plagirism in this nutters opinion?

How is this news? (1)

JanusFury (452699) | more than 7 years ago | (#16725041)

Really, how is this news? I don't get it.

Re:How is this news? (1)

Klaidas (981300) | more than 7 years ago | (#16725061)

You must be new here...

Re:How is this news? (1)

LeRaldo (983244) | more than 7 years ago | (#16725235)

I bet that's news to him.

Re:How is this news? (1)

pasamio (737659) | more than 7 years ago | (#16725917)

especially since his user id is half of the child, but you'd expect your parents to be older than you in some cases

Impressive (3, Interesting)

Solder Fumes (797270) | more than 7 years ago | (#16725045)

Wow. Only 142 articles in which average Joe Wiki forgot the proper way to attribute a source. I'm actually amazed there were so few occurrences. This article has the effect of heightening my opinion of Wikipedia's quality.

Re:Impressive (1)

porkThreeWays (895269) | more than 7 years ago | (#16725227)

In high school while doing term papers at least 1/3 of most of my papers weren't written by me. They were quotes from other sources. What's the difference? It's only plagiarized if you don't cite the source properly. Legally you are allowed to take small quotes and use them in a publication as long as you cite sources. I'm guessing many of those offenders could go legit just by citing the source alone without removing the quote.

Are you going to the prom? (1)

goombah99 (560566) | more than 7 years ago | (#16725327)

I ask because apparently You did not actually graduate high school yet if you can't understand what the difference is between cited and uncited text.

Re:Impressive (1)

Salmar (991564) | more than 7 years ago | (#16725593)

Please read more carefully. That 142 was the number of articles found in the stated sample of 12000 articles.

Re:Impressive (1)

multisync (218450) | more than 7 years ago | (#16725779)

This article has the effect of heightening my opinion of Wikipedia's quality.


I agree. Plagiarism is a reality that all publications have to deal with, and Wikipedia's responose seems to be a reasonable one. They have removed the questionable content pending review.

A while back one of my local papers became aware that a columnist was copying material from aonther paper. They fired her and printed an apology to their readers and the publication she stole from, and moved on.

This Daniel Brandt apparently has an axe to grind against Wikipedia because he was unhappy with an article that was written about him. Among other things, he feels people who write and edit articles should make their identites publc, basically so he can sue them if he doesn't like what they write. Reminds me of something else [slashdot.org] I read here recently.

Not shocking, but not a big deal (2, Interesting)

Chairboy (88841) | more than 7 years ago | (#16725049)

What's missing from the summary is that almost immediately upon getting the list, the articles in question were dealt with and the offenders were blocked or warned.

Wikipedia is written by a large community, and people make mistakes. I have read about other reference tomes that have been caught plagiarizing (for example, some encyclopedias or atlas's will put in a fake piece of data or a fake street so that they can easily determine if they're being copied from), and the turnaround time for fixing it can be years depending on the publishing cycle.

This isn't a condemndation of Wikipedia, despite Mr. Brandt's best efforts, it's a confirmation of why WP works.

Only 142?! (1)

thelamecamel (561865) | more than 7 years ago | (#16725053)

142 articles is bugger-all, not all of these cases were actually plagiarism, and the biggest cited example in TFA is "An entire paragraph in Alonzo Clark's entry". Surely there has been much greater, and more significant plagiarism in Wikipedia than this? Why is this number so low?!

Pfizzle. (1)

Etherwalk (681268) | more than 7 years ago | (#16725059)

142 out of 12,000, some of which aren't really a problem, and that's numbers generated by a critic?

Yes, it's a problem, but that's actually not a bad score at all. You probably get more plagiarism than that on college papers at good schools. How many of these articles cite what they "plagiarize," even if they don't put it in quotes? Also, to make it legal plagiarizing, all you have to do is re-write each paragraph in your own words.

I see 1.18% of articles as potentially having text lifted from somewhere else as a serious issue for the maintainers of Wikipedia, sure. But I don't think it has a major negative impact on its reliability, or on the quantity or quality of information contained within it. And reliable information is what I care about when I go to wikipedia. If it worked only by having mass exerpts of other sites, I'd call it "GOOGLE," and I'd still use it.

Re:Pfizzle. (1)

Daniel Rutter (126873) | more than 7 years ago | (#16725441)

142 out of 12,000, some of which aren't really a problem, and that's numbers generated by a critic?

And a very... dedicated critic, too [crank.net] .

I must admit there's a certain recursive appeal to the idea of someone being notable enough for a Wikipedia entry purely because of his vehement attempts to avoid being mentioned on Wikipedia.

As usual, the talk page [wikipedia.org] has lots of entertaining dirt.

(Uncyclopedia has the real low-down [uncyclopedia.org] , of course.)

Re:Pfizzle. (0)

Anonymous Coward | more than 7 years ago | (#16725449)

"Also, to make it legal plagiarizing, all you have to do is re-write each paragraph in your own words."
You'd better do more than that. You need to cite your sources for IDEAS too. If you're in my class and I find your paper has not cited ideas, you'll face the same consequences as someone who simply copied the text.

The only time you don't have to cite ideas are when it is common knowledge and easily deducted. Even then, you ought to do your homework and find out who first thought of the theory / idea that you are copying.

Re:Pfizzle. (1)

Etherwalk (681268) | more than 7 years ago | (#16725579)

Legal != ethical.

A school's honor code may be very different from a nation's copyright laws. (As they should be.) Ideally, if you come up with an idea in conversation with a few friends around a coffee table, and they contribute meaningfully to the genesis of the idea, you'll cite, thank them, or credit them in the finished product. But from a copyright status, while you can copyright the form of an idea, you can't usually copyright the idea itself--which is why you can write a new horror novel, or a new formulaic fantasy or soap opera. It's also why you can write any work about history. The copyright of the people who wrote the books you used to research (and even if you had primary sources, you almost certainly got information from copyrighted materials as well,) doesn't apply to your work, even if you're conveying the same information.

Re:Pfizzle. (1)

pasamio (737659) | more than 7 years ago | (#16725977)

See this is the thing that gets me about academics. Unless you have the relevant sized pole shoved in your preference of orifice and can point to it accordingly you cannot have your own opinion or new idea. It has to be someone elses. It shits me off because I have so many strange ideas that I'm not going to bother looking in case some retard had them before. The conceept that you have to be some brilliant person to have an idea just annoys me. I remember back to studies on ancient history and the development of farming. Around the globe around the same time different independent civilizations developed the concept in varying degrees (some also developed irigation earlier as well due to needs or different methods of irrigation to resolve problems). Do they all need to reference God for showing them how to tend the ground?

1% plagarism! (0, Flamebait)

goombah99 (560566) | more than 7 years ago | (#16725073)

Any Journal article comprised of 1% plagiarism would be subject to law suits, apologies and the journal would face ostracism. This is intellectual theft somehow made possible by the anonymity of the Wiki. We do tolerate this in less professional venues. For example, amateur reader comments are not subject to this kind of scrutiny. Comment sites like slashdot are protected from that sort of thing. But a formal identifiable entity that generates citable articles in itself, which has pervasive plagiarism at the 1% level needs to be shut down or it's citations fixed. This is a terrible day for the otherwise marvelous wikipedia concept. Deep thought is needed to figure out how to create some process of assured attribution. It's a shame. Even with the plagiarism Wikipedia is still informative. It's just that we can't become permissive about plagiarism even if it is for a good cause.

Re:1% plagarism! (1)

Solder Fumes (797270) | more than 7 years ago | (#16725123)

Plagiarizing on Wikipedia has to be one of the more victimless "crimes" I can think of, especially since entries are essentially anonymous and no one else is really getting quantifiable credit for using someone else's text in a wiki article.

Re:1% plagarism! (0)

Anonymous Coward | more than 7 years ago | (#16725163)

Solder fumes Writes

Plagiarizing on Wikipedia has to be one of the more victimless "crimes" I can think of, especially since entries are essentially anonymous and no one else is really getting quantifiable credit for using someone else's text in a wiki article.
Plagiarizing on Wikipedia has to be one of the more victimless "crimes" I can think of, especially since entries are essentially anonymous and no one else is really getting quantifiable credit for using someone else's text in a wiki article.

Re:1% plagarism! (1)

Solder Fumes (797270) | more than 7 years ago | (#16725263)

Imitation is the sincerest form of flattery...are you coming on to me?

Mod Up PARENT (0)

Anonymous Coward | more than 7 years ago | (#16725257)

Not Flamebait. how about an intelligent balanced comment worth discussing.

Re:1% plagarism! (1)

cddale (1023093) | more than 7 years ago | (#16725405)

Not true. It is estimated that at least 13% of articles in first-tier journals (NEJM, JAMA, etc.) listed on PubMed contain "ghostauthored" papers--written by drug companies for promotional purposes and where the named authors had little or nothing to do with the study, but were instead paid to front as the authors in order to remove the appearance of bias that would result from drug company authorship, add credibility on the basis of the phony author's reputation, and to promote off-label drug use (that is, for indications beyond what the FDA has approved) which is otherwise illegal. Some of these papers are actually published multiple times. There is a famous example where essentially the same paper was published three times by three different and non-overlapping groups of authors. In one version the sole author's name was even misspelled. The matter was brought to the University of Washington, where on of the authors was on the faculty (and, in fact, the former dean of the medical school, and the University of Washington held that it did not meet the definition of plagiarism (arguing that consent from the original source, which was granted here by the ghostauthor, was a requirement for plagiarism) and did not force a retraction, a printed correction, or even discipline the so-called author of this paper.

Re:1% plagarism! (1)

Zeinfeld (263942) | more than 7 years ago | (#16725411)

Any Journal article comprised of 1% plagiarism would be subject to law suits, apologies and the journal would face ostracism.

There is a big difference between plagarised articles and articles with plagarised passages. Pretty much every medium has a significant plagarism rate, including scholarly journals.

The methodology in this case is more than a little suspect. At least 50% of Wikipedia is utter crap. There is fancruft, stubs, POV peddling forks. Anyone who is involved with Wikipedia will admit as much. The fact is that it does not matter if the article on the garage band 'Frog the Bustards' is plagarised or not, only twelve people will read it before it gets deleted, albeit thats five more than have heard the band. The similarity to the official biography is because both were written by the lead singer's girlfriend.

The Britanica comparisons are plain silly. There are 1.5 million articles in Wikipedia of which something like 200,000 could be considered competition to Britanica. OK the Harry Potter pages are interesting and useful but thats not what Britanica claims to provide. That still leaves Britanica in the dust with a mere 100,000 articles.

Fact is that Britanica is not much use on most of the things I need an online source for and equally useful for the things I would use Britanica for. No encyclopedia is 100% trustworthy, the information is inevitably out of date in Britanica. There is no entry at all in Britanica for what I use it most often - tracking the latest computing neologisms.

The most valuable aspect of Wikipedia is precisely the fact that its pages come with 'caveat lector' written on every page. If you read Wikipedia without being aware of possible POV peddling you are an idiot, if you read Britanica without being aware of possible POV peddling you are also an idiot, if you watch Fox News without being aware that it is POV peddling 24 hours a day you are an utter fool.

Re:1% plagarism! (0)

Anonymous Coward | more than 7 years ago | (#16725537)

But if you view CNN with a jaded eye, you will enrage the I hate Bush crowd.

Re:1% plagarism! (0)

Anonymous Coward | more than 7 years ago | (#16725643)

Oh of course you can view CNN with a jaded eye. Just so long as your jaded eye realizes that CNN is nowhere near liberal enough to be considered unbiased like the pure sources of the one Truth: Indymedia, the Village Voice, the Daily Kos and Air America Radio.

Great (0)

Anonymous Coward | more than 7 years ago | (#16725081)

As an occasional minor contributor and frequent user of Wikipedia, I have to say well done and thanks to Mr Brandt. It's a shame (but entirely predictable) that some lamebrains might feel the need to gain (imaginary, virtualised) kudos for work that isn't their own. Interesting, really - I get nothing for my little contributions apart from feeling a little smug (in a good way) every now and then. The status is completely disconnected from my meatspace existence (except in my head, I suppose, without wanting to come over all Dennett ;)

unthinkable. (0)

Anonymous Coward | more than 7 years ago | (#16725113)

plagiarism on the internet?

Brandt is a Republican (0, Troll)

packetmon (977047) | more than 7 years ago | (#16725127)

Brandt, who has long sparred with Wikipedia over an unflattering biography of himself, called on Wikipedia to conduct a throughout review of all its articles.... Sounds like he's a politician. So what I'm a jackass, I molest underage boys, send troops to die... Wiki is plagiarizing"

Re:Brandt is a Republican (1)

MostAwesomeDude (980382) | more than 7 years ago | (#16726183)

I'll bite, mostly because people might actually believe what you're saying.

Daniel Brandt doesn't like Wikipedia. His article there was started 'against his wishes,' and although he managed to get it deleted once by a few choice threats. it was quite rapidly created again. Ironically, the community now agrees that his anti-Wikipedia rantings have made him notable enough to be included in the encyclopedia.

Mr. Brandt is certainly not a nice person. While your words "politician" and "Republican" are completely unfounded, it is true that Mr. Brandt maintains a web page chock-full of personal data, including the names and addresses of any Wikipedians who he feels have been mean to him in the past.

The interesting part of all this is that Brandt does not have the authority to order Wikipedia to remove content. That kind of copyright enforcement can only be carried out by the copyright holder. However, he is well aware that Wikipedia's "no copyright violations" policy requires users to immediately quash plagarized content.

The proof of the pudding (1)

GerardM (535367) | more than 7 years ago | (#16725151)

The proof of the pudding is in the eating; consider Mr Brandt comes up with a computer generated list of potential problematic articles. These are scrutinized and where needed problematic content is removed. The wiki methodology works thanks to Mr Brandt.

Conclusion; the best way of improving Wikipedia is by showing where it has a problem. Mr Brandt disproved his opinion. Live and learn. :)

Thanks,
        GerardM

Re:The proof of the pudding (0)

Anonymous Coward | more than 7 years ago | (#16725921)

The proof of the pudding is in the eating; consider Mr Brandt comes up with a computer generated list of potential problematic articles. These are scrutinized and where needed problematic content is removed. The wiki methodology works thanks to Mr Brandt.

Conclusion; the best way of improving Wikipedia is by showing where it has a problem. Mr Brandt disproved his opinion. Live and learn. :)


I reached a somewhate different conclusion: Find an obsessive-compulsive paranoid with a tenuous grip on reality, write an unflattering article in Wikipedia on him, and let him use his free time looking for flaws in the submissions.

Link to Brandt's Site on Topic (0)

Anonymous Coward | more than 7 years ago | (#16725179)

Plagiarism by Wikipedia editors [wikipedia-watch.org]

Re:Link to Brandt's Site on Topic (1)

Fnkmaster (89084) | more than 7 years ago | (#16725827)

Sorry, but Brandt is a fucking nutjob. Just look around on his sites. That is not a stable, coherent person.

Re:Link to Brandt's Site on Topic (1)

remembertomorrow (959064) | more than 7 years ago | (#16725953)

Wow, you're right.

This guy is almost on the same level as Jack Thompson in terms of stupidity/ignorance.

Re:Link to Brandt's Site on Topic (1)

mabu (178417) | more than 7 years ago | (#16725833)

The guy's got a 501(c)3 corporation dedicated to bashing Wiki. My guess is it's funded by media and other encyclopedia makers. Follow the money and what you probably will find out about these people is much more disgusting than any transgression on the part of Wikipedia.

Daniel Brandt, valuable Wikipedia contributor (4, Insightful)

alienmole (15522) | more than 7 years ago | (#16725195)

Brandt is doing a great service to Wikipedia — checking for and reporting plagiarism. That takes dedication and hard work. It's ironic that he feels the need to present it as criticims of Wikipedia's model, when in fact he's demonstrating the power of contributions from many people with different motivations. Even if the motivation is anti-Wikipedia, Wikipedia just absorbs the input and grows stronger.

"If you strike me down, I shall become more powerful than you could possibly imagine..." -- Obi Wiki-nobi

Re:Daniel Brandt, valuable Wikipedia contributor (1)

sjwest (948274) | more than 7 years ago | (#16725811)


If memory serves me http://en.wikipedia.org/wiki/Charles_Van_Doren [wikipedia.org] didnt cheat either (Game show 21) and he then worked on the Britannica.

Funny thing 'cheating'.

Re:Daniel Brandt, valuable Wikipedia contributor (0)

Anonymous Coward | more than 7 years ago | (#16726099)

I agree. In fact, Brandt is doing a great service to Wikipedia -- checking for and reporting plagiarism. That takes dedication and hard work. It's ironic that he feels the need to present it as criticims of Wikipedia's model, when in fact he's demonstrating the power of contributions from many people with different motivations. Even if the motivation is anti-Wikipedia, Wikipedia just absorbs the input and grows stronger.

I don't know (1)

khallow (566160) | more than 7 years ago | (#16726167)

Is it such a good idea, checking for and reporting plagiarism? While that takes dedication and hard work, it's notable that he feels the need to present it as criticims of Wikipedia's model, because in fact he's demonstrating the power of plagarism from many people with different motivations. Even if the motivation is anti-Wikipedia, Wikipedia just absorbs the input and grows stronger. That doesn't seem a good thing.

From an ex-wikipedia administrator (1)

BMIComp (87596) | more than 7 years ago | (#16725211)

I used to be a wikipedia administrator, before resigning due to time constraints. However, we would catch a lot of the copyright issues. I mean, when you're reading an article, and part of its plagerized, it's usually really obvious. The plagarized part usually doesn't fit into the rest of the article.. and you can just tell that the average editor didn't write that copy. (Just as I'm sure a teacher can tell one of his/her students didn't write a plagerized essay) Once you found the possibly infringing content, you could google parts of the suspect text, and see if it appears anywhere else. If it does, you'd either report the problem or remove it yourself.

I used to run into these all the time... but the thing is... a lot of them are caught and removed. Wikipedia has a system to deal with such infrigements, and the users that post them. (See Wikipedia's policy [wikipedia.org] and their copyright problems reporting page [wikipedia.org] ) The truth is that you're going to find copyright problems wherever there is user-submitted content (look at YouTub!, for example).

URL please! (0)

Anonymous Coward | more than 7 years ago | (#16725691)

I know tubgirl... but what is YouTub?

Re:From an ex-wikipedia administrator (0)

Anonymous Coward | more than 7 years ago | (#16726067)

when you're reading an article, and part of its plagerized... The plagarized part usually doesn't fit... Try typing your post in Word with the spell check on... saves a good deal of embarrassment...

142 out of 12,000? (1)

MMC Monster (602931) | more than 7 years ago | (#16725269)

142 articles out of 12,000 is certainly a problem, but actually not much of one. I'm sure it he made his script public (I have no idea if he did so. In the /. tradition, I did not RTFArticle) and the wikipedia were to use it, it would be of benefit. Not to automatically tag articles as plagiarism, but at least tag them for further evaluation by an editor.

Buy, hey, 142/12000 is less than 2%. I would have thought the percentage would have been at least 5%.

You inSensitive clod. (-1, Troll)

Anonymous Coward | more than 7 years ago | (#16725311)

US Gov copyright? (1, Insightful)

julesh (229690) | more than 7 years ago | (#16725423)

Articles with offending passages have been stripped of most text. An entire paragraph in Alonzo Clark's entry, for instance, was deleted, leaving the article with the bare-bones: "Alonzo M. Clark (August 13, 1868-October 12, 1952) was an American politician who was Governor of Wyoming from 1931 to 1933."

The original article, Brandt said, was copied from a biography on the Wyoming state government site.


Err... I thought works of the US Government were generally free from copyright...?

Re:US Gov copyright? (1)

Microlith (54737) | more than 7 years ago | (#16725623)

Citations are still required, even for the work of Government officials.

Re:US Gov copyright? (1)

osu-neko (2604) | more than 7 years ago | (#16725749)

Citations are still required, even for the work of Government officials.

Especially so. It's always important to know the source of your information to evaluate potential bias, and particularly when the source has a long track-record of fudging the truth for self-serving purposes.

Re:US Gov copyright? (1)

AxelBoldt (1490) | more than 7 years ago | (#16725831)

Citations are still required, even for the work of Government officials.
By (often ignored) Wikipedia policy, which requires sourcing of all statements, but not by law.

Re:US Gov copyright? (1)

athmanb (100367) | more than 7 years ago | (#16725805)

Only those of the federal government. Those of most states aren't.

Re:US Gov copyright? (3, Insightful)

DragonWriter (970822) | more than 7 years ago | (#16725857)

Err... I thought works of the US Government were generally free from copyright...?


(1) The Wyoming state government is not the US government: state government works are not generally free from copyright.

(2) Plagiarism is separate from copyright violation, anyway. Using material that is not subject to copyright or is in the public domain that is from one unique identifiable source without crediting the source is plagiarism, as is using copyright material in a way that does not violate copyright without attribution (say, fair use.) Plagiarism isn't a violation of the law, but a violation of commonly accepted standards of integrity when it comes to not claiming other's work as your own.

Biographical articles. (4, Funny)

Anonymous Coward | more than 7 years ago | (#16725463)

It's very lazy of of the Wikipedia authors to enter the same biographical information as other sites.
They should write new and interesting histories for all these people rather than using the same old worn out ideas that are on so many places on the net.
All it takes is a little imagination.
A new birth place, better achivements (why could hitler not have discovered the cure for cancer and be the first man on the moon? It's better than the depressing story on Wiki at the moment.) and some creative editing would solve this problem once and for all.

Some Wiki articles are already better and contain things about people that have never happened, but sadly these often get put back to the same old boring stories almost as soon as the changes are made.

ok methodology, bad analysis (1)

fermion (181285) | more than 7 years ago | (#16725465)

In this kind of study, basing the conclusion on the presence of few hits would characterize the study as faith based science.

First, the sample size was 12,000. Where did that number come from? Were the samples picked randomly? Assuming so, is 12,000 a statistically an effective sample size? And if the samples are random, and the size is sufficient, is that 142 articles statistically significant, that is, are the number of matches outside the margin of error? In other words, does the sample size, selection, and methodology, merit a margin of error around 1%.

And then we get to the fact that sometimes wikipedia text is copied to other sites. This in itself leads to the conclusion that wikipedia has some credibility, even if unfounded. I found it interesting that we are not told how many articles off wikipedia were plagiarized. I also wonder what 'Wikipedia appeared to be the one plagiarized' means, and what systematic errors was introduced by that subjective judgement. Perhaps 1%?

There is no question that plagiarism is a big issue, and we all must watch for it. I am on the side that plagiarism in no more an issue than in the past, but with better communication and distribution, we catch it more. At some level, because it so easy to plagiarize now, we perhaps see more egregious cases of it.

What gets me is that an analysis of such low analytical value is news. I am once again amazed at how little people seem to know or care about proper logic. In the end all we know is that some study with questionable methodology produced 142 hits. Not a huge revalation, even if we stipulate the study is of even minimal value.

lets see (0)

Anonymous Coward | more than 7 years ago | (#16725475)

if you place 100 people in a room and tell them all to write several paragraphs about a specific subject, I wonder how many of them would have the very same sentence within their paragraphs. Theres only so many ways to say something

*CUM (-1, Redundant)

Anonymous Coward | more than 7 years ago | (#16725523)

FreeBSD went out

Respect level (1)

pboyd2004 (860767) | more than 7 years ago | (#16725543)

"They are trying to be mainstream respectable."
And plagiarism is going to change the mainstream's respect level for Wikipedia?

That is a very unreal scenario (1)

cucucu (953756) | more than 7 years ago | (#16725609)

Daniel Brandt wrote a script that pinpoints all the plagiarized pages. That is a very unlikely scenario in real life. He should have selected a sample of random pages, and check which are plagiarized.
  • Claim surfing the web is risky because his firewalls only gives access to phishing sites
  • Say sex is dangerous, because he frequents a nightclub were all members have STD
  • Assert numbers don't have square roots because his population is made of negative numbers.

    Re:That is a very unreal scenario (1)

    Yetihehe (971185) | more than 7 years ago | (#16725757)

    Erm, he didn't select random pages, because he probably checked ALL the pages. And then selected those plagiarized. So if he selected only random subset, he would have smaller sample.

    Either this, or I'm totally uninformed because I didn't RTFA (I'm just lazy).

    Confused? (1)

    superstick58 (809423) | more than 7 years ago | (#16725627)

    I'm confused by the concept of plagiarism on wikipedia. For example, the article describes a biography copied from a government website. Isn't the point of Wikipedia to catalog and assemble information? How is copying an openly published biography from a government website considered plagiarism? Wikipedia is not being sold. No one is taking credit for the articles. Most cases, the original info is cited anyway. Anyway, please let me know what I'm missing here (which is probably a lot).

    Re:Confused? (1)

    AxelBoldt (1490) | more than 7 years ago | (#16726029)

    Plagiarism is not a legal term, it's a term used in journalism and academia to describe taking somebody else's words or ideas and presenting them as your own, without attribution. In these realms, it is considered unethical.

    If you copy somebody's words, and these words are not in the public domain (for instance because the author is long dead or works for the U.S. government), and you can't defend the use as "fair use", then it's a civil offense and they can sue you (in some countries and severe cases it's even a criminal offense). Whether you attribute the copied material or not is irrelevant for the legal status of copyright infringement.

    Wikipedia is extremely vigilant in removing copyright infringements. Plagiarisms of public domain works are not considered that big a deal; Wikipedia policy generally requires sources for all statements but that is rarely enforced.

    Even Virus authors contribute (1)

    tmk (712144) | more than 7 years ago | (#16725629)

    Authors of malware are trying to exploit the good reputation of Wikipedia to infect PCs with their malicious software. In a mass e-mail, recipients were told to download a "security update" for windows from a Wikipedia site.

    The attackers had used a Wikipedia feature that archives all previous versions of articles when changes have been made. The malicious page thus continued to exist in the archive, and the attackers were able to point to it in mass emails.

    See here [heise.de] , here [techworld.com] and here [theregister.co.uk] .

    Okay, Brandt is learning. (1)

    WWWWolf (2428) | more than 7 years ago | (#16725631)

    This is how you fix the problems with Wikipedia: Point them out in a way that makes the problems easy to fix. Okay, it's probably still harder to get criticism against user conduct and policies reacted upon, but the way Wikipedia works, the content is still easy to fix. Especially in the case of plagiarism.

    I really wish people would conduct accuracy and plagiarism studies a bit more often - especially when it's easy to fix, like this.

    And by the way, Wikipedia recently got a bot that finds suspected plagiarism [wikipedia.org] , which is pretty cool.

    How works the Wherebot? (1)

    tmk (712144) | more than 7 years ago | (#16725713)

    Is there a description how this bot identifies plagiarism? Does he search for random edits?

    Re:How works the Wherebot? (1)

    WWWWolf (2428) | more than 7 years ago | (#16726013)

    Is there a description how this bot identifies plagiarism? Does he search for random edits?

    I don't know if there's a really detailed description anywhere, and I'm not coffeed enough to find anything more, but the bot's user page [wikipedia.org] says it searches for phrases found in new articles through Yahoo search API. So it may not be good for finding plagiarism that's been inserted to articles that are more than a few days old, I suppose, but it does help to find cases where people just copy-paste web stuff to new articles. Basically, this helps the new-page patrollers a lot.

    Perhaps something to vet every edit would be cool =)

    Turns out they weren't plagiarized... (1)

    cliveholloway (132299) | more than 7 years ago | (#16725689)

    They were just authored by Roland Piquepaille. His articles are always all his own work, so it must be a mistake in the program.

    Statically unsignificant (1)

    shareme (897587) | more than 7 years ago | (#16725695)

    Someone needs to brush up on stats.. Get back to us when the results are statistically analyzed to measure the results to determine are the results actually anything to worry about..

    Plagiarism or Copyright? (1)

    Absolut187 (816431) | more than 7 years ago | (#16725699)

    Is plagiarism an issue for Wikipedia?
    Wikipedia is not a PhD candidate. Wikipedia's job is to provide accurate information.
    Of course, sources should be provided as well.
    But legally, the real issue here is Copyright, isn't it?
    "Plagiarism" and "Copyright infringement" are not synonyms.

    There is no copyright in facts.
    Therefore, nonfiction works are open to have the facts used in Wikipedia.
    Where a verbatim transcription would not be fair use, someone needs to paraphrase.

    Re:Plagiarism or Copyright? (1)

    DragonWriter (970822) | more than 7 years ago | (#16725891)

    Is plagiarism an issue for Wikipedia?
    Yes.
    ut legally, the real issue here is Copyright, isn't it?
    Not all issues are legal issues.
    There is no copyright in facts. Therefore, nonfiction works are open to have the facts used in Wikipedia. Where a verbatim transcription would not be fair use, someone needs to paraphrase.
    The issue here is verbatim use, anyway. An automated script is going to have more trouble finding use of "facts" from another source that aren't verbatim copies of the presentation.

    Bitch bitch bleh...... (0)

    Anonymous Coward | more than 7 years ago | (#16725707)

    While plagiarism is bad, Wikipedia is free, and it's good that this information is mirrored at more than one web site. Anyway, if they start to present false information/blatant lies as fact, then that is when the bitch sessions need to really start.

    Wikipedia bashing du jour (1)

    mabu (178417) | more than 7 years ago | (#16725785)

    It seems to be th3 c00l3ss to bash Wiki lately, but the bottom line is there is no encyclopedic reference that comes close. The media and other pseudo-pundits who seem to resent any influential source of information that doesn't have obvious corporate influence (read: money-based control) as a major threat and they do whatever they can to discredit Wikipedia. Aside from a tiny subset of controversial articles that routinely get vandalized, and another tiny subset of plagiarism, this issue is likely to be blown way out of proportion by those who have a vested interest in destroying any information resource they cannot control.

    Citations? (1)

    Ash-Fox (726320) | more than 7 years ago | (#16725823)

    I tend to check the citations on Wikipedia. If there is no citation and I can't find a somewhat reliable source on Google related to the information I'm looking at -- I know I can't trust that information.

    These people who ramble on that Wikipedia is inaccurate almost appear to me like they never sat history class in high-school. Where you have to verify your sources.

    I've also never heard of citing encyclopedias in research projects, ever. Good-grade coursework, also never seen them cite encyclopedia entries (they may cite information that was cited to on some encyclopedias).

    I find it funny... (1)

    Net_fiend (811742) | more than 7 years ago | (#16725949)

    ...that people are worried about wiki plagiarizing *other* sites. I'd love to see how many students plagiarize off of the wiki. Let alone double check all the supposed facts they get off of wikipedia. Personally I would *never* use wikipedia to help write a paper. Granted it is checked by people, but the fact that anyone and everyone can post to it leaves me with concerns. There have already been examples of abuse from Politicians and other parties with their own agendas that abuse the system.

    With that said I had an English Lit teacher in my college class print out stuff from wiki and use it has handouts. I was surprised any teacher would use wiki as a source for educational information. Sure its a great idea, but there is just too much information that can't be processed for authenticity at every turn.

    Here is the link to my report (1)

    Everyman (197621) | more than 7 years ago | (#16725971)

    Why is this news? Maybe because the Associated Press says it's news, and it's in hundreds of newspapers?

    Why should Slashdotters care? Because while AP doesn't use links, Slashdot should have the courtesy of linking to the original sources that AP used to generate the report. (Plus AP also checked with Jimmy Wales for a reply, which is expected from professional reporters.)

    The report is at http://www.wikipedia-watch.org/psamples.html [wikipedia-watch.org]

    Wikipedia's own newsletter reports on it here:
    http://en.wikipedia.org/wiki/Wikipedia:Wikipedia_S ignpost/2006-10-30/Plagiarism_cleanup [wikipedia.org]

    The efforts of Wikipedia administrators to clean up the mess are chronicled here: http://en.wikipedia.org/wiki/User:W.marsh/list [wikipedia.org]

    Of course, Slashdotters may continue shooting from the hip if they choose. It's what they do best.

    Brandt vs. Wikipedia (1)

    mako1138 (837520) | more than 7 years ago | (#16726009)

    Brandt has a long-standing (well, year-old) beef with Wikipedia. You can read about it, ironically enough, in the Wikipedia article about him [wikipedia.org] .

    He got into a dispute because he didn't like having his biography on WP (though it was constructed from publicly available news sources). He was generally combative and belligerent, and so was blocked and banned various times; check out the Talk archives for details. Afterwards he started a webpage where he attempted to list the real-world identities of the editors involved in the dispute.

    Brandt is also the guy responsible for outing the anonymous editor in the Seigenthaler controversy.

    Not an unflattering biography (1)

    iabervon (1971) | more than 7 years ago | (#16726093)

    Daniel Brandt is against Wikipedia's portrayal of him not because of it being unflattering (it is, in my opinion, if anything oddly sympathetic to his position, despite his position being that it shouldn't exist at all), but because of his privacy concerns. He's a privacy activist with a particular focus on the actions of information organizing sites, and so he's not unexpectedly against the existance of unauthorized widely-available detailed biographies. He's gone so far as to complain about CIA and NSA websites using cookies, so it's not surprising that he wouldn't be happy about a vast conspiracy to produce reports on unwilling individuals, regardless of the merits of the reports.

    Wikipedia is now digg.com, without the credit! (1)

    dotancohen (1015143) | more than 7 years ago | (#16726127)

    This isn't surprising, seeing how _anybody_ can edit wikipedia. The inability to verify has always been an issue with wikipedia. Furthermore, I'm sure that most of these 'incidents' could be rectified by simply changing a few words and then referencing the source webpage. Then, instead of it being plagerism, it would be accountable reference work.

    Bah.

    http://what-is-what.com/what_is/digg.html [what-is-what.com]

    Brandt's paper and Wikipedia's response (1)

    AxelBoldt (1490) | more than 7 years ago | (#16726173)

    Brandt's original paper is here [wikipedia-watch.org] , explaining his methodology and giving the complete list of articles he found. Wikipedia's response is here [wikipedia.org] , where people go through the list one by one and also check the other contributions of users who have added copyrighted content. Wikipedia also has a bot [wikipedia.org] which aims to detect newly added copyright violations by searching Google.
    Load More Comments
    Slashdot Login

    Need an Account?

    Forgot your password?