Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Google Open Sources Its Data Interchange Format

kdawson posted more than 6 years ago | from the it's-fast-that's-why dept.

Google 332

A number of readers have noted Google's open sourcing of their internal data interchange format, called Protocol Buffers (here's the code and the doc). Google elevator statement for Protocol Buffers is "a language-neutral, platform-neutral, extensible way of serializing structured data for use in communications protocols, data storage, and more." It's the way data is formatted to move around inside of Google. Betanews spotlights some of Protocol Buffers' contrasts with XML and IDL, with which it is most comparable. Google's blogger claims, "And, yes, it is very fast — at least an order of magnitude faster than XML."

cancel ×


Sorry! There are no comments related to the filter you selected.

An order of magnitude over XML? (5, Funny)

Anonymous Coward | more than 6 years ago | (#24105175)

So is, well, just about anything.

metaspam (-1, Offtopic)

Anonymous Coward | more than 6 years ago | (#24105341)

Your continued donations keep Wikipedia running!

Spam (electronic)
From Wikipedia, the free encyclopedia
Jump to: navigation, search
This article is about electronic spam. For other uses, see Spam.
An email box folder of spam messages.
An email box folder of spam messages.

Spamming is the abuse of electronic messaging systems to indiscriminately send unsolicited bulk messages. While the most widely recognized form of spam is e-mail spam, the term is applied to similar abuses in other media: instant messaging spam, Usenet newsgroup spam, Web search engine spam, spam in blogs, wiki spam, mobile phone messaging spam, Internet forum spam and junk fax transmissions.

Spamming remains economically viable because advertisers have no operating costs beyond the management of their mailing lists, and it is difficult to hold senders accountable for their mass mailings. Because the barrier to entry is so low, spammers are numerous, and the volume of unsolicited mail has become very high. The costs, such as lost productivity and fraud, are borne by the public and by Internet service providers, which have been forced to add extra capacity to cope with the deluge. Spamming is widely reviled, and has been the subject of legislation in many jurisdictions.[citation needed]

Persons who create electronic spam are called spammers.[1]

        * 1 Spamming in different media
                    o 1.1 E-mail spam
                    o 1.2 Instant Messaging and Chat Room spam
                    o 1.3 Chat spam
                    o 1.4 Newsgroup spam and forum spam
                    o 1.5 Mobile phone spam
                    o 1.6 Online game messaging spam
                    o 1.7 Spam targeting search engines (spamdexing)
                    o 1.8 Blog, wiki, and guestbook spam
                    o 1.9 Spam targeting video sharing sites
        * 2 Noncommercial spam
        * 3 Geographical origins of spams
        * 4 History
                    o 4.1 Pre-Internet spam
                    o 4.2 Origin of the term "spam"
                    o 4.3 History of Internet "spam"
        * 5 Trademark issues
        * 6 Costs of spam
                    o 6.1 General costs of spam
        * 7 In crime
        * 8 Political issues
        * 9 Court cases
                    o 9.1 United States
                    o 9.2 United Kingdom
        * 10 References
        * 11 Newsgroups
        * 12 See also
                    o 12.1 History
        * 13 External links

[edit] Spamming in different media

[edit] E-mail spam

        Main article: E-mail spam

E-mail spam, also known as unsolicited bulk email (UBE) or unsolicited commercial email (UCE), is the practice of sending unwanted e-mail messages, frequently with commercial content, in large quantities to an indiscriminate set of recipients.

Spam in e-mail started to become a problem when the Internet was opened up to the general public in the mid-1990s. It grew exponentially over the following years, and today comprises some 80 to 85% of all the email in the world, by conservative estimate;[2] some sources go as high as 95%.

Pressure to make e-mail spam illegal has been successful in some jurisdictions, but less so in others. Spammers take advantage of this fact, and frequently outsource parts of their operations to countries where spamming will not get them into legal trouble.

Increasingly, e-mail spam today is sent via "zombie networks", networks of virus- or worm-infected personal computers in homes and offices around the globe; many modern worms install a backdoor which allows the spammer access to the computer. At the same time, it is becoming clear that malware authors, spammers, and phishers are learning from each other, and possibly forming various kinds of partnerships.

E-mail is an extremely cheap mass medium, and professional spammers have automated their processes to a high extent. Thus, spamming can be very profitable even at what would otherwise be considered extremely low response rates.

An industry of e-mail address harvesting is dedicated to collecting email addresses and selling compiled databases.[3] Millions of email addresses can be cheaply purchased.[4]

[edit] Instant Messaging and Chat Room spam
        This section may require cleanup to meet Wikipedia's quality standards.
Please improve this article if you can (November 2007).

        Main article: Messaging spam

Instant Messaging spam, sometimes termed spim (a portmanteau of spam and IM, short for instant messenger), makes use of instant messaging systems, such as AOL Instant Messenger,Xfire,ICQ,Yahoo messenger or Windows Live Messenger. Many IM systems offer a user directory, including demographic information that allows an advertiser to gather the information, sign on to the system, and send unsolicited messages. To send instant messages to millions of users requires scriptable software and the recipients' IM usernames. Spammers have similarly targeted Internet Relay Chat channels, using IRC bots that join channels and bombard them with advertising.

Messenger service spam has lent itself to spammer use in a particularly circular scheme. In many cases, messenger spammers send messages to vulnerable machines consisting of text like "Annoyed by these messages? Visit this site." The link leads to a Web site where, for a fee, users are told how to disable the Windows messenger service. Though the messenger service is easily disabled for free, the scam works because it creates a perceived need and offers a solution. Often the only "annoying messages" the user receives through Messenger are ads to disable Messenger itself. It is often using a false ID to get money or credit card numbers. Another place where people spam or get spammed is on Online Social Networks such as Myspace and Bebo.

[edit] Chat spam
        This section may contain original research or unverified claims.
Please improve the article by adding references. See the talk page for details. (November 2007)

Chat spam can occur in any live chat environment like IRC, the in-game multiplayer chat supplied through online games/gaming systems, and in any other form of chat the masses are able to view. It consists of repeating the same word or sentence many times to get attention or to interfere with normal operations. It is generally considered very rude and may lead to swift exclusion of the user from the used chat service by the owners or moderators.

The application of the name "Spam" to unwanted communication originates in Chat-room spam. Specifically, it was developed in the chat-rooms of People-Link in the early 1980s as a technique for getting rid of unwelcome newcomers. When someone would enter a chat-room full of friends who were in mid-conversation, and when the newcomer tried to turn the conversation in an unwelcome direction, two veteran members of the room would begin typing in the Monty Python "Spam" routine at high speed. They would fill the screen with "Spam Spam Spam eggs Spam Spam and Spam" etc, and make all other communication impossible. The other members of the room would just wait quietly until the newcomer got disgusted and moved on to a different room.

[edit] Newsgroup spam and forum spam

        Main article: Newsgroup spam
        Main article: Forum spam

[edit] Mobile phone spam

        Main article: Mobile phone spam

Mobile phone spam is directed at the text messaging service of a mobile phone. This can be especially irritating to customers not only for the inconvenience but also because of the fee they may be charged per text message received in some markets. The term "SpaSMS" was coined at the adnews website Adland in 2000 to describe spam SMS.

[edit] Online game messaging spam

Many online games allow players to contact each other via player-to-player messaging, chatrooms, or public discussion areas. What qualifies as spam varies from game to game, but usually this term applies to all forms of message flooding, violating the terms of service contract for the website.

In this context, spam is sometimes perceived as a backronym for stupid, pointless, annoying message (sometimes the A is thought to stand for anonymous).[citation needed]

[edit] Spam targeting search engines (spamdexing)

        Main article: Spamdexing

Spamdexing (a portmanteau of spamming and indexing) refers to the practice on the World Wide Web of modifying HTML pages to increase the chances of them being placed high on search engine relevancy lists. These sites use "black hat search engine optimization techniques" to unfairly increase their rank in search engines. Many modern search engines modified their search algorithms to try to exclude web pages utilizing spamdexing tactics.

[edit] Blog, wiki, and guestbook spam

        Main article: Spam in blogs

Blog spam, or "blam" for short, is spamming on weblogs. In 2003, this type of spam took advantage of the open nature of comments in the blogging software Movable Type by repeatedly placing comments to various blog posts that provided nothing more than a link to the spammer's commercial web site.[5] Similar attacks are often performed against wikis and guestbooks, both of which accept user contributions.

[edit] Spam targeting video sharing sites

Video sharing sites, such as YouTube, are now being frequently targeted by spammers. The most common technique involves people (or spambots) posting links to sites, most likely pornographic or dealing with online dating, on the comments section of random videos or people's profiles.

Another frequently used technique is using bots to post messages on random users' profiles to a spam account's channel page, along with enticing text and images, usually of a suggestive nature. These pages may include their own or other users' videos, again often suggestive. The main purpose of these accounts is to draw people to their link in the home page section of their profile.

YouTube has blocked the posting of links but people can still manage to get their message across by replacing all instances of a period with the word "dot." For instance, typing out example dot com instead of bypasses the filter set in place. In addition, YouTube has implemented a CAPTCHA system that makes rapid posting of repeated comments much more difficult than before, because of abuse in the past by mass-spammers who would flood people's profiles with thousands of repetitive comments.

Another form of such spam is posting a message which claims to elicit an occurrence, such as an easter egg, the loss of a loved one, or being haunted by a ghost, unless a demand is met by copying and pasting the message a certain number of times within a time limit. A prime example is as follows: "Post this in 5 videos in an hour or you shall die." Such posts target the gullible, but those who are more familiar with them usually respond with derision or simply ignore them. Some sites include a feature that allows users to mark certain comments as spam or rate unwelcome comments with a low score, with the intent that spam posts will receive a negative rating.

Yet another kind is actual video spam, giving the uploaded movie a name and description with a popular figure or event which is likely to draw attention, or within the video has a certain image timed to come up as the video's thumbnail image to mislead the viewer. The actual content of the video ends up being totally unrelated, sometimes offensive, or just features on-screen text of a link to the site being promoted.

Others may upload videos presented in an infomercial-like format selling their product which feature actors and paid testimonials, though the promoted product or service is of dubious quality and would likely not pass the scrutiny of a standards and practices department at a television station or cable network.

[edit] Noncommercial spam

E-mail and other forms of spamming have been used for purposes other than advertisements. Many early Usenet spams were religious or political. Serdar Argic, for instance, spammed Usenet with historical revisionist screeds. A number of evangelists have spammed Usenet and e-mail media with preaching messages. A growing number of criminals are also using spam to perpetrate various sorts of fraud,[6] and in some cases have used it to lure people to locations where they have been kidnapped, held for ransom, and even murdered.[7]

[edit] Geographical origins of spams

Experts from SophosLabs analysed spam messages which were caught by some companies' spam filters, these being a part of the Sophos global spam monitoring network. They found that during the third quarter of 2007 the USA was the leader in the number of spam messages around the world. According to Sophos experts 28.4% of global spam comes from the U.S. The second place in the list of spammer-countries is South Korea, bringing 5.2% of global spam.

The list of top 12 countries that spread spam around the globe is presented below:

      1. USA - 28.4%;
      2. South Korea - 5.2%;
      3. China (including Hong Kong) - 4.9%;
      4. Russia - 4.4%;
      5. Brazil - 3.7%;
      6. France - 3.6%;
      7. Germany - 3.4%;
      8. Turkey - 3.%;
      9. Poland - 2.7%;
    10. Great Britain - 2.4%;
    11. Romania - 2.3%;
    12. Mexico - 1.9%;

        * Other countries - 33.9%[8]

[edit] History

[edit] Pre-Internet spam
A possible 19th century mass telegraph
A possible 19th century mass telegraph

In the late 19th Century Western Union allowed telegraphic messages on its network to be sent to multiple destinations. The first recorded instance of mass unsolicited commercial telegram is from May 1864.[9] Up until the Great Depression wealthy North American residents would be deluged with nebulous investment offers. This problem never fully emerged in Europe to the degree that it did in the Americas, because telegraphy was regulated by national post offices in the European region.

[edit] Origin of the term "spam"

It is widely believed the term spam is derived from the 1970 SPAM sketch of the BBC television comedy series "Monty Python's Flying Circus".[10]

The sketch is set in a cafe where nearly every item on the menu includes SPAM luncheon meat. As the waiter recites the SPAM-filled menu, a chorus of Viking patrons drowns out all conversations with a song repeating "SPAM, SPAM, SPAM, SPAM... lovely SPAM, wonderful SPAM", hence "SPAMming" the dialogue. The excessive amount of SPAM mentioned in the sketch is a reference to British rationing during World War II.[citation needed] SPAM was one of the few meat products that avoided rationing, and hence was widely available.

In the 1980s the term was adopted to describe certain abusive users who frequented BBSs and MUDs, who would repeat "SPAM" a huge number of times to scroll other users' text off the screen.[11] In early Chat rooms services like PeopleLink and the early days of AOL, they actually flooded the screen with quotes from the Monty Python Spam sketch. This was used as a tactic by insiders of a group that wanted to drive newcomers out of the room so the usual conversation could continue. It was also used to prevent members of rival groups from chatting â" for instance, Star Wars fans often invaded Star Trek chat rooms, filling the space with blocks of text until the Star Trek fans left.[12] This act, previously called flooding or trashing, came to be known as spamming.[13] The term was soon applied to a large amount of text broadcasted by many users.

It later came to be used on Usenet to mean excessive multiple postingâ"the repeated posting of the same message. The unwanted message would appear in many if not all newsgroups, just as SPAM appeared in all the menu items in the Monty Python sketch. The first usage of this sense was by Joel Furr[14] in the aftermath of the ARMM incident of March 31, 1993, in which a piece of experimental software released dozens of recursive messages onto the news.admin.policy newsgroup [1]. This use had also become establishedâ"to spam Usenet was flooding newsgroups with junk messages. The word was also attributed to the flood of "Make Money Fast" messages that clogged many newsgroups during the 1990s.[citation needed]

In 1998, the New Oxford Dictionary of English, which had previously only defined "spam" in relation to the trademarked food product, added a second definition to its entry for "spam": "Irrelevant or inappropriate messages sent on the Internet to a large number of newsgroups or users."[15]

There are three popular false etymologies of the word "spam". The first, promulgated by Canter & Siegel themselves, is that "spamming" is what happens when one dumps a can of SPAM luncheon meat into a fan blade. The second is the backronym "shit posing as mail." The third is similar, using "stupid pointless annoying messages."[citation needed] Another false etymology is the Esperanto interpretation: The term spamo (with the o-ending designating nouns) makes sense as "senpete alsendita mesaÄo", which means "a message sent to someone without request".[citation needed]

[edit] History of Internet "spam"

The earliest documented spam was a message advertising the availability of a new model of Digital Equipment Corporation computers sent to 393 recipients on ARPANET in 1978, by Gary Thuerk.[16][17][14] The term "spam" for this practice had not yet been applied.

Spamming had been practiced as a prank by participants in multi-user dungeon games, to fill their rivals' accounts with unwanted electronic junk.[17] The first known electronic chain letter, titled Make Money Fast, was released in 1988.

The first major commercial spam incident started on March 5, 1994, when a husband and wife team of lawyers, Laurence Canter and Martha Siegel, began using bulk Usenet posting to advertise immigration law services. The incident was commonly termed the "Green Card spam", after the subject line of the postings. Defiant in the face of widespread condemnation, the attorneys claimed their detractors were hypocrites or "zealouts", claimed they had a free speech right to send unwanted commercial messages, and labeled their opponents "anti-commerce radicals." The couple wrote a controversial book entitled How to Make a Fortune on the Information Superhighway.[17]

Later that year a poster operating under the alias Serdar Argic posted antagonistic messages denying the Armenian Genocide to tens of thousands of Usenet discussions that had been searched for the word Turkey.

Within a few years, the focus of spamming (and antispam efforts) moved chiefly to e-mail, where it remains today.[18] Arguably, the aggressive email spamming by a number of high-profile spammers such as Sanford Wallace of Cyber Promotions in the mid-to-late 1990s contributed to making spam predominantly an email phenomenon in the public mind.

[edit] Trademark issues

Hormel Foods Corporation, the maker of SPAM luncheon meat, does not object to the Internet use of the term "spamming". However, they did ask that the capitalized word "SPAM" be reserved to refer to their product and trademark.[19] By and large, this request is obeyed in forums which discuss spam. In Hormel Foods v SpamArrest, Hormel attempted to assert its trademark rights against SpamArrest, a software company, from using the mark "spam", since Hormel owns the trademark. In a dilution claim, Hormel argued that Spam Arrest's use of the term "spam" had endangered and damaged "substantial goodwill and good reputation" in connection with its trademarked lunch meat and related products. Hormel also asserts that Spam Arrest's name so closely resembles its luncheon meat that the public might become confused, or might think that Hormel endorses Spam Arrest's products. Hormel did not prevail. Attorney Derek Newman responded on behalf of Spam Arrest: "Spam has become ubiquitous throughout the world to describe unsolicited commercial e-mail. No company can claim trademark rights on a generic term." Hormel stated on its website: "Ultimately, we are trying to avoid the day when the consuming public asks, 'Why would Hormel Foods name its product after junk email?'"[20]

Hormel also made two attempts that were dismissed in 2005 to revoke the mark "SPAMBUSTER".[21]

Hormel's Corporate Attorney Melanie J. Neumann also sent SpamCop's Julian Haight a letter on August 27, 1999 requesting that he delete an objectionable image (a can of Hormel's SPAM luncheon meat product in a trash can), change references to UCE spam to all lower case letters, and confirm his agreement to do so.[22]

[edit] Costs of spam

The European Union's Internal Market Commission estimated in 2001 that "junk e-mail" cost Internet users â10 billion per year worldwide.[23]

The California legislature found that spam cost United States organizations alone more than $13 billion in 2007, including lost productivity and the additional equipment, software, and manpower needed to combat the problem.[24]

Spam's direct effects include the consumption of computer and network resources, and the cost in human time and attention of dismissing unwanted messages. In addition, spam has costs stemming from the kinds of spam messages sent, from the ways spammers send them, and from the arms race between spammers and those who try to stop or control spam. In addition, there are the opportunity cost of those who forgo the use of spam-afflicted systems. There are the direct costs, as well as the indirect costs borne by the victims - both those related to the spamming itself, and to other crimes that usually accompany it, such as financial theft, identity theft, data and intellectual property theft, virus and other malware infection, child pornography, fraud, and deceptive marketing.

The cost to providers of search engines is not insignificant:

        "The secondary consequence of spamming is that search engine indexes are inundated with useless pages, increasing the cost of each processed query."[1]

The methods of spammers are likewise costly. Because spamming contravenes the vast majority of ISPs' acceptable-use policies, most spammers have for many years gone to some trouble to conceal the origins of their spam. E-mail, Usenet, and instant-message spam are often sent through insecure proxy servers belonging to unwilling third parties. Spammers frequently use false names, addresses, phone numbers, and other contact information to set up "disposable" accounts at various Internet service providers. In some cases, they have used falsified or stolen credit card numbers to pay for these accounts. This allows them to quickly move from one account to the next as each one is discovered and shut down by the host ISPs.

The costs of spam also include the collateral costs of the struggle between spammers and the administrators and users of the media threatened by spamming. [25]

Many users are bothered by spam because it impinges upon the amount of time they spend reading their e-mail. Many also find the content of spam frequently offensive, in that pornography is one of the most frequently advertised products. Spammers send their spam largely indiscriminately, so pornographic ads may show up in a work place e-mail inboxâ"or a child's, the latter of which is illegal in many jurisdictions. Recently, there has been a noticeable increase in spam advertising websites that contain child pornography.[citation needed]

Some spammers argue that most of these costs could potentially be alleviated by having spammers reimburse ISPs and persons for their material.[citation needed] There are two problems with this logic: first, the rate of reimbursement they could credibly budget is not nearly high enough to pay the direct costs; and second, the human cost (lost mail, lost time, and lost opportunities) is basically unrecoverable.

E-mail spam exemplifies a tragedy of the commons: spammers use resources (both physical and human), without bearing the entire cost of those resources. In fact, spammers commonly do not bear the cost at all. This raises the costs for everyone. In some ways spam is even a potential threat to the entire e-mail system, as operated in the past.

Since e-mail is so cheap to send, a tiny number of spammers can saturate the Internet with junk mail. Although only a tiny percentage of their targets are motivated to purchase their products (or fall victim to their scams), the low cost may provide a sufficient conversion rate to keep the spamming alive. Furthermore, even though spam appears not to be economically viable as a way for a reputable company to do business, it suffices for professional spammers to convince a tiny proportion of gullible advertisers that it is viable for those spammers to stay in business. Finally, new spammers go into business every day, and the low costs allow a single spammer to do a lot of harm before finally realizing that the business is not profitable.

Some companies and groups "rank" spammers; spammers who make the news are sometimes referred to by these rankings.[26][27] The secretive nature of spamming operations makes it difficult to determine how proliferated an individual spammer is, thus making the spammer hard to track, block or avoid. Also, spammers may target different networks to different extents, depending on how successful they are at attacking the target. Thus considerable resources are employed to actually measure the amount of spam generated by a single person or group. For example, victims that use common antispam hardware, software or services provide opportunities for such tracking. Nevertheless, such rankings should be taken with a grain of salt.

[edit] General costs of spam

In all cases listed above, including both commercial and non-commercial, "spam happens" because of a positive Cost-benefit analysis result.

Cost is the combination of

        * Overhead: The costs and overhead of electronic spamming include bandwidth, developing or acquiring an email/wiki/blog spam tool, taking over or acquiring a host/zombie, etc.
        * Transaction cost: The incremental cost of contacting each additional recipient once a method of spamming is constructed, multiplied by the number of recipients. (see CAPTCHA as a method of increasing transaction costs)
        * Risks: Chance and severity of legal and/or public reactions, including damages and punitive damages
        * Damage: Impact on the community and/or communication channels being spammed (see Newsgroup spam)

Benefit is the total expected profit from spam, which may include any combination of the commercial and non-commercial reasons listed above. It is normally linear, based on the incremental benefit of reaching each additional spam recipient, combined with the conversion rate.

Spam is prevalent on the Internet because the transaction cost of electronic communications is radically less than any alternate form of communication, far outweighing the current potential losses, as seen by the amount of spam currently in existence. Spam continues to spread to new forms of electronic communication as the gain (number of potential recipients) increases to levels where the cost/benefit becomes positive. Spam has most recently evolved to include wikispam and blogspam as the levels of readership increase to levels where the overhead is no longer the dominating factor. According to the above analysis, spam levels will continue to increase until the cost/benefit analysis is balanced[citation needed].

[edit] In crime

Spam can be used to spread computer viruses, trojan horses or other malicious software. The objective may be identity theft, or worse (e.g., advance fee fraud). Some spam attempts to capitalize on human greed whilst other attempts to use the victims' inexperience with computer technology to trick them (e.g., Phishing).

On May 31, 2007, one of the world's most prolific spammers, Robert Alan Soloway, was arrested by U.S. authorities.[28] Described as one of the top ten spammers in the world, Soloway was charged with 35 criminal counts, including mail fraud, wire fraud, e-mail fraud, aggravated identity theft and money laundering.[29] Prosecutors allege that Soloway used millions of "zombie" computers to distribute spam during 2003.[citation needed] This is the first case in which U.S. prosecutors used identity theft laws to prosecute a spammer for taking over someone else's Internet domain name.[citation needed]

Scammers developed software which involves an attractive blond girl, who shows up on the screen promising a striptease if the user enters the CAPTCHA code that is often required to tell humans from computers. After entering the code several times the woman didn't take off all her clothes, instead the program restarted again.[30]

[edit] Political issues
        The neutrality of this section is disputed.
Please see the discussion on the talk page.(December 2007)
Please do not remove this message until the dispute is resolved.

Spamming remains a hot discussion topic. In 2004, the seized Porsche of an indicted spammer was advertised on the Internet;[2] this revealed the extent of the financial rewards available to those who are willing to commit duplicitous acts online. However, some of the possible means used to stop spamming may lead to other side effects, such as increased government control over the Internet, loss of privacy, barriers to free expression, and the commercialization of e-mail.[citation needed]

One of the chief values favored by many long-time Internet users and experts, as well as by many members of the public, is the free exchange of ideas. Many have valued the relative anarchy of the Internet, and bridle at the idea of restrictions placed upon it.[citation needed] A common refrain from spam-fighters is that spamming itself abridges the historical freedom of the Internet, by attempting to force users to carry the costs of material which they would not choose.[citation needed]

An ongoing concern expressed by parties such as the Electronic Frontier Foundation and the ACLU has to do with so-called "stealth blocking", a term for ISPs employing aggressive spam blocking without their users' knowledge. These groups' concern is that ISPs or technicians seeking to reduce spam-related costs may select tools which (either through error or design) also block non-spam e-mail from sites seen as "spam-friendly". SPEWS is a common target of these criticisms. Few object to the existence of these tools; it is their use in filtering the mail of users who are not informed of their use which draws fire.[citation needed]

Some see spam-blocking tools as a threat to free expressionâ"and laws against spamming as an untoward precedent for regulation or taxation of e-mail and the Internet at large. Even though it is possible in some jurisdictions to treat some spam as unlawful merely by applying existing laws against trespass and conversion, some laws specifically targeting spam have been proposed. In 2004, United States passed the CAN-SPAM Act of 2003 which provided ISPs with tools to combat spam. This act allowed Yahoo! to successfully sue Eric Head, reportedly one of the biggest spammers in the world, who settled the lawsuit for several thousand U.S. dollars in June 2004. But the law is criticized by many for not being effective enough. Indeed, the law was supported by some spammers and organizations which support spamming, and opposed by many in the antispam community. Examples of effective anti-abuse laws that respect free speech rights include those in the U.S. against unsolicited faxes and phone calls, and those in Australia and a few U.S. states against spam.[citation needed]

In November 2004, Lycos Europe released a screensaver called make LOVE not SPAM which made Distributed Denial of Service attacks on the spammers themselves. It met with a large amount of controversy and the initiative ended in December 2004.[citation needed]

[edit] Court cases

[edit] United States

Sanford Wallace and Cyber Promotions were the target of a string of lawsuits, many of which were settled out of court, up through the famous 1998 Earthlink settlement which put Cyber Promotions out of business.

Attorney Laurence Canter was disbarred by the Tennessee Supreme Court in 1997 for sending prodigious amounts of spam advertising his immigration law practice.

In 2005, Jason Smathers, a former America Online employee, pled guilty to charges of violating the CAN-SPAM Act. In 2003, he sold a list of approximately 93 million AOL subscriber e-mail addresses to Sean Dunaway who, in turn, sold the list to spammers.[31][32]

In 2007, Robert Soloway lost a case in a federal court against the operator of a small Oklahoma-based Internet service provider who accused him of spamming. U.S. Judge Ralph G. Thompson granted a motion by plaintiff Robert Braver for a default judgment and permanent injunction against him. The judgment includes a statutory damages award of $10,075,000 under Oklahoma law.[33]

In June 2007, two men were convicted of eight counts stemming from sending millions of e-mail spam messages that included hardcore pornographic images. Jeffrey A. Kilbride, 41, of Venice, California was sentenced to six years in prison, and James R. Schaffer, 41, of Paradise Valley, Arizona, was sentenced to 63 months. In addition, the two were fined $100,000, ordered to pay $77,500 in restitution to AOL, and ordered to forfeit more than $1.1 million, the amount of illegal proceeds from their spamming operation.[34] The charges included conspiracy, fraud, money laundering, and transportation of obscene materials. The trial, which began on June 5, was the first to include charges under the CAN-SPAM Act of 2003, according to a release from the Department of Justice. The specific law that prosecutors used under the CAN-Spam Act was designed to crack down on the transmission of pornography in spam.[35]

In 2005, Scott J. Filary and Donald E. Townsend of Tampa, Florida were sued by Florida Attorney General Charlie Crist for violating the Florida Electronic Mail Communications Act.[36] The two spammers were required to pay $50,000 USD to cover the costs of investigation by the state of Florida, and a $1.1 million penalty if spamming were to continue, the $50,000 was not paid, or the financial statements provided were found to be inaccurate. The spamming operation was successfully shut down.[37]

Edna Fiedler, 44, of Olympia, Washington, on June 25, 2008, pleaded guilty in a Tacoma court and was sentenced to 2 years imprisonment and 5 years of supervised release or probation in an Internet $1 million "Nigerian check scam." She conspired to commit bank, wire and mail fraud, against US citizens, specifically using Internet by having had an accomplice who shipped counterfeit checks and money orders to her from Lagos, Nigeria, last November. Fiedler shipped out $ 609,000 fake check and money orders when arrested and prepared to send additional $ 1.1 million counterfeit materials. Also, the U.S. Postal Service recently intercepted counterfeit checks, lottery tickets and eBay overpayment schemes with a face value of $2.1 billion.[38][39]

[edit] United Kingdom

In the first successful case of its kind, Nigel Roberts from the Channel Islands won £270 against Media Logistics UK who sent junk e-mails to his personal account.[40]

January 2007, a Sheriff Court in Scotland awarded Mr. Gordon Dick £750 (the then maximum sum which could be awarded in a Small Claim action) plus expenses of £618.66, a total of £1368.66 against Transcom Internet Services Ltd.[41] for breaching anti-spam laws.[42] Transcom had been legally represented at earlier hearings but were not represented at the proof, so Dick got his decree by default. It is the largest amount awarded in compensation in the United Kingdom since the Nigel Roberts case in 2005 above.

[edit] References

      1. ^ a b GyÃngyi, ZoltÃn & Garcia-Molina, Hector (2005), âoeWeb spam taxonomyâ, Proceedings of the First International Workshop on Adversarial Information Retrieval on the Web (AIRWeb), 2005 in The 14th International World Wide Web Conference (WWW 2005) May 10, (Tue)-14 (Sat), 2005, Nippon Convention Center (Makuhari Messe), Chiba, Japan., New York, N.Y.: ACM Press, ISBN 1-59593-046-9
      2. ^ []
      3. ^ FileOn List Builder-Extract URL,MetaTags,Email,Phone,Fax from www-Optimized Webcrawler
      4. ^ Email Mailing Lists - Email Address Lists - Buy
      5. ^ The (Evil) Genius of Comment Spammers - Wired Magazine, March 2004
      6. ^ See: Advance fee fraud
      7. ^ SA cops, Interpol probe murder -, 2004-12-31
      8. ^ Most Spam comes from the USA, says SophosLabs
      9. ^ "Getting the message, at last" (2007-12-14).
    10. ^ Origin of the term "spam" to mean net abuse
    11. ^ Origin of the term "spam" to mean net abuse
    12. ^ The Origins of Spam in Star Trek chatrooms
    13. ^ Spamming? ( - Google Groups USENET archive, 1990-09-26
    14. ^ a b At 30, Spam Going Nowhere Soon - Interviews with Gary Thuerk and Joel Furr
    15. ^ "Oxford dictionary adds Net terms" on
    16. ^ Reaction to the DEC Spam of 1978
    17. ^ a b c Tom Abate (May 3, 2008). "A very unhappy birthday to spam, age 30", San Francisco Chronicle.
    18. ^ Origin of the term "spam" to mean net abuse
    19. ^ SPAM and the Internet - Official SPAM Website
    20. ^ Hormel Foods v SpamArrest, Motion for Summary Judgement, Redacted Version (PDF)
    21. ^ Hormel Foods Corpn v Antilles Landscape Investments NV (2005) EWHC 13 (Ch)
    22. ^ Letter from Hormel's Corporate Attorney Melanie J. Neumann to SpamCop's Julian Haight
    23. ^ "Data protection: "Junk" e-mail costs internet users 10 billion a year worldwide - Commission study"
    25. ^ Thank the Spammers - William R. James 2003-03-10
    26. ^ Spamhaus' "TOP 10 spam service ISPs"
    27. ^ The 10 Worst ROKSO Spammers
    28. ^ Alleged 'Seattle Spammer' arrested - CNET
    29. ^ Alleged 'Seattle Spammer' arrested - CNET
    30. ^ Online Striptease Scam Makes Users Break the Codes
    31. ^ U.S. v Jason Smathers and Sean Dunaway, amended complaint, US District Court for the Southern District of New York (2003). Retrieved 7 March 2007, from []
    32. ^ Ex-AOL employee pleads guilty in spam case. (2005, February 4). CNN. Retrieved 7 March 2007, from []
    33. ^ Braver v. Newport Internet Marketing Corporation et al - U.S. District Court - Western District of Oklahoma (Oklahoma City), 2005-02-22
    34. ^ "Two Men Sentenced for Running International Pornographic Spamming Business". United States Department of Justice (October 12, 2007). Retrieved on 2007-10-25.
    35. ^ Gaudin, Sharon, Two Men Convicted Of Spamming Pornography InformationWeek, June 26, 2007
    36. ^ "Crist Announces First Case Under Florida Anti-Spam Law". Office of the Florida Attorney General. Retrieved on 2008-02-23.
    37. ^ "Crist: Judgment Ends Duo's Illegal Spam, Internet Operations". Office of the Florida Attorney General. Retrieved on 2008-02-23.
    38. ^, Woman gets prison for 'Nigerian' scam
    39. ^, Woman Gets Two Years for Aiding Nigerian Internet Check Scam (PC World)
    40. ^ Businessman wins e-mail spam case - BBC News, 2005-12-27
    41. ^ Gordon Dick v Transcom Internet Service Ltd.
    42. ^ Article 13-Unsolicited communications

        * Specter, Michael (2007-08-06). "Damn Spam". The New Yorker. Retrieved on 2007-08-02.

[edit] Newsgroups

        * others including*
        * news:alt.spam [alt.spam]

[edit] See also

        * Address munging (avoidance technique)
        * Bacon (electronic)
        * E-mail fraud
        * Identity theft
        * Image spam
        * Internet Troll
        * Job scams
        * Junk mail
        * Malware
        * Network Abuse Clearinghouse
        * Advance fee fraud (Nigerian spam)
        * Phishing
        * Scam
        * Social networking spam
        * SORBS
        * Spam
        * SpamCop
        * Spamigation
        * Spam Lit
        * Spoetry
        * Sporgery
        * Virus (computer)
        * Vishing

[edit] History

        * Howard Carmack
        * Make money fast
        * Sanford Wallace
        * Spam King
        * UUnet and the Usenet Death Penalty

[edit] External links

        * Spamtrackers SpamWiki: a peer-reviewed spam information and analysis resource.
        * Federal Trade Commission page advising people to forward spam e-mail to them
        * Slamming Spamming Resource on Spam
        * Why am I getting all this spam? CDT
        * Cybertelecom:: Federal SPAM law and policy
        * Reaction to the DEC Spam of 1978 Overview and text of the first known internet email spam.

v â d â e
History  Network Abuse Clearinghouse

Address munging  Bulk email software  Directory Harvest Attack  DNSBL  Spambot  Pink contract

Autodialer  Flyposting  Junk fax  Messaging  Mobile phone  Newsgroup  Telemarketing  VoIP
Disposable e-mail address  E-mail authentication  SORBS  SpamCop  Spamhaus  List poisoning  Bayesian spam filtering  Policy block list
Keyword stuffing  Google bomb  Scraper site  Link farm  Webring  Cloaking  Doorway page  URL redirection  Spam blogs  Sping  Forum spam  Blog spam  Referer spam
Internet fraud
Advance fee fraud  Lottery scam  Make Money Fast  Microcap stock fraud  Phishing  Vishing
Retrieved from ""
Categories: Spamming | Electronic commerce | Information technology management | Internet advertising and promotion | Internet terminology | Marketing | Cybercrime | Ethically disputed business practices | History of computing
Hidden categories: Articles with unsourced statements | All articles with unsourced statements | Articles with unsourced statements since June 2007 | Cleanup from November 2007 | All pages needing cleanup | Articles that may contain original research since November 2007 | All articles that may contain original research | Articles with unsourced statements since October 2007 | Articles with unsourced statements since March 2008 | Articles with unsourced statements since April 2008 | Articles with unsourced statements since November 2007 | Articles with unsourced statements since January 2007 | Articles with unsourced statements since February 2008 | NPOV disputes from December 2007 | All NPOV disputes | Articles with unsourced statements since February 2007

        * Article
        * Discussion
        * Edit this page
        * History

Personal tools

        * Log in / create account


        * Main Page
        * Contents
        * Featured content
        * Current events
        * Random article


        * About Wikipedia
        * Community portal
        * Recent changes
        * Contact Wikipedia
        * Donate to Wikipedia
        * Help



        * What links here
        * Related changes
        * Upload file
        * Special pages
        * Printable version
        * Permanent link
        * Cite this page


        * ØÙØرØÙSØ©
        * Asturianu
        * AzÉ(TM)rbaycan
        * Bosanski
        * Ð'ÑSÐÐÐÑÑÐÐ
        * CatalÃ
        * ÄOEesky
        * Dansk
        * Deutsch
        * Español
        * Esperanto
        * Euskara
        * ÙØرØÛOE
        * FranÃais
        * Galego
        * íoeêì-
        * Hrvatski
        * Bahasa Indonesia
        * Interlingua
        * Italiano
        * ××'××(TM)×
        * LietuviÅ
        * Magyar
        * Bahasa Melayu
        * Nederlands
        * æ--¥æoeèz
        * âNorsk (bokmÃ¥l)â
        * Polski
        * PortuguÃs
        * RomÃnÄf
        * ÐÑfÑÑÐÐÐ
        * Shqip
        * Simple English
        * SlovenÄina
        * SlovenÅÄina
        * Basa Sunda
        * Suomi
        * Svenska
        * àà--à
        * Tiáng Viát
        * TürkÃe
        * УÐÑÐÑ--нÑÑOEÐÐ
        * Walon
        * ×(TM)×(TM)Ö×"×(TM)ש
        * äæ-

Powered by MediaWiki
Wikimedia Foundation

        * This page was last modified on 5 July 2008, at 23:06.
        * All text is available under the terms of the GNU Free Documentation License. (See Copyrights for details.)
            Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a U.S. registered 501(c)(3) tax-deductible nonprofit charity.
        * Privacy policy
        * About Wikipedia
        * Disclaimers

Re: meta-insult (-1, Offtopic)

X0563511 (793323) | more than 6 years ago | (#24105509)

You can support Wikipedia by making a tax-deductible donation.

From Wikipedia, the free encyclopedia
Jump to: navigation, search
        This article is missing citations or needs footnotes.
Using inline citations helps guard against copyright violations and factual inaccuracies. (February 2008)
For the ska album, see Stupidity (album).

Stupidity (also called fatuity) is the property a person, action or belief instantiates by virtue of having or being indicative of low intelligence or poor learning abilities. Stupidity is distinct from irrationality because stupidity denotes an incapability or unwillingness to properly consider the relevant information. It is frequently used as a pejorative, and consequently has a negative connotation. The term has fallen out of favor in medical journals as it is seen as a generic term used to describe a wide variety of conditions.[citation needed]

        * 1 In politics
        * 2 In comedy
        * 3 Group stupidity
        * 4 See also
        * 5 References
        * 6 External links

In politics

Robert J. Sternberg notes that many politicians have acted in ways that were stupid despite indications of general intelligence[1] He argues that there is an inherent psychological drive causing some acts of stupidity.

In comedy

The fool or buffoon has been a central character in much comedy. Alford and Alford found that humor based on stupidity was prevelent in "more complex" societies as compared to some other forms of humor.[2] Some analysis of Shakespeare's comedy has found that his characters tend to hold mutually contradictory positions; because this implies a lack of careful analysis it indicates stupidity on their part.[3] Today there is a wide array of television shows that showcase stupidity such as The Simpsons.[4]

Group stupidity

In psychology, group stupidity is known as deindividuation in crowds, and can lead to behaviors usually not displayed outside the specific social situation. The behaviors are attributed to a variety of causes, including loss of self-identity, incentives to conform to group behavior, and other dynamics.[5]

See also

        * Bounded rationality
        * Darwin Awards
        * Genius
        * Ignorance
        * Irrationality
        * World Stupidity Awards


      1. ^ Sternberg, Robert J. Why Smart People Can Be So Stupid. Yale University Press, 2003.
      2. ^ Finnegan Alford; Richard Alford. A Holo-Cultural Study of Humor. Ethos 9(2), pg 149-164.
      3. ^ N Frye. A Natural Perspective: The Development of Shakespearean Comedy and Romance. Columbia University Press, 1995.
      4. ^ R Hobbs. The Simpsons Meet Mark Twain: Analyzing Popular Media Texts in the Classroom. The English Journal, 1998.
      5. ^ Reicher, S.D., R. Spears, and T. Postmes. A Social Identity Model of Deindividuation Phenomena. European Review of Social Psychology 6, 1995.

External links
Wikiquote has a collection of quotations related to:

        * In praise of irrationality
        * "Unskilled and unaware of it: How Difficulties in Recognizing One's Own Incompetence Lead to Inflated Self-Assessments" The authors received the 2000 Ig Nobel Prize in psychology.
        * The Power of Stupidity by Giancarlo Livraghi, a series of nine papers on the nature of human stupidity.
        * Understanding Stupidity by James F Welles, Ph.D.

Retrieved from ""
Categories: Sociology
Hidden categories: Articles with unsourced statements since February 2008 | All articles with unsourced statements | Semi-protected | Articles with unsourced statements since April 2008

        * Article
        * Discussion
        * View source
        * History

Personal tools

        * Log in / create account


        * Main Page
        * Contents
        * Featured content
        * Current events
        * Random article


        * About Wikipedia
        * Community portal
        * Recent changes
        * Contact Wikipedia
        * Donate to Wikipedia
        * Help



        * What links here
        * Related changes
        * Upload file
        * Special pages
        * Printable version
        * Permanent link
        * Cite this page


        * Dansk
        * Deutsch
        * Español
        * Italiano
        * Lëtzebuergesch
        * Norsk (bokmål)
        * Polski
        * Português
        * Simple English
        * Suomi
        * Ting Vit

Powered by MediaWiki
Wikimedia Foundation

        * This page was last modified on 19 June 2008, at 15:07.
        * All text is available under the terms of the GNU Free Documentation License. (See Copyrights for details.)
            Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a U.S. registered 501(c)(3) tax-deductible nonprofit charity.
        * Privacy policy
        * About Wikipedia
        * Disclaimers

Re:An order of magnitude over XML? (5, Interesting)

dedazo (737510) | more than 6 years ago | (#24105539)

Looks like Google just invented the IIOP [] wire protocol, which is also platform agnostic and an open standard.

I guess the main difference here is that their "compiler" can generate the actual language-domain classes off of the descriptor files, which is a definite advantage over "classic" IDL.

"Google protocol Buffers" is cooler than the OMG terminology, but this kind of thing has been around for 20 years.

Re:An order of magnitude over XML? (4, Funny)

alexgieg (948359) | more than 6 years ago | (#24105683)

An order of magnitude over XML? So is, well, just about anything.

Well, let's also not forget that the meaning of the expression "an order of magnitude" depends strongly from the numeric base you're using.

Re:An order of magnitude over XML? (1)

jellomizer (103300) | more than 6 years ago | (#24105855)

But the Slashdot Add above the message says XML combined with Java is fast. And the slow part is the Database server. Could I be mistaken.

If I (0, Troll)

Xenobiotic (1230540) | more than 6 years ago | (#24105183)

If I was the first to comment this, I would say "Cool!"

Re:If I (-1, Offtopic)

Anonymous Coward | more than 6 years ago | (#24105229)

Google ads garbled the top level page entry for me.

Re:If I (-1, Troll)

Anonymous Coward | more than 6 years ago | (#24105235)

If I was the first to comment this, I would say "Cool!"

Lame dude, lame.

Why another encoding scheme? (1)

gladish (982899) | more than 6 years ago | (#24105211)

Isn't xdr compact enough?

Re:Why another encoding scheme? (0)

Anonymous Coward | more than 6 years ago | (#24105485)

If xdr isn't, perhaps packed ASN.1?

Someone invented another "standard" serialization format. Woohoo.

10x more efficient than XML is nice. Not a big accomplishment, really.

Re:Why another encoding scheme? (2, Informative)

MightyMartian (840721) | more than 6 years ago | (#24105721)

It's not hard because XML has to be the most bloated (and yet still, ironically, nowhere near human-readable) format ever invented. That it has not only not been discarded, but is now being used to store binary blobs by guys like Microsoft and is testimony to the sheer overwhelming stupidity of a lot of developers.

Re:Why another encoding scheme? (4, Insightful)

QuoteMstr (55051) | more than 6 years ago | (#24105749)

This is just yet another way in which Google demonstrates that it is suffering from NIH syndrome [] . Instead of improving existing tools, they have to go off and re-invent all the bad mistakes of past, including non-relational databases [] , clunky [] binary encodings, and a bizarre non-POSIX filesystem [] .

Just imagine how far we ahead we would be today if Google had put the same effort into creating tools the rest of the SQL-writing, open(2)-using world could use.

Re:Why another encoding scheme? (1)

QuoteMstr (55051) | more than 6 years ago | (#24106165)

I'm not trolling. I genuinely believe what I've written above.

Likely story! (4, Funny)

TheRealMindChild (743925) | more than 6 years ago | (#24105217)

"Google's blogger claims, "And, yes, it is very fast -- at least an order of magnitude faster than XML."

That is just because they aren't using enough XML!

Re:Likely story! (-1)

DriedClexler (814907) | more than 6 years ago | (#24105265)

To anyone who seriously believe's google's protocol is an order of magnitude faster than XML, I have two words for you:


Re:Likely story! (3, Informative)

caerwyn (38056) | more than 6 years ago | (#24105389)

Are you serious? XML is great for certain applications, but the one thing it *isn't* is fast. It's very believable that something like this could be an order of magnitude faster.

Re:Likely story! (1)

lgw (121541) | more than 6 years ago | (#24106205)

I've written an XML parser that was an order of magnitude faster than XML! Seriously, most XML parsers are horrifically bloated, making the slowness of XML an order of magnitude faster than it needs to be. Their claims of 40-100x faster are believable, when compared to a typical XML parser.

Re:Likely story! (4, Funny)

jandrese (485) | more than 6 years ago | (#24105409)

Yeah, I mean XML didn't earn its reputation for being lightning fast and byte efficient for nothing...

Re:Likely story! (5, Insightful)

cduffy (652) | more than 6 years ago | (#24105497)

Being 10x faster than XML to work with is entirely believable: If you're serializing directly to binary structures, those structures can be directly manipulated without any parsing at all... and if you need to do some byte-swapping and alignment adjustments to get them into and out of native form for your current processor, those are still operations which can be performed in a matter of a few CPU instructions, rather than through a few hundred KB of libraries.

I drink the XML kool-aid plenty -- but there are things it's good for, and things it's not. Serializing and parsing truly massive amounts of data is part of the latter set.

Re:Likely story! (0)

Anonymous Coward | more than 6 years ago | (#24106087)

As someone pointed out, it depends on what number base you're referring to. If it's binary then that's only twice as fast as XML.
(That's a joke, BTW).

Seriously, however- does anyone really think that Google, of all companies, would be running a different setup if XML provided faster, more reliable options? Hardly. One thing that Google is not, is slow. I would say that whatever they are running is working pretty damn good.

Re:Likely story! (0)

Anonymous Coward | more than 6 years ago | (#24106151)

true, basically XML aims at a higher level while this is supposed to be as low enough as not to spare any (well hardly any) resources

Re:Likely story! (1)

Reality Master 101 (179095) | more than 6 years ago | (#24105607)

To anyone who seriously believe's google's protocol is an order of magnitude faster than XML, I have two words for you: No.

You're right -- if it's less than two orders of magnitude faster, I would be very surprised.

Re:Likely story! (2, Insightful)

dedazo (737510) | more than 6 years ago | (#24105631)

The 10x does not refer to the transmission speed (you're not getting that for a 100KB XML string vs. a 80KB binary blob), but the speed at which the [de]serialization occurs.

In fact this approach is even faster than runtime-specific stream serialization like cPickle in Python or the built-in binary formatter in the .NET CLR, because those use reflection.

Good (0, Troll)

Darkness404 (1287218) | more than 6 years ago | (#24105233)

It is good that Google has started making more things open source, but they still have a long way to go. Right now they are comparable to Apple, they like open source, make some open source products, but aren't like Red Hat and make everything open source, but I guess we should be glad they aren't like Microsoft where nothing is open source.

Re:Good (0)

Anonymous Coward | more than 6 years ago | (#24105609)

So... the only factor in determining how "good" a company is is though what percentage of its products are open-source? I hate to shatter your Stallman-derived utopian view of the world, but not every company can make money (which is the purpose of a company, BTW, not to make nifty little code trinkets for you to play with) by having entirely open-source products. If Google open-sourced its advertising-content generators, for example, any two-bit web startup could use them and make just as much money, and Google would be no more. Get real.

Re:Good (1)

drinkypoo (153816) | more than 6 years ago | (#24105657)

Microsoft has open-sourced some things upon abandonment. That's better than some companies, even. Companies can be good in some areas, and evil in others, however.

I bet ... (5, Funny)

Anonymous Coward | more than 6 years ago | (#24105239)

... it requires piping data through google's servers for data mining and ad injection purposes.

Re:I bet ... (2, Funny)

eddy (18759) | more than 6 years ago | (#24105821)

Hey, that's a pretty cool concept.

$ cat spanish.txt | [] | grep "terrorist"

I'm sure I'm years late to the party. <sigh>

What? (1)

Yvan256 (722131) | more than 6 years ago | (#24105243)

Is that like PHP's serialize?

Re:What? (1)

psergiu (67614) | more than 6 years ago | (#24105517)

More like the Oracle SQLLoader ...
Or the VMS Fixed Record Length/Indexed or VFC files ...

I think Google might just receive a visit from the patent fairy ...

Re:What? (1)

Foofoobar (318279) | more than 6 years ago | (#24105893)

No. This is more along the lines of a hashmap or a multidimensional array. With serialize in PHP, you still have to unserialize which takes time to parse. With a multidimensional array, it's already in a usable state; no additional parsing is required. And you can add on or remove variables whenever you want without having to reparse.

Re:What? (2, Informative)

merreborn (853723) | more than 6 years ago | (#24105927)

1) It has a binary format, far more compact (and faster to unserialize) than PHP's text-based serialized format.
2) It handles multiple versions of the same objects (e.g., your server can interact with both PhoneNumber 2.0 and PhoneNumber 3.0 objects relatively trivially)
3) It generates code for converting each format into objects in their 3 supported languages.

So, no, not really.

Faster than XML? (1)

sdsucks (1161899) | more than 6 years ago | (#24105273)

I must say - I'm amazed.

No PERL API ??!!?? (4, Insightful)

Proudrooster (580120) | more than 6 years ago | (#24105275)


what about PERL ? :]

Re:No PERL API ??!!?? (4, Insightful)

Anonymous Coward | more than 6 years ago | (#24105317)

Go out and write one, sonny!

That's the beauty of open source.

Re:No PERL API ??!!?? (2, Funny)

Shut the fuck up! (572058) | more than 6 years ago | (#24105425)

perl -e 'print "Shut the fuck up!\n"'

Re:No PERL API ??!!?? (0)

Anonymous Coward | more than 6 years ago | (#24105473)

That must suck, posting at -1...your post was funny...for me at least...

Re:No PERL API ??!!?? (1, Troll)

Shut the fuck up! (572058) | more than 6 years ago | (#24105549)

I've been posting at minus one since the early seventies.

Re:No PERL API ??!!?? (1)

jandrese (485) | more than 6 years ago | (#24105429)

I'm sure it won't take long for the module to show up on CPAN.

Re:No PERL API ??!!?? (-1, Troll)

helicologic (845077) | more than 6 years ago | (#24105511)

Perl is to code what xml is to data: to be avoided at any cost if you want a scalable, intelligible interchange. XML and perl indeed do belong on the same trash truck the dump.

Re:No PERL API ??!!?? (0)

Anonymous Coward | more than 6 years ago | (#24105537)

I've had to work with PERL for the past month, and I'm not that impressed with it. It seems so... haphazard.

Re:No PERL API ??!!?? (1)

fbjon (692006) | more than 6 years ago | (#24105863)

You... You have GOT to be new here.

Re:No PERL API ??!!?? (1)

Goaway (82658) | more than 6 years ago | (#24106045)

You're not really going to see the benefits of Perl in one month. It's not a very straightforward language like that.

Re:No PERL API ??!!?? (1)

Rgb465 (325668) | more than 6 years ago | (#24105597)

Thats OK, we have Storable [] .

Re:No PERL API ??!!?? (5, Informative)

yknott (463514) | more than 6 years ago | (#24105601)

According to Brad Fitzpatrick's(of LiveJounral fame) blog [] , He's working on Perl support.

Re:No PERL API ??!!?? (1)

dedazo (737510) | more than 6 years ago | (#24105679)

Yeah, and I'd like this for the .NET CLR and Mono as well. I looked at the code and the generators are not that complicated, maybe I'll give it a shot over the weekend. Does Google accept outside contribs for projects like these?

Re:No PERL API ??!!?? (1)

A beautiful mind (821714) | more than 6 years ago | (#24105809)

It's called "Perl".

We love the sight of power (1)

heroine (1220) | more than 6 years ago | (#24105307)

Just think of the kind of power it took to make millions of employees standardize on the same format for their data interchange. Humans just gravitate to power wielding forces. Wonder what format they require for their surprise blog posts.

How about C? (1)

microbee (682094) | more than 6 years ago | (#24105365)

SunRPC is old and awkward. Always want something better.

Re:How about C? (2, Insightful)

AuMatar (183847) | more than 6 years ago | (#24105761)

They gave you C++. If you can't translate C++ to C, please turn in your keyboard and leave.

Re:How about C? (3, Funny)

vigmeister (1112659) | more than 6 years ago | (#24106075)

Well, I can't translate C++ to C until after it is DECLASSIFIED...



do you think it makes you look smart? (0)

Anonymous Coward | more than 6 years ago | (#24105387)

what's up with everyone going on about "an order of magnitude"? do you think it makes you look smart? it doesn't.

Re:do you think it makes you look smart? (1)

neokushan (932374) | more than 6 years ago | (#24106009)

I think it makes me look an order of magnitude smarter, yes.

Now just release Goobuntu... (1)

mdm-adph (1030332) | more than 6 years ago | (#24105395)

...and we'll be happy.

Re:Now just release Goobuntu... (1, Insightful)

Anonymous Coward | more than 6 years ago | (#24105651)

Trust me, you won't.

Typed on a Goobuntu machine.

Re:Now just release Goobuntu... (0)

Anonymous Coward | more than 6 years ago | (#24105685)

Believe me, you're happier with regular ubuntu Hardy. Or whatever's newest. Which is not Goobuntu.

Re:Now just release Goobuntu... (2, Funny)

fph il quozientatore (971015) | more than 6 years ago | (#24105711)

Here, fixed the typo for you:

Now just release Boobuntu...
... and we'll be happy

Back to the 70's night? (1, Insightful)

Madball (1319269) | more than 6 years ago | (#24105397)

But here, in an unusual departure from the norm, the default values for these members are set to digits (for strings or literals) or values (for numerals) that define their place in a sequence -- where they fall within a record.

Wow! They've invented fixed position data files. What will they invent next, a cool new programming language called RPG?

Re:Back to the 70's night? (3, Insightful)

Temporal (96070) | more than 6 years ago | (#24105487)

Wow! They've invented fixed position data files. What will they invent next, a cool new programming language called RPG?

The article is actually completely wrong there. The protocol buffer binary format uses tag/value pairs, not fixed positions. Parsers simply ignore any tag they don't recognize and move on to the next.

As a former user of CORBA (5, Interesting)

Anonymous Coward | more than 6 years ago | (#24105431)

It looks like Google has taken some of the good elements of CORBA and IIOP into its own interchange format.
While CORBA certainly is bloated in a lot of ways, the IIOP wire protocol it uses is vastly faster and more efficient than any XML out there.. and yes it is just as "open" (publicly documented and Freely available for use in any open source application) as any XML schema out there. J2EE uses IIOP as well and its is technically possible to interoperate (although the problem with CORBA is that different implementations never really interoperated as they were supposed to).
    As a side note, I'd rather write IDL code than an XML schema any day of the week too, but that's another rant.

compare to thrift ( from facebook) (5, Informative)

Anonymous Coward | more than 6 years ago | (#24105439)

both really from the same design sheet, but thrift has been opensource'd for over a year, and has many more language bindings. its been in use in several opensource projects (thrudb comes to mind), and has much more extant articles/documentation.

Fast (5, Interesting)

JamesP (688957) | more than 6 years ago | (#24105457)

"And, yes, it is very fast â" at least an order of magnitude faster than XML."

Just wait for the XML zealots to come crashing and not believing that XML is not the fastest, best, solution to all the world's problems (including cancer) and of course people at Google are amateurs and id10ts and WHY DO YOU HATE XML kind of stuff.

Or, as Joel Spolski once said: []

No, there is nothing wrong with XML per se, except for the fans...

Ok, I'll bite... (5, Interesting)

Dutch Gun (899105) | more than 6 years ago | (#24105961)

Obviously, those at Google felt XML didn't work well for them. They have the resources to invent a protocol and libraries to support it. And, they are big enough to be their own ecosystem, which means as long as everyone at Google is using their formats, interop is no biggie. Good for them, I don't begrudge that decision.

I'm actually a game developer, not a web developer, so I'll speak to XML's use as a file format in general. Here's a few points regarding our use of XML:

* We only use it as a source format for our tools. XML is far too inefficient and verbose to use in the final game - all our XML data is packed into our own proprietary binary data format.
* We also only use it as a meta-data format, not a primary container type. For instance, we store gameplay scripts, audio script, and cinematic meta-data in XML format. We're not foolish enough to store images, sounds, or maps in a highly-verbose, text-based format. XML's value to us is in how well it can glue large pieces of our game together.
* All our latest tools are written in C# and using the .NET platform (Windows is our development platform, of course). It's astoundingly easy to serialize data structures to XML using .NET libraries - just a few lines of code.
* Because it's a text-based format and human readable, if a file breaks in any way, we can just do a diff in source control to see what changed, and why it's breaking.

I'll make a concession that I've heard of some pretty awful uses of XML. But those who dismiss XML as a valuable tool in the toolchest are equally as foolish as those who believe it's the end-all and be-all of programming (I'm not saying that's true of you, just pointing out foolishness on both sides). Like any tool, it's most valuable when used in it's optimal role, not when shoehorned into projects as a solution to everything.

Smart move (5, Insightful)

ruin20 (1242396) | more than 6 years ago | (#24105491)

Since they're Google people will clamor over this (as we're doing here) and the result will be at least a handful of folks will learn and use it. Google's key to success has always been finding fresh talent and removing barriers from their contributing and advancement so what I've seen they've done is A) help train potential employee's on how they're tech and thought process works, and B) provide themselves a filter by which to gauge the ability for a potential employee to understand they're system.

And as a bonus, they help undermine opponents who use competing technologies by helping train the workforce away from their practices. Overall I think it's very intelligent and well done strategic move.

XML was not created for speed (1)

UseCase (939095) | more than 6 years ago | (#24105545)

Binary encoding, none hierarchy based string list, and simple file serialization are all faster than XML. XML was created flexibility, commonality and human readability not speed. XSL, XQuery, and XPATH along with the DOM or SAX supply out of the box query, transformation, and manipulation capability.

Those who don't understand ASN1 (0)

Anonymous Coward | more than 6 years ago | (#24105565)

Are doomed to re-invent it.

The killer feature is simplicity (5, Insightful)

jandrese (485) | more than 6 years ago | (#24105571)

The point of this isn't so much that it's faster than XML (so is everything else), it's that google took everything that a real person needs in a IDL and cut out everything else. Most IDLs have a serious case of second system effect, where features are added that nobody uses but seriously complicate the API. Even XML suffers from that (have you ever seen the kind of data structure you need to store a DOM, or what that does to library APIs for manipulating XML)?

I'd use it because 95% of the time all I need is something simple like this, and the other 5% of the time I should go back and rethink my design anyway.

That said, there is still a case for XML, especially the self documenting and human readable nature of the document, but there are a lot of cases where it is used today where it only adds unnecessary complexity and actually makes your code more difficult to maintain instead of simpler.

When can we talk this way? (1)

Sybert42 (1309493) | more than 6 years ago | (#24105579)

So...when can we abandon these silly letters and decimal numbers to express ourselves in binary? It's like the elephant in the room. We all want a semantic web, but we all want it in English. At least Lojban has a start on a parsable language, but it still wants to be speakable.

This isn't new... (0)

Anonymous Coward | more than 6 years ago | (#24105637)

This looks very, very similar to ASN.1 which has been around for years.

Re:This isn't new... (1)

natoochtoniket (763630) | more than 6 years ago | (#24106067)

Of course it's not new. It not only looks like ASN.1, it actually is very much like ASN.1. But to me it looks more like an extension of rpcgen, because ASN.1 came with a lot of other baggage. Of course, both rpcgen and asn.1 are just the best known implementations of ideas that were developed far earlier. Shannon's book on information theory explains just this sort of prefix code. These kinds of prefix codes have been in use since the 1960s, and code-generators have been around since the 1970s.

I think the reason that some people at google think it's new is because they are all young. Young people are constantly coming up with "new" ideas that are really two decades or more old. The idea seems new to the young person because he/she has not seen it before. That isn't a jab at google, or at young people. It is just a fact that everything seems new until after you have seen it before.

XML is a crappy format (4, Insightful)

Alex Belits (437) | more than 6 years ago | (#24105649)

I always told people that -- it's optimized for:

1. Easy parsing by parsers written by people who slept through their compiler classes.

2. Verification in situations when it's impossible to devise a meaningful reaction to a failure (other than either "everything failed, turn off the computers and go home" and "assume the data to be valid anyway because ALL of it will have the same formatting error because the same program generates it")

3. Dealing with data that arrives in neatly packaged "documents" and "requests", as opposed to being constantly produced and consumed.

4. Either communicating between programs that have the same knowledge of message semantics, or preparation of pretty human-readable documents.

None of the above even remotely applies to anything practical except UI/display formats -- this is why XHTML and ODF (and because of that at some extent XSL) are usable, SOAP is a load of crap, and for the rest of purposes XML is used as a glorified CSL with angle brackets. XML is widespread because monumentally stupid standard is still better than no standard.

So here is your example of how superior can be ANY format that is not based on this stupid idea.

Re:XML is a crappy format (1)

mattcasters (67972) | more than 6 years ago | (#24106221)

  5. Handles codepages very well
  6. Supports Unicode
  7. Handles codepages very well
  8. Supports Unicode
  9. Handles codepages very well

and did I mention this one?

10. Supports Unicode

WTF am I missing (0, Troll)

youngdev (1238812) | more than 6 years ago | (#24105663)

How do you "Open Source" a Data Interchange format. There is no source to open? You just specify a format and distribute the specs to your trading partners and there you go. I am getting so damn sick and tired of open source this and open source that. The words "open Source" have specific meaning. It is not a catch all term friendly licensing. So Google published the format it uses to push data around. Big Fuckin Deal. In case anyone is interested. I organize my dvd collection by genre and title alpabetically on a book shelf. Look mom, I open sourced my data storage format. Ridiculous.

Re:WTF am I missing (5, Informative)

jandrese (485) | more than 6 years ago | (#24105701)

They open sourced the compiler (for C++, Java, and Python) that lets you actually use the data interchange format. If you follow the link you can download the code and start using it today. The code is open source.

Re:WTF am I missing (5, Insightful)

Chyeld (713439) | more than 6 years ago | (#24105753)

Seems like you are missing the code they released that allows you to implement this in a number of languages from the 'get-go'.

You've also missed that they've just told the world how the majority of their systems talk, something most people would find interesting given how much Google does and the fact that one of Google's strong points is mangling huge amounts of data in a relatively quickly manner.

PS. Your format stinks and is horribly slow and unscalable when it comes to adding to the library. Genre's are so unbelievably grey defined that you might as well just sort them by the dominate color of the cover. Google would have done better.

Re:WTF am I missing (1, Insightful)

Anonymous Coward | more than 6 years ago | (#24105795)

You are missing that you're an idiot. Cheers.

Re:WTF am I missing (1)

shis-ka-bob (595298) | more than 6 years ago | (#24105835)

You open access to the source code of the C++, Java and Python libraries that you use in your internal work.

Re:WTF am I missing (0)

Anonymous Coward | more than 6 years ago | (#24105915)

They open sourced their implementation. RTFA.

JSON (4, Interesting)

hey (83763) | more than 6 years ago | (#24105729)

Looks kinda like JSON to me.

Re:JSON (1)

SuperKendall (25149) | more than 6 years ago | (#24105777)

I was kind of wondering the same thing, JSON was created to fill the same need. JSON is more like XML in that it's meant to be human parsable though, which counts for a lot in web use I think.

faster than XML ?? (0, Redundant)

arthurpaliden (939626) | more than 6 years ago | (#24105747)

XML cannot have a speed. XML is a format. It has no speed. The speed in which it is processed is entirly dependant on the software and algorithums used to process it.

Re:faster than XML ?? (0)

Anonymous Coward | more than 6 years ago | (#24105891)

The speed in which it is processed is entirly dependant on the software and algorithums used to process it.

I uhgree, it's entirly up to the algorithums.

Have they ever heard of BER/DER? (2, Insightful)

ugen (93902) | more than 6 years ago | (#24105755)

How is this either implementationally or conceptually different from BER/DER encoding (commonly used and available all over the place)?

Looks to me like it is exactly the same thing, reimplemented. I am sure bearing a mark of Google is nice and all, but they are definitely reinventing the wheel here.

Re:Have they ever heard of BER/DER? (1)

Dan Berlin (682091) | more than 6 years ago | (#24105819)

Have you ever met anyone who worked with ASN.1 and didn't run screaming for the hills?

Re:Have they ever heard of BER/DER? (1)

forsetti (158019) | more than 6 years ago | (#24105913)

Yeah - those guys at MIT (Kerberos), UMich (LDAP), and the SSL guys ... not that anyone uses any of those protocols/implementations ...

ASN.1 is the solution ... the problem just hasn't been properly specified yet.

Re:Have they ever heard of BER/DER? (3, Funny)

Dan Berlin (682091) | more than 6 years ago | (#24106053)

Uh, having one of the OpenSSL guys working down the hall, he certainly said he would shoot himself if he had to work with ASN.1 again.

Re:Have they ever heard of BER/DER? (1)

forsetti (158019) | more than 6 years ago | (#24106201)

Heh -- I'm not talking about the *implementers* ... just the *protocol designers*. That's why I left out *OpenLDAP* and *OpenSSL*. ;-)

Seriously though - ASN.1 is pretty good for specification, and some of the serializations aren't bad. Fairly compact, flexible ... But don't code it by hand - se a ASN.1 compiler.

Wow, they've reinvented FAST!!! (1)

Giant Electronic Bra (1229876) | more than 6 years ago | (#24105773)

lol. Not that FAST is IDENTICAL, but it is essentially just a much more sophisticated implementation of the same basic idea...

Re:Wow, they've reinvented FAST!!! (1)

Dan Berlin (682091) | more than 6 years ago | (#24105831)

I don't think you "get" it. Google open sourced this because they thought it would be cool, not because they think it is an amazingly new idea that nobody has ever done before. It's not like Google hasn't been using this internally for 5 years (Which of course, makes all the JSON comments humorous).

XDR? (1)

Rene S. Hollan (1943) | more than 6 years ago | (#24105837)

I guess that XDR wasn't good enough, then, or ASN.1 (which supports multiple abstract encodings to boot).

XML, as an interchange format?

I suppose one could load source code into memory, and compile it every time, too. Even Java compiles to bytecode.

Bloated formats are fine for human interpretation (I rather like one kind of structure for my config files), or occasional parsing (which is why most of the stuff in /etc is human-readable, for small data sets (I do remember when "the internet" was one big /etc/hosts file), but for interchange? Just cause you're big-endian and I'm little-endian?

The trick to making non-human readable formats acceptable is the prevelence of wide-spread encoding and decoding tools.

Yes, XML is self-describing, at least syntactically (and formally with an XSD), and specific encoding semantics can be tagged, but the same can be achieved with means for type encoding. The big thing with XDR and related formats is that types are implicit -- both ends need to know what is being serialized. For RPC, with well-defined interfaces, this is not a problem, but it does make type-checking a remote service a bit of a challenge.

However, types can be encoded as data, and serialized as well: this happens for variant types naturally. Thus, there is no reason to not have a type-encoding and type-exchange protocol to permit dynamic type-checking. The advantage over self-describing data serializations is that it can be done on an as-required basis, instead of with every damn serialization.

still stupid (-1, Flamebait)

youngdev (1238812) | more than 6 years ago | (#24105925)

God this is SOOOOOO pretentious. Fine I am gonna build a machine that searches through my DVD collection and retrieves the disc for the user. You will still have to put the dvd in the player yourself and it won't work with any collection but my own but praise me anyway.

By Neruos (0)

Anonymous Coward | more than 6 years ago | (#24105955)

It's all text. So... It's only as fast as its reader and writer.

How is this different.. (1)

Ztream (584474) | more than 6 years ago | (#24105999)

.. from things like YAML and JSON?

Re:How is this different.. (1)

Temporal (96070) | more than 6 years ago | (#24106133)

YAML and JSON are text-based formats intended for human readability. Protocol Buffers are binary, and therefore smaller and faster, but not human-readable.

Also, the protocol buffer compiler provides friendly data access objects. You could actually use these with JSON or YAML, by just writing a new encoder and decoder (which is easy to do).

Re:How is this different.. (1)

zolf13 (941799) | more than 6 years ago | (#24106215)

Two new great features making it useable!
1) binary
2) not "human friendly"

I have an XML alternative format too. (3, Funny)

IGnatius T Foobar (4328) | more than 6 years ago | (#24106169)

I have my own data format that is an alternative to XML as well. It works by normalizing the data into records which all contain the same number of fields, and placing an agreed-upon delimiter between each field. The end of the record is indicated by a newline.

I think this "delimited" format has a lot of potential.

C# (1)

1000101 (584896) | more than 6 years ago | (#24106181)

"then compile them to produce classes to represent those structures in the language of your choice"

That's not entirely true, but I digress. Anyway, can someone shed some light on how this is different than binary serialization I've been using to pass C# objects around for quite some time now? It is just a matter of giving the class a Serializable attribute and then using the BinaryFormatter class to serialize the object to a stream. XML serialization is available if needed to pass to non-M$ entities, but binary serialization has been around a while, no?
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?