Frequent Slashdot contributor Bennett Haselton writes " From about 1996 to 2003, there were regular reports listing examples of sites stupidly blocked by blocking software. The genre has tapered off recently, probably as a result of the Supreme Court ruling in 2003 that the Children's Internet Protection Act (CIPA) was constitutional, requiring blocking software in schools and libraries that receive federal funds, despite all the evidence of over-blocking presented at the trial. The last high-profile story about a site blocked by blocking software was about the blocking of BoingBoing almost a year ago. But the lack of recent reports on blocking software errors doesn't mean that the software has gotten better." The rest of his essay follows.
One product that generated several reports over the years was "Bess, the Internet Retriever" from N2H2, which has since been bought out by Secure Computing, which also makes a blocking program called SmartFilter (the one that blocked BoingBoing) and now sells "SmartFilter, Bess Edition" which uses the same database as Bess. Different organizations and individuals published a series of investigative reports about Bess from 1997 until 2002, listing sites about gay rights, eating disorders, and other subjects that were blocked as "pornography". In Ben Edelman's supplemental report, submitted as testimony in the CIPA trial, he listed examples of erroneously blocked sites that he had reported to N2H2 in his first expert report, and which were still being blocked five months later.
Since Bess represents a set of data points showing how the accuracy of a blocking program can change, or not change, over the years, recently I began testing it again. I didn't know whether to expect it to be better or worse. On the one hand, advances in technology and greater revenue to censorware companies could have caused the software to improve. On the other hand, the number of Web pages, and the rate at which dynamic sites like blogs change content every day, has exploded. The result? I'm still tabulating data, but it looks as if the accuracy rate is roughly the same as it was in 2000, when about 30% of blocked sites were obvious errors. Then and now, I found most of the errors by starting with a large list of URLs culled from search engines and other sources, and simply running them through the software to see what was blocked.
Here is a partial list of some of the questionable categorizations made by Bess; as of this writing, all of the following sites are listed as "Pornography" when you look them up on Secure Computing's Bess lookup form. (This is not just a fluke of the lookup tool; I tested against a copy of the software that all of these sites really were blocked.) The "screen cap" link next to each site links to a snapshot of the results taken from the lookup form (you can check on http://database.n2h2.com/ to see if the page is still returning the same results, although the more obvious errors will probably be fixed after this article is published):
- The Electronic Frontier Foundation, Austin chapter (screen cap)
- Cretans of Houston (screen cap). That's Cretans, as in "people from the island of Crete". Not to be confused with the Cretins of Houston, located here.
- The Rhode Island Coalition Against Domestic Violence (screen cap)
- The website of the public art galleries of British Columbia, Canada (screen cap)
- Rail2000, now the Bay Rail Alliance, a consumer group lobbying for a San Francisco regional rail system (screen cap)
- Rainbow Service Organization, a gay rights advocacy group (screen cap)
- GardenMentors.com, a custom gardening services company in Seattle (screen cap)
- A web site for Catalina 380 series boats (screen cap)
- Open Source ERP, a site promoting open source software for enterprise resource planning and customer relationship management (screen cap)
- The Bryn Mawr Mainliners, a barbershop harmony group (screen cap)
- Timber Trails, an outdoor recreation site (screen cap)
- The MEFTA Institute: "Middle East Free Trade Areas for Business Peace" -- world peace through cheap oil! (screen cap)
- Topple Rummy, a (somewhat out-of-date) site calling for the ouster of Donald Rumsfeld (screen cap)
- The Alabama Network of Children's Advocacy Centers (screen cap)
- PSARA, a non-profit organization for training cruise travel agents (screen cap)
- Park Place Behavioral Health Care, a non-profit mental health care agency (screen cap)
- The Oklahoma chapter of the American Institute of Building Design (screen cap)
- The Boys & Girls Clubs of Metropolitan Phoenix (screen cap)
- CEMTACH -- Computational ElectroMagnetics Theory-Algorithm-Code-Hardware. "Our goal is to develop systems simulations capabilities based on time-domain computational electromagnetics methods." Thanks for clearing that up. (screen cap)
- Fund for Humanity, a San Francisco non-profit supporting environmental organizations and organizations that assist the poor. (screen cap)
A long-standing point of contention while earlier reports about Bess were coming out, was whether every site on their blacklist had been reviewed by a human before being blocked. In 1998 the CEO testified before Congress that "All sites that are blocked are reviewed by N2H2 staff before being added to the block lists." However in their 2002 annual report the company finally admitted that not all sites were reviewed before being blocked: "Through automated categorization or human review, Web sites are identified as fitting into one or more of our categories". At one point an N2H2 employee also told me that when one site is blocked, they will often block all sites hosted on that machine or at that IP -- which of course means that those sites are also not reviewed before being blocked. In any case, it's possible to access some of these sites by IP address, such as the BC Art Galleries site via this link, or the or the Rhode Island Coalition Against Domestic Violence via this link -- so if they're not sharing their IP with other sites, that wouldn't explain how they got blocked either. Smartfilter spokesperson Tomo Foote-Lennox said that one other blocked URL that I found, http://www.arbiol.org/, was the result of an experiment N2H2 once did with fully automated website ratings.
Foote-Lennox added, "In general, we find that schools are VERY sensitive to under-blocking. The would rather block a whole lot of useful reference sites to avoid exposing one porn site." Probably true, although keep in mind we're talking about liability issues, not actual moral outrage. (If they were really morally outraged, they'd be trying to keep kids away from uncensored Internet access everywhere, not just in school! That is in fact the approach that schools take with things like drugs, which do inspire moral outrage because they really are harmful.) Perhaps what is needed is a law explicitly shielding schools from all liability for what students do or see on the Internet at school, if the faculty had no knowledge of it.
(Obligatory interstitial advertisement for common sense: I still don't see what the big deal is about porn anyway. Ask yourself: Why is it harmful to see a picture of a naked person, or even a picture of people having sex? And try to find an answer to that question that doesn't involve, "Lots of other people think so." That includes all variations like "Our society has determined...", "We as a people have decided...", which are just re-phrasings of "Lots of other people think so." I submit that if you disallow those variations of grownup-peer-pressure as an excuse, most people can't really come up with any reason at all.)
OK, flame-retardant suit off, lab coat back on. Previous reports have listed absurd examples of sites blocked by Bess, and looking at any one of those examples or the ones listed here, I'd say that in terms of public policy discussions -- specifically, whether a blocking software company should be trusted to decide what students can look at -- any one of these blocked sites would be more significant than, say, the blocking of BoingBoing which got so much attention. BoingBoing got blocked because of a non-sexual picture of a bare breast on the cover of one of the books they reviewed -- and in fact they were blocked only in the "nudity" category, which includes only "non-pornographic images of the bare human body". So the block on BoingBoing really only revealed that Secure Computing was a bit heavy-handed. (The real problem is that SmartFilter has the category for non-pornographic nudity blocked by default, even though the CIPA filtering law certainly doesn't require schools to block non-pornographic artistic images!) On the other hand, the fact that EFF Austin and the Rhode Island Coalition Against Domestic Violence are currently blocked as "Pornography", suggests that in many instances the blocking companies have nobody at the controls at all. To focus on stupid-but-not-completely-insane blocks like BoingBoing is letting them off easy.
So why did the laundry lists of blocked sites released over the years never become as widely known as BoingBoing, or the guffaw-inducing examples like "Beaver College", which had to change their name in part because of students reportedly being blocked from accessing their website? I think it's because the news favors a good "punch line" -- a fact that anybody can understand that makes us feel smarter than the computers making these dumb mistakes. "Oh, I get it, it was blocked because it was called Beaver College!" But the "punch line" anecdotes are precisely the ones that let the blocking companies off lightly, because it gives them a plausible-sounding excuse for making an error. On the other hand, when the Rhode Island Coalition Against Domestic Violence gets blocked as "Pornography", that could probably force the blocking company to answer some tough questions if it got more press, but there's no good punch line there, so the story just fizzles.
So, while I'm looking through the rest of the data, let me try and come up with some punch lines for reporters to make these blocked sites newsworthy. OK: Why was GardenMentor.com blocked? To keep kids away from all the dirty bitches and hoes! Get it? Ha ha! Why was the Catalina 380 yachting site blocked from kids? Because teens are too vulnerable to pier pressure! Hey, where are you going?