Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Eisenstadt's Analysis Of 8 Years' Worth Of Email

timothy posted more than 9 years ago | from the bodyparts-all-fully-expanded dept.

Communications 230

Hylton writes "Thought this might be of interest: Marc Eisenstadt's saved every email he's gotten over the past eight years, including spam, and run an analysis of it."

cancel ×

230 comments

Sorry! There are no comments related to the filter you selected.

My Stats (-1, Redundant)

iclod (831412) | more than 9 years ago | (#11684551)

I haven't had 1 email in 8 years! You insensitive clod!

Here's my stats:

1997 0 0% 0%
1998 0 0% 0%
1999 0 0% 0%
2000 0 0% 0%
2001 0 0% 0%
2002 0 0% 0%
2003 0 0% 0%
2004 0 0% 0%

Margin of error: 0

Executive Summary (0, Redundant)

mishmash (585101) | more than 9 years ago | (#11684707)

Year Emails %In Error %Spam
1997 4320 20% 2%
1998 3996 20% 3%
1999 6821 10% 5%
2000 7580 5% 6%
2001 6125 5% 7% 2002
6497 5% 10%
2003 13092 1% 37.6%
2004 13889 1% 40% He now does more tasks by e-mail than he used to, so e-mail takes up more of his time - 2.5 hours per day. Quite where the interest is here I don't know.

Re:Executive Summary (1)

Dorothy 86 (677356) | more than 9 years ago | (#11685161)

has Netcraft Confirmed that? I won't believe it until then ;)

Have you Meta Moderated recently? (-1, Troll)

Offtopica (413375) | more than 9 years ago | (#11684747)

What the fuck is up with Slashdot? I meta-moderated this morning. I meta-moderated again at work (logged in as the same account). When I get home: Have you Meta Moderated recently? Regular Meta Moderators are more likely to get mod points.

Why, yes, I have, and no thanks! This account has meta-moderated twice today already! Leave it alone!

Today, you have 10 moderations to meta-moderate.

LIES! Today, I have 30 moderations to meta-moderate!

"The system is broken" is too dramatic. The system just sucks.

xox,
Offtopica

Re:Have you Meta Moderated recently? (-1, Offtopic)

Anonymous Coward | more than 9 years ago | (#11684783)

You asslicker - I never meta moderate and get mod points handed out to me like candy. I will keep your userid handy and will mod you down at every opportunity, you whiny bitch. Hopefully I'll ruin your account by getting it so low that you can't even post with it anymore. Go smoke a cock.

Re:Have you Meta Moderated recently? (0, Flamebait)

Offtopica (413375) | more than 9 years ago | (#11684850)

HAW HAW!!! Joke's on you! I already post at -1!

And anonymous paedophiles always seem to reference the the handing out of candy, don't they?

O

Apparently the analysis is still running (5, Funny)

Anonymous Coward | more than 9 years ago | (#11684558)

on their webserver.

Executive Summary (-1, Redundant)

mishmash (585101) | more than 9 years ago | (#11684738)

Year Emails %In Error %Spam
1997 4320 20% 2%
1998 3996 20% 3%
1999 6821 10% 5%
2000 7580 5% 6%
2001 6125 5% 7%
2002 6497 5% 10%
2003 13092 1% 37.6%
2004 13889 1% 40%
He now does more tasks by e-mail than he used to, so e-mail takes up more of his time - 2.5 hours per day. Quite where the interest is here I don't know.

Re:Apparently the analysis is still running (-1, Troll)

rhyno46 (654622) | more than 9 years ago | (#11685014)

HAHAHAHAHAHA! Holy Shit, I get it! Slashdot crashed their webserver because everyone's trying to hit the same server at one time. Genious! I just can't wait to get to Slashdot everyday and see the original thoughts, and jokes, that such a wonderful community provides!

Re:Apparently the analysis is still running (1)

mattspammail (828219) | more than 9 years ago | (#11685176)

I know! I know! Let's call it the "Effect of the members of the slashdot community"!

No, wait. Too long.

How 'bout "Really slow web server because of slashdot effect"?

In soviet russia (0)

bryan986 (833912) | more than 9 years ago | (#11684560)

Email analyzes you!

Spam (5, Funny)

thinkliberty (593776) | more than 9 years ago | (#11684566)

I have received more spam in the past week than I have legitimate email in the past 10 years.

Re:Spam (4, Funny)

PopeAlien (164869) | more than 9 years ago | (#11684624)

Ah! but did you save and analyze them?

Re:Spam (5, Funny)

BosstonesOwn (794949) | more than 9 years ago | (#11684763)

I have analyzed it all and apparently the people sending me these spam messages know my plight.

I need a bigger penis

I need teen sluts who suck **** on webcams

And apparently I shouldn't be telling anyone this but this nice man in nigeria , who is the lawyer in charge of my long lost grand father mutambi wikimbo is trying to get me $5 million american dollars but I have to pay a tax of $5 thousand american dollars to nigeria and he will gladly handle it for me.What a swell guy.

Re:Spam (1, Funny)

agraupe (769778) | more than 9 years ago | (#11684824)

You mean your Nigerian contact only has $5,000,000 to offer you??? What a waste of time, mine has at least 60,000,000 (six ZERO millions) DOLLARS Amrican, of which I get 30%. Best of all, there is no risk!

Re:Spam (-1, Troll)

Anonymous Coward | more than 9 years ago | (#11684911)

I like my women like I like my coffee: ground up and boiled.

I bet you think that's funny, don't you, you sick fuck?

Re:Spam (1, Funny)

mattyrobinson69 (751521) | more than 9 years ago | (#11685082)

mutambi wikimbo is a bastard. i sent him my $5000 and i haven't heard back from him.

hmm. maybe we share a grandfather

Re:Spam (-1, Offtopic)

laughingcoyote (762272) | more than 9 years ago | (#11685124)

This is not offtopic...too bad I'm out of mod points, I would've given you a boost for funny anyway.

Mods On Crack (M.O.C.) (3, Insightful)

Lehk228 (705449) | more than 9 years ago | (#11685165)

*Sacrafices karma to protest idiotic mods*

Re:Spam (0)

Anonymous Coward | more than 9 years ago | (#11684836)

That's because you have no friends. (rim shot please)

wow (-1, Redundant)

Anonymous Coward | more than 9 years ago | (#11684568)

truly fascinating stuff here!

Already down! (-1, Redundant)

Anonymous Coward | more than 9 years ago | (#11684569)

No posts and already slashdotted.....

Remembering when.. (5, Funny)

Bite-lover (826567) | more than 9 years ago | (#11684572)

Must be nice to be able to look back on porn-spam and feel old. 'Hot XXX - Newcomer Jenna!'

Long-term (1, Interesting)

Anonymous Coward | more than 9 years ago | (#11684573)

Wonder how this will affect bayesian technology in the future...

Analysis this (-1, Redundant)

Anonymous Coward | more than 9 years ago | (#11684593)

Now he gets to analysis why his server failed.

Re:Analysis this (0)

Anonymous Coward | more than 9 years ago | (#11684639)

Now you get to read a dictionary!

!42 (3, Funny)

Anonymous Coward | more than 9 years ago | (#11684594)

Apparantly the computer spent months compiling and cross referencing only to spit out this cryptic message: Host not found

Since I can' read it let me guess (0)

Anonymous Coward | more than 9 years ago | (#11684597)

More spam now than in the past?
About the same amount of legit email over time?

Slashdotted (4, Funny)

Anonymous Coward | more than 9 years ago | (#11684599)

If its already slashdotted, he's also probably saving all of his server logs as well.

Re:Slashdotted (0)

Anonymous Coward | more than 9 years ago | (#11684661)

It's because the analysis is performed in real-time, using PERL.

Re:Slashdotted (1)

skweegee (755245) | more than 9 years ago | (#11685084)

It's Perl not PERL.

hah (4, Funny)

usernotfound (831691) | more than 9 years ago | (#11684602)

my yahoo account i use to collect spam gets 1700 a month, while my "real" email account i've recieved 1566 since august of 2003, only 10 of those being spam.

Re:hah (1)

mrjackson2000 (733829) | more than 9 years ago | (#11685074)

i get 5000/mo in my yahoo

Not very much (4, Insightful)

brunes69 (86786) | more than 9 years ago | (#11685077)

I rotate my email folders every 6-9 months to increase performance.

Even so, I have 2 folders with over 9000 Emails in them. My work Inbox alone has 1015. None of these are spam - I filter those out through a combination of SpamAssassin and manual filtering.

Anyways - my point is that the numbers in this article are small potatoes. He talks about 250 Emails in a week - I easily get 300 -400 Emails **a day**, probably 40-50 of which are directly work related, the other 350 related to various other side projects of mine, so they are just as important.

I would say I read around 25-50% of my Emails. The rest I only give a cursory scan. His numbers for reply times are way off for a number of reasons:

- Hardly anyone replies to every email they recieve. Most of it needs no reply.

- He basically says that the time spent reading the emails and responding is a waste. Well, what do you think managers did to communicate with you before email? You had faxes, daily memos, daily reports to file... it is just more streamlined now. It is not like this stuff is new.

Newsflash - work is difficult. People are distracting to your work. Shit happens. Deal with it, just like everyone else has for the past 150 years.

and on the 8th post... (0, Offtopic)

rd4tech (711615) | more than 9 years ago | (#11684616)

his (mail) server went down. case /.ed

Slashdotted (-1, Offtopic)

IdahoEv (195056) | more than 9 years ago | (#11684618)

I wonder what the statistics on his server show. Slashdotted within three minutes of the story posting, and only three comments so far. Must be a new record.

Indeed (4, Insightful)

mboverload (657893) | more than 9 years ago | (#11684619)

I used to NEVER get spam, I didn't know what people were complaining about. I was on a mail server that no spammer really knew about, so there were no dictionary mailings. However, once I posted that paticular email on just a few websites I have been getting ~50 spam a day. There is no way they got my email through me signing up for things because I use a seperate address. I'm just glad the kind of practices they use (trawling the internet for emails) are illegal, although that doesn't mean much.

I will never buy anything from spam, and whoever does has got to be a complete moron.

[anti-slash] celebrate the firing of Michael (-1, Troll)

jihadi_celebration (859737) | more than 9 years ago | (#11684644)

For public distribution (13 Feb 2005):

Anti-slash [anti-slash.org] confirms: Slashdot's editors are dying.

In another devistating blow to slashdot's editors, it was learned that Michael was fired as an editor on slashdot. We at anti-slash are proud to have brought about this partial victory by our unrelenting jihad of bringing their injustices to light.

The specifics of this case are documented in this post: http://slashdot.org/comments.pl?sid=138099&cid=115 70041 [slashdot.org]
Quote:
I got this from a verified source who's in the know:

Long story short, michael was canned for his abusive and egotistical personality.

Rob's been building a list of complaints by users about michael's abusive patterns but he never acted on it. Well, michael managed to bitchslap one of Rob's old college buddies' accounts along with a couple of paid accounts, word eventually filtered down to Rob, and he had kittens. He convinced michael's OSTG manager to track him down and drag him into a conference call.

Rob laid down the law and started reading off complaints and michael raised his voice, saying that if Rob had a personal problem with him that he didn't need to go over his head and involve his manager in it.

During the shouting match, michael's editor flag was revoked. He was in the admin area at the time and he noticed.

At this point he went totally ballistic and started screaming about how this was why he moved, to get away from "arrogant elitist bullshit". (this is a direct quote.. michael actually did move from New York to Canada to protest George W. Bush's inauguration in 2001. Andover kept him on since it was only an all-remote job anyway.)

michael's manager ducked out of the call to page (read: wake up) Hemos (overseas on business) to three-way him into the call, to try and calm everyone down.

There was some more shouting, and michael's manager told him that things aren't working out well, and that he's going to recommend that his employment be terminated.

michael just hung up, and that was the end of the call as well as michael's employment with OSTG.

This can be confirmed by visiting http://slashdot.org/authors.pl [slashdot.org] . Michael is not listed.

Fact: Slashdot's editors are dying

Re:[anti-slash] celebrate the firing of Michael (-1, Offtopic)

iced_773 (857608) | more than 9 years ago | (#11684811)

Trolls posting irrelevant comments on Slashdot?

In a way, that's spam. It wastes people's time and there is a filter for it.

Re:[anti-slash] celebrate the firing of Michael (0, Offtopic)

grolschie (610666) | more than 9 years ago | (#11684888)

He was here 2 weeks [slashdot.org] ago. Could he just be on holiday?

Re:Indeed (-1, Troll)

Anonymous Coward | more than 9 years ago | (#11684659)

Yeah, I took a big shit today. Palaces have been built with softer bricks and stood for thousands of years. My fecal production for today was truly awesome.

Furthermore, before I dropped my load, I weighed myself. Post-defecation, I was down 5 pounds. That's right - I dropped a 5 lb. baby in the pool. For fuck's sake, I rule.

By the way, you're an asshole. Go fuck yourself and your stupid email address fagmaster.

sincerely,
Terry Bradshaw

Re:Indeed (5, Informative)

iced_773 (857608) | more than 9 years ago | (#11684662)

I should point out that you shouldn't respond to spam under ANY circumstances - it just verifies to the spammer that your address exists.

Re:Indeed (3, Informative)

Anonymous Coward | more than 9 years ago | (#11685118)

And also set your email client not to load images, or anything remote for that matter, off the net. They can just add a image.jpg?id=123456 and know that the email address in their db with the id of 123456 read their spam message.

Re:Indeed (1)

mwkaufman (859791) | more than 9 years ago | (#11684671)

Yeah, I went for years getting a spam for about every 9-10 real e-mails, but then I just happened to try to use an ISP (when I got a cable modem) e-mail account and I suddenly had a need for spam filters that I had never used before. Kinda wish I could see the article...

Re:Indeed (2, Insightful)

autopr0n (534291) | more than 9 years ago | (#11684816)

Trawling for email addresses isn't illegal at all, certanly not in the USA which actualy legalized spam with the CAN-SPAM act.

The domains I use for email arn't even up right now, and I'm using gmail these days anyway. I had been using 'throwaway' emails for everything, and then a spammer started jo-jobbing me. Meaning that they started using fake addresses @mydomain. So I was getting tons and tons of bounce messages. It was awful.

These spammers are horrible people, but they're not even close to stupid. They're obviously making money off of it, or they would have stopped doing it a long time ago.

Re:Indeed (1)

Sheepdot (211478) | more than 9 years ago | (#11684828)

I'm just glad the kind of practices they use (trawling the internet for emails) are illegal, although that doesn't mean much.

I know of no country that has stated it is illegal to obtain email addresses this way. Sending emails to the addresses after you've collected them this way may be illegal in some countries like the US, but collecting them is certainly not.

Re:Indeed (1)

Tony Hoyle (11698) | more than 9 years ago | (#11685015)

In theory it could be argued in the UK under the computer misuse act. You'd have to have some kind of click-through I expect - a straight website would be treated as public. However if you've made an effort to stop them they're illegally using resources they're not entitled to (ie. your computer, your bandwidth) so can be prosecuted.

Reading that back though you could apply that to spam itself, at a pinch... you'd need a good lawyer though.

Re:Indeed (2, Interesting)

A nonymous Coward (7548) | more than 9 years ago | (#11684886)

Try your own domain name on a dialup connection :-) My own account gets around 200 spams a day. It annoys me, but doesn't take long to delete, since so much of it is 5 or ten copies of the same thing, which sticks out like the proverbial sore thumb when viewed once or twice a day.

But starting last summer, maybe 9 months ago, some spammers realized they had an untapped (fools') gold mine to plunder, and my simple little home domain has been receiving more and more spam to accounts that don't exist, like bill123 and so on. My poor little dialup domain has been receiving around 50-60,000 spams a day to those bogus accounts. It hit 120,000 one day.

It's easy enough to deal with since it is known to be spam by definition of going to bogus accounts. I never see it unless I am curious. I collect stats daily on how many unique account names were used, around 3000. It just amazes me that those bozos would send so much pure crap with no hope of ever getting a response.

I sometimes buy from Spam (0)

Anonymous Coward | more than 9 years ago | (#11684946)

I feel that if I buy some things from them, then they will no longer feel like they are selling an inferior product. They will become less insecure and stop sending me email.

I have been trying this for a while; I think they must feel pretty discouraged all the time that no one replies to them or buys their products. Maybe if we all chipped in, had a "Buy something from Spam letters" day, it would solve the problem once and for all, leaving everyone happy.

Am I right or am I right?

Link seems to be down... (1)

Teja (826685) | more than 9 years ago | (#11684621)

And mirrordot didn't pick it up Here is a google cache: http://64.233.167.104/search?q=cache:GshwWambHvEJ: www.corante.com/getreal/archives/2005/02/11/eight_ years_of_email_stats_pass_1.php+eight+years+of+ema il+stats&hl=en

Re:Link seems to be down... (5, Informative)

Vario (120611) | more than 9 years ago | (#11684669)

This is the google cache linked with slashcode: http://64.233.183.104/search?q=cache:GshwWambHvEJ: www.corante.com/getreal/archives/2005/02/11/eight_ years_of_email_stats_pass_1.php [64.233.183.104]

It still tries to access the original site, so it rather slow but you can read the article.

Re:Link seems to be down... (0)

Anonymous Coward | more than 9 years ago | (#11684944)

A more direct link to the google cache... This'll actually load: http://64.233.183.104/search?q=cache:GshwWambHvEJ: www.corante.com/getreal/archives/2005/02/11/eight_ years_of_email_stats_pass_1.php&hl=en&lr=&strip=1

Faster Link (1)

Anonymous Coward | more than 9 years ago | (#11685127)

Links to the text cache only [66.102.7.104] , so doesn't try to access the original site.

Hotmail (2, Interesting)

Dash'n'SlashDot (841636) | more than 9 years ago | (#11684626)

I have managed to maintain a hotmail account for 10 years, which I consider a feat. It isn't clean of spam or anything, but 10 filters based on keywords in the subject or body make a HUGE dent in the amount that reacheas my inbox.

Re:Hotmail (1)

rincebrain (776480) | more than 9 years ago | (#11684822)

I just filter everything to Junk Mail that's not from a recognized sender or with a certain keyphrase in the subject, and suddenly I have a week to scoop it out before it dies.

Saves me time.

Slashdotted, article text: (-1, Redundant)

Anonymous Coward | more than 9 years ago | (#11684628)

"The operation timed out while attempting to contact www.corante.com."

Fascinating.

Article Text (5, Informative)

Anonymous Coward | more than 9 years ago | (#11684630)

February 11, 2005
Eight years of email stats, pass 1Email This EntryPrint This Entry
Posted by Marc Eisenstadt

What's the reality behind the 'email overload' talk? Let's look at some numbers... personal numbers.

To kick things off, I've got a huge email archive. I started emailing in the early ArpaNet days, around 1972, and haven't stopped since. My archive has been extremely thorough for at least the past 12 years (and, in case you think I'm nuts for keeping all of these, my actual regret from a scientific/archive perspective is that I don't have the earlier ones too!). Why? Let's just say that one day I planned to do an analysis of it all... types of mails, social networks, the whole works. But things got a little out of hand.... (anyone lookin' for some data, give me a shout... but first read on)...

Most of this 'storage mania' was triggered by a casual comment in around 1992 or 1993 by Ron Baecker, of the University of Toronto, a longtime research colleague and acquaintance and someone whose work I have long admired and respected. Ron asked me, "given ultra-cheap storage and ultra-fast search, both clearly on their way, why would you ever need either to delete or indeed to accurately file/categorize your emails?"

OK, so as a little personal experiment, I decided to keep 'em, and to see what happened. The quick story is that migrating across machines, operating systems, and preferred email clients, plus being a bit cavalier about the whole thing, has meant that although all the emails are 'there' in various archive files, it takes a little work to get 'em all back in a harmonious form, that is with all headers intact and no duplicates (the main formats are Vax mails, Unix mails, Mac Eudora, PC Eudora, Outlook Express, and Outlook).

The longer story, with some data and preliminary analysis, begins like this:

Even though I haven't had the time or motivation thus far to put in the harmonization work required to get all the data in one format and with duplicates eliminated, I nevertheless thought that a little 'first pass' set of totals (with my estimate of their accuracy) would be interesting, and maybe even provide a little coarse empirical support for Stowe's "Just Say No To Email" campaign.

So I quickly eyeballed-and-tallied the most coherent of the archives, spanning eight years of emails, from January 1st 1997 to December 31st 2004. The totals are real enough, but the 'eyeballing' was needed to assess the approximate propotion of spam and duplication involved in the emails. A more detailed analysis later will enable me to do these more accurately. I've indicated my estimate of the margin for error in the third column, and my estimate for the percentage of spam received (and I mean real spam: i.e. either 'greedily-lookin-for-suckers' or 'low-down-mean-and-nasty spam', not conference announcements - you know what I'm talkin' about). For 2003, this number is precise, because I filtered off such spam using SpamAssassin, and counted them! 2004 spam numbers are an extrapolation, but the totals are accurate, as explained below. Here goes:

TABLE 1: Eisenstadt's 1997-2004 email totals
Year

Emails received Est. Error Est. Spam

1997 4320 20% 2%
1998 3996 20% 3%
1999 6821 10% 5%
2000 7580 5% 6%
2001 6125 5% 7%
2002 6497 5% 10%
2003 13092 1% 37.6%
2004 13889 1% 40%

2003 is the most accurate, because (unlike earlier years when I was changing clients and machines) I have all emails in one clean format and all spam preserved, auto-filtered by SpamAssassin into a folder that I look at only a few times a year, scanning rapidly for false rejections. Incidentally, that falsely rejected email rate appears to be roughly 1 in 5000: good enough for me! By 2004, although I kept all emails, I got fed up keeping the spam even for analysis purposes, and can't even be bothered to scan it, so stuff auto-filtered by SpamAssassin is now deleted without my looking at it - so the column 4 '40% spam' in the lower right hand corner is a well-educated approximation based on my observation of the ebb and flow of the size of my 'deleted' folder.

It's interesting that before 2003, I found that I didn't really need SpamAssassin - the number were annoying, but manageable, as the fourth column estimates show. As we go back in time, I have less patience with the process of harmonizing the data, as I mentioned above, hence the '20% error' estimate... in other words I believe, subjectively of course, that the totals for 1997 and 1998 could be off by roughly 20% either way. That's the price I pay for doing a quick-and-dirty analsysis right now. On the other hand, even with such an analysis, I find the totals illuminating.

What does it all mean?

The totals in Table 1 tell me that the subjective 'quantum leap in spam' in 2002/3 that led me to install SpamAssassin as a full-time companion is certainly corroborated by the numbers. There's simply no other way to cope with the large volume of junk. But now (auto)strip away that nasty spam, and we're still looking at some scary numbers. Let's call the emails that are left over, after stipping away the nasty spam, "OK emails" (let's face it, they are never going to be "GOOD emails", right?). What we see then is an increase from 5-6K annual "OK emails" in the late nineties (15-ish daily) to 8-9K annual "OK emails" today (25-ish daily). A bright note in all this is that the numbers for 2004 are surprisingly steady compared with 2003, i.e. there's no exponential growth, even though things are clearly getting 'intense'.

25 emails daily (and thereare many I know who have WAY more than this) is a lot to deal with, especially since the emails don't cluster evenly throughout the week. To get to a 25-per-day average, you're looking at more like 30-40 per working weekday, if you're the kind of person who switches off at the weekend (ha!). If each email requires 3 minutes of thinking/response time (you're lucky if you can average that), then you've got a guaranteed two hours straight down the tubes every day.

But wait a minute, "down the tubes" is incorrect: surely your emails involve key interactions, networking, brainstorming, appropriate drudgery and admin, in short what you get paid to do, right? Well, that's not clear... and requires drilling down a bit deeper into the data.

Digging deeper: a work-week in depth

Table 2 shows a coarse categorization of all 286 emails I received during a Monday-Friday working week in January 2005 (10th-14th to be precise). I break them down into four groups, labelled simply A, B, C, D in the left hand column for ease of reference, along with the specific category label in the middle, and the total number of emails in each category shown in the third column. I also checked every email to see whether it involved some mundane scheduling/timetabling query/response (e.g. "Can you meet with Jones on 13th Feb at 10AM?"), on the hunch that such emails arrived a little too often for my liking. The fourth column shows the number of the emails in column 3 that involved such scheduling interactions (e.g. for row D, KMi Management, of the 68 emails received in that category, fully 32 of them were scheduling-related).

TABLE 2: Main categories for 5 workdays of email in January 2005
Group
Summary

Number

Num of those involving 'scheduling'?
A Projects, papers, info requests

71

6
B Blog and site comments and maintenance
73

3
C Announcements, news, social, family
74

2
D KMi Managements, Gigs, Visitors, Invitations
68

32
TOTAL
286

43

The four main categories A, B, C, D of Table 2 are further subdivided in Table 3, this time preserving the A, B, C, D labels in the left-hand column for cross-referencing with Table 2, but breaking them down into finer categories as shown in the second column (in reality I did this breakdown first, and only later chunked them together to create Table 2, but thought it was easier to present this way).

TABLE 3 Further subdivisions of Table 2
Group
Category

Number
A Funding bids, new project work requests
17
A Alerts, requests, lab messages
21
A Main project work, paper writing
33
B Blog commentary and queries
40
B Issues related to 'popular KMi tools' (BuddySpace, HitMaps etc)
33
C Conference and seminar announcments
16
C Semi-junk, news, domain renewals, etc
14
C Family / social
30
C From self and meta (system email bounces etc)
14
D KMi Management
44
D Visitors, gig arrangements, etc
24
TOTALS
286

Now what?

So there you have my finer-grained interactions 'laid bare'. Allowing ZERO minutes of response time for some finer-grained categories (e.g. semi-junk, self/meta, which don't require reading at all) and ONE-THREE minutes of response times for most categories, plus, say, TEN minutes of response time for an important research category such as 'main project work, paper writing', it is trivially easy to get to 2.5 hours per workday assuming a fairly ruthless, 'one-touch', knee-jerk email interaction regime. And worse if you deviate from the regime.

Then there are other sources of workflow: blogs, aggregator summaries, phone calls (rare, but I still allow one or two), cell-phone, text message, instant messaging (my buddy list is very large, and most of them are work-related).

All of this paints a very very bad picture. Sure, if you're "in the business" like we are, then that's the price you pay. But the pace is quickening (I've just tallied what we already knew intuitively), and I have little faith or trust right now in intelligent agents being able to solve my overload problems. Just consider the proportion of emails listed above that are scheduling-related! 43 out of 286, that's 15%! We already have a tool, Meetomatic, that would handle at least half of those, but of course not everyone uses it. And the other half of that subset tend to require awkward interactions and judgement calls that no delegated agent, human or artificial, can actually cope with.

We're entering an era in which something that Stowe has often written about is going to become an essential skill: "continuous partial attention." I thought I was pretty good at it, but I am slowly-but-surely observing everyone around me slipping into a kind of cognitive quicksand, getting increasingly grumpy and stressed out, and I don't like it.

As I was putting together this entry, I noticed that The New York Times has an article this week on email-overload and related attentional problems (free subscription required). The research described there is interesting, but falls into the trap I refer to above, of requiring agents that I personally would not trust to handle my attentional needs. Stanford University's Donald Knuth opted out of email many years ago - what a visionary!

Raymond Chen's Analysis... (5, Interesting)

ticklish2day (575989) | more than 9 years ago | (#11684734)

Microsoftie Chen's analysis [asp.net] , slashdotted [slashdot.org] a while ago, has pictures too!

Re:Raymond Chen's Analysis... (1)

Deag (250823) | more than 9 years ago | (#11685085)

I wonder what virus caused the red dot cluster in late 2003 around 100KB? Anyone remember anything that sticks out?

Re:Article Text (In Summary) (1)

StikyPad (445176) | more than 9 years ago | (#11684840)

In summary, I don't utilize e-mail much to begin with, I didn't maintain the archives I had very well, and all my figures are speculative.

Re: information overload? (0)

Anonymous Coward | more than 9 years ago | (#11684956)

Allowing ZERO minutes of response time for some finer-grained categories (e.g. semi-junk, self/meta, which don't require reading at all) and ONE-THREE minutes of response times for most categories, plus, say, TEN minutes of response time for an important research category such as 'main project work, paper writing', ...

I think his estimates are fairly overzealous. I get probably around the same number of emails per day that he does. Most of them I can digest at a glance and delete. A quick response is easily under 30 seconds (often 10 or less if I'm grouchy). Rarely do I get one that requires significant research or effort.

In fact, I think I just spent more time writing this post than I spent replying to all my emails so far today :)

Einstein? (5, Funny)

kristopher (723047) | more than 9 years ago | (#11684634)

Don't misread like I did. I was like, what the hell was Einstein doing with email..

Re:Einstein? (1)

jacen_sunstrider (797955) | more than 9 years ago | (#11684891)

Filtering it of course! You think a genius wouldn't be running some sort of intelligent spam filter? Jeeze, don't knock the guy! He's dead for Bob's sake!

Re:Einstein? (1)

Stormwatch (703920) | more than 9 years ago | (#11685160)

I also misread... for a second I thought it had something to do with Wolfenstein 3D's "Operation Eisenfaust".

8 Years without a change of address... (0)

Anonymous Coward | more than 9 years ago | (#11684643)

wow, thats quite a feat.
I can't think of how many times my primary email has changed.

every month on lug radio (4, Insightful)

QuantumG (50515) | more than 9 years ago | (#11684646)

One of the guys reminds us that people who send those "increase your penis size" emails and other spam don't just do it because they think it is fun to piss off the world, they do it because they make lots and lots of money from it.

That's what anti-spam laws should be targeting, the morons who use the services offered by spammers.

Re:every month on lug radio (0)

Anonymous Coward | more than 9 years ago | (#11684901)

That is the stupidest idea ever.

Re:every month on lug radio (2, Funny)

QuantumG (50515) | more than 9 years ago | (#11684918)

obviously I disagree. If you're willing to fund people who piss off millions, just so your penis can be larger, you should spend some time in jail.

Google cached raw text (0)

Anonymous Coward | more than 9 years ago | (#11684651)

The raw text from Google's cache is here [66.102.7.104] . Scroll down a bit to read the stuff that's being talked about.

And have a snooze. You deserve it, you've been working hard.

Whats the point? (1)

[cx] (181186) | more than 9 years ago | (#11684663)

What could this analysis possibly prove other than yes, you get more spam than real mail, and your real mail is dull and uninteresting and even moreso after 8 years.

Result of Analysis: Marc Eisenstadt's mail is as worthless now as it was 8 years ago!

[cx]

Daily Dilbert (0, Offtopic)

bird603568 (808629) | more than 9 years ago | (#11684664)

I cant believe I didn't see Dogbert's bill that turned fat into rolex watches

Analysis... (1)

Infinityis (807294) | more than 9 years ago | (#11684666)

So I've got a question for analysis (although it seems the server could use a liquid nitrogen bath right now)...

If all the spam-based penis growth pill claims were stacked end to end, how many times would it circle the world, and would it be worth the money to have a member that large?

Re:Analysis... (0)

Anonymous Coward | more than 9 years ago | (#11685046)

Yes, it would be worth it, for societies sake. Add viagra and you'd have an instant space elevator!

Cached URL and stats extract (0, Redundant)

spamfo (803637) | more than 9 years ago | (#11684677)

Site appears to have been slashdotted already
Google cache here [66.102.9.104]

The stats

Year Emails received Est. Error Est. Spam
1997 4320 20% 2%
1998 3996 20% 3%
1999 6821 10% 5%
2000 7580 5% 6%
2001 6125 5% 7%
2002 6497 5% 10%
2003 13092 1% 37.6%
2004 13889 1% 40%

GMail (5, Interesting)

GNUALMAFUERTE (697061) | more than 9 years ago | (#11684699)

This is pretty interesting (sadly i can't access TFA)
Google should have such a program, there should be a preference in you GMail account, where you can allow /deny google to take stats out of your email. Many interesting information can be collected, like, for example, Ammount of SPAM / Legitim E-mail, % of each kind of spam (viagra, drugs, porn, etc), spam by countrys, % of Text / HTML email, and even other interesting stats not e-mail related, for example, language analisys, frequent mispells, toppics of interest by age, etc,etc,etc. I Would gladly allow google to make such stats, it can be done in such a way that no personal / sensitive information would be leaked.

(Thinks about what has just said, and puts tinfoil hat on)

ALMAFUERTE

Re:GMail (1)

Anonymous Coward | more than 9 years ago | (#11685110)

yeah, I'm sure Google don't do any of that. It'd be insane if they did.... wouldnt it.

Text (-1, Redundant)

Sheepdot (211478) | more than 9 years ago | (#11684718)

Copy/Paste from Google Cache [64.233.161.104] (Scroll to the bottom-third of the page)

Note: Sorry the tables aren't lined up.

February 11, 2005
Eight years of email stats, pass 1
Posted by Marc Eisenstadt
What's the reality behind the 'email overload' talk? Let's look at some numbers... personal numbers.

To kick things off, I've got a huge email archive. I started emailing in the early ArpaNet days, around 1972, and haven't stopped since. My archive has been extremely thorough for at least the past 12 years (and, in case you think I'm nuts for keeping all of these, my actual regret from a scientific/archive perspective is that I don't have the earlier ones too!). Why? Let's just say that one day I planned to do an analysis of it all... types of mails, social networks, the whole works. But things got a little out of hand.... (anyone lookin' for some data, give me a shout... but first read on)...

Most of this 'storage mania' was triggered by a casual comment in around 1992 or 1993 by Ron Baecker, of the University of Toronto, a longtime research colleague and acquaintance and someone whose work I have long admired and respected. Ron asked me, "given ultra-cheap storage and ultra-fast search, both clearly on their way, why would you ever need either to delete or indeed to accurately file/categorize your emails?"

OK, so as a little personal experiment, I decided to keep 'em, and to see what happened. The quick story is that migrating across machines, operating systems, and preferred email clients, plus being a bit cavalier about the whole thing, has meant that although all the emails are 'there' in various archive files, it takes a little work to get 'em all back in a harmonious form, that is with all headers intact and no duplicates (the main formats are Vax mails, Unix mails, Mac Eudora, PC Eudora, Outlook Express, and Outlook).

The longer story, with some data and preliminary analysis, begins like this:

Even though I haven't had the time or motivation thus far to put in the harmonization work required to get all the data in one format and with duplicates eliminated, I nevertheless thought that a little 'first pass' set of totals (with my estimate of their accuracy) would be interesting, and maybe even provide a little coarse empirical support for Stowe's "Just Say No To Email" campaign.

So I quickly eyeballed-and-tallied the most coherent of the archives, spanning eight years of emails, from January 1st 1997 to December 31st 2004. The totals are real enough, but the 'eyeballing' was needed to assess the approximate propotion of spam and duplication involved in the emails. A more detailed analysis later will enable me to do these more accurately. I've indicated my estimate of the margin for error in the third column, and my estimate for the percentage of spam received (and I mean real spam: i.e. either 'greedily-lookin-for-suckers' or 'low-down-mean-and-nasty spam', not conference announcements - you know what I'm talkin' about). For 2003, this number is precise, because I filtered off such spam using SpamAssassin, and counted them! 2004 spam numbers are an extrapolation, but the totals are accurate, as explained below. Here goes:

TABLE 1: Eisenstadt's 1997-2004 email totals

Year Emails received Est. Error Est. Spam

1997
4320 20%
2%
1998 3996 20% 3%
1999 6821 10% 5%
2000 7580 5% 6%
2001 6125 5% 7%
2002 6497 5% 10%
2003 13092 1% 37.6%
2004 13889 1% 40%

2003 is the most accurate, because (unlike earlier years when I was changing clients and machines) I have all emails in one clean format and all spam preserved, auto-filtered by SpamAssassin into a folder that I look at only a few times a year, scanning rapidly for false rejections. Incidentally, that falsely rejected email rate appears to be roughly 1 in 5000: good enough for me! By 2004, although I kept all emails, I got fed up keeping the spam even for analysis purposes, and can't even be bothered to scan it, so stuff auto-filtered by SpamAssassin is now deleted without my looking at it - so the column 4 '40% spam' in the lower right hand corner is a well-educated approximation based on my observation of the ebb and flow of the size of my 'deleted' folder.

It's interesting that before 2003, I found that I didn't really need SpamAssassin - the number were annoying, but manageable, as the fourth column estimates show. As we go back in time, I have less patience with the process of harmonizing the data, as I mentioned above, hence the '20% error' estimate... in other words I believe, subjectively of course, that the totals for 1997 and 1998 could be off by roughly 20% either way. That's the price I pay for doing a quick-and-dirty analsysis right now. On the other hand, even with such an analysis, I find the totals illuminating.

What does it all mean?

The totals in Table 1 tell me that the subjective 'quantum leap in spam' in 2002/3 that led me to install SpamAssassin as a full-time companion is certainly corroborated by the numbers. There's simply no other way to cope with the large volume of junk. But now (auto)strip away that nasty spam, and we're still looking at some scary numbers. Let's call the emails that are left over, after stipping away the nasty spam, "OK emails" (let's face it, they are never going to be "GOOD emails", right?). What we see then is an increase from 5-6K annual "OK emails" in the late nineties (15-ish daily) to 8-9K annual "OK emails" today (25-ish daily). A bright note in all this is that the numbers for 2004 are surprisingly steady compared with 2003, i.e. there's no exponential growth, even though things are clearly getting 'intense'.

25 emails daily (and thereare many I know who have WAY more than this) is a lot to deal with, especially since the emails don't cluster evenly throughout the week. To get to a 25-per-day average, you're looking at more like 30-40 per working weekday, if you're the kind of person who switches off at the weekend (ha!). If each email requires 3 minutes of thinking/response time (you're lucky if you can average that), then you've got a guaranteed two hours straight down the tubes every day.

But wait a minute, "down the tubes" is incorrect: surely your emails involve key interactions, networking, brainstorming, appropriate drudgery and admin, in short what you get paid to do, right? Well, that's not clear... and requires drilling down a bit deeper into the data.

Digging deeper: a work-week in depth

Table 2 shows a coarse categorization of all 286 emails I received during a Monday-Friday working week in January 2005 (10th-14th to be precise). I break them down into four groups, labelled simply A, B, C, D in the left hand column for ease of reference, along with the specific category label in the middle, and the total number of emails in each category shown in the third column. I also checked every email to see whether it involved some mundane scheduling/timetabling query/response (e.g. "Can you meet with Jones on 13th Feb at 10AM?"), on the hunch that such emails arrived a little too often for my liking. The fourth column shows the number of the emails in column 3 that involved such scheduling interactions (e.g. for row D, KMi Management, of the 68 emails received in that category, fully 32 of them were scheduling-related).

TABLE 2: Main categories for 5 workdays of email in January 2005

Group Summary Number Num of those involving 'scheduling'?
A Projects, papers, info requests 71
6
B Blog and site comments and maintenance 73 3
C Announcements, news, social, family 74 2
D KMi Managements, Gigs, Visitors, Invitations 68 32
TOTAL 286 43

The four main categories A, B, C, D of Table 2 are further subdivided in Table 3, this time preserving the A, B, C, D labels in the left-hand column for cross-referencing with Table 2, but breaking them down into finer categories as shown in the second column (in reality I did this breakdown first, and only later chunked them together to create Table 2, but thought it was easier to present this way).

TABLE 3 Further subdivisions of Table 2

Group Category Number
A Funding bids, new project work requests 17
A Alerts, requests, lab messages 21
A Main project work, paper writing 33
B Blog commentary and queries 40
B Issues related to 'popular KMi tools' (BuddySpace, HitMaps etc) 33
C Conference and seminar announcments 16
C Semi-junk, news, domain renewals, etc 14
C Family / social 30
C From self and meta (system email bounces etc) 14
D KMi Management 44
D Visitors, gig arrangements, etc 24
TOTALS 286

Now what?

So there you have my finer-grained interactions 'laid bare'. Allowing ZERO minutes of response time for some finer-grained categories (e.g. semi-junk, self/meta, which don't require reading at all) and ONE-THREE minutes of response times for most categories, plus, say, TEN minutes of response time for an important research category such as 'main project work, paper writing', it is trivially easy to get to 2.5 hours per workday assuming a fairly ruthless, 'one-touch', knee-jerk email interaction regime. And worse if you deviate from the regime.

Then there are other sources of workflow: blogs, aggregator summaries, phone calls (rare, but I still allow one or two), cell-phone, text message, instant messaging (my buddy list is very large, and most of them are work-related).

All of this paints a very very bad picture. Sure, if you're "in the business" like we are, then that's the price you pay. But the pace is quickening (I've just tallied what we already knew intuitively), and I have little faith or trust right now in intelligent agents being able to solve my overload problems. Just consider the proportion of emails listed above that are scheduling-related! 43 out of 286, that's 15%! We already have a tool, Meetomatic, that would handle at least half of those, but of course not everyone uses it. And the other half of that subset tend to require awkward interactions and judgement calls that no delegated agent, human or artificial, can actually cope with.

We're entering an era in which something that Stowe has often written about is going to become an essential skill: "continuous partial attention." I thought I was pretty good at it, but I am slowly-but-surely observing everyone around me slipping into a kind of cognitive quicksand, getting increasingly grumpy and stressed out, and I don't like it.

As I was putting together this entry, I noticed that The New York Times has an article this week on email-overload and related attentional problems (free subscription required). The research described there is interesting, but falls into the trap I refer to above, of requiring agents that I personally would not trust to handle my attentional needs. Stanford University's Donald Knuth opted out of email many years ago - what a visionary!

Re:Text (-1, Offtopic)

Anonymous Coward | more than 9 years ago | (#11684863)

Fucking KARMA WHORE, go back to your masturbation.

So he.... (4, Funny)

Robotron23 (832528) | more than 9 years ago | (#11684720)

Saved it for what exactly? Maybe vintage 1997 pr0n e-mails are now worth something to antique pr0n collectors...

Back in the old days... (4, Funny)

Sheepdot (211478) | more than 9 years ago | (#11684800)

I remember a time when the size of my genitalia wasn't an issue.

I remember when I never had any Korean friends.

I remember a time when I went to the pharmacist for a drug I needed, not the pharmacist asking me which drugs I wanted to buy online.

I remember when consolidating a loan was a big decision instead of "just a click away!".

I remember a time where when I left high school, there was no chance in hell I'd ever have to hear from those nitwits again.

God, I miss those days.

Re:Back in the old days... (1, Redundant)

GNUALMAFUERTE (697061) | more than 9 years ago | (#11684857)

Yhea, all nice and everything, but in those times you didn't have such incredible chances to do bussines arround the world like you have today!, for example, right now i'm in negotiations with this nice guy from nigeria to help him manage his bank account in the USA. BTW, that counts as consulting, right?.

Re:Back in the old days... (5, Funny)

Ian Action (836876) | more than 9 years ago | (#11685048)

I'm sorry you're so upset...

So, would you like to buy some ink cartidges?

Re:Back in the old days... (1)

fsh (751959) | more than 9 years ago | (#11685083)

> I remember a time when the size of my genitalia wasn't an issue. My friend, there has *never* been such a time.

If you think this article is about spam, read end (5, Interesting)

linuxbaby (124641) | more than 9 years ago | (#11684880)

If you think this article is about spam, make sure you read it all the way to the end. It's not.

He's questioning the entire technology of email as an effective way of communicating.

Analyzes not just the spam-count in his email, but the work-time needed to respond to the non-spam emails, too.

This is one of the most thought-provoking articles posted on Slashdot in a long time.

Re:If you think this article is about spam, read e (0)

Anonymous Coward | more than 9 years ago | (#11685011)

I am eagerly awaiting the un-slashdotting of his server. I think that with email, the emperor has no clothes. I've been thinking that for the last two years.

Re:If you think this article is about spam, read e (4, Insightful)

Tony Hoyle (11698) | more than 9 years ago | (#11685097)

It depends on what you use it for.

I work for a company on the other side of the globe.. couldn't do that without email. I also support an opensource project with 10,000 downloads a week... that generates 'a few' support queries :) Heck, without email I don't even think I could do that by phone without hiring a call center.

Re:If you think this article is about spam, read e (3, Interesting)

Bonhamme Richard (856034) | more than 9 years ago | (#11685171)

He raises a good point. I'm a college freshman at one of the most wired campuses in America (Carnegie Mellon). I get NO SPAM since I'm on the campus account, and my packrat mentality has forced me to save all of my emails. I've deleted a (very) few, but have saved 2,200 +/- 50 since September. I am in Navy ROTC, have a long distance girlfriend, and some of my profs really like email. Between these three I've got a lot of "work related" mail. If I spent 3 minutes reading/responding to each of the 2,200 emails, then I've spent nearly 4.5 DAYS on email in the last 5.5 months.

It's interesting to think of where the time goes...

Old Mail Saved... (1)

Kenshin (43036) | more than 9 years ago | (#11684890)

I still have all my e-mail dating back to 1997. (My packrat mentality is alive and well on my computer.)

But running a scan on it wouldn't do much use, since I culled all the spam manually over the years...

spam? Whats spam? (0, Troll)

Legodude522 (847797) | more than 9 years ago | (#11684940)

uhh get gmail, i didnt even know i had spam until i checked the spam folder

hylton & his site 'corante.com' (0)

Anonymous Coward | more than 9 years ago | (#11684970)

... he spams his site just like Roland Piquepaille used to.

I POOP on email (-1, Offtopic)

Anonymous Coward | more than 9 years ago | (#11684973)

poop

Shows value of backups (2, Informative)

confusedneutrino (732640) | more than 9 years ago | (#11684982)

I had a very similar setup going on for a while, but I lost it over a year ago. 6 years and 2 gigs of emails lost to a faulty power supply. Scouring turned up nothing usable and I didn't have backups of my emails.

I felt like I lost a part of my past...

Goes to show the value of backing up your data.

Femto's Law of Email (4, Interesting)

femto (459605) | more than 9 years ago | (#11685068)

I did a similar analysis in 1998. I came to the following conclusion:

Given enough time, nearly every email becomes irrelevant.

This 'law' is base based on the fact that of many thousands of emails, there were only about 3 or 4 that I judged to be of value (worth keeping) after three years.

A corollary:

You can safely ignore your email and suffer minimal long term consequences.

Here is an example of the application of "Femto's Law". The boss sends you an email asking you to do something. If you ignore the email, the boss will either a) if it is important come and tell you personally or, b) find someone else to do the task. Ultimately I think the law is based on the fact that email is mainly used for trivial stuff and important stuff will eventually be presented to you in a form which is harder to ignore.

I guess the applicabililty might have changed since 1998, if email has come to be used for non-trivial stuff, but I reckon it's mostly still true.

Side note: the reason I ended up doing the analysis is because the 'delete' button stopped working on my mail client and I had to sort my emails when jobs. AT the time I posted my conclusions to the rest of the University department, to other people's amusement.

PS. No, I'm not brave enough to ignore my email!

Gelfling's Axiom of Irrelevant eMail (4, Funny)

gelfling (6534) | more than 9 years ago | (#11685107)

Is this:

90% of all eMail is useless the moment it arrives in your inbox.

The First Corollary of eMail age is this:

All remaining eMail is useless no more than one year after the moment it arrives in your inbox.

The Second Corollary of eMail age is this:

eMail accidently deleted will become instantly irrelevant or it will be resent without your request.

Re:Gelfling's Axiom of Irrelevant eMail (1)

Vegeta99 (219501) | more than 9 years ago | (#11685141)

Email accidentally deleted becomes INSTANTLY very important, not irrelevant.

In Soviet Russia... (-1, Redundant)

Anonymous Coward | more than 9 years ago | (#11685146)

Email analyses you.

I've got that much email, with one difference... (0, Redundant)

DamienMcKenna (181101) | more than 9 years ago | (#11685147)

I didn't keep my spam, besides that I've kept almost everything I've ever received or sent.

Damien
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>