Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Why Programmers Need To Learn Statistics

Soulskill posted more than 4 years ago | from the because-they-suck-at-poker dept.

Math 572

David Gerard writes "Zed Shaw writes an impassioned plea to programmers: Programmers Need To Learn Statistics Or I Will Kill Them All. Quoting: 'I go insane when I hear programmers talking about statistics like they know s*** when it's clearly obvious they do not. I've been studying it for years and years and still don't think I know anything. ... I have taken a bunch of math classes, studied statistics in grad school, learned the R language, and read tons of books on the subject. Despite all of this I'm not at all confident in my understanding of such a vast topic. What I can do is apply the techniques to common problems I encounter at work. My favorite problem to attack with the statistics wolverine is performance measurement and tuning. All of this leads to a curse since none of my colleagues have any clue about what they don't understand. I'll propose a measurement technique and they'll scoff at it. I try to show them how to properly graph a run chart and they're indignant. I question their metrics and they try to back it up with lame attempts at statistical reasoning. I really can't blame them since they were probably told in college that logic and reason are superior to evidence and observation.'"

cancel ×

572 comments

Sorry! There are no comments related to the filter you selected.

93% of Programmers Think You're Wrong (3, Interesting)

Greyfox (87712) | more than 4 years ago | (#30710568)

Everything I needed to know about statistics I learned playing poker.

Re:93% of Programmers Think You're Wrong (5, Interesting)

Anonymous Coward | more than 4 years ago | (#30710662)

Re:93% of Programmers Think You're Wrong (5, Insightful)

ShakaUVM (157947) | more than 4 years ago | (#30710740)

A manga statistics book, eh?

I just realized I was a nerd. I looked at the table of contents and closed it down, then realized I hadn't even looked at the short skirt-wearing protagonist.

Sigh...

But to answer the article's point, elementary statistics are very easy. Advanced statistics are very hard. It's kind of like how people think "knowing the difference between circles and squares" is geometry and so analytical geometry must be just more of the same, right? It's quite possible the programmers think they know statistics because they know they're vaguely supposed to do a run multiple times, and maybe average the results or something.

It's also possible the author of the article is a know-it-all douchebag who tries to solve problems with overwrought solutions.

From TFA: "Zed: Fuck! Fuck! I have eyes! You do not! See!? No?! Exactly! Because you can't fucking see because you have no fucking eyes! Arrggh!"

Just throwing that theory out there.

Re:93% of Programmers Think You're Wrong (0, Troll)

moderators_are_w*nke (571920) | more than 4 years ago | (#30710752)

Someone mod this up

Re:93% of Programmers Think You're Wrong (0, Insightful)

Anonymous Coward | more than 4 years ago | (#30710796)

Everyone knows that 98.2% of all statistics are made up on the spot.

Percent probability that Zed Shaw is a jerk (5, Funny)

Anonymous Coward | more than 4 years ago | (#30710580)

110%.

correlation != causation (5, Funny)

Hognoxious (631665) | more than 4 years ago | (#30710596)

Correlation != causation. Just repeat that and you don't need to know statistics.

Re:correlation != causation (1, Insightful)

Anonymous Coward | more than 4 years ago | (#30710790)

I think many programmers/managers would be better off with less statistics. I can not tell you the number of times I have seen a major (ie crash the damn app, corrupt data, etc) bugs go into an application because 'statistically no one will ever do that'. It is 100% predictable that someone will do it oh and then you are allowed to fix it.

Misapplied statistics are worse. Like my previous example sure out of say 10 million transactions 1 failed. But guess what? That 1 is just as important as ALL the other ones. So what if it is 1 out of 10 million that it will happen. Thats just math masturbation.

Can't happen is always fixed twice (1)

RobertLTux (260313) | more than 4 years ago | (#30710858)

you fix it once to handle when some Anti-Mensa card carrying twit actually makes it happen
then you fix it a second time to prevent it from happening

every time you get data from a user/outside process you should be able to handle values that make you go Eh WOT?? and then chuck those values out (and emit the correct error code)

Re:Can't happen is always fixed twice (1)

LostCluster (625375) | more than 4 years ago | (#30711002)

every time you get data from a user/outside process you should be able to handle values that make you go Eh WOT?? and then chuck those values out (and emit the correct error code)"

Computers aren't very good at generating data, just analyzing it. You've got to get your data from somewhere.

So, when something unlikely comes up for a report... the question isn't just whether the number is accurate, but also why did it happen?

I was once working at a catalog outfit where there was a question as why some days there were massive return numbers, others where there was a zero, and usually it stayed within the acceptable range. I looked into it... there was one guy who specialized in returns. When he took a day off for any reason, nobody stepped up to take his place. So that's where the zero-return days came from. Following any time off, there was a backlog which he quickly processed, creating the big days. The stat was accurate... there were just some irregularities in the data.

Bitch, while you were writing all that jive (-1, Flamebait)

Rogerborg (306625) | more than 4 years ago | (#30710602)

I just authored an "good enough" OBDC client app that paid the bills for this month. How much did you earn, Poindexter?

Re:Bitch, while you were writing all that jive (1, Funny)

Anonymous Coward | more than 4 years ago | (#30710774)

That's ODBC, Junior. Details matter.

(And I'll bet you a thousand dollars that I earned more than you this month.)

Oh so true... (0)

Anonymous Coward | more than 4 years ago | (#30710606)

It is just another way for the majority of programmers to jstify their shortcuts and shortcommings as correct. If they were to really study statistics, they would finally realize they know nothing and a chorus of millions of programmers heads would explode simultaneously.

There is a spell check in the comment box... (0)

Anonymous Coward | more than 4 years ago | (#30710826)

jstify = justify

shortcommings = shortcomings

programmers = programmers'

If the word is underlined in red, you spelled it incorrectly. Just a thought. I can only hope that you are more careful when you write programs.

Re:There is a spell check in the comment box... (1)

Hognoxious (631665) | more than 4 years ago | (#30710878)

Meh, that's what compilers are for.

Your argument is dead, Zed (5, Insightful)

BadAnalogyGuy (945258) | more than 4 years ago | (#30710610)

Maybe the problem is in your presentation. Even here, you tell programmers that you want to kill them for not understanding a topic that even you are unwilling to acknowledge mastery of. Then you tell us how hard the topic is to understand, even though you've spent so much time trying to learn it.

Is it any wonder that no one takes your suggestions seriously? You are practically sabotaging yourself with self-effacement.

These aren't homework problems you're tackling here. They are business problems and you need to sell yourself and your ideas if you want to get any traction. Do you have any evidence that your methods are better than the SOP thus far? Do you have any case studies that show how effective statistic analysis is in *any* of your projects?

Or are you simply taking something that seems like a data point and extrapolating it to cover a vast swath of applications?

Re:Your argument is dead, Zed (4, Funny)

Krishnoid (984597) | more than 4 years ago | (#30710724)

Or are you simply taking something that seems like a data point and extrapolating it to cover a vast swath of applications?

Well yeah, that's what he was saying -- statistics!

Re:Your argument is dead, Zed (1, Offtopic)

ihavnoid (749312) | more than 4 years ago | (#30710838)

Well, I think this would be the article Zed needs to read:

http://www.joelonsoftware.com/articles/fog0000000332.html [joelonsoftware.com]

Basically, many programmers feel that everybody else around him(or her) is a stupid asshole. However, if you want succeed, (e.g. have everybody around you learn statistics) you should never, ever, ever make enemies.

Be productive, work hard, listen to others, and try to do the work in the *right way*. Gain respect from yor collegues, and then they will get interested.

Re:Your argument is dead, Zed (4, Insightful)

superdana (1211758) | more than 4 years ago | (#30710850)

Maybe the problem is in your presentation.

Meet Zed Shaw.

Or, how about... (5, Insightful)

halivar (535827) | more than 4 years ago | (#30710618)

Statisticians need to learn programming or I will kill them all.

Re:Or, how about... (1)

Max(10) (1716458) | more than 4 years ago | (#30710912)

"Statisticians need to learn programming or I will kill them all."

No, please don't, leave at least half a dozen so they can do the statistics on your killing the others and then we'll use the Pearson correlation coefficient [wikipedia.org] on their results to find the most incompetent statistician of the bunch whose future work we'll then use to seed our PRNGs.

Re:Or, how about... (1)

SiggyTheViking (890997) | more than 4 years ago | (#30710984)

How about you just kill them all?
Right after all the lawyers.

Re:Or, how about... (1)

ruyon (660897) | more than 4 years ago | (#30711004)

Better yet, how about "Zed needs to learn manners or I will rip his mouth apart."
Please execuse my English.

Mathematicians just need to shutup. (4, Insightful)

HornWumpus (783565) | more than 4 years ago | (#30710620)

We know as much statistics as we need to know.

Some know more, some less. Each has traded off hours vs. knowledge in many fields.

For example: Why would a programmer who's job is to automate bean counting need to know more then basic statistics? (s)he rightfully focuses his efforts on accounting.

One post calculus statistics course gives me enough grounding to know what I don't know and punt to experts when I need to.

Fucking specialists forget all the things they don't know and only look at the world through one lens.

Re:Mathematicians just need to shutup. (2, Interesting)

gardyloo (512791) | more than 4 years ago | (#30710756)

We know as much statistics as we need to know.

Some know more, some less.

That's either the most honest, insightful comment I've ever seen, or the most useless. I'm 92% sure, with an uncertainty of about +/-5%, that it's the latter.

Re:Mathematicians just need to shutup. (0)

HornWumpus (783565) | more than 4 years ago | (#30710784)

You sir need to learn to read and not cherry pick parts of posts in attempts at lame jokes.

Re:Mathematicians just need to shutup. (5, Insightful)

nextekcarl (1402899) | more than 4 years ago | (#30710848)

One post calculus statistics course gives me enough grounding to know what I don't know and punt to experts when I need to.

That's actually his argument (though I'm pretty sure he doesn't realize it, having met him a few years ago at a conference). People need to know their limits, and the strengths (and weaknesses) of others, and defer to them when they know what they're talking about, rather than talking out of their asses. As you point out, you can't know everything, but you'll defer to others who know more when you need to. I'm pretty sure Zed would like working with you based upon that fact alone (I know I value that trait and try to express it myself). Far too many people think they aren't allowed to have any weaknesses (and we all do in some area or another) so they talk a big game, and when push comes to shove, they will actively block people who actually know more than they do about the subject at hand. Working with too many people like that has driven Zed insane (IMHO) and I know I've been close to it at a couple of work places before (and really loved the one that wasn't like that hardly at all).

Re:Mathematicians just need to shutup. (5, Insightful)

Toonol (1057698) | more than 4 years ago | (#30710876)

But statistics is one of those fields that benefits everybody; it's a bit like probability, logic, or (further afield) history. Lack of a fundamental understanding of statistic can lead you astray in a near-infinite number of ways.

I have sat in business meetings hundreds of times where I've seen decisions made on completely meaningless and irrelevant data, because the people involved don't understand statistics. The same holds true in your personal life; decisions with purchasing products, investing money...

Now, I'll bet that most slashdot readers have the minimum amount of knowledge of statistic to avoid the most egregious errors; but more knowledge is certainly helpful. It will help you in a myriad of ways.

Re:Mathematicians just need to shutup. (2, Insightful)

Anonymous Coward | more than 4 years ago | (#30710988)

Being socially adept is also a skill that benefits everybody but many programmers just arent. I hardly know anything about statistics, but Im not afraid to ask questions. Im sure there's stuff that other programmers know and think equally fundamental to success that Zed doesnt. It's fantastic that he's passionate about statistics. That skill certain comes in handy, but how much more important is it than helping everyone on the team get their job done, for example?

Re:Mathematicians just need to shutup. (0)

Anonymous Coward | more than 4 years ago | (#30710936)

Fucking specialists forget all the things they don't know and only look at the world through one lens.

Meanwhile, people like you can't even tell the difference between the words "then" and "than."

But don't let that stop you! A poor grasp of basic English shouldn't slow you down any more than a poor grasp of statistics! It's ok ... you can just explain it all away with some hand-waving and a post to Slashdot! You're a genius! Here ... have a lollipop!

93% of everyone else thinks you're full of it. (0)

Anonymous Coward | more than 4 years ago | (#30710622)

Damm geek. Take your fancy math and get off my lawn.

Title fail. (5, Funny)

girlintraining (1395911) | more than 4 years ago | (#30710628)

Programmers Need To Learn Statistics Or I Will Kill Them All

Okay, two things: First, threatening programmers never work. Management's been trying that for years. Second -- don't you mean 'kill -9' them all, or maybe demalloc(), or cast them to void*, or one of a dozen other witty things you could do besides the mundane answer of threatening stabby bits on them because you have a case of intellectual snobbery?

Really? (1, Funny)

Anonymous Coward | more than 4 years ago | (#30710632)

Zed Shaw says: "I've been studying it for years and years and still don't think I know anything"

Don't you think this might be telling you something, like... perhaps statistics are too hard for you? Leave the real work to the people who do know what they are doing and do know something about the field: programmers.

Re:Really? (1, Insightful)

Anonymous Coward | more than 4 years ago | (#30710658)

Statistics is "just" applied measure theory. Which means, among other things, that its language is Turing complete. There is infinitely much to know.

Re:Really? (1)

Sulphur (1548251) | more than 4 years ago | (#30710922)

Statistics is "just" applied measure theory. Which means, among other things, that its language is Turing complete.

Can you give a traveling salesman analogy for that?

empty threats (0)

Anonymous Coward | more than 4 years ago | (#30710644)

statisticians need to stfu or I will kill them all.

shut the fuck up you homo (-1, Troll)

Anonymous Coward | more than 4 years ago | (#30710652)

go suck some dick. Us heteros have shit to do.

Logic and Reason *ARE* superior to evidence and (0)

Anonymous Coward | more than 4 years ago | (#30710656)

observation in some circumstances. In social sciences, where you generally can't classify phenomena by observable evidence, you have to rely on them by assuming others think as you do, so that you have "observations" (ie others' perceptions or classifications as related) to work with.

Re:Logic and Reason *ARE* superior to evidence and (2, Insightful)

AnotherUsername (966110) | more than 4 years ago | (#30710898)

I prefer logic and reason mixed with evidence and observation.

If you just have logic and reason, then you get religion. Logically, it worked out when it was created. There is no evidence to counter it, so it must be true. Religion was created with logical reasoning. Some may say it was incorrect reasoning, but it was reasoning nonetheless.

On the other hand, if you just have observable evidence, with no logical reasoning, you can have all the data in the world, but you will have nothing to use it with. True, you can see it, but you cannot understand why it is the way it is.

Having all of one or the other is useless.

My advice: take a statistics class as an undergrad (1)

j1m+5n0w (749199) | more than 4 years ago | (#30710664)

I never took a statistics class as an undergrad. In retrospect, I think it would have been very useful, probably more so than the calculus I took (which I think is also a very good thing to know, but stats tend to be used more often).

For actual arguments... (0)

Anonymous Coward | more than 4 years ago | (#30710684)

as opposed to strawmen and insults, scroll to the power of ten syndrome heading on the linked page.

The funny thing is he's doing exactly the same (4, Insightful)

Rix (54095) | more than 4 years ago | (#30710686)

He's just as arrogantly claiming that he's right and they're wrong. Now, he may very well in fact be right, but he's taking the same obstinate position the people he criticizes do.

It's important to know when your input is not desired. Even if you think it should be.

The reason people ignore you Zed.. (5, Insightful)

Anonymous Coward | more than 4 years ago | (#30710692)

is not because they don't understand statistics. It is because you are a dick.

Statistics is HARD (4, Informative)

omb (759389) | more than 4 years ago | (#30710694)

Statistics is HARD, for two reasons:

(a) Probability theory, on which all practical Statistics is based it both (i) counter-intuitive and (ii) difficult

(b) The very Mathematics on which it is based is obscure

And, worst of all, it is uniformly badly taught, even in good universities, and the Statistics for XXX are uniformly awful, blind leading the blind.

Lastly it is very hard to get a staight answer from a mathematical Statistician.

Re:Statistics is HARD (3, Funny)

codewarren (927270) | more than 4 years ago | (#30710812)

Statistics for XXX are uniformly awful, blind leading the blind.

They have statistics for porn? (!!)

What could be wrong with that? And blind on blind action? Strange, but interesting.

Re:Statistics is HARD (1)

digitalhermit (113459) | more than 4 years ago | (#30710836)

Can't agree with that.

Basic statistics as taught in a beginning stats class is counter-intuitive because they don't teach the calculus behind it. But it's actually quite simple to use, however. The tough part is figuring out what statistic to apply to a given problem. It's not difficult. There's a reason that it satisfies the "basic math requirements" for a business major and physical therapy major.

The mathematics behind statistics is Calculus 2 which is hardly obscure. The Statistics with Calculus class in fact only requires a Calc 1 understanding; i.e., knowledge of limits, differentiation and integration. What the statistics course teaches is how to apply those tools and not the reasoning behind how they work.

And yes, statistics is often badly taught, but I can say that about almost every undergrad math course that I ever took.

Re:Statistics is HARD (1)

LostCluster (625375) | more than 4 years ago | (#30710914)

I didn't have much trouble with statistics in college after having studied physics the year before in high school, and firmly formulas are being taught because they've been proven true, so you just need to remember the steps to get something done, and the numbers were just filling in the variables. More numbers involved, but still there's formulas.

I had such an easy time with the course, and had trouble hiding that, that I would regularly be visited by students asking for help on Sunday on the homework that I had completed after class on Friday. Doing the homework within minutes of it being taught helped greatly. It led me to be totally free of work over the weekend while others put it off, and some waiting for me to return from my hometown.

Re:Statistics is HARD (4, Insightful)

radtea (464814) | more than 4 years ago | (#30710954)

Statistics is HARD, for two reasons:

I'd argue that probability theory isn't as hard as people make it seem, but statisticians are wankers. Most of what we think of statistics was developed by people who were intimately engaged with empirical research, but modern statisticians are mathematicians, many of whom have never actually performed an experiment. They think the statistics are real, whereas experimental scientists know the truth: God made the Probability Distribution Functions. All else is the work of man.

Furthermore, modern computing has made a lot of the conceptual apparatus of conventional statistics irrelevant, as it is designed to deal with the problem of reducing problems to something that can be computed by hand and finished off with a single table lookup. Today its a rare case that we can't get at the PDFs directly, bypassing much of conventional statistics. But due to how badly the stats are taught, and how poorly probability theory is understood, we are still living in a world where p-values are the exception, not the norm, and when they are quoted they are frequently unrealistic because they are based on statistical assumptions that are not warranted given the non-idealities of the data.

So I'd argue that statistics is basically a dead field populated by zombies who are dedicated to infecting as many students as possible. If we taught thermodynamics or mechanics with equally outmoded concepts they would be really hard too.

Re:Statistics is HARD (1)

wfolta (603698) | more than 4 years ago | (#30710992)

You hit the nail on the head. Statstics is counter-intuitive and badly taught. But extremely important.

The worst grade I got in undergraduate studies was in Probability, and in graduate studies I've been exposed to statistics now for about the 4th time and it's finally sinking in... mostly... a lot.

That said, there is need for statistics in any programming endeavor where you are trying to come up with a new algorithm or trying to improve the performance of an existing one. I can think of the kind of pitiful "ran it several times and this one's faster" testing I would have done in the past, and all the logical hand-waving I would have done if questioned, "Can we be SURE it's faster?", and it's embarrassing. If you're just coding, perhaps no need, though a good feel for how real statistics and scientific experimentation is done is very helpful in programming.

Re:Statistics is HARD (5, Interesting)

thesandtiger (819476) | more than 4 years ago | (#30711006)

I don't think it's hard - I just think it requires a different way of thinking than most programmers usually take to maths.

As a programmer/developer who went into research (in social sciences, so it's really soft), I can say that in my experience stats is really closer to a programming language than it is to other maths. Here's why:

1) You have a LOT of tools to pick from. What kind of analysis do you want to do? What kind will give you the most useful result? What kind is your data amenable to?

2) You don't always have a clear choice as to which is the best for a given situation. Sometimes you need multiple different types of analysis to really get the full picture.

3) Just because it's math doesn't always mean it's right. There's some crazy ass black-box magic stats stuff we use for one project of ours that, in theory, will let us figure out the demographic composition of an unknown target population. Maybe. Sometimes. If the wind is right. Or not.

4) At the advanced levels, it's fucking insane. People who hack stuff like ultra optimized 3d engines with large quantities of assembler or whatever always wigged me out because my brain just doesn't work that way. With the really complex stats stuff it's the same way - I can plug and chug with the formulas, but I honestly have about as much comprehension of why some of the more advanced stuff works as my dog has of CPU design.

5) If you know the basics, you know just enough to be dangerous and really piss off people who know what they're doing. Being able to run an anova or determine correlation makes some people think they actually know what's going on because, hey, it's math. But a lot of people who just do the basic stuff think their results are more meaningful than they actually are - falling prey to the whole "it's statistically significant therefore it must be IMPORTANT" fallacy (when you can certainly have things that are "statistically significant" but actually have virtually no impact on the outcome.

6) Even when people know their shit, they disagree. A fine example of this would be the Space Shuttle failure rate - you had people saying that the shuttle would suffer a critical failure from everywhere between 1 in 5 and 1 in 50,000 launches. And depending on what tools they used to do their analysis, they were correct. Same as with programming languages - depending on the problem, equally skilled programmers might pick entirely different languages to use because they think one part or another is more critical.

Honestly, I really enjoy stats - if I had to do it all over again I would probably have spent a LOT more time working with stats than I did as a programmer in my younger years - but I won't pretend that it's totally clear what tools to use when. The author of TFA should do well to realize that even fellow statisticians would probably slap the shit out of him over some of his beliefs about how to properly go about utilizing stats toolsets.

bloggers need to learn to write or ... (1)

Lazy Jones (8403) | more than 4 years ago | (#30710706)

... something inside me wants to flame him for being a rude twat who wasted 1 minute of my lifetime, even though he has some valid points. I'd be surprised if he didn't get some responses along the lines of "cry me a river" etc.

Go ahead and try it (3, Insightful)

thetoadwarrior (1268702) | more than 4 years ago | (#30710708)

I know enough about statistics to know statistically I know I'm safe from his threats. I suspect if I were a bag of Cheetos the odds were be against me but that's not the case.

It's not just statistics (2, Insightful)

im_thatoneguy (819432) | more than 4 years ago | (#30710718)

I've found that more than just about any other degree Computer Science and to a less extent Medical Degrees imbue the recipient with an unnatural ego when it comes to subjects with which they are unfamiliar. I propose we remove the word Science from CS degrees and call it what it is "Computer Programming and Troubleshooting". There are far too many CS graduates who think they are actually scientists.

Re:It's not just statistics (4, Insightful)

radarsat1 (786772) | more than 4 years ago | (#30710782)

I disagree that CS is just "programming and troubleshooting", but I do agree that Computer Science is a complete misnomer. It's extremely misleading, and difficult to explain to people, "I'm a computer scientist, but no I'm not actually a scientist, instead I understand how to describe formal languages in terms of strict grammar rules and transform abstract syntax trees from one representation to another."

It shouldn't be called Computer Science, it should be called Computational Mathematics, because that's what it is.

(On the other hand, there is whole branch of CS that extends very deeply into statistics called Machine Learning, but at the core I'd say it is still more mathematics than science. There is also human-machine interaction which often goes under CS, but is actually more like psychology.. so it's not so cut and dry.)

Re:It's not just statistics (2, Insightful)

Dahamma (304068) | more than 4 years ago | (#30710880)

Maybe wherever you went to school they taught "computer troubleshooting" as a degree, but some of us actually got a solid foundation in the various theoretical and practical foundations of computer software engineering.

Though I do agree that "Computer Science" is a stupid name. They already have Mechanical Engineering, Chemical Engineering, Electrical Engineering, etc - why not just call it "Software Engineering"? [I'd say "Computer Engineering", but since that was my major and I also had to do transistor physics and VLSI design, it I guess does need to be separate...]

Re:It's not just statistics (0)

Anonymous Coward | more than 4 years ago | (#30710998)

Because Software Engineering, which is actually both a course and a profession, is strictly distinct from CS in the same way chemistry is distinct from Chemical Engineering - one focuses on theory, while the other covers applications and implementation.

Stats are only as good as the data (1)

spiffmastercow (1001386) | more than 4 years ago | (#30710728)

I was tasked recently with developing stat reports that would be used to give the best workers the most important tasks. I used their desired metric, and modified the numbers to show on a 0-100 scale where 75 is average and each standard deviation is 10 points. The result? The sample sizes were too small, and some groups had widely varying scores when every group member's performance was nearly identical. Then again, maybe I'm doing something wrong.

Re:Stats are only as good as the data (1, Funny)

Anonymous Coward | more than 4 years ago | (#30710934)

Crash course in statistics... The result you got is not 'refined', you get the 'vital' variables like who has a mustache, ugly shirt... things that might be more likely to group people together. Ugly people hang out with ugly people and vice versa, you get the point.
Then you go around asking them to lend you some money, if they don't, their stats go down pretty quick, it's also a plus here if their memory is bad or maybe suffering from early onset alzheimers.
When that's well and done you use a differential algoritm with the other stats and this 'noise' you have gathered to get a nice graph.
And finally you put the values into excel, and make a nice pie chart which you copy paste into Power Point.
Present it to your superiors and tell them how much work you put into it. Also if there is a glitch in the presentation, like odd values or discrepancies, tell them the IT sector screwd up.

Re:Stats are only as good as the data (0)

Anonymous Coward | more than 4 years ago | (#30710952)

Ummm...

Where to begin?

I used their desired metric, and modified the numbers to show on a 0-100 scale where 75 is average and each standard deviation is 10 points.

OK, so they asked you to compute a statistic and you began by modifying the numbers to adjust the mean to 75 and each standard deviation is 10 points? That's like saying you adjusted the numbers so that ten equals thirty-five and had unicorns grow apple trees in their magical dung.

The result? The sample sizes were too small, and some groups had widely varying scores when every group member's performance was nearly identical.

And that's ass-backwards. You don't sample a portion of population to get the results. You don't care about sample size when presumably you're measuring all workers. That's like asking Tabitha and Gerarldine in the marketing department the color of their ..ummm.. sneakers and then slapping your head and saying, "Man, these results just don't seem to apply to the group!"

  Then again, maybe I'm doing something wrong.

You think? You use some of the right terms, but sort of like saying, "I pulled on my CPU across the gigahertz and compiled my keyboard. Then danced in the unicorn dung."

Is Zed insane? (1)

greg_barton (5551) | more than 4 years ago | (#30710730)

Seriously.

Re:Is Zed insane? (1)

perry64 (1324755) | more than 4 years ago | (#30710814)

Zed's dead, baby. Zed's dead. We're riding his chopper.

Re:Is Zed insane? (0)

Anonymous Coward | more than 4 years ago | (#30710982)

He admits as much in the summary.

sounds impossible to please? (3, Insightful)

v1 (525388) | more than 4 years ago | (#30710746)

I've been studying it for years and years and still don't think I know anything.

And yet you're expecting someone whose expertise is in a different field to know more about it than you?

We can't all be experts in everything. If you're the expert in the field of discussion, get used to educating your coworkers on the topic, or find another job where you're surrounded by people with the same education and expertise as you.

The average person is an expert in no more than two or three related areas. That's why people work in teams, to cover each other's blind spots.

Re:sounds impossible to please? (0)

Anonymous Coward | more than 4 years ago | (#30710944)

The average person is an expert

I see what you did there...

StatisticsIsEmoApparently (0)

Anonymous Coward | more than 4 years ago | (#30710748)

I never thought I'd read lame crying on slashdot, but now i have. Man up and cut your own wrists.

Everyone knows smoking is the leading cause of statistics.

Zed Shaw needs some serious meds (1)

optikos (1187213) | more than 4 years ago | (#30710760)

He cannot even write a logical, rational thought supporting why programmers need to know more than a casual level of statistics. He just rants about blue sunsets and writes the f-word a lot.

He makes some good points... (5, Insightful)

SanityInAnarchy (655584) | more than 4 years ago | (#30710794)

...unfortunately, they are mostly lost in the irony of statements like this:

I think women are better programmers because they have less ego and are typically more interested in the gear rather than the pissing contest.

I doubt I've seen anyone more thoroughly entrenched in a pissing contest than Zed Shaw, of the website formerly known as "Zed's So Fucking Awesome".

Zed Shaw is a tosser. (2, Informative)

toby (759) | more than 4 years ago | (#30710766)

Nothing new to see here.

Re:Zed Shaw is a tosser. (0)

Anonymous Coward | more than 4 years ago | (#30710886)

And to sum up the article...

Also consider the standard deviation when measuring performance over multiple iterations.

Stats? Fuck that. (2, Informative)

delysid-x (18948) | more than 4 years ago | (#30710778)

Statstics is WAY beyond what a programmer cares about. Logic is all that matters. Statistics->logic is the problem of the software engineer, not the programmer.

Show them you're the Boss (0)

Anonymous Coward | more than 4 years ago | (#30710800)

It is easy to convince your colleagues that you are better than them in statistics. Just play some statistical games with them. I recommend the "Three Door Problem" which is sometimes called the Monty Hall problem. Those people who don't know statistics will be doomed.

Re:Show them you're the Boss (1)

KZigurs (638781) | more than 4 years ago | (#30710882)

three door problem? What about poker! ;)

Re:Show them you're the Boss (0)

Anonymous Coward | more than 4 years ago | (#30710962)

Poker has too many psychological factors. OK, here is another one - the probability matching paradox. You show your friend a biased coin (heads 70% of the time, tails 30%). Ask them to guess and write down the outcomes of the next ten tosses so that they get a maximum number of correct matches. Most people will write down a sequence such as H, T, H, H, H, T, H, T, H, H. The correct answer of course is to have heads in all 10 slots.

Religious experience... (0)

Anonymous Coward | more than 4 years ago | (#30710810)

When is comes to statistic it becomes like religion - you either believe that they are telling generating a truth or you don't
(Its all about assuming that you are accounting for all the variables)

Famous last comments (1)

HTH NE1 (675604) | more than 4 years ago | (#30710828)

Zed Shaw writes an impassioned plea to programmers: Programmers Need To Learn Statistics Or I Will Kill Them All.

// This will never happen

Re:Famous last comments (1)

thegrassyknowl (762218) | more than 4 years ago | (#30710910)

// This will never happen
if (Zed.killed(this) == true)
{
    Universe.instance() / 0;
}

... or know when to defer to an expert (1)

jamesh (87723) | more than 4 years ago | (#30710830)

I certainly suffer from a feeling of being an expert in all fields. Deep down I guess I know I'm not, but I'd probably rather just muddle my way through it assuming I know everything there is to know. The trick is knowing when something is sufficiently out of your field that you need to defer to someone who is an expert in that field. Statistics is just one example. Certainly a little bit of knowledge in a lot of fields is a good thing, but when you have to choose between 4 years of study vs consulting someone who's already done 4 years of study, the choice should be obvious... (assuming you aren't going to spend the rest of your programming life doing heavily statistics related programming :)

For me the frustration is taking the word of an expert without understanding why and how they have arrived at that answer. I guess statistics is one field where the answer that 'feels right' is often not the answer that is right. The number of people who buy lottery tickets is a good example of that :)

stfu! (1)

AlgorithMan (937244) | more than 4 years ago | (#30710834)

I don't know how educated your colleagues are, but if they have studied computer science, then you should just shut your dumb mouth, because we learn how to analyze running times WITHOUT actually running it. Even without actually programming it, just by analyzing the problem itself. That is called "complexity theory" and (in that case) you are the one who doesn't have any clue about what you don't understand.

and go away with "tuning". You might improve running times a bit, but no little tuning hack can defeat the improvements you get by better algorithm design by an expert on algorithmics (I mean that e.g. some XOR AX AX might speed up your program by factor 2, but replacing simple backtracking with techniques to keep branching vectors small gets you exponential speed ups!)

ah.... (1)

KZigurs (638781) | more than 4 years ago | (#30710840)

95% confidence in understanding statistics when applied to business setting is often just as good as 95% confidence in actual measurements. Yes, the last 5% are the trickiest bit, but be sure if there will be slightest indication that a proper application is required I won't be afraid to ask someone who knows more. It's just that it is quite rare.

In example: Performance testing systems. You care way more about the degradation mode than statistical model of sustainable load.

This is why Science is starting to suck. (0)

Anonymous Coward | more than 4 years ago | (#30710842)

I'm really starting to fucking hate science. I finally, after 6 years, decided I should go back to university and get a stupid piece of paper on something. So I start down the two things I loved best in school: Biology and Computers. But oh its not that simple, because to learn computers "right" I have to take algebra, calculus, statistics and physics (just incase i'm ever lost in the desert and need to build an iPhone out of sand and snake shit). To learn Biology I have to know chemisty, physics, sociology and psychology (which requires fucking statistics anyways). Stupid!

I shoulda just been like every other lemming and got a business degree so I could earn six figures a year with my thumb up my ass and my brain on a tropical island.

Logic and reason superior? (1)

TranceThrust (1391831) | more than 4 years ago | (#30710844)

Those two things is what statistics is based in the first place as well. Evidence etcetera comes second. If you can't blow logical counterarguments away you're probably wrong and you're indeed lacking in understanding.

Statistical analysis of the summary (2, Interesting)

mmmmbeer (107215) | more than 4 years ago | (#30710846)

Let's see, we have one guy complaining about how none of his programmer coworkers understand statistics, and we have X coworkers who undoubtedly disagree with him. Since we do not know him or any of his colleagues to any meaningful degree, we have to assign equal weight to each of their opinions. Statistics then tells us there is a 1/(X+1) chance of his being right, and an X/(X+1) chance of their being right. We can assume that X >= 2 based on his ranting, therefore resulting in the odds favoring them by at least 2/3, and probably much more. Therefore it is only rational to assume they are correct.

Who needs stats? (0)

Anonymous Coward | more than 4 years ago | (#30710852)

83% of programmers know that 67% of statistics are made up on the fly anyway.

Who is Zed Shaw? (1)

Coward Anonymous (110649) | more than 4 years ago | (#30710854)

What has Zed Shaw done for humanity?

Reply from a programmer that knows no statistics (-1, Troll)

viking80 (697716) | more than 4 years ago | (#30710864)

And still has the "ignorance is bliss" and unwarranted "know it all" attitude.

Statistics is a phony science and should be thrown on the garbage heap altogether.

I will qualify this. There are a few exceptions, primarily quantum mechanics, where uncertainty is part of nature.

Except for that there is very few cases where one can apply statistics.

You probably still think I am a lunatic, but hear me out.

With human calculators it was necessary to approximate all kinds of calculations because all you could do was 0.1 IPS (instruction per second) at best. Pretty much all science became statistics, from thermodynamics, to economics to geology and meteorology. As computers became faster, more and more could be modeled accurately, and we can actually model each individual human in most population models. The same with thermodynamics. For many systems, the solution can be solved numerically, and there is no uncertainty. Quantum dynamic properties carry over to macro systems sometimes, like Einstein condensates, superconductivity etc. but most often there is no "built-in" uncertainty, and statistics is just a way to excuse incompetence, laziness or worse.

  Real world example is the "medical advice" you will get before performing a procedure such as amniocentesis. The doctor will tell you there is a 0.1% chance it will have catastrophic consequences. This might mean that the hospital has one problem every 50 years, and when you dig in the data you find out that the problem happened when the hospital caught fire during the procedure. That is a manageable risk, not a probability.

Failure modes is also modeled statistically. It should not; Bridges that fail, fail predictably. It is usually just a question of collecting some data. The same with foreclosures. Some properties in my neighborhood are in foreclosure, and in all the cases I looked into, it is not hard to see why. (Like: "I told my loan officer that I could not pay the mortgage after the low teaser rate ended. He just told me to refinance again, and get a new low teaser rate, so I signed up, and a year later he told me he could not refinance." Duh!)

So programmer, throw statistics away, and stop using that sliderule.

Have fun.

Re:Reply from a programmer that knows no statistic (1)

not-quite-rite (232445) | more than 4 years ago | (#30710924)

Best. Troll. Ever.

You know nothing about statistics, yet want to tell us how it is a phony science?

You couldn't have taken a few minutes on wolfram, or even wikipedia to even TRY to know a little of what you are talking about?

Yes, I do think you are a lunatic.

Re:Reply from a programmer that knows no statistic (1)

digitig (1056110) | more than 4 years ago | (#30710978)

Bridges that fail, fail predictably. It is usually just a question of collecting some data.

Good luck demonstaring that an aircraft instrument landing system is fit for purpose, then. Semiconductors might fail predictably when they're being observed under an electron microscope, but it's a bit harder in a hut by the side of an airfield.

Gift them a MANGA Guide (0)

Anonymous Coward | more than 4 years ago | (#30710870)

This is somewhat tangential to the discussion but I recommend the
MANGA GUIDE TO STATISTICS
http://www.amazon.com/Manga-Guide-Statistics-Shin-Takahashi/dp/1593271891

Maybe you suck? (1)

Gothmolly (148874) | more than 4 years ago | (#30710884)

You know, studying stuff in college for years doesn't make you smart. Maybe these are clever, practical people, and you're just not a good communicator?

Not just programmers (1)

famebait (450028) | more than 4 years ago | (#30710888)

Everyone needs to learn statistics. All of us who understand one iota of it are in a constant state of depression over how everyone keeps on making the most banal mistakes. But just a general gripe is not very helpful. Getting everyone to take advanced degrees in statistics is simply not going to happen. Most engineering courses inclue some basics, but that only helps a bit. What is needed is to teach it (to the "masses", i.e. the ones who really ought to know better) in terms of the pitfalls first, and what to understrand the workarounds. Those who have no iterest in pursuing it further might still gain some insight about where to be careful, and those with potential might more easily see the point in investing in some real knowledge.

Translation (1)

Opportunist (166417) | more than 4 years ago | (#30710890)

I studied it for years, so my e-peen is bigger. It worked in school, so it has to work in reality and thus they are wrong when they tell me it does not, despite them having experience with real applications while I have not.

Ok, snideness aside. Statistics is a wonderful tool (hey, my degree is in statistics actually), but I wouldn't want to impose my metrics on real applications without first looking whether they measure anything sensible. I turned for programming because, well, it's more suitable to me. But when I look at the metrics some of my superiors designed, cringing is all I can do.

Example: A metric that measures how much code you produce. Which is in theory nice. Who creates more code has done more work. Right? From a statistician's point of view, yes. But any programmer will tell you that it's trivial to write lots of lines or few, and they will do the same work. Most programming languages support that just fine. Does the statistician know? Probably not, unless he is a programmer too.

Example: A metric that measures the amount of code you alter. Which is in theory nice. You check out, change and check in code, and who checks out and checks in more (and does alteration in between) does more work than others. Right? No. For reference, see the Wikipedia game.

The reason why programmers scoff at metrics is that we've all seen our share of really, really crappy metrics that led to less instead of more productivity because everyone started gaming the system. Had to do that, because if you actually did sensible work, you fell behind in the metric against those that gamed (i.e. those that didn't produce in the first place).

We love you, Zed (0)

Anonymous Coward | more than 4 years ago | (#30710916)

This a classic Zed Shaw post:

- Makes one really very good point (programmers doing testing should incorporate basic statistical techniques into their tests)
- Tells everyone how smart he is, albeit emphasizing his own humility ("I've read tons of books on this subject, but I still don't know shit")
- Angrily berates stupid fucking programmers for making fucking stupid mistakes, and for not listening to him when he tries to put them fucking straight
- Claims that bad practices afflict the entire community (except him)
- Betrays secret hurt feelings ("Screw you guys, I'm going to get a burrito")
- Makes creepy and patronizing comments about women
- Informs us how tall he is (6'2")
- Descends into Daily WTF-style enumerations of fucking stupid things his former boss did

Unfortunately it is missing some elements that would make it a truly great Zed Shaw post: personal insults, bewildered complaints that he is not rich, and stories about his random good deeds.

His main point is excellent though: programmers doing testing should understand statistics, and their tests should be statistically valid, just like any other empirical test. A great point and one I have not heard discussed very much in the context of software engineering.

In other news... (1)

MoeDrippins (769977) | more than 4 years ago | (#30710940)

....Zed wants everyone to be just like him.

*Somebody* on the team might need to know stats (1)

digitig (1056110) | more than 4 years ago | (#30710950)

Unless they're actually programming statistical applications, most programmers probably don't need to know statistics. As long as somebody on the testing team does, all the programmer needs to understand is that function X sometimes fails to meet its timing spec (perhaps "often fails..." or "occasionally fails..." might add some value) or whatever. Then they know they need to do some optimisation. There's a natural human tendency to think that everybody should be doing what we're doing. In reality, they don't have to, because we're doing that; they need to be doing something else.

Superior? (1)

VinceVulpes (1717404) | more than 4 years ago | (#30710964)

"I really can't blame them since they were probably told in college that logic and reason are superior to evidence and observation." Both are superior to statistics.

lies, damned lies... (3, Funny)

yalap (1443551) | more than 4 years ago | (#30710972)

Lies, damned lies and statistics. Us programmers are too busy dealing with the first two to ever reach the third..

In my humble opinion (0)

Anonymous Coward | more than 4 years ago | (#30710980)

Only 37% of programmers need to learn statistics. Remaining 95% either know it already or don't need it at all.

Raziel2001au (0)

Anonymous Coward | more than 4 years ago | (#30710990)

I think it's a matter of what you do as a programmer... Not trying to brag here, but I get paid a lot if you compare my salary to that of others, my skills are also well-sought after where I work and project managers are always trying to drag me onto their projects. Yet, I don't know ANY statistics.

Why is this? Because I don't need it for what I do. I think generalizing things and saying all programmers must know statistics is down-right stupid. It comes down to what you do... what I do doesn't require it, so what's my incentive for even knowing about it?

I think knowing basic logic (getting your ands and ors correct the first time) and generally knowing about good design and having solid debugging skills is much more important for the average programmer. Anything more than that will push you beyond average, but knowing statistics is not necessarily the correct answer. It all comes down to what you do... That is my 2 cents anyway, take it how you want.

Burn in flames (1)

Rivalz (1431453) | more than 4 years ago | (#30710996)

" I have taken a bunch of math classes, studied statistics in grad school, learned the R language, and read tons of books on the subject. Despite all of this I'm not at all confident in my understanding of such a vast topic." I'm presented with 1 of 2 scenarios. Either he is smart and I should not bother studying statistics because it is vast and complicated and should only do research on a as needed basis. Or He is stupid. And I should just ignore the guy completely.
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?