Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

System Admin's Unit of Production?

kdawson posted more than 6 years ago | from the counting-lines-of-shell-script dept.

Businesses 556

RailGunSally writes "I am a (strictly technical) member of a large *nix systems admin team at a Fortune 150. Our new IT Management Overlord is a hardcore bean-counter from hell. We in the trenches have been tasked with providing 'metrics' on absolutely everything from system utilization to paper clip recycling. Of course, measuring productivity is right up there at the top of the list. We're stumped as to a definition of the basic unit of productivity for a *nix admin. There is a school of thought in our group that holds that if the PHBs are simple enough to want to operate purely from pie charts and spreadsheets, then we should just graph some output from /dev/random and have done with it. I personally love the idea, but I feel the need for due diligence, so I put the question to the Slashdot community: How does one reasonably quantify admin productivity?"

Sorry! There are no comments related to the filter you selected.

Number of Cases (5, Funny)

Esion Modnar (632431) | more than 6 years ago | (#20356015)

of Jolt Cola consumed.

Measuring productivity? (4, Interesting)

haluness (219661) | more than 6 years ago | (#20356023)

How many tickets answered per day? Completed per day? /dev/random is probably the most elegant though

Re:Measuring productivity? (5, Insightful)

metlin (258108) | more than 6 years ago | (#20356199)

Trouble tickets are great, but I would recommend that you find ways to quantify all of the following in some way or the other -
  1. Stability calculated using the uptime of your systems. You could include such things as updates, patches etc to this to demonstrate that stability is not set in stone.
  2. Reliability is similar to stability, but how many production/pilot/training and other systems rely on you? How often and how well do you serve them?
  3. Response time is how fast you react to problems and how often do problems come up? (trouble tickets are a good way to quantify the latter)
  4. Network load is a good way to demonstrate how well your network is performing, if you are a *nix sysadmin handling networks.
  5. Security is how much time and effort do you spend on keeping your systems secure, both internally and externally?
  6. Efficiency would be a combination of all of the above and a way of measuring how well you achieved those things and how much time, resources and effort was expended to achieve those things.


    I am sure that others could find much better ways of quantifying performance, but this is something that jumped out at me. I was part of a consulting team that was asked to improve performance in a company several years back, and they came up with something similar.

Re:Measuring productivity? (4, Insightful)

Duhavid (677874) | more than 6 years ago | (#20356309)

No. How many tickets were not opened in the first place because things
just work.

Yeah, I know.

Re:Measuring productivity? (1)

SCHecklerX (229973) | more than 6 years ago | (#20356335)

Isn't answering tickets the antithesis to productivity? Productivity would be designing your systems such that you don't get tickets. Routing everything to /dev/null doesn't count :)

Re:Measuring productivity? (0)

Anonymous Coward | more than 6 years ago | (#20356397)

O.K: now all you need is a way to quantify the "complexity" of each ticket.

And the Corollary.. (1)

gerf (532474) | more than 6 years ago | (#20356413)

Figure out how much downtime costs, and how much downtime you prevent due to increased numbers of people, or special skill sets. If your webserver goes down, will it potentially hurt your business or not? Are your internal servers more important? Do printers really need 99% uptime? Everything gets weighted.

Overall, it seems like a pain in the ass to figure out, but might be quite useful in the future to justify spending for new equipment, software, or support, or even your own employment.

The only reason you need... (1, Funny)

Ice Wewe (936718) | more than 6 years ago | (#20356029)

to know [xkcd.com]

Re:The only reason you need... (0)

Anonymous Coward | more than 6 years ago | (#20356129)

CPU Cycles (for your entire company) per Admin Hour... actually might be useful.

Easy (1, Funny)

Bloater (12932) | more than 6 years ago | (#20356031)

Write a kernel module to log the number of keystrokes.

Re:Easy (1)

sw17ch (934319) | more than 6 years ago | (#20356251)

Or... you could just grep the keyboard interrupt in /proc/interrupts ...

Re:Easy (0)

Anonymous Coward | more than 6 years ago | (#20356453)

Troll?
Mods on crack, yet again.
Can somebody give them an intervention?

Run (0)

Anonymous Coward | more than 6 years ago | (#20356035)

Run away, as fast as you can...

Unit of production (4, Insightful)

phoenix.bam! (642635) | more than 6 years ago | (#20356037)

The best sys admins are the ones you never notice. If the productive workers in a company never see or need to talk to a sys admin it's been a productive day for the admins.

Re:Unit of production (1)

mikek2 (562884) | more than 6 years ago | (#20356201)

This is been my mantra for years. It drives my (very PH) boss crazy, but it's more true for a sysadmin more than any other job I can think of. By the nature of the beast, sysadmin metrics are awfully hard to pin down.

Re:Unit of production (5, Funny)

thomas.galvin (551471) | more than 6 years ago | (#20356217)

The best sys admins are the ones you never notice.

If the productive workers in a company never see or need to talk to a sys admin it's been a productive day for the admins.
Bingo. So here's what you do:

Leave for a week.

When you get back, after you've managed to suppress the fires and riots, fight your way to Mr. Bean's office, talk him down off of his desk, get him to put away the spear, and tell him "that's why you keep us around." If he wants it quantified, write it up as "Number of Cannibal Insurrections Suppressed Per Week (Estimated)."

Re:Unit of production (2, Insightful)

entgod (998805) | more than 6 years ago | (#20356261)

Thus, productivity can effectively be measured 1/N where N is the number of times someone needs the sysadmin during a month. If the sysadmin is never needed this woud be 1 or 100% and someone with an effectiveness of 100% deserves a raise.

Re:Unit of production (2, Insightful)

drmerope (771119) | more than 6 years ago | (#20356307)

Bingo. I remember very fondly joining companies and discovering that the IT stuff just works. This is the measure of a good sysadmin team.

The common state-of-affairs is not that everything works; its impressive when it actually does.

Easy (1, Offtopic)

metalhed77 (250273) | more than 6 years ago | (#20356039)

Why worry? If you've got enough time to post stories to Slashdot you're clearly very efficient.

Re:Easy (2, Funny)

TubeSteak (669689) | more than 6 years ago | (#20356401)

Why worry? If you've got enough time to post stories to Slashdot you're clearly very efficient.
So what you're saying is that he should use /. posts per day as a measure of efficiency?

I wonder how much he'd have to post to get a bigger Christmas bonus.

Best non-/dev/random method: (1)

mdenham (747985) | more than 6 years ago | (#20356041)

Percentage of tickets completed. Remember, 0/0 = 100% - just reverse the math to check that one.

Re:Best non-/dev/random method: (1)

metlin (258108) | more than 6 years ago | (#20356099)

Percentage of tickets completed. Remember, 0/0 = 100% - just reverse the math to check that one.
Son, who's your math teacher?

Re:Best non-/dev/random method: (0)

Anonymous Coward | more than 6 years ago | (#20356143)

I dunno, but one thing's for sure: He works for Verizon.

Re:Best non-/dev/random method: (1)

dvice_null (981029) | more than 6 years ago | (#20356177)

Why not just use "the amount of unsolved tickets"?

Well... (2, Insightful)

djupedal (584558) | more than 6 years ago | (#20356045)

"How does one reasonably quantify admin productivity?""

If no one in the building but HR and your line report need to know your name, you're doing your job...

Other than that, it would be like a trash collector counting how many cans he emptied during the day or a wildfire firefighter how many burning bushes he chopped. If there weren't any fires or trash these people wouldn't be needed, would they?

You can't quantify SA productivity.

they dont chop burning bushes (1, Insightful)

Anonymous Coward | more than 6 years ago | (#20356187)

firefighters spend a lot of their time these days preventing fires, doing stuff like controlled burns.

maybe you cant measure 'productivity', but at some point you have to make a budget of how many people you need to hire for the season, and to do that, you have to know how many people it takes to do certain activities in a given amount of time.

whicih means you need to measure those things.

---

Re:they dont chop burning bushes (3, Insightful)

irc.goatse.cx troll (593289) | more than 6 years ago | (#20356299)

The problem is the numbers don't look good. To quantify what you're looking for you'd want "number of hours spent idle" i.e if a sysadmin did his job well and has everything running smoothly, how many hours does he have with nothing needing to be done?

Once any manager or other authority type sees that number though rather than seeing you did a good job at keeping things reliable, they'll see you as lazy and assign work you shouldn't be doing (other peoples jobs).

Really just about anything other than data entry is hard to quantify in the computer field. Someone suggested troubletickets.. but theres a huge difference between a ticket that requires you to restart apache, and one that requires you to strace half your system to debug, and raw ticket numbers don't tell you that.

On the same note, lines of code mean nothing to actual programming, nor do "functions per day" or anything similar as again, you can't quantify the effort required in an easy line vs hard line. Is it a simple debug print or core logic you had to scratch out on a whiteboard to keep sane?

Re:Well... (0)

Anonymous Coward | more than 6 years ago | (#20356253)

My first thought was along these lines...

Its always hard to explain why problem A took 4 hours to solve simpley because the error reported was non-helpful, but similar problem B took 5 minutes because you've seen it before.

They should do an experiment. Send all the IT staff on a simultaneous one or two week vacation. When they all come back and clean up the mess, management will have a better idea of their value, and the range of things they do and impact. And if the users were ever forced to at least attempt to look at or look up the issue themselves, and see how difficult it can sometimes be, they as well would have more appreciation.

haha, coincidentally, my CATPCHA is "informed".

Re:Well... (3, Interesting)

SplatMan_DK (1035528) | more than 6 years ago | (#20356379)

You can't quantify SA productivity.
I respectfully disagree.

You can evaluate how many users the SA's systems serve, how many systems the SA maintains, and how much data throughput all these users/systems generate.

A confused Microsoft-SA running in circles around an Exchange server all day in order to serve 200 users is not "efficient" compared to a Linux-SA running an MTA which services 25.000 users (with better response times).

On the other hand, a non-skilled Linux-SA who is fiddling with a SAMBA server in order to maintain 200 users with Windows clients is not very "efficient" compared to a skilled Microsoft-SA with a well configured AD.

Off course you can measure SA efficiency. And there is nothing bad about it. In most cases it is even a benefit for the *nix admins.

:-)

- Jesper

Re:Well... (3, Insightful)

Eponymous Bastard (1143615) | more than 6 years ago | (#20356457)

If no one in the building but HR and your line report need to know your name, you're doing your job...
And that's easy to quantify. Number of outages per week, average downtime, etc. Then report this by service (the email server was up 100% of the time, but the server with lousy intranet app X crashed twice, they need to rewrite that).

Of course, it's not productivity, but it's a measurement of the quality of service. Combine that with other indicators like users served, requests serviced, emails delivered, etc. and you can actually chart improvements in "productivity".

Even something like average time to solve a ticket or bring up a server is a useful indicator. Granted, it'll vary depending on what the failure is, but over time it should average out to a useful picture.

impossible? (2, Insightful)

ragahast (879945) | more than 6 years ago | (#20356049)

It's easy to quantify /my/ productivity as a support tech (at the U of CA) in number of tickets resolved per shift. But sysadmins have a number of duties which they are performing /continuously/, so how can you quantify that?

Re:impossible? (0)

DarkIye (875062) | more than 6 years ago | (#20356221)

Articlesearch 0.1:

Returned 0 results for 'my'
Returned 0 results for 'continuously'

Time to find another job (2, Insightful)

ZWithaPGGB (608529) | more than 6 years ago | (#20356055)

Since the real proof of actual productivity for network admins is negative: nothing goes wrong (no trouble tickets). Also, the PHB will get their wish: No one to pay is infinite productivity (measured as output per $ spent).

Right on, Productivity measurement -- bad idea (1)

Uksi (68751) | more than 6 years ago | (#20356443)

Productivity for a sysadmin is almost impossible to measure. Whatever metrics you come up with are likely to only capture very specific types of workflows or problems: when the set of issues changes (as it most certainly will), the metrics become useless. It's easy to measure the productivity of a knowledge worker, a little harder that of a sales person, but still doable. But for a knowledge worker like a sysadmin, it is impossible.

You should convince your boss that you will not measure productivity; if that fails, start looking for a new job.

Here's what's gonna happen if you do come up with some productivity metrics. Your boss will probably give incentives (bonus) or disincentives (lose job) based on the productivity. Things will be dandy for a bit until the problems you deal with start changing in nature, and your productivity metrics will start going down. At this point, you will have a choice between actually doing what's good for the company or trying to maintain your productivity metric at the expense of the company. With a boss like yours, the former spells bonus loss or job loss. Do you want to be in this situation? If not, don't measure productivity.

Unit of productivity (5, Informative)

orionpi (318587) | more than 6 years ago | (#20356059)

Unit of Productivity = 1 / (hours of down time)

They are paying you to keep bad things from happening.

Re:Unit of productivity - Mod parent up (1)

witte (681163) | more than 6 years ago | (#20356155)

(Out of mod points.)

You are absolutely correct imho.
Now, to explain this to a PHB... give them some downtime first ?

Putting beancounters in IT mgmt can be a prelude to axing of IT jobs. Tread lightly!

Bad metric (1)

Tony (765) | more than 6 years ago | (#20356169)

Unit of Productivity = 1 / (hours of down time)

Bad choice. In a well run shop, you'd get a "division by zero" error.

hmmm (2, Insightful)

nomadic (141991) | more than 6 years ago | (#20356063)

Do uptime. Unless your team has serious problems, those numbers should always look good. If you do any sort of work in response to in-house or outside tech support requests, you can measure how long it takes to resolve issues.

Re:hmmm (0)

Anonymous Coward | more than 6 years ago | (#20356135)

Great suggestions. I would also add a metric for cost of resolution. A well-operating department works to minimize the downtime, resolution time and resolution costs, and balance these with politics.

Re:hmmm (0)

Anonymous Coward | more than 6 years ago | (#20356139)

Problem with doing uptime is that that same anal tightwadded PHB is going to start cutting corners on software, so even if your uptime seems good now, it might not when you've got some piece of software not interoperating properly and interfering with your ability to do your job. Of course going with that over whatever you'd recommend, even if more expensive, should be the PHB's problem, but we all know that doesn't happen in the real world.

Some guy who's never had a job of any kind as a sysadmin, even though lots of dumbasses think he's knowledgable.

And just to be amusing, my captcha was 'interned', anyone think they can help out with that? :P

Re:hmmm (1)

Bender0x7D1 (536254) | more than 6 years ago | (#20356245)

Make sure you put the correct spin on the graphs. You would hate to be called on your low number of support requests completed just because you managed to keep 100% uptime on everything. Also, try and keep it all relative - for example 94% of all support requests were completed within 1 hour. It doesn't matter if you got 16 or 160 requests, it still looks like a good number. Maybe have a "fall back" category such as %age of support requests taking more than 6 hours to resolve - works great even if you don't have any requests. Of course, if you have a lot of requests like that either use a longer time period, drop the category, or create a new category like "long-term roadmap support request" so you don't include the number in your regular data.

Also, you need to make sure that your management lives in the real world. For example, if you do a great job, and everything is running smoothly, they might assume you don't need all of the people in your department. When I was in development, teams would be let go based on how low their bug list was. If it was low, they didn't have as much work to do, so they got the boot instead of the people who had done a bad job of developing their module or feature.

By doing quantifiable stuff (1)

DingerX (847589) | more than 6 years ago | (#20356067)

A) Nagging emails
B) Logging every OS update
C) Supervising Patch Thursdays
D) recording the percentage increase in spam email intercepted (This is your business metrics friend, since that number will never go down)
E) Number of meetintgs with employees about improper email use.
F) number of Company-wide software-license-compliance surveys, and number of improper installations detected.
G) total number of top executive emails logged, with copies sent to several geographically distinct locations.

If they want metrics, give 'em metrics. And let them know that metrics will only encourage you to be more of an a$$hole.

Hey, dumba$$ (1)

DingerX (847589) | more than 6 years ago | (#20356101)

Those are patch TUESDAYS.

(Now, will I get flamebait for insulting myself?)

Re:Hey, dumba$$ (2, Funny)

markdavis (642305) | more than 6 years ago | (#20356235)

He is a *nix sysadmin... there are no regular patch THURSDAYS *OR* TUESDAYS!

Re:By doing quantifiable stuff (1)

tempest69 (572798) | more than 6 years ago | (#20356227)

Nice list.. its nice to see a good counter for the quantifiers..

I've found that the quantifiables are number of Servers, number of users, number of update requests, and number of unexpected downtimes..

If the company is providing the right support the number of unexpected downtimes gets close to zero..

The real crux is.. WHAT is uptime worth... it varies from company to company.. for google and microsoft the number is amazing.. for billsmitherspersonalwebpage downtime might be less noticed.

Still Having metrics that seem evil might just shut up the bean counter.

Storm

Re:By doing quantifiable stuff (1)

Nexx (75873) | more than 6 years ago | (#20356389)

The real crux is.. WHAT is uptime worth... it varies from company to company.. for google and microsoft the number is amazing.. for billsmitherspersonalwebpage downtime might be less noticed.

It all depends on what these systems are doing, too. My clients have trading apps whose downtime is a measurable quantity. Simply put, if their clients cannot trade, everyone loses money.

However, how much does their corporate website being down cost? email servers? Access control? Those are less quantifiable.

Depends on your bean counters objectives.... (1)

3seas (184403) | more than 6 years ago | (#20356073)

Does he want to reduce the departments employee count? Or does he want to sustain it employee count?

Mayeb he just needs to hire more and task to them, the metrics goals.

One word (1)

anom (809433) | more than 6 years ago | (#20356079)

Uptime.

Easy -- send out resumes (1)

localroger (258128) | more than 6 years ago | (#20356089)

The sooner you find a job that doesn't suck so much, the more productive you probably were.

I suspect the closest model... (1)

cmowire (254489) | more than 6 years ago | (#20356095)

I suspect the closest model with math behind it is a futures contract.

The things that a good sysadmin is supposed to do is make sure that all of the things that would threaten the relevant simple metrics -- capacity, uptime, etc -- are taken care of ahead of time.

Clearly, it is more desirable to add servers a month or two before they are needed instead of after the server farm becomes unusable.

So what you want, as your metric, is to track the future value of capacity and the future value of uptime.

Got a time machine? The best we've managed is Black-Scholes and that's not going to work for this situation. :P

Parent makes no sense. (1)

pafein (2979) | more than 6 years ago | (#20356407)

As a former futures & options trader and current programmer & occasional admin:

That's the most nonsensical thing I've read this week & I've spent a bunch of time on reading comments on reddit.

It's not even wrong. Black-Scholes is used for option valuation, not futures. WTF either has to do with server metrics is beyond me.

uptime (1)

immerrath (607098) | more than 6 years ago | (#20356105)

uptime, for various properties of the system.

Slack (1)

Saxerman (253676) | more than 6 years ago | (#20356111)

The basic unit of measure for any good admin is, of course, slack. You never notice when an admin is doing a good job. You only notice when they're not.

Units (5, Funny)

ettlz (639203) | more than 6 years ago | (#20356117)

How does one reasonably quantify admin productivity?
In admons.

Uptime. (1)

B5_geek (638928) | more than 6 years ago | (#20356127)

Base your (metric)work on how long things are NOT broken.

i.e. Well I was 92% effective yesterday because I had to replace a switch.

That is arse backwards (5, Insightful)

JosefAssad (1138611) | more than 6 years ago | (#20356133)

You aren't building automobiles or painting teapots. You are a support function and not a line function.

You should have business plan objectives. These things are usually annual; there can be longer strategic objectives. If the person who set these things did it right, they should be measurable.

What I'm trying to say is, if you're banging your head against the wall trying to figure out how your performance should be measured, your higher up didn't set your objectives correctly.

This doesn't apply anywhere and everywhere. When the organization is in the business of IT itself, you might be measured differently since you'd then be contributing directly to the organization's core business. But from the description provided, it sounds like you're not.

Re:That is arse backwards (4, Insightful)

adrianmonk (890071) | more than 6 years ago | (#20356455)

You aren't building automobiles or painting teapots. You are a support function and not a line function.

That is the best answer I've seen so far in this discussion. It mostly clearly illustrates that the question is framed wrong.

There is nothing wrong with wanting to monitor and even quantify the value that an employee brings to the organization, but contrasting support function vs. line function perfectly illustrates the key point here: production is not the only kind of value that an employee can add to an organization.

I wonder if a way of communicating this might be to make an analogy to something a financial person can relate to. You can use money to make several different types of purchases: you can buy durable goods, you can buy consumables, and you can buy more abstract things like insurance or legal advice. Don't take the analogy too literally, but system administration is like insurance or legal advice in that the value you provide is stuff like protection, security, planning, design, and order.

I think if this were me, I would start by providing an outline of the responsibilities of the system administrator and the value that a system admin provides to the organization. This does include certain deliverables (like physical installation of hardware in machine rooms, installation of software, working and configured systems, documentation, answers to technical questions, training presentations, and code for scripts written to automate tasks), but it also includes a lot of work that doesn't have a deliverable (like diagnosing a problem and tracking down a patch from a vendor, or even convincing a vendor to supply a patch). It might be helpful to break the job down into types and subtypes of work being done and very rough estimates of the proportion of time being spent at each.

So maybe the best plan is to educate the higher-ups about what the job really entails. It's quite possible they don't understand much about it, and some increased visibility into what is really going on could help with their understanding and thus their comfort level with paying the salaries of the people who do it.

Also, there are deliverables that can be quantified. Creating user accounts, for example, has to be done repeatedly, and it takes about the same amount of time every time it happens. Auto mechanics deal with a similar situation and the industry has developed a list of tasks (such as replacing a fuel pump or brake pads) and standard times required to accomplish them. The computer world changes so quickly it might be hard to accomplish that, especially without industry support, but it seems possible to quantify some of what a system administrator does, because some of it is standard stuff.

This is easy (1)

iamdrscience (541136) | more than 6 years ago | (#20356141)

The standard unit for quantifying productivity in IT is generally bottles/cans of Mountain Dew. This varies from "cups of coffee" the standard metric used in many other fields, but you can easily convert between the two, so it shouldn't be any trouble for your manager to share your productivity figures to his bosses in a manner they are more familiar with. To convert cans to cups of coffee, multiply by two fifths. To convert bottles to cups of coffee, multiply by two thirds.

System performance is easy enough... (1)

Glowing Fish (155236) | more than 6 years ago | (#20356145)

To do system performance, just take pages/data/e-Mail/whatever served from a server, and divide it by the operational cost of the server.

But how do you measure the value of the administrator?
I guess I would take the value of what the system would do by itself, if it was just "left running", and divide it by that cost. (You would end up with less cost, but also less productivity, of course).
And then, take it with say, a bare bones set up, with only a few, poorly trained administrators, and show what the cost per page/e-Mail delivery would be.
And then show your cost as it is now, with you and your sterling team of administrators.

Whatever you do... (0)

Anonymous Coward | more than 6 years ago | (#20356147)

... try not to think of your Overlord's salary being ten times yours. Plus bonuses.

Indexes for users and servers maintained (1)

SplatMan_DK (1035528) | more than 6 years ago | (#20356151)

I suggest you take the task very seriously and try to find indexes on the internet from Gartner, Accenture, Cap Gemini, KMPG, or other major analyst/business consulting companies. Just using their names and logos in your reports/suggestions will get you a long way!

After you have collected some reports, try to think of ways in which you can demonstrate your effectiveness. If you are in fact not very efficient, the boss is right in wanting that information - right? If you are in fact very efficient, you should have no problem with him discovering that - right?

You could make performance-indexes for the amount of users you serve (for each type of service like mail, security, infrastructure, etc.) and for the amount of servers you maintain. With any luck you should be able to prove that you serve a high numbers of users, a large amount of data, and maintain a high number of servers... compared to whatever general indexes you dug out of the reports we started out with.

:-)

- Jesper

KW productivity (1)

cb_abq (894167) | more than 6 years ago | (#20356157)

As a technical member of a highly specialized profession, you alone may be able to quantify your output, particularly if you work independently on projects. Measuring the productivity of knowledge workers for the purposes of assessing cost and improving efficiency has been a management problem since Peter Drucker began writing about knowledge workers and the knowledge economy in the 1950's. Your counter of beans should know that if he or she has had any formal management training or has picked up a book. This is also one of the problems that has plagued the information technology field due the difficulty in determining return-on-investment and setting budgets accordingly. Your manager has a challenge ahead if he or she is determined to attach a prod-o-meter to your seat. One of the methods commonly used is through trouble-call tracking systems, and measuring statistics such as calls closed and average time to closure. It is unreasonable at best and destructive at worst to assess performance of project-oriented workers using these metrics. YMMV.

Maintain activity logs (0)

Anonymous Coward | more than 6 years ago | (#20356159)

It's a bit of a nuisance, but maintaining activity logs is probably the best defence. If you're really good, there won't be much trouble ticket activity which could lead a PHB to think you're not doing anything and he can score points by getting rid of staff.

There are lots of automated tools for tracking activity, but the more important part will be explaining the need for time to think about how to solve nascent problems before they manifest themselves as user calls.

Good luck!

rhb

Wipro... (1)

cyberbob2010 (312049) | more than 6 years ago | (#20356161)

You can't. So prepare to teach Wipro how to do your job.

I work for a Fortune 15 pharmaceuticals company. You don't keep application management and support around for the good times.
You keep them around for the bad times and thats a growing pain all IT dept's are facing as we are all pushed under the magnifying glass. Gone are the days of "what? IT? throw money at it!!!" out of fear. Here are the days of "global IT services". If they can find a way to do it cheaper, they will and they will do it at the cost of quality.

Still, the company I work for is slowly beginning to realize the importance of having someone who speaks passable English available during a crisis situation and with any luck your company will realize that before shipping your job off to Bangalore.

Unit of Productivity (0)

Anonymous Coward | more than 6 years ago | (#20356173)

# of furious users outside the manager's door after you leave.

Push the question back (1)

Flying pig (925874) | more than 6 years ago | (#20356181)

Blind the bastards with science. Submit a load of white papers on things like function point analysis with a covering note explaining that this may be appropriate for measuring productivity in shell scripts. Ask for a really good job tracking and logging system that will cost a fortune to deploy, then estimate the resources required to implement it and feed back an estimate of the lost productivity while it is going in, plus the time to administer it. Find out what is the most expensive and complex server management system you can imagine, and propose it as a long term cost reduction.

Always be helpful and make co-operative proposals. Suggest that users who lose passwords should be docked pay according to the time to fix, thus ensuring that user incompetence is not perceived as an IT cost (I know this isn't *nix admin, but it's part of the general pattern).

Produce impressive graphs showing the tradeoff between increasing the numbers of servers per admin, and the expected downtime, and suggest that increased downtime is a small price to pay for increased server/admin ratio.

I'm sure you can see the picture developing. Every proposal you make to measure productivity needs to look superficially good but have a huge cost and a huge risk. And the best thing is, you don't need to lie. Because that is the nature of reality.

True story. Many years ago I was asked to investigate a shop floor wide downtime monitoring system. As an experiment, we tried paper logging of downtime events to see what the expected traffic would be, what events needed to be handled, probable traffic etc. The paper logging showed that an entire floor of machinery had three quarters of its downtime attributed to a single design fault in the machines. The maintenance people knew this, but were keeping quiet because of all the overtime they were earning. Getting the manufacturer to fix the fault was far cheaper and quicker than putting in the monitoring system. Moral: in parallel find a few really good cost reduction proposals. You are bound to have some if you are any good. Then wave them under the nose of the bean counter in parallel with the other stuff.

Servers/Admin (1)

tphb (181551) | more than 6 years ago | (#20356183)

The standard unit of measure is # servers supported per administrator. Now that doesn't help identify whether a specific admin is any good, but it does provide a comparison cost.

Guessing what managment wants to know is harder (2, Insightful)

russg (64596) | more than 6 years ago | (#20356191)

Systems Administration falls into several categories.
Projects, Service requests, Patching, and user satisfaction are a few.

Once you have an idea of what you do, define some SLAs with your customers and the metrics are easy from there.

Now compare your defined SLAs to the following.

Metrics:
Time to ticket close?
Were the requesters satisfied?
Projects completed in the expected time?
Resource allocation is at what percentage?
Don't forget to measure your ongoing education and professional development. How much should you get, are you getting it?
Patch schedule being met?
Availability metrics.

Resource loads on the systems are easy and provide management nice graphs, plus they can be automated.

My systems roll all this information up and e-mail it for me.

While none of this is really important to us, the management teams operate almost entirely on this data. Take this as an opportunity. In some shops I've worked, management defines the metrics and they mostly are irrelevant. In your case it seems you have the rope to hang yourself so take care to present the data that is important and will help you meet your goals. As always, a good admin will automate the task but not tell anyone. :)

--russ

Get the big numbers and show importance (1)

pol-pot (101141) | more than 6 years ago | (#20356193)

What is important with CxO's is that they need to justify their spendings. So if they have a unit of sys-admins, which do something, but they do not know what, they will see it as a potential to cut your budget, and save money.

So you have to show him some figures about the size of your systems. Are you running important infrastructure for the company? If so you should give him estimates of:

  • How much "money" is the data in these systems worth?
  • What is the cost pr hour or day if the system is down? $1,000 or $1,000,1000?
  • What if data is lost? How much can be lost? $1,000 or $1,000,000 or $1,000,000,000?
  • What is your systems reliance? Do they run with 99,99% uptime?
  • How much support do they generate? Is it possibility to reduce the support need, by improving the system?

Remember that if you give him sensible info about these things, you might get a better budget and more arm-space.

Must be an equivalent of Windows SysAdmins. (0)

Anonymous Coward | more than 6 years ago | (#20356197)

Just check the statistics on Spider Solitaire. Let's see you work your Six Sigma black belt on that, bitch.

good productivity == no work (1)

p0 (740290) | more than 6 years ago | (#20356203)

the best admin can (theoretically) make the servers run by themselves forever. such an admin will have no daily work to measure productivity with.

Wasted Time (1, Funny)

Anonymous Coward | more than 6 years ago | (#20356231)

Just make a Pie Chart and have 66% of the pie say "Time Wasted figuring out how productive we are."

The number of critical problems... (1)

s1234d (542588) | more than 6 years ago | (#20356247)

...that you didn't have to fix this week.

Fun with labor models (1)

MonkeyBot (545313) | more than 6 years ago | (#20356249)

You should view this as an opportunity to be able to rationalize extra help in the future! Let me give you an example: I'm an industrial engineer for a large 3PL. I put together labor models for warehousing opps on a day to day basis, and I do it by breaking down individual activities that the team will perform (stageing pallets, filling out purchase orders, etc.). I then assign hourly productivity rates to them (how many units I handle per hour), and assume that my employees will really only work about 6.75 hours per day, after breaks and interaction with others, using the restroom, etcetera. Now, I used to be a *nix admin for a much smaller company back in the day, and most of my requests would come through email. I imagine you guys probably get your requests to perform different tasks through some sort of fancy IT system if you work for a larger company. So, ask yourself the major categories that these tasks break down into. For example, if there are two main categories of work you do, say, 1.) fixing stupid errors and 2.) setting up things for people to break, you estimate how long it takes to handle a request from each of these major categories, and translate that into how many you could handle a week (or other time period), keeping in mind you only have about 6.75 productive hours a day to work. Say a single employee could handle 10 stupid errors or 5 set ups per week, assuming they were doing nothing else (be conservative when setting these up so you look good when you surpass the productivity figures). Now, when you are getting in 40 requests for stupid errors per week, and you only have 3 team members, you have something to bitch about! Your boss is basically telling you that he doesn't know what you really do, and so you can pretty much define how your group should be operating - HAVE FUN WITH IT!!

Here's how. (1)

bmo (77928) | more than 6 years ago | (#20356259)

"How does one reasonably quantify admin productivity?"

Go see "Doctor Summeroff"

High blood pressure from dealing with stupid people should do it. A couple of months might be enough.

--
BMO

Time tracking and resource utilization (1)

www.2cups.com (642654) | more than 6 years ago | (#20356263)

I sense in your post that you are taken aback by this request to account for your time. Your response is quite common, though I would guess a little reactive. Many it departments are implementing project management programs as prescribed by ITIL standards. The foundation of these project management programs is having a clear understanding of how much time your resources are spending on tasks and projects. The purpose of collecting this information is to create a baseline utilization rate for the teams you manage. Once you have collected this baseline, you can observe for example that sysadmin A spends 60% of his time on tickets, 20% of his time on strategic projects, and 20% of his time on general office administration tasks. Management can also observe (if you code your hours) when Sysadmin B is spending 60% of time on tickets, 60% on projects and 20% on general administration. If you notice, this equals 140% utilization (or a 60 hour work week). This information is normally used (by project managers) to then do a resource leveling. This is when you take say 1000 man hours necessary to complete a new project (say migrating from Solaris to Redhat) and distribute it to engineers which have available cycles, or if there are no available engineers, to tell upper management that either A - the project needs to be slid out into the future when people have more time, or B - Management needs to hire more engineers to complete the project. If you read between the lines in this, there is a gotcha. The sysadmin that improperly codes his time (fills out 40 hours of tickets.. at the end of the week, instead of accurately tracking his time) can short change himself. Even worse, if you refuse to code your time, management has not metrics to request hiring of new engineers, and you get stuck working nights and weekends because management lacks proper information about your time.

I suggest that you hop on the wagon of tracking your time. If you read between the lines you will find that this is your chance to prove to management how hard you actually are working. You can also use this combined with other data to argue for pay increases (point out that your ratio of tickets closed to time spent is much better then your associates etc).

Colin McNamara
CCIE #18233

Remember to make suggestions for optimization (1)

SplatMan_DK (1035528) | more than 6 years ago | (#20356265)

Don't forget to make (honest) suggestions on how to improve efficiency. Even though the new boss might not go for your suggestions, it will prove to him that you are a valuable worker who is always seeking the most cost-efficient and optimum solution.

Could you perhaps benefit from server virtualization which could potentially consolidate a number of servers - thereby reducing power and cooling costs? Or would it be possible to eliminate some niche systems which have high maintenance costs and which are really not used? Could key systems benefit greatly from upgrades even though the upgrade is somewhat costly? (new products are presumably better/smarter). What was (in your opinion) the five biggest neglects of his predecessor?

Think of this guy as a career opportunity - instead of a problem you have to work around.

:-)

- Jesper

Simple question... simple answer (4, Insightful)

RazorJ_2000 (164431) | more than 6 years ago | (#20356281)

What you need to do is contact some other F150 companies and ask their senior IT admins/CTOs how they measure productivity. I work for a major investment firm and we have metrics for everything we do (even though we're private) because of two primary reasons:
1. its how you improve, and
2. its what our competitors do too.

Its that simple.

Down time & turn around time (1)

Just Jeff (5760) | more than 6 years ago | (#20356283)

A bean counter wants to know if the admin staff is "worth it." You can quantify some of the value of the admin staff. First, downtime. How much does that admin staff save the company by fixing problems? plot downtime & lost revenue or wasted time vs. admin staff hours spent preventing and repairing such events. The idea here is that an ounce of prevention is worth a pound of cure. Backups, patches, log files... everything. These activities prevent or reduce company losses due to downtime or similar failures.

Second, calculate lost productivity in the company while waiting for admin staff to carry out a requested task vs. the number of admin hours it takes to accomplish the task. The idea here is that reducing the admin staff head count will increase the time others have to wait for admin tasks to be completed.

Number of Pirated Episodes of DragonBallZ/Lost (0)

Anonymous Coward | more than 6 years ago | (#20356285)

The answer is easy enough: The number of pirated DragonBallZ/Prison Break/Lost/Hero episodes that a sys admin has downloaded and watched during working hours. This should be measured with a sliding windows of three weeks. Major migrations and upgrades should be taken as bonus points.

Productivity is not the right metric. (4, Insightful)

fishtop records (910593) | more than 6 years ago | (#20356291)

Assume for a second you had a perfect server farm. Its always up, backups are made, users are added and removed, etc. While we are at it, assume you have a staff of say two admins per shift, 24x7. That's at least 8 admins, probably more to cover holidays, vacation, etc. In this case, their productivity is zero, they have nothing to do. In reality, they are working their tails off, and deserve a nice bonus. So tell the PHB that productivity is not important, its problems. Its uptime, transactions delivered, average delay on transactions, etc. Get the Users to define what the 'requirements' are, and have the sysadmins deliver it. That is the measure of what is important.

Metrics (3, Informative)

Codifex Maximus (639) | more than 6 years ago | (#20356295)

RailGunSally wrote:
>We in the trenches have been tasked with providing 'metrics' on absolutely everything from system utilization to paper clip recycling.

This pretty much says it all; your manager wants you to do HIS job. Shouldn't he develop his own metrics? He can ask you for ideas but he should do the work himself. As for metrics, I'd suggest downtime percentages for each machine. If the services are up and running and the machines are online providing service then that should be metrics enough.

Tickets (0)

Anonymous Coward | more than 6 years ago | (#20356329)

You are support. You measure work in tickets. Everything you do should be in a ticket. If you do something on your own, open a ticket and assign it to yourself. The objective measurement that tickets provide is man-hours per task, but otherwise they're a relative measure, as in "how many tickets opened this week vs last". You don't have to educate them as to what it means, you just give them a report on tickets, and let them figure it out from there.

Seriously, this should be a no-brainer to anyone in IT. Anyone giving you "creative" suggestions on slashdot has probably never really even done the job, or is just showing off their geeky cleverness.

If you find yourself doing non-ticketable items because they're larger scope projects, then work your change management system into the metrics too. You do have one, don't you? If you don't, then they'll be pleased as punch with one you roll together yourself.

You're only as good as... (2, Informative)

iminplaya (723125) | more than 6 years ago | (#20356331)

They consider you only as good as your last mistake. The bosses don't want to know what goes on "under the hood". It just has to work. Anything less than 100% uptime is considered a failure in their eyes.

Uptime (2, Interesting)

GuyverDH (232921) | more than 6 years ago | (#20356337)

Keep track of uptime. Are the systems only down for scheduled maintenance? If they are down outside of scheduled maintenance windows, what is the percentage? Was it hardware or software or a mix (old firmware with updated driver requiring newer firmware), etc...

Was the outage extended due to vendor timing? if so, maybe stock of typical spare components should be maintained to shorten the window.

Typical maintenance like adding/deleting/unlocking user accounts, resetting passwords, printer maintenance, disk admin should be a small part of an admins day. The rest should be keeping an eye out on the real world looking for potential problems like security vulnerabilities, patches, planning the next updates / upgrades.

Tell the bean counters that their demands to quantify everything will only reduce uptime and complicate matters to where you spend more time doing paperwork than you do managing systems. If they can't understand that, it's time to go elsewhere. Be sure to tell the bean counter that they'll be lucky to find anyone talented to work under their regime.

I've seen systems that went from 20% loaded to always overloaded because of the number of *accounting* applications, programs and monitoring solutions that were *demanded* by the bean counters. After a user and business unit rebellion, the *fluff* was removed, as was the bean counter. This left the systems running in a state where the end users could do their work, and the business units had satisfied customers.

Measuring productivity? (1)

Kennon (683628) | more than 6 years ago | (#20356349)

In my office we are judged by system and application uptimes and average trouble ticket resolution times.

You Cant (1)

unity100 (970058) | more than 6 years ago | (#20356351)

and thats it.

a sysadmin may rear a big belly and sit all day long for years, yet s/he can avert compromise of the system just one day and save even a fortune 50 company from total chop busting - like 100.000 entries of customer records being stolen.

therefore forward your phbs here, and make them read these and learn that one cant direct an i.t. unit without being an i.t. person. its a totally different, magical world.

UPTIME (0)

Anonymous Coward | more than 6 years ago | (#20356357)

Your first core metric should be uptime. Follow that with turn around time for fixing failures. Add in turn around time for setting up new servers/services and you should have enough to keep the PHB happy.

PS: If that's not enough - and if you think it's to your advantage - add in 'customer satisfaction'. Send out user surveys to evaluate their opinions of the services you are responsible for. Naturally, this is a good idea only if you think you'll get back decent results.

Sysadmin Metrics (1)

dghcasp (459766) | more than 6 years ago | (#20356359)

  1. Number of lusers not dead at end of shift, no matter how much they deserve it.

There are no other metrics that matter.

Pig of a PHB (0)

Anonymous Coward | more than 6 years ago | (#20356363)

Metrics are nothing without insight. If he's totally in the dark about what might be good metrics to collect in your particular setup, then where will he get the insight into how to interpret them?

Hell, he probably just wants to take your list of metrics with him on the occasion of his next job hop, and claim to be God's guru of IT metrics and that his salary be paid in platinum and slave girls. Don't give him the satisfaction.

simple math (1)

wcb4 (75520) | more than 6 years ago | (#20356365)

Its a simple equation.

productive_hours_per_period = ((number_of_hours_per_period * number of employees) - (number_of_hours_computers_are_down_or_unavailable * employees_affected)) / number_of_hours_per_period.

10 employees, a full week, only one person looses computer access for 4 hours

productive hours = (40*10) - (4*1) / 40 = 400-4 /400 = 396/400 = 99% productive

perfectly reasonable

A "Fortune 150" company (0)

Anonymous Coward | more than 6 years ago | (#20356377)

So you work for Countrywide?

Different measures (1)

Eponymous Bastard (1143615) | more than 6 years ago | (#20356383)

There are three aspects to system administration: Making sure everything keeps working, expansion and support.

For maintenance you'd want to look at simple downtime measurements. Then you can talk about schedule/unscheduled, etc. If you already have monitoring systems in place then this is relatively simple.

For expansion, I suggest you include any new expansion as a small "project", with a due date. Then you report %projects completed on time. The danger is having your boss then require a 10 page proposal, approval from 5 different people and a budget estimate, etc. But on most expansions that require nontrivial hardware you probably do at least the budget anyway.

For support (creating new users, fixing things, help desk stuff) you'd want to look at tickets/day and to use satisfaction surveys (automatically emailed?)

In fact, you might want to consider surveys as a regular tool. If you can show that most people consider the servers to be stable, then that's something. Of course, from the self-selected nature of these surveys, most people who fill them will have something bad to say about at least one aspect, so you have to keep that in mind

Most of these things are a good idea to record anyway, even if your boss isn't a bean-counter. Looking at trends and setting reasonable goals for your team is a reasonable way to detect problems and encourage improvement.

Graph employee turnover tasks (1)

sprior (249994) | more than 6 years ago | (#20356395)

Here are some things you can track:

- obsolete accounts archived/deleted
- phone passwords changed
- exit interviews logged

Learn to Love the Bomb. (1)

BlueBoxSW.com (745855) | more than 6 years ago | (#20356399)

You and your follow IT folks are in an uncomfortable situation, and you should admit that to yourself.

And the typical reaction of someone placed in a situation like this is to be as passive-agressive as possible.

However, I would reccomend a different approach.

There are two reasons why this beancounter wants you to quantify everything with numbers.

1) They don't know jack about how IT works,

2) They want to be able to divide those numbers into dollars, to get ratios and to figure out where adding dollars improves performance.

Arm yourself wisely. Read a book of TQM (Total Quality Management) that deals with software development or technical management.

You are going to have to understand a little bit of where they are coming from.

Second, tell the manager that you would be HAPPY to provide statistics (even if this isn't exactly true), but this doesn't come for free. If you're going to do this right, you need to be able to dedicate time and $$$ for your team to meet and tackle this problem, come up with a plan, and do the number crunching needed. This is important for them to understand, so you'll probably have to say it several times in several meetings.

Then get yourself to work on the stats. There's no single bullet, but understand you'll have both quantitative stats (uptime, number of tickets, number of tickets resolved/unresolved, gigs of network storage, bandwidth used, access time) and qualitative stats that you may need to assess with a survey of general or helpdesk users (happiness, responsiveness, satisfaction with IT tasks, network and application functions, etc.)

Provide the requested stats on a regular basis, and understand yourself how and why they fluctuate. And, this is the IMPORTANT part, use the stats yourself when debating with your boss over where to allocate resources and dollars. If they are going to make you collect them and are going to judge you on them, they need to involve you in decisions that will have a direct relation to the stats.

Good Luck.

SysAdmin Unit of Measure (1)

zenofjazz (614733) | more than 6 years ago | (#20356415)

According to the BOFH, wouldn't that be LARTs per Subnet?

Ticket System (1)

hackus (159037) | more than 6 years ago | (#20356423)

Simple.

Use a ticket system. I help customers setup RT.
(http://bestpractical.com/rt/)

How does that help?

Well, it allows coordination within a group, or shifted workgroups to identify and track changes in systems, and the status on certain things.

For example, did you change the iptables entries on the router for the primary hub in the network?

You then write a RT ticket entry on why, and what you changed.

Then everyone in your group gets a copy.

That way if something stoped working, or you made a typo people know "Ok, things stopped working, what has changed in the past 8 hours?"

You can use RT to track stuff like that.

If your a "Bean Counter from Hell", you can use it to track labor, and what was done over a given period of time and why.

At the end of the year, you can do some pretty interesting reviews of the what types of jobs were most in demand, what stuff likes to break all the time. (i.e. MMmmm...lots of windows problems on the desktops/servers. I think we should probably put more Linux boxes in this year to reduce our workload.) :-)

Give it a try!

-Hack

I say screw with their minds .... (1)

taniwha (70410) | more than 6 years ago | (#20356431)

produce a metric that proves that when you're doing your best job everything's running smoothly and you're sitting on your hands doing nothing ... perhaps an inverse correlation to 'hours spent on /. or playing warcraft'

Leave (1)

theunixman (538211) | more than 6 years ago | (#20356447)

If you leave, find another job (that likely pays more anyway), and they have to replace you, they'll figure out pretty quickly what metrics they should have used to measure your productivity. And you'll be in a job that doesn't try and manage knowledge workers the same way as unskilled labor.

Approximate Formulas (1)

Killer Eye (3711) | more than 6 years ago | (#20356465)

Here are some of the more basic formulas:

Hours spent solving user problems per week:
% lookupd -q user | egrep '^name:' | awk '{print $2}' | wc -l | perl -e 'print 1.5 * ;'

Hours spent putting up with useless management requests:
% find ~ -name '*.doc' -o -name '*.ppt' -o -name '*.xls' | wc -l

Hours spent reading news while at work:
% grep -i "href=" ~/.mozilla/firefox/default.isu/bookmarks.html | wc -l
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?