Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Tools For Understanding Code?

kdawson posted more than 6 years ago | from the getting-it dept.

Software 383

ewhac writes "Having just recently taken a new job, I find myself confronted with an enormous pile of existing, unfamiliar code written for a (somewhat) unfamiliar platform — and an implicit expectation that I'll grok it all Real Soon Now. Simply firing up an editor and reading through it has proven unequal to the task. I'm familiar with cscope, but it doesn't really seem to analyze program structure; it's just a very fancy 'grep' package with a rudimentary understanding of C syntax. A new-ish tool called ncc looks promising, as it appears to be based on an actual C/C++ parser, but the UI is clunky, and there doesn't appear to be any facility for integrating/communicating with an editor. What sorts of tools do you use for effectively analyzing and understanding a large code base?"

Sorry! There are no comments related to the filter you selected.

Wait for cenqua's solution (4, Funny)

ccguy (1116865) | more than 6 years ago | (#22094856)

I hear that the commentator [] guys are finishing a new product that instead of commenting your code is able to comment other's.

speaking of opc... (1)

airdrummer (547536) | more than 6 years ago | (#22095072)

other peoples' code...b sure 2 post the good stuff on [] ;-)

Stepping Through (5, Insightful)

blaster151 (874280) | more than 6 years ago | (#22094862)

I've always found that stepping through the debugger at runtime is a decent way to start making sense of a large code base. Easier, anyway, than trying to read static code printouts. Just set a breakpoint at a point of interest, fire up the application, and use it as a starting point. You get a sense for program flow and it's a great way to generate questions--lots of them. (What does class SuchAndSuch do? It looks like the application is handling remoting in such-and-such a fashion; is that right?) You can also choose one aspect of the architecture and selectively ignore or step over other aspects, building up your understanding one aspect at a time. In my case, with Visual Studio as a development environment, I can hover the mouse cursor over variable names to see their current values. In the case of variables of a certain type, like datasets or XML structures, I can use realtime visualizers to browse the contents and get a much better feel for what's going on.

If there's no one at your company that can help answer your questions and bring you up to speed, I feel for you - your employers ought to know enough to give you some extra margin. It can be very hard to take over a large code base without some human-to-human handover time.

Also, is it an object-oriented system? I assume that it's not, based on your post, but you don't say either way. If it is, the important aspects of program flow often live in the interactions between classes and objects and the business logic is decentralized. OO is great, but it can be harder to reverse-engineer business logic because it's distributed among various classes. A debugger that lets you step through running code is almost essential in this case.

Re:Stepping Through (4, Insightful)

daVinci1980 (73174) | more than 6 years ago | (#22095068)

This post is dead on.

Place a breakpoint somewhere you think will get hit (e.g. main), and then start stepping over and into functions. I usually attack this problem as follows:

Place breakpoint. Use step-in functionality to drop down a ways into the program, looking at things as I go. What are they doing, how do they work, etc.

Once I feel like I understand how a section of code works, I step over that code on subsequent visits. If I feel like this isn't taking me fast enough, I let the program run for a bit, then randomly break the program and see where I am.

Lather, rinse, repeat.

Also, this should go without saying, but you should ask someone who works with you for a high-level overview of what the code is doing. The two of these combined should get you up to speed as quickly as possible.

Re:Stepping Through (2, Informative)

The_reformant (777653) | more than 6 years ago | (#22095570)

Absolutely since joining the real world I have found the visual studio debugger my most prized tool. Somehow I managed all through my degree to never come into contact with one (probably because all the free ones are rubbish and most schools won't shell out for visual studio). I now extol the virtues of debugging to all and sundry!

The best tool (-1, Troll)

teknopurge (199509) | more than 6 years ago | (#22094882)

A college degree in something CS/related will help you.


Re:The best tool (0)

Anonymous Coward | more than 6 years ago | (#22095310)

A college degree in something CS/related will help you.

How about creating a new tag:

        "Troll needing ego boost"

Re:The best tool (2, Insightful)

Anonymous Coward | more than 6 years ago | (#22095406)

The best programmers I've ever worked with didn't have degrees. But some of the worst ones did.

Re:The best tool (0)

Anonymous Coward | more than 6 years ago | (#22095408)

It was a serious question, and your reply is not only not helpful, it stinks - and probably so do you.

How / why did you get the job... (-1, Flamebait)

BarnabyWilde (948425) | more than 6 years ago | (#22094884)

...if you don't understand the language?

Seems basic to me.

Re:How / why did you get the job... (2, Insightful)

wampus (1932) | more than 6 years ago | (#22094972)

Sometimes its hard to follow execution, especially in a large codebase. Its made even more difficult when a smug jackass wrote it to be as terse as possible.

Re:How / why did you get the job... (4, Insightful)

Jeremi (14640) | more than 6 years ago | (#22094992)

One might as well ask, why are you posting smarmy retorts when you clearly didn't understand the question? The question was about understanding the program, not the underlying language.

Re:How / why did you get the job... (1)

geekoid (135745) | more than 6 years ago | (#22095022)

I have seem some pretty awfully used languages.
I started at one company, and they had functions that were 1600 lines long, with gotos.

Not easy to understand, and very complex.

Good open source tool for understanding any code.. (0, Redundant)

mazanoid (1114617) | more than 6 years ago | (#22095322)

There's this great opensource package called OpenEyes, and to my knowledge it only requires nominal installation effort by the user. Basically you just have to configure the face.cfg to provide the correct balance of tension and flexion to the ocular.modules.

Hope it helps.

Doxygen (4, Informative)

Raedwald (567500) | more than 6 years ago | (#22094886)

For C++ code, Doxygen [] can be useful, as it shows the class inheritance. As requested, it uses a (rudimentary) parser. It works with several other languages too, although I can't vouch for its utility for them.

Re:Doxygen (1)

PetriBORG (518266) | more than 6 years ago | (#22095096)

Doxygen I thought did java-doc like parsing for C++? I was thinking he should look for something able to build a UML diagram based on the code... I hate UML, but if there isn't any documentation telling you the structures of the code it might be a place to look.

I would google for that, but I'm under deadline myself... (but yet still reading /. - I think its an addition).

Re:Doxygen (3, Informative)

zeekec (795504) | more than 6 years ago | (#22095324)

Doxygen can produce UML diagrams for undocumented code. (UML_LOOK and EXTRACT_ALL)

Re:Doxygen, and Extracting Software Architectures (5, Informative)

Mr.Bananas (851193) | more than 6 years ago | (#22095130)

I use Doxygen for C code, and it is really helpful. One of its most useful features is that it generates caller and callee graphs for all functions. You can also browse the code itself in the generated HTML pages, and the function calls are turned into links to the implementation. Data structures and file includes are also pictorially graphed for easy browsing.

If the system you need to understand has a really big undocumented architecture, then this presentation [] might be useful to you (there is a research paper, but it's not free yet). In it, the authors present a systematic method of extracting the underlying architecture of the Linux kernel.

Re:Doxygen (1)

JamesP (688957) | more than 6 years ago | (#22095300)

I second this. The important thing is to enable all the 'graph' options, as well as call graphs and other stuff. That will be most useful.

As such, it does reverse engineering of code, showing inheritances, clall graphs, include graphs, etc, etc

Only problem is, it is a pain to configure. Also, windows versions don't look very stable.

When I was your age... (2, Interesting)

russotto (537200) | more than 6 years ago | (#22094890)

I use the Mark I eyeball, grep, emacs, and of course, the little gray cells.


Re:When I was your age... (2, Funny)

Mr. Underbridge (666784) | more than 6 years ago | (#22095624)

When I was your age...I use the Mark I eyeball, grep, emacs, and of course, the little gray cells.


They have lawns at the old folks' homes these days?

Re:When I was your age... (1)

jdschulteis (689834) | more than 6 years ago | (#22095644)

Last time I faced this problem (about 5 years ago), emacs, ctags, and grep got the job done.

I don't understand why those young whippersnappers modded you "Funny".

Re:When I was your age... (0)

Anonymous Coward | more than 6 years ago | (#22095676)

Having just recently taken a new job, I find myself confronted with an enormous pile of existing, unfamiliar code written for a (somewhat) unfamiliar platform -- and an implicit expectation that I'll 'grok' it all 'Real Soon Now'. Simply firing up an editor and reading through it like I did in uni has proven unequal to the task. I'm familiar with Java programming too, but I was never taught to analyze program structure; I've got a very fancy suit but only a rudimentary understanding of procedural syntax. A new-ish tool called an 'IDE' looks promising, as it appears to be based on an actual langauge parser, but the UI is clunky (I have to use the keyboard), and there doesn't appear to be any facility for integrating/communicating with a developer. What sorts of tools do you use for effectively analyzing and understanding the basic skills of your job?

Paper (2, Insightful)

raddan (519638) | more than 6 years ago | (#22094918)

You should really be sitting down and attempting to understand the code, ASAP. Asking Slashdot for fancy tools isn't really going to help you. The real barrier here is your own brain.

Re:Paper (0)

Anonymous Coward | more than 6 years ago | (#22095152)

I agree, stop wasting time searching the web for useless tools and get to reading the code.

Re:Paper (1)

plopez (54068) | more than 6 years ago | (#22095532)

Damn. You beat me to it. I would also suggest developing domain knowledge. Reading code is useless unless you understand the what and why of the problems being solved.

Re:Paper (2, Interesting)

bunratty (545641) | more than 6 years ago | (#22095664)

I don't think I've ever been able to understand a large body of code by simply looking at it. I've always found that attempting to make modifications (fixing bugs, adding features) to the code gets me to understand it fairly quickly. Often, I'll find myself adding comments or cleaning the code up as I go. There have been times when I've just thrown all the code away and reimplemented the same functionality form scratch. That may not be an option here, but perhaps writing an implementation of part of the code from scratch will help to gain an understanding how that particular feature is implemented.

Re:Paper (1)

jklappenbach (824031) | more than 6 years ago | (#22095844)

Tools help. Several people have mentioned DOxygen. I've used it in the past on commercial projects. New developers coming in found its output to be of great help in understanding the general structure of the code, the hierarchies (they were C++ projects), and as a reference to quickly identify candidate classes for modifications or the likely source of bugs.

Mod parent down for being arrogant and patronizing.


doxygen (3, Informative)

greywar (640908) | more than 6 years ago | (#22094922)

If its in a language that doxygen can understand, thats the tool I would HIGHLY recommend.

Ctags (3, Insightful)

pahoran (893196) | more than 6 years ago | (#22094948)

google exuberant ctags and learn how to use the resulting tags file(s) with vim or your editor of choice

Old School (4, Funny)

geekoid (135745) | more than 6 years ago | (#22094958)

Printouts and colored markers.

Re:Old School (1)

alienpeach (930248) | more than 6 years ago | (#22095704)

While using colored markers seems to be funny, as our moderators have noted, this is actually extremely helpful. Taking a "book" of code, a few highlighters and mug of coffee is the best scenario for understanding complex code.

Understand C++ (5, Informative)

SparkleMotion88 (1013083) | more than 6 years ago | (#22094978)

Sorry I don't have an open source tool for you, but I've used Understand for C++ [] in the past and it was pretty helpful. To me, the most useful piece of information for understanding a large codebase is a browseable call graph. I'm sure there are simpler tools out there that generate a call graph, but this is the only one I've used with C++.

Re:Understand C++ (1)

TacoBellGrande (701502) | more than 6 years ago | (#22095632)

I've used both Understand C++ and doxygen (with graphviz) to understand the code of those who are no longer available for me to bug. They both have their strengths and weaknesses, so I tend to use both.

RR & EA (3, Informative)

Anonymous Coward | more than 6 years ago | (#22094988)

Sometimes tools like Rational Rose [] or Enterprise Architect [] are successful at reading in the code an building a UML model that you can then attempt to parse through. I'm not familiar with the use of either, but I know it can be done, with mixed results depending on the size and complexity of the code being analyzed. Both tools are fairly expensive though, I believe.

Re:RR & EA (1)

wfeick (591200) | more than 6 years ago | (#22095510)

I've used Borland's Together in the past and found it really helpful for C++/Java code. It can be really helpful for coming up to speed a code base's class hierarchy. Unfortunately, when I tried it on a large C++ code base where I'm currently working, after loading the code base in it seemed to go into some sort of a analysis phase and then eventually crashed.

I'm not sure what the problem was. A sales droid called to check in on my download, passed the crash info on to a techie, and said I'd get a call back. About a month later another sales droid called, said essentially the same thing, and I never heard back. :-(

Slightly off topic, but at a previous company we ran Together inside a VNC session as a virtual whiteboard during design sessions with a distributed team. That made it really easy for everyone to visualize the designs we were discussing, and produced code for us as well.

It's a great tool, but apparently has its limitations.

Eclipse works extremely well for Java... (1)

mario_grgic (515333) | more than 6 years ago | (#22095006)

if your code is not Java, then I go back to Vim and ctags as a fast start that can be setup in a few minutes (and it works for everything from assembler to Java). It will help you navigate code fast, follow function calls etc, but it won't help you visualize class hierarchies or help you to figure out all the places a function is called from like Eclipse does for Java.

Your best bet is to look for a good IDE specific to the language the code is written in. But as far as I know nothing comes close to the power of Eclipse exploration tools for Java, for other languages since not even Eclipse works as well for say C/C++ as it does for Java.

Re:Eclipse works extremely well for Java... (1)

AmaDaden (794446) | more than 6 years ago | (#22095368)

I just started working a company with a horrifying code base and was using Eclipse. Eclipse did a fantastic job of helping me jump around the code (oh how I love you CTRL + left click) but the code it self was still hard to read. I figured out that in Eclipse you can do a LOT more color coding than is used by default. This seems trivial but now with a glance I can get a good deal more information on the scope and type of a variable or function then before. I highly recommend looking in to it. I have to note that I am doing Java so I'm not sure how well it'll work for C/C++.

Reverse Engineer? (1)

dotpavan (829804) | more than 6 years ago | (#22095024)

For Java, would reverse engineering the code to UML diagrams help? Any good open source tools one could recommend to understand a large code base?

Re:Reverse Engineer? (1)

samkass (174571) | more than 6 years ago | (#22095504)

For Java, he probably wouldn't be having this problem as acutely in the first place. The reduced syntax compared to C++ makes many of the hacker types hate Java, which makes Java twice as good in my book. It also makes everything a lot clearer. In addition, the dynamic nature of the language combined with the compact syntax means even the free tools like Eclipse have excellent analysis capability, and tools like IntelliJ offer phenomenal ability to introspect the code.

But yes, C is a few percent faster and lets the hackers go to town, so some companies still choose it.

lxr (1, Informative)

Anonymous Coward | more than 6 years ago | (#22095026)

I often use LXR for understanding the kernel, but have used it for other large code bases. If you pair it with some sort of sticky note firefox add-on it becomes particularly useful. []

You must have inherited my old project (5, Funny)

theophilosophilus (606876) | more than 6 years ago | (#22095062)

Sorry about that.

Delete it! (1)

Besna (1175279) | more than 6 years ago | (#22095074)

Make all interfaces use explicit typing (no plain "int"s around, everything clearly signed or unsigned--better yet, use uint32_t and the like from stdint.h). Use one width if possible--whatever your CPU prefers (usually a uint32 or uint64). Learn it by refactoring it. Delete code whenever possible. Kill "#if 0"'s laying around.

When I am particularly frustrated (1)

antifoidulus (807088) | more than 6 years ago | (#22095076)

I find that a hammer works well. Not so much for understanding the code, but it CAN help relieve computer-created stress!

What I do (5, Informative)

laughing_badger (628416) | more than 6 years ago | (#22095078)

SourceNavigator : A good visualisation package []

ETrace : Run-time tracing []

This book is worth a read []

Draw some static graphs of functions of interest using CodeViz []

Write lots of notes, preferably on paper with a pen rather than electronically.

Re:What I do (1, Informative)

Anonymous Coward | more than 6 years ago | (#22095536)

Use sourcesinsight for C/C++/java/C#/perl/ksh/etc programming languges. It is very light, yet powerful IDE and could be used to browse thru code.

I have used for code bases more than 3000 C/C++ files and yet the IDE behaved well -- jusk like Eclipse for java platform, and consumes very less memory

Doxygen (0)

Anonymous Coward | more than 6 years ago | (#22095082)

How about Doxygen [] ; see their site [] ? Gives you the whole OO inheritance structure, lists of function caller/callees (if desired), graphical representations, etc, etc. And it lets you browse through the code with a web browser...

Non-sequitur time (1)

14erCleaner (745600) | more than 6 years ago | (#22095110)

I'm not exactly answering your question, but in my experience nothing helps you learn about somebody else's code like having to find and fix bugs in it. Just diving in with a specific goal in mind. The next best thing is having somebody who's familiar with it draw you a diagram of the overall structure. Comments in the program, or external documentation, are usually too much to hope for.

Re:Non-sequitur time (1)

morn (136835) | more than 6 years ago | (#22095556)

I absolutely agree. You don't need to understand all the code, you just need to be able to follow the part you're dealing with to fix whatever bug or interface with whatever part you're working on. Don't get me wrong, getting to know the overall architecture is something you should do (hopefully there are some old employees who can draw some block diagrams on a whiteboard for you or something - if not, that's probably something you should try yourself with a bit of archeology of the code), but knowing the ins-and-outs of the whole codebase is not something you should even attempt - you don't need to know all the code in that level of detail.

In my experience, even after two years in my current job, management are still perfectly willing to accept an answer of "I don't know that part of the code very well, give me some time to look into it and get back to you" when they ask me about a bug or a prospective new feature.

Re:Non-sequitur time (1)

TigerNut (718742) | more than 6 years ago | (#22095696)

The next best thing is having somebody who's familiar with it draw you a diagram of the overall structure.

I find the best thing is to do the drawing myself. It might take a couple of attempts, but in the process you have to dig into the details and discover the structure. The extra interaction with the code gives a more indepth understanding. If I can draw it, then I can create it, fix it, and explain it to someone else. If I can't draw a particular thing to a desired level of detail (whether it's a piece of hardware, mechanical construction, or software) then I don't really understand it.

Answer (4, Funny)

hey! (33014) | more than 6 years ago | (#22095116)

Yes. Understanding code is one of thing things you hire tools for.

Wait, were you talking about software?

Re:Answer (1)

WK2 (1072560) | more than 6 years ago | (#22095678)

Yes. Understanding code is one of thing things you hire tools for.

Yes, but what happens when the tool asks Slashdot how to understand the code?

Re:Answer (1)

hey! (33014) | more than 6 years ago | (#22095774)

He gets gratuitously mocked for a cheap laugh by people with a pathetic need to be perceived as clever.

doxygen - with full source option (2, Interesting)

mhackarbie (593426) | more than 6 years ago | (#22095122)

I agree with the previous recommendations for Doxygen. A while back I wanted to become familiar with the source code for a game engine and tried various tools to help with the 'grok' factor. I found the doxygen docs, with full source code generation in html, to be the fastest and most convenient way to walk around the code. After a while, it just clicked.

Creating small demo apps that use the code can also help.


Re:doxygen - with full source option (1)

sheltond (252356) | more than 6 years ago | (#22095742)

Yes - I have used doxygen for both C and C++ code. When using the full-source option it can be quite slow, but in conjunction with the "dot" tool it produces quite nice call graphs.

See [] for info. [] (to pick a random example found on google) has an example of a doxygen-produced page giving both an include-file graph (at the top) and a call graph (at the bottom).

As you can see it gives a quite nice at-a-glance overview of the program's structure. It will happily produce individual pages for each function in your program showing a graph of functions that call into it and all of the functions it calls.

Note that the boxes in the diagrams are hyperlinked to the corresponding page for that function/header-file.

GNU Global (3, Informative)

Masa (74401) | more than 6 years ago | (#22095134)

GNU Global is able to generate a set of HTML pages from C/C++ source code. This tool has helped me several times. All member variables, functions, classes and class instances are hyperlinks. It provides an easy way to examine source code. It also provides tags for several text editors (for Vim and Emacs especially). []

Re:GNU Global vs HyperAddin for Visual Studio (1)

pg--az (650777) | more than 6 years ago | (#22095432)

HyperAddin is actually merely on my "list of things to try", I have never actually installed it even. It's at [] , part of "Microsoft's open source project hosting web site". On a new project, theoretically it would be great to link things up as you believe you understand them. On the other hand I have met folks who would actually delete all comments from something they are trying to understand, but that philosophy goes too far, I think a grain-of-salt is what you want.

Imagix 4D (1)

Imagix (695350) | more than 6 years ago | (#22095142)

Imagix 4d ( [] ) was a rather interesting tool the last time I looked at it.

Umm.. documentation? (5, Insightful)

Anonymous Coward | more than 6 years ago | (#22095144)

Seriously folks, having spent large chunks of my working life having to decipher the mess of those who came before me I cannot stress enough the importance of clear comments, variable/function names, and consistent and readable syntax. AND WRITE F@#$%ing HUMAN READABLE DOCUMENTS DESCRIBING FUNCTIONAL REQUIREMENTS, ALGORITHMS USED, LESSONS LEARNED, ETC.
Calling all your variables "pook" or the like may be very cute, but does not help me figure out what the heck the function is supposed to do or why I would ever want to call it. Yes it's a pain. Yes we're all under time deadlines and want to get it working first and go back and document it later. And yes, it WILL bite you in the ass (ever heard of karma? your own memory can go and then you have to decipher your OWN code!).

That said, if you have inherited a code base from someone who ignored the above, go through and generate the documentation yourself. Write flow charts and software diagrams showing what gets called where and why. Derive the equations and algorithms used in each piece and figure out why the constant values are what they are. Finally, start at the main function or reset vector (I do a lot of microcontroller development) and trace the execution path.

Re:Umm.. documentation? (3, Funny)

Skewray (896393) | more than 6 years ago | (#22095850)

Why? I can write crap and you can clean it up. This is Division of Labor, which is the basis of our civilization.

visual studio (0)

Anonymous Coward | more than 6 years ago | (#22095146)

I just use visual studio even though the code is not MFC or windows. As long as it is C/C++, it works fine. VS is a great development tool and has all the features (and more) you are asking for build in.

Documentation Documentation Documentation (1)

DrLang21 (900992) | more than 6 years ago | (#22095148)

If any documentation describing the code or at least functions in plain language exists (and for the love of God it always should) start there. If it doesn't, advise that your company start making documentation for any new code (not that you should expect them to listen).

Osmosis (2, Insightful)

Greyfox (87712) | more than 6 years ago | (#22095150)

If the original developer made useful comments that will help immensely. If there's a design document showing how the program fits together that helps a lot. If there's a process document explaining the business logic the application implements, that helps a lot. On average you'll start with a marginal code base with no comments, no design documents and no explanation of what the application is attempting to accomplish.

Get the guys who use it to explain what they're trying to do, read the code for a couple of days and then have them show you how they use the application. Then plan on six months to a year to get to the point where you can look at buggy output and know immediately where the failure is occurring. In the mean time just work in it as much as you can and don't try to redesign major parts of it until you know what it's doing.

Last time I had to do something similar... (1)

ByOhTek (1181381) | more than 6 years ago | (#22095156)

I had to do something similar a while ago with a poorly documented piece of software, I pulled out visio (it's what we have here, I'm sure there are better tools for the job, but it worked well enough), and made a diagram of what-called-what. Even without the why/conditionals, that helped me a lot (the names made more sense), on parts where I had trouble, I'd go to the lower levels, figure out what they did, and document those functions in the visio diagram.

That is what I would do in your situation, except:
(A) If you can, find something better than Visio. It beats a paint program, true, but it is still irritating for the task (any recommandations here?).
(B) If you use visio, you probably don't want to make my mistake of doing the drawing on an 8.5x11 sheeet. 85x110 might be better... Assuming you won't print it out...

Re:Last time I had to do something similar... (0)

Anonymous Coward | more than 6 years ago | (#22095370)

Use death trees to draw on?

Constant switching between source and 'visual program' is hell.

Re:Last time I had to do something similar... (1)

ByOhTek (1181381) | more than 6 years ago | (#22095550)

to each his/her own. I had the source editor on one desktop, and Visio on the other. A key command switched desktops, so I could read something, edit the visio diagram, and go back fairly quickly. Code more resembles that kindof diagram in my head anyway, so I didn't have trouble.

I don't see dead trees as being any easier for me personally. Too much erasing makes them hard to read, and there was a lot of moving/erasing.

Re:Last time I had to do something similar... (0)

Anonymous Coward | more than 6 years ago | (#22095608)

Dia is free, and runs on linux/windows. It also has a wide variety of templates available as well.

perl and graphviz (1)

Speare (84249) | more than 6 years ago | (#22095172)

I had to do this sort of "unfamiliar code analysis" with an ancient FORTRAN application written by non-software guys in the 1980s. It was some of the worst spaghetti I'd seen in some time.

To make any sense of it, I asked the compiler for a call tree report, and then I fed this through Perl to make a GraphViz "dot" file of it. After a few shuffles, I could start to determine some architecturally related areas and refactor slightly to decouple them into a more clear arrangement of modules. It was still crap, but it was at least something that I could understand to the point of making unit tests and coverage tests.

Re:perl and graphviz (0)

Anonymous Coward | more than 6 years ago | (#22095480)

I had to do this sort of "unfamiliar code analysis" with an ancient FORTRAN application written by non-software guys in the 1980s.

Was it called AWSIM?

Don't attempt the impossible... (4, Insightful)

namgge (777284) | more than 6 years ago | (#22095174)

and an implicit expectation that I'll grok it all Real Soon Now

It is unlikely that your job is really to 'grok it all'. Most likely there are specific issues that need to be solved - stop panicking and pick the simplest one on the list and start working on it.

In a similar position to you, I followed Brook's advice to study on the data structures and found it good. Also just running the application under a debugger, inserting breaks in important looking code and then having a look at the call stack when that code was used also proved enlightening. A good debugger also lets you explore the data structures.

When smart-asses tell you "Bill would have fixed that in ten minutes." I recommend replying "I never met Bill, why do you think he left?"


Re:Don't attempt the impossible... (1)

Lumpy (12016) | more than 6 years ago | (#22095306)

When smart-asses tell you "Bill would have fixed that in ten minutes." I recommend replying "I never met Bill, why do you think he left?"

that one works great, Problem is most of the time smart-asses are not the ones doing that, but incredibly stupid managers.

"manager of Marketing, XXX did it in 15 minutes... why are you taking so long?"

The parent's response is perfect for these situations... it shuts them up instantly.

Etags (2, Interesting)

__david__ (45671) | more than 6 years ago | (#22095176)

Emacs and etags are your friend. Meta-. zips to the function under the cursor. C-s for incremental search. Meta-x grep-find for any other search.

Also, run the program with a debugger and step through it. Or put some print statements in key places and see what it produces.

I find that's all I ever need.

well.. (0)

Anonymous Coward | more than 6 years ago | (#22095178)

How about doxygen?

Some tips for you... (0)

Anonymous Coward | more than 6 years ago | (#22095180)

You stepped into a bees nest, Getting in the place where you now maintain some other guys code can be a nightmare. Specially if your management is clueless.

1- communicate. The only way they know is if you tell them, if you run up against a pile of spaghetti code that is nothing more than a ugly half arsed hack, tell them. Tell them that it is going to take more time because of the last guy's mess. Being honest is better than being a yes man and acting like you can do anything they ask on their time line.

2- Dont be afraid to ask, Last job I had like that.... The previous guy did everything on his personal copy of the tools and took them with him, if you need to purchase anything tell them you need to buy XYZ at $$$$ cost and why. Justification goes a long way.

3- dont be afraid to let their deadlines slip. It's not like you can control this, You cant know the stuff like the last guy overnight, some code I have here I have worked with for 2 years and I still dont fully understand it... (we are replacing it with something that is not a nightmare) I let deadlines they impose slip all the time if I am not in control. And I let them know this in the meetings when they set the deadline.... "That one will be missed unless we budget way more for it." If you attach dollars to their deadline, they usually move their deadline.

4- Talk to them about getting things replaced with proper solutions. Maintaining that MS access nightmare that some guy in Marketing created 5 years ago is not a real solution, it needs to be replaced with a real solution, let them know.

Re:Some tips for you... (1)

Schraegstrichpunkt (931443) | more than 6 years ago | (#22095482)

4- Talk to them about getting things replaced with proper solutions. Maintaining that MS access nightmare that some guy in Marketing created 5 years ago is not a real solution, it needs to be replaced with a real solution, let them know.

Here is a useful bit of vocabulary for explaining why this is so: technical debt [] .

Muhahaha (1)

roman_mir (125474) | more than 6 years ago | (#22095182)

I hate people who refuse writing requirments / design documentation stating that good code must be self-explanatory.

Now you can hate them too!

Start with the application type (0)

Anonymous Coward | more than 6 years ago | (#22095200)

and reverse-engineer the analysis diagrams and approaches common to that type of problem rather than getting down into the weeds of specific classes and calls.

For instance, a business app is data centric, so start by understanding the persistent data structures and relationships. If your code is real-time or event driven, try to back out state transition diagrams. If this is a web server app, try to extract use cases from the client's point of view. And so on.

It's MUCH easier to learn the details of specific parts of the code when you know the broader what/why.

Read The Fine Manual (1)

frith01 (1118539) | more than 6 years ago | (#22095204)

1. Understand any documentation or diagrams that explain what the high level purpose is of the processing. 2. Document the inputs / outputs of the system 3. Determine if it is Object Oriented, Procedural, multi-code language, etc. 4. Search for IDE's for step #3 5. Identify code repository used to manage the code. (If none exist, please submit resume to another corporation quickly). 4. Identify the first layer of processing, and sort out the important sections of code. (Is Input translation the most important, is it the User Interface / middle processing, or are the outputs the most important.) 5. Look at most recent changes / bug fixes based on Issue reporting / QA tracking . 6. Dive into most important sections first.

Reading code is no good to start with. (1)

pt99par (588458) | more than 6 years ago | (#22095274)

There are some people here who says that you should read and understand the code but that is just stupid to start reading the code. The best thing to do is to use tools to analyze the code with tools so that you can look at the system at different abstraction levels. People that say that you should start at code level have probably never had a real job or have never seen a system with 50k+ loc. When i analyze a system i use grep,sed and graphviz to draw various diagrams for me at different levels. In that way i can understand the sytem much quciker and i dont have to understand all the details directly. After that i can zoom in to the details by starting at the right parts of the system. So try to find good tools and if you cant find any try to use graphviz in combination with your text processing tools of choice.

Codesurfer (1)

ximor_iksivich (666068) | more than 6 years ago | (#22095308)

You could have a look at CodeSurfer [] which is a program slicing tool for c/c++. I found it extremely useful for analyzing programs. To make full use of it, I would recommend reading the manual in its entirety :)

I have an idea (1, Funny)

Anonymous Coward | more than 6 years ago | (#22095312)

You could try posting the code here and maybe some kind people at slashdot can help.

Source Insight or SlickEdit (1)

dgoldman (244241) | more than 6 years ago | (#22095314)

Source Insight and SlickEdit are not open source but trials are there for either.

In my opinion, having a good editor that allows quickly jumping to definitions or references is the best tool out there. Understand works but in the end wasn't as helpful as I hoped. Take a look at these and pick your poison. I like them both but prefer Source Insight for the windows machines and SlickEdit for Linux.

Try everything you can. Find what works for you. Yea, I know, not much help.

hmm. (1)

apodyopsis (1048476) | more than 6 years ago | (#22095326)

Thats like a carpenter asking for a nail gun because the hammer is too complicated to use. As with all trades get to grips with the basics first, if you really cannot make a dent on your code mountain then are you sure you should be doing the job? No disrespect intended.

I find, when in similar situations, start in main() and stroll down the call tree. I also make a beeline for interrupt handlers and pointers - but then I specialize in embedded software so bear in mind that my advice might be as useful as rice paper underpants. I suspect that the same idea holds true for most SW. For OO work I try to get a mental image of all the classes before I picture how they stand together.

Of course, as my profession is full of considerate, professional engineers all the code is clearly labeled and structured. riight.

Re:hmm. (0)

Anonymous Coward | more than 6 years ago | (#22095778)

[...] my advice might be as useful as rice paper underpants. [...]
I wear rice paper underpants, you insensitive clod!

Just Do It (1)

BAH Humbug (242702) | more than 6 years ago | (#22095350)

Without someone else to lead you through the code set, the best option is to go make a small change that is desired by someone. That person becomes your customer and has a vested interest in confirming that your change works. Don't try to understand the whole code set -- just study the section(s) you think need to change to fulfill the request. Repeat. You'll build an understanding over time.

Add unit tests as you make changes to demonstrate how a section of code is used and to capture existing behavior. When you feel comfortable, begin refactoring sections which you found obtuse. If someone complains that you have broke something, add a test to make sure it doesn't happen again.

Understand the design first, then the code (4, Informative)

Anonymous Brave Guy (457657) | more than 6 years ago | (#22095364)

I'm afraid you've set yourself an almost impossible task. IME, there are no shortcuts here, and it it's going to take anywhere from a few months to a couple of years for a new developer to really get their head around a large, unfamiliar code base.

That said, I recommend against just diving in to some random bit of code. You'll probably never need most of it. Heck, I've never read the majority of the code of the project I work on, and that's after several years, with approx 1M lines to consider.

You need to get the big picture instead. Identify the entry point(s), and look for the major functions they call, and so on down until you start to get a feel for how the work is broken down. Look for the major data structures and code operating on them as well, because if you can establish the important data flows in the program you'll be well on your way. Hopefully the design is fairly modular, and if you're in OO world or you're working in a language with packages, looking at how the modules fit together can help a lot too. Any good IDE will have some basic tools to plot things like call graphs and inheritance/containment diagrams, if not there are tools like Doxygen that can do some of it independently.

If you're working on a large code base without a decent overall design that you can grok within a few days, then I'm afraid you're doomed and no amount of tools or documentation or reading files full of code will help you. Projects in that state invariably die, usually slowly and painfully, IME.

useful tools for groking large code bases (0)

Anonymous Coward | more than 6 years ago | (#22095388)

you seem to be looking for commandline tools, but their never really going to offer a great way to visualize a new complex program, although they can be quite useful in development.

ide's with class browsers, like eclipse (w/ cdt for non java) or openkomodo are pretty good aids.

for source search, cross linking, and highlighting, the best tool i've come across is opengrok - []

if your more apt to build your own tool, there are a couple of nice libraries out there, scintilla has cross platform language parsers. silvercity builds a python api for looking examing language constructs. for ruby the recently released ohloh contains parsing capabilities, []

also the venerable exuberant tags, is a must for non ide development environments. its a great tool in conjuction with flexible environments like emacs, or textmate.

of course, nothing beats a good debugger, and stepping through the runtime execution of the code paths.

Tools For Understanding Code? (1)

mattboston (537016) | more than 6 years ago | (#22095440)

I thought those were called programmers?

kcachegrind (1)

Akatosh (80189) | more than 6 years ago | (#22095456)

kcachegrind [] is very nice for a lot of languages. It makes an easy to read function call map, among other things.

Look at doxygen/umbrello (3, Informative)

Yiliar (603536) | more than 6 years ago | (#22095458)

See: []
and: []

These tools allow you to 'visualize' a codebase in several very helpful ways.
One important way is to generate connection graphs of all functions.
These images can look like a mess, or a huge rail yard with hundreds of connections.
The modules, libraries, or source files that are a real jumble of crossconnected lines are a clear indication of where to start clean up activities. :)

Good luck!

Vim and etags (0)

Anonymous Coward | more than 6 years ago | (#22095492)

I use [g]vim with etags. This works really well, even for exploring complex code like the Linux kernel.

Wait 'till you get to reading the specs... (2, Interesting)

crovira (10242) | more than 6 years ago | (#22095546)

That should be good for a laugh or three.

They'll be out of date, full of inconsistencies and incomplete.

Then you'll be reading the code only to discover that people's idiosyncrasies and personalities definitely affects their coding styles. (There's even some gender bias where women tend to set a lot of flags [sometimes quite needlessly] and decided what to do later in the execution while men code as if they knew where they were going all the time, just that when they get there, they're missing some piece of information or other.)

If you read code developed by a whole team of people, you'll get to know them, intimately.

Good luck. You'll be at the bar in no time... I kept the stool warm for you.

The Classics (1)

Dunx (23729) | more than 6 years ago | (#22095552)

In you situation, the thing you need to use most is your voice: talk to people who already understand the code.

The last time I had to do this (with no documentation, meaningful code comments, or engineering support - no voice option!) it was in a mixed-language code base too.

My tools of choice were:

* etags - like ctags, but supporting pretty much any block-structured language. So navigating from Delphi code into C# code actually worked.

* vim - reads etags files, and of course it is my editor of choice.

* grep - etags doesn't work so well on finding references, nor on qualified names in Delphi (and why should it? I was delighted it understood Delphi at all)

Other tools that were used in the team included Eclipse, Visual Studio and Delphi for the parts that they could each understand but jumping across languages was hard in those IDEs.

Then we wrote lots of wiki pages and I drew UML diagrams to capture program structure. We got there in the end, but it was a hard road.

But it was a nasty mess and I sympathise with your predicament.

massive printfs (1)

cathryn (133574) | more than 6 years ago | (#22095610)

Try to change something. Maybe try to fix a bug, something repeatable, but non-cosmetic. Guess names and grep for objects that look like they have about the right name, put in a lot of 'log print' statements and run the thing, adding more log printing as needed. Repeat this every day of your life for about a year.

Read the Comments (0)

Anonymous Coward | more than 6 years ago | (#22095620)

I find that the best way to catch up is to read the helpful comments left behind by the original developers.

They often contain such helpful gems as "Once again the SCU team folks decided to Ass/U/Me that the replication would occur in the ORD node so we have to come in and clean up their mess again. Thanks a lot Dave!"

What's even more fun is when the variable names are encoded: DO WHILE R1 LT D4; R1++ { ...

R1?!? Not so bad when there are only two variables. Mental Sudoku when there are 25.

mod dokwn (-1)

Anonymous Coward | more than 6 years ago | (#22095640)

what we've known t2he project of OpenBSD versus here, please do

The Slashdot attitude (2, Insightful)

gaspyy (514539) | more than 6 years ago | (#22095734)

I'm appalled by some of the comments that imply that the poster may not be fit for the job.

A few years back I had to maintain a large module written in C#. I had about 200K lines of code, 50 classes, zero documentation, zero comments, zero error logging support, and I was expected to find and fix bugs and add functionality the day after the module was handled over.

So if you were never in this position, just STFU. Yeah, the code is there, but is this flag for? Is this part really used, or is obsolete? What are the side-effects of using that method? And so on...

Eventually, I learned it, especially after some intensive debugging sessions, but it was frustrating to say the least. I would have loved to have some aiding tools.

sourcenav-NG (0)

Anonymous Coward | more than 6 years ago | (#22095806)

At least one poster mentioned Source Navigator. I second
this as a good choice for digging into the structure
of several programming languages. I've used it off and
on for several years (even bought a copy back when it was
a cygnus product). I think the original project
(sourceforge page) is unmaintained (last news posting
was in 2003), so it is a challange to build on
a modern linux distirbution (there is a windows
binary as well).

There is a fork working to update the package
SourceNavigator NG. I was able to build their
release with no problems. []

I've used it for C, C++, Java, and some Python.

I highly suggest giving it a look.

Robert Wood

Editplus (0)

Anonymous Coward | more than 6 years ago | (#22095832)

I've used Editplus 2 for years and years - it parses code and color-codes the different elements (functions, variables, strings...).

Where be dragons? (2, Informative)

mm4 (1089615) | more than 6 years ago | (#22095862)

Apart from Understand for C++, I'd also suggest SourceMonitor - [] It will at least quickly point you to potentially problematic parts (long functions, deep nesting, etc.).
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?