Learning and Maintaining a Large Inherited Codebase?

Follow Slashdot blog updates by subscribing to our blog RSS feed

Learning and Maintaining a Large Inherited Codebase? 532

Posted by timothy on Friday February 12, 2010 @08:09PM from the bequeathed-and-devised dept.

An anonymous reader writes "A couple of times in my career, I've inherited a fairly large (30-40 thousand lines) collection of code. The original authors knew it because they wrote it; I didn't, and I don't. I spend a huge amount of time finding the right place to make a change, far more than I do changing anything. How would you learn such a big hunk of code? And how discouraged should I be that I can't seem to 'get' this code as well as the original developers?"

This discussion has been archived. No new comments can be posted.

Learning and Maintaining a Large Inherited Codebase?

Load All Comments

Search 532 Comments Log In/Create an Account

Comments Filter:

Time (Score:5, Interesting)

by wmbetts ( 1306001 ) writes: on Friday February 12, 2010 @08:12PM (#31122206)

If you don't have access to the original developers and they didn't document it you're going to just have to spend a lot of time reading the code. =\

Share
twitter facebook
- Re: (Score:2)
  
  by dintech ( 998802 ) writes:
  
  You might want to come up with a few good reasons (other than just the ones you stated above) for doing a clean-room re-write of the damn thing. This might give you a chance to give the users something better than they already have or that interfaces better with other systems in your enterprise. It's a long shot but doing the requirements gathering and developing it yourself might be more fun than just learning it through reading. Good luck!
  - Re:Time (Score:5, Insightful)
    
    by Anonymous Coward writes: on Friday February 12, 2010 @09:32PM (#31123122)
    
    Everyone, including me, always wants to go for the clean rewrite. But in my experience it almost never turns out for the best. There's a reason for all that messy code. Much of it was bug fixes that real-world users needed. Other complexities were needed in the first place to make the user experience simple (natural, giving it that "hey, it's just works like I expected" feeling).
    The reason you don't understand the code is that you weren't part of the original design discussions, in which weeks or months were spent learning, debating, arguing, etc., about many different design decisions at many different levels of abstraction. You don't know why the trade-offs were made. You just see the finished product.
    Rewriting the code won't give you insight into any of this. Learning the code the hard way, fixing bugs, rewriting *small* pieces and seeing what breaks the regression tests, etc. will eventually help you to understand it.
    There is no point in rewriting it before you fully understand it. Attempting that can kill a product. Conversely, by the time you fully understand it, there won't be any need to rewrite it, because you'll own the code.
    
    Parent Share
    twitter facebook
    - Re: (Score:3, Insightful)
      
      by ztransform ( 929641 ) writes:
      
      There is no point in rewriting it before you fully understand it.
      I fully support this statement.
      I recently worked with a guy new to contracting. He came onboard to a project that had a lot of problems. He argued for re-writing it thinking he could do it quickly and simply; I didn't dispute that the system could use significant changes, and I asked him to read through and understand the existing code.
      He never did.
      Consequently I suggested to senior managers that he should be let go. Reading other people's code, particularly undocumented code, is painful - even for ex
- I'm afraid the time may already have passed (Score:2, Interesting)
  
  by Chris Newton ( 1711450 ) writes:
  
  If both the original developers and the knowledge they had have been lost, then it is probably already too late to perform any major maintenance on this code base. The project has already entered its “servicing” stage.
  At that point, you basically have two possible approaches that actually work: you can restrict maintenance to small-scale changes, which may be sufficient if the goal is just to keep the project ticking over for a while, or you can accept The Big Rewrite (which isn’t so big i
A good starting point (Score:4, Interesting)

by RCL ( 891376 ) writes: on Friday February 12, 2010 @08:13PM (#31122210) Homepage

Try to single-step it in debugger from the beginning up to main loop.

Share
twitter facebook
- Re: (Score:3, Insightful)
  
  by robot256 ( 1635039 ) writes:
  
  I didn't get this one until I switched to my alter ego, the assembler programmer.
don't feel bad at all (Score:5, Insightful)

by iggymanz ( 596061 ) writes: on Friday February 12, 2010 @08:14PM (#31122234)

So you have been handed the steamin' pile o' code, it is great that you are very cautious and deliberate when modifying it. Make a set of regression tests, that is, make a set of test data and procedures and expected results to ensure original functionality that is still desirable is still working and no other errors introduced. It is hard, much more tedious than just creating new code with few constraints.

Share
twitter facebook
- Re:don't feel bad at all (Score:5, Insightful)
  
  by kaiser423 ( 828989 ) writes: on Friday February 12, 2010 @09:45PM (#31123232)
  
  Definitely what parent said. Also:
  
  I have inherited huge code bases. I actually kind of like it. Lots of people whom I thought were idiots, and cursed their code, I later found out that they were quite smart. Others, I found that they just thought about problems vastly different than I, and learning how they tackled problems gave me many more tools in my personal arsenal.
  
  That said, find a big wall or something. Use a debugger or code analysis tool to find the main execution paths (what calls what and when, etc). Diagram that up on the wall really large. Then use the tools to determine when and why certain auxiliary functions get called. Diagram that up, and you'll start getting a spider on your wall. Go from there using your new understanding to re-arrange the program flow not in terms that make sense to you, but rather seem to be how they are programmed (functional, objective, some pattern). Rinse and repeat until you know pretty much what the code is trying to accomplish in 90+% of the situations, and it's general plan for attack.
  
  With that diagram, dive in! There's tons of little details in every function that look useless but are usually bug fixes. Use a scalpel, not a hatchet.
  
  I was deployed remotely with no way for the main programmer to get at me. We had prepared 9 months to collect 4 minutes of data, and the test wouldn't wait for us. I found an odd bug hidden somewhere in ~22k lines of code. I did this over a weekend, and found about 4-5 nasty bugs that were combining to produce what I was seeing, and fixed them. I did this with zero input or help, over a weekend in code I had never seen spread around about 60 files. I spent the first half day just diving in and trying things, and nearly shot myself. That's when I went high-level and dug in from there.
  
  When that was done, I the took over code maintenance and updates on that project. The other guy had wrote it 100% himself, but because after that exercise I knew the code better him. Sometimes being new is good; you don't have all that cruft of implementations that didn't work, etc, but still linger in the original programmer's head.
  
  Parent Share
  twitter facebook
Use Doxygen (Score:5, Insightful)

by gbrandt ( 113294 ) writes: on Friday February 12, 2010 @08:14PM (#31122238)

Doxygen is your friend. run it over the source code and keep the HTML handy for searches and cross references.

Share
twitter facebook
- Re: (Score:2, Informative)
  
  by eggy78 ( 1227698 ) writes:
  
  I have found that equally useful to Doxygen's standard documentation are the caller/callee graphs (and the source browser as well!). These features are invaluable but they don't get used when you generate documentation with a more-or-less default config.
It depends on the language (Score:5, Funny)

by $RANDOMLUSER ( 804576 ) writes: on Friday February 12, 2010 @08:15PM (#31122256)

If it's Perl or VB, you might want to consider self-immolation as a first step.

Share
twitter facebook
- Re: (Score:2)
  
  by rocker_wannabe ( 673157 ) writes:
  
  Simply running out of the room screaming "No!!!!!!" should suffice. There IS life after programming, believe it or not.
  - Re:It depends on the language (Score:5, Informative)
    
    by martin-boundary ( 547041 ) writes: on Friday February 12, 2010 @08:37PM (#31122538)
    
    No, he meant that as an actual offering to the Perl God, Quetzal$@[&shift]L. It's a bloodthirsty god, who never sends the Divine Debugger without at least two pints of the red stuff. I would have immolated a coworker, but the parent poster seems to have been alone in the room :-/
    
    Parent Share
    twitter facebook
    - Re:It depends on the language (Score:5, Funny)
      
      by chill ( 34294 ) writes: on Friday February 12, 2010 @09:58PM (#31123338) Journal
      
      No, he meant that as an actual offering to the Perl God, Quetzal$@[&shift]L. It's a bloodthirsty god, who never sends the Divine Debugger without at least two pints of the red stuff. I would have immolated a coworker, but the parent poster seems to have been alone in the room :-/
      The fact the above comment is +5 Informative and not +5 Funny makes me very glad I stopped programming in Perl when I did.
      
      Parent Share
      twitter facebook
- Re: (Score:2)
  
  by budgenator ( 254554 ) writes:
  
  I was on fire once you insensitive clod.
- Re: (Score:2)
  
  by LostCluster ( 625375 ) * writes:
  
  VB6's actually very easy to understand when you have the code...
  1. You can control-break at any point in program and be shown exactly the line you're executing and step through with F8 or resume at full speed with F5.
  2. You've got a rather nice project-wide search tool to find functions and subs that the old programmer wrote.
  3. You've got an immediate pane for simulating "What if X was set to..." situations.
  4. The previous programmer likely left behind date-stamps in the OS so if a user can tell you when th
- Re: (Score:3, Informative)
  
  by BerntB ( 584621 ) writes:
  
  Funny you should say that.,,
  I quite like this reference from the Perl world about understanding large systems: http://www.perlmonks.org/?node_id=788328 [perlmonks.org]
Not lots of code (Score:5, Insightful)

by www.sorehands.com ( 142825 ) writes: on Friday February 12, 2010 @08:16PM (#31122260) Homepage

First of all, 30-40,000 lines of code is not lots of code. Try, 250,000 of code.
To start, use a good programming editor/environment (Xcode, Vslick, Visual Studio, etc.) that gives you the ability to easily go to definition or references to variables, functions, structs and such. Run some sort of profiler or flowchart type program on it to get a high level view of the code and how it fits together. If you can get the person(s) who worked on it before you to give you an idea of it fits together.

Share
twitter facebook
- Re:Not lots of code (Score:5, Insightful)
  
  by Coryoth ( 254751 ) writes: on Friday February 12, 2010 @08:49PM (#31122696) Homepage Journal
  
  First of all, 30-40,000 lines of code is not lots of code. Try, 250,000 of code.
  To start, use a good programming editor/environment (Xcode, Vslick, Visual Studio, etc.) that gives you the ability to easily go to definition or references to variables, functions, structs and such.
  30-40,000 lines can be lots of code, it really depends on how maintainably it is written. I've had to pick up codebases that were somewhat smaller but were still diabolical ... good programming environments don't buy you much when the code consists of functions that are many thousands of lines long making little or no use of typedefs or structs (arrays and lots of variables should be enough right?) and convenient variable names like 'e', 'ee', and 'eee'. Even small codebases can become practically incomprehensible if written with little thought given to long term maintenance.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by dgatwood ( 11270 ) writes:
    
    Fair enough. On the other hand, badly written code is self-limiting in size. It almost never gets particularly large because if it is that hard to maintain, it will also be extremely hard to expand in any useful way. Usually by the time it gets past about 10-15,000 LOC, it has to be at least somewhat sensible.
    I tend to agree that 30,000 LOC is not at all large. My trivial little web photo gallery is 8k lines of code. At work, I maintain and periodically enhance a relatively small tool that's about 37k
  - Re: (Score:3, Funny)
    
    by snowgirl ( 978879 ) writes:
    
    so like... perl?
    More percisely 30-40,000 lines of code is 29,999-39,999 times more lines than one needs to write shitty code...
- Re: (Score:3, Insightful)
  
  by greg1104 ( 461138 ) writes:
  
  Sure, if you only have a trivial 250K lines of code, I guess you can use crappy tools like Xcode and Visual Studio to maintain your project. The rest of us have to use grown-up tools that look like this:
  src$ find . -print | xargs wc | tail -n 1
  1950894 7085675 56777966
  There's only one way to learn your way around a new codebase, and the worst thing you can do is use a tool that aims to help with the job. Want to know how stuff flows through the program? Find where the program starts and draw the
Hunt down the original developer (Score:5, Funny)

by Anonymous Coward writes: on Friday February 12, 2010 @08:16PM (#31122276)

(And then shoot him.)

Share
twitter facebook
- Re: (Score:2)
  
  by istartedi ( 132515 ) writes:
  
  (And then shoot him.)
  With Lisp?
  - Re: (Score:2)
    
    by $RANDOMLUSER ( 804576 ) writes:
    
    shoot(huntdown(original developer))
    - Re:Hunt down the original developer (Score:5, Funny)
      
      by ottothecow ( 600101 ) writes: on Friday February 12, 2010 @09:13PM (#31122936) Homepage
      
      shouldn't that be more like shoot(huntdown(first(developers)))?
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by Matheus ( 586080 ) writes:
        
        No... that would be shoot(huntdown(car(developers)))
Not at all. (Score:5, Insightful)

by hemorex ( 1013427 ) writes: on Friday February 12, 2010 @08:17PM (#31122286)

I find that if the other programmer wrote it in such a way where it's too complex for me to follow, I'm not the one who's a moron.

Share
twitter facebook
- Re:Not at all. (Score:5, Insightful)
  
  by tsm_sf ( 545316 ) writes: on Friday February 12, 2010 @08:43PM (#31122620) Journal
  
  Man, always when I run out of mod points.
  Nothing like being handed a steaming plate of spaghetti and hearing about how much of a "genius" its creator was.
  
  Parent Share
  twitter facebook
  - Re:Not at all. (Score:4, Insightful)
    
    by CorporateSuit ( 1319461 ) writes: on Friday February 12, 2010 @09:22PM (#31123028)
    
    Yes, but there's also when you hire the new guy, fresh from college, and he sits down at his work station. After a few days of getting absolutely no work done, he comes to you and tells you he wants to rewrite the core 50K lines of tested, trusted company code because he thinks it's not written "by the book". To which, the only sane reply is "You touch that code, and I will set you on fire."
    
    Parent Share
    twitter facebook
  - Re: (Score:3, Insightful)
    
    by ajlisows ( 768780 ) writes:
    
    Then again, the creator MAY have been a genius. Perhaps he was told "Put this enormous program together in one month or the company is screwed." In cases like that, poorly thought out algorithms, bloated classes, using variables with names like "x", "y", "z" with no comments, nothing really works except for the absolute bare minimum required and other coding no-nos probably do not seem that important. Given appropriate time and resources, perhaps he could have written the greatest code EVAR! Given a ver
- Re: (Score:2)
  
  by Jane Q. Public ( 1010737 ) writes:
  
  To add to that:
  
  What language is it in? That could make a big difference in our answers. But in general, if it is very old code it should at least contain comments. If it was written in the last few years, the code should be in discrete sections that are organized in a logical manner. If not, then they were either seriously old-school programmers, or hacks.
Visualisation (Score:5, Informative)

by gilleain ( 1310105 ) writes: on Friday February 12, 2010 @08:17PM (#31122290)

Anything ranging from just sketching out some informal package diagrams on some paper (I quite like using an A3 sketchpad) to something more like Code City [inf.usi.ch] which can work with code in smalltalk, java, and c++. There are UML diagram makers, of course, but automated diagrams like that probably need to be edited.
In fact, it is not the finished diagram that helps so much as the drawing of it, which is why paper and pencil is so good. Or a vector graphics package.

Share
twitter facebook
Use it (Score:2)

by mosb1000 ( 710161 ) writes:

The only way to learn the code is to work with it. Simply reading through it won't help, you have to go try to change things and see what works and what doesn't.
The main thing that bothers me when working with other peoples code is the sheer number of variables they use. I tend not to declare a new variable unless it is absolutely necessary (and in object oriented programming variables other than pointers are almost never necessary). It seems like code written this way is easier to read and understand
- Re: (Score:2, Interesting)
  
  by EvanED ( 569694 ) writes:
  
  Am I off base here? What do you think about intermediate variables that are not strictly necessary?
  I can't say you're off base per se (I don't have nearly enough production dev experience to make statements like that, and even if I did, I couldn't speak for everyone), but my personal style is not quite the complete opposite of yours.
  I pretty heavily use intermediate variables. Why? A couple big reasons. One, if you give the temporary variables decent names, they serve as additional documentation. Two, if yo
  - Re: (Score:2)
    
    by mosb1000 ( 710161 ) writes:
    
    From a documentation standpoint, I have never found descriptive variable names to be good enough. The problem is that while the programmer may have a good idea what he means when we chooses a name, and indeed that name may make a lot of sense if you already have a good understanding of how the code works, someone new who is unfamiliar with the program will not understand it because they do not know how the code works. In the mean time, it's a lot of work to track back through intermediate variables (espec
- Re:Use it (Score:4, Insightful)
  
  by phantomfive ( 622387 ) writes: on Friday February 12, 2010 @09:45PM (#31123230) Journal
  
  What do you think about intermediate variables that are not strictly necessary?
  Use them if they make things clearer for someone reading the code, otherwise don't. For example, you can write:
  
  screen.displayName = user.firstName + user.lastName;
  
  or you can write
  
  String fullName = user.FirstName + user.lastName; screen.displayName = fullName;
  
  Thus making it more clear to someone reading that you are trying to use the full name. That is probably not the best example because anyone would probably understand that user.firstName + user.lastName is the full name, but I think you can see the main point, that sometimes it can be easier to read a few meaningfully named intermediate variables than a long equation. If it isn't easier to read, don't do it. But really when I read code, or even write it, I am willing to conform to either way of doing it if someone else feels strongly about it, because that is far less important than things like flexibility of major structures in the code.
  
  Parent Share
  twitter facebook
- Re:Use it (Score:5, Informative)
  
  by ciggieposeur ( 715798 ) writes: on Friday February 12, 2010 @10:37PM (#31123616)
  
  What do you think about intermediate variables that are not strictly necessary?
  My general rules of thumb:
  1) I don't care how many variables are declared, so long as each makes sense on its own. Like another poster's example, 'fullName' is perfectly acceptable (especially for i18n/l10n aware code that may have different rules for generating a name).
  2) I ABSOLUTELY HATE clever arithmetic / pointer arithmetic / expressions all crunched into one line that can be split out. Example: in C-like languages that support pre- and post-increment, I expect the code to use only one or the other consistently, and never mix it with another expression. So this is fine:
  i++;
  j = i + 4; ...but this I can't stand:
  j = ++i + 4;
  #2 I picked up from a very experienced developer who pointed out that making the code harder to read is never worth it, the compiler produces the same code as the easy-to-read version. And that making code that looks 'too easy to be clever' is quite a bit harder than making code that looks 'too clever to always work'.
  
  Parent Share
  twitter facebook
- - Re: (Score:2)
    
    by mosb1000 ( 710161 ) writes:
    
    I said pointers are variables. . .
    variables other than pointers are almost never necessary
    That's what "other than" means.
- - Re:Use it (Score:4, Informative)
    
    by mosb1000 ( 710161 ) writes: <mosb1000@mac.com> on Friday February 12, 2010 @09:12PM (#31122924)
    
    Not without variables, but without unnecessary ones. For example, someone might write:
    int a;
    int b;
    int c;
    int d;
    int e;
    int f;
    int g;
    a = dropBox1.Value;
    b = dropBox2.Value;
    c = dropBox3.Value;
    d = dropBox4.Value;
    e = a + b;
    f = c + d;
    g = e * f;
    result.Value = g;
    While I would write:
    result.Value = ( dropBox1.Value + dropBox2.Value ) * ( dropBox3.Value + dropBox4.Value );
    
    Parent Share
    twitter facebook
30-40kloc is not large (Score:2)

by aachrisg ( 899192 ) writes:

I wouldn't try too hard with a codebase as small as 30-40k lines, but for an actually large codebase, there are a bunch of different things that can help: - examine a class or function hierarchy and call graph. If you have tools to do so and the codebase is set up for it, go ahead. If not, set up the tools and codebase to be processed for this - you'll learn stuff about the code just by hooking these tools up. - pick medium-level routines in the code base that you are interested and run the applicaiton
Trace sessions and time (Score:5, Insightful)

by oldhack ( 1037484 ) writes: on Friday February 12, 2010 @08:22PM (#31122354)

I'll echo some earlier comments.
Set up an execution environment with debugger, and run several typical scenarios and trace them with debugger. Get the feel of the big-picture execution scenarios/paths.
It will take time for your brains to get comfortable with it, though. And the details, when you look into them, will throw odd stuff at you. But that's the nature of our work.

Share
twitter facebook
Tried and True (Score:3, Insightful)

by cosm ( 1072588 ) writes: <thecosm3NO@SPAMgmail.com> on Friday February 12, 2010 @08:22PM (#31122356)

For culinary folks...
The time and money you spend tracing and inserting noodles in the spaghetti will end up being larger than the time it takes to cook a new batch (no pun intended).

For auto folks...
The time and money you spend bondo-ing, welding, rewiring, duct-taping, and C'n'Cing parts for the car will end up being larger than the time it takes to design and build a new car. (Although restoring an old/vintage car for the sake of nostalgia is a much more pleasing experience than buying a new one).

Gain an understanding of the purpose of each pivotal region. Know what your desired result should be, then begin the rewriting endeavor.

Share
twitter facebook
- Re:Tried and True (Score:4, Insightful)
  
  by Piquan ( 49943 ) writes: on Friday February 12, 2010 @09:34PM (#31123138)
  
  These projects invariably have lots of tiny gotchas that you're going to steamroll in your effort to rewrite it. See Joel on Software on this [joelonsoftware.com].
  
  Parent Share
  twitter facebook
Some things I do to figure out code... (Score:2, Interesting)

by CFBMoo1 ( 157453 ) writes:

PL/SQL or cobol or whatever they throw at me I poke, prod, and play with it in a test environment. Someone up above mentioned pencil and paper to draw out how everything relates and that is a very good practice I've found to just get to know things. It's not instant but it helps more then you initially think. Also I use Open Office Draw to map out things as well. :P
2000 lines can be enough (Score:2)

by sugarmotor ( 621907 ) writes:

2000 lines can be enough to throw you off!
I think it is just like learning anything. Keep at it.
The most important thing is whether you have an efficient way to
look at what effect any changes have that you may make. Any effort you put into
that is probably not going to be wasted. (Might be unit tests? Sounds like they did not come with the code)
Stephan
Read the source! (Score:2)

by Deflatamouse! ( 132424 ) writes:

Seriously... if there is a lack of documentation, then you just have to start reading the source code, starting at main(). Then look at each object and read its constructors.
And start documenting it. Add comments in the code, create inheritance diagrams and sequence diagrams.
It will be tedious but you will come out of it a better programmer.
You don't. You find out what the software did (Score:5, Funny)

by Colin Smith ( 2679 ) writes: on Friday February 12, 2010 @08:36PM (#31122520)

And then you re-implement it in the latest language.

Share
twitter facebook
- Re: (Score:2)
  
  by mikelieman ( 35628 ) writes:
  
  Good luck with that. There a business rules implemented by people who aren't there anymore for people who aren't there anymore. And it's all tied to whether $variable_1 is an "A" or "B" and $variable_2 being 999.
Hope your management understands (Score:4, Insightful)

by syntap ( 242090 ) writes: on Friday February 12, 2010 @08:37PM (#31122536)

I have inherited projects and do my best to convince management that a pause is needed to document the code. Personally I try to flowchart the functionality and cover a couple of office walls with Visio printouts. Later on I can use such work to add detail and further documentation.
I inherited some code where the developer used names of girlfriends in variable names, it was just dumb and completely unprofessional. I didn't worry so much about keeping track of those, I was more worried about a change in one spot having unintended (and perhaps unknown until too late) consequences. Rather than spend time fixing problems, I thought it best to do some up-front documenting to at least provide a path to successful maintenance.
When I left the project, the manager had a binder of documentation and almost cried.

Share
twitter facebook
- Re: (Score:2)
  
  by Jane Q. Public ( 1010737 ) writes:
  
  I inherited a Web site that was not only done in a goofy manner, nothing was documented at all. The customer didn't know who the host was, what the passwords were, and so on and so on. Nothing.
  
  My philosophy was that since the customer is footing the bill, nothing should be secret. I spent a bit of time hunting down host, account info, domain name info, contact info, etc, etc... writing it all down in an organized format, and gave it to the customer, rather proud of myself for being professional when the
- Re: (Score:3, Funny)
  
  by greg1104 ( 461138 ) writes:
  
  I inherited some code where the developer used names of girlfriends in variable names, it was just dumb and completely unprofessional.
  I once inherited a coding project where the naming conventions involved anti-depressant, anti-anxiety, and sleeping drugs. Let me tell you, that's a fun preview of how one's future working on the project might turn out.
Try to learn the structure (Score:5, Insightful)

by phantomfive ( 622387 ) writes: on Friday February 12, 2010 @08:39PM (#31122558) Journal

I had an English professor who always said, "Structure is the key to understanding." He was talking about literature, but I think the same is true for programs as well.

Try to understand the structure of the program. What is the basic flow? It should have an initialization routine, a main loop, and a shutdown routine. Find out roughly where they are, then focus on the main loop. Usually there will be one piece of code that is central, and it will occasionally pass control into other large pieces of the program. Sometimes there will be more than one main loop, and control switches back and forth between the various main loops. If the program is event drive, this will make a difference in the structure.

If you are just trying to make a small change, try to find the sequence of events that will lead up to where that change needs to be made. Follow the sequence of execution until you get to the line you need to change. If you are changing a single variable, sometimes it's helpful to do a search and find all the places that variable is used, to make sure your change won't have any side effects. This may seem time consuming, but it can save 10 times more in debugging.

Learn to follow code execution with your eyes, without running a debugger. One thing that separates good coders from not so good coders is the ability to follow code that isn't being executed.

Share
twitter facebook
- Re: (Score:3, Interesting)
  
  by Trepidity ( 597 ) writes:
  
  Depending on the language and domain, one way to speed up learning the structure can be to see if you can match it to some set of programming idioms, and then read up on those idioms if it's not a style of programming you're familiar with. For example, if it's C++, can you figure out by looking at the code's layout whether it was written by someone big into C++ design patterns? If so, it might be easier to reverse-engineer what it's doing if you read a C++ design-patterns book, and then match large segments
use grep (Score:2)

by AeiwiMaster ( 20560 ) writes:

There is a tool called grep which is very useful.
Done that.. (Score:3, Funny)

by spasm ( 79260 ) writes: on Friday February 12, 2010 @08:53PM (#31122746) Homepage

As someone who recently passed off a pile of code of about that size in poorly written and poorly documented php to someone.. All I can say is I'm very very sorry, and I had *no idea* my personal side project would work better than the original commercial offering and be declared 'mission critical' three months before I left for greener pastures..

Share
twitter facebook
Quit (Score:2)

by codepunk ( 167897 ) writes:

I just took the easy way out and quit. I had inherited about 30K lines of php code
that was written by my boss. Shortly after inheriting this spagetti mess I ran a grep
across the source the word "function" did not occur a single time in the entire source
tree. To top it all off I was not to rework any of it only maintain it as it was going
away. I did end up installing it on about 5 new machines so going away anytime soon
was not going to happen. On top of all that I would run into about 20 blocks of if
statemen
Divide and Conquer (Score:4, Informative)

by Whomp-Ass ( 135351 ) writes: on Friday February 12, 2010 @08:57PM (#31122788)

Identify each major portion of functionality. If you are working with a sales/billing system you would probably end up with : Orders, Invoices, Payments, Admin.
Go through each of those portions and identify the major portions. Orders: Order headers, Order details, business logic, ui logic, reports, datalayer, etc. Repeat until reduced into easily consumable units.
Pick and stick to an SDLC. Use whatever fits the situation and the resources. For a small project (under 100k lines of code) you should be good by yourself. Anything more and you'll have to involve at least 1 other person for testing. For medium (100k-500k lines) you'll need an additional dev...For large projects (500K-5M lines) you'll need a project manager, lead dev, 2 devs, 1 test, and a UAT team...For larger projects you'll have something unique and frightening to the specifics of the software project and corporation/agency making it...anyway, I digress...
Go through each subdivision line-by-line and re-write it yourself (even if you aren't going to put your re-written version into production); the only way you're going to truly understand what is going on is if you do it yourself. Use whatever language you are most comfortable with or is most appropriate to the task (or languages), it does not need to be the same as the original.
Verify that for a given input, your version produces an exact output.
Take a deep breath. It's not a race. It's a one-to-one functional mapping of your software (your mindspace) and the original software (the other developer(s) mindspace(s)). The code probably will not be straight forward. It has also been battle-scarred and will be warty. Changes of initial requirements through time and feature enhancements (feature creep) will have taken it's toll on what may have originally been something simple or even elegant. It's something of a niche mindset and if it is not for you, there exist many other exciting things to be programming.
Ultimately, if you do as outlined above, you'll solve many problems, be able to make whatever changes you like, and in so doing have a way to present your design as a replacement if you want...Or not, if you don't; for 30-40k lines parallel development makes sense, in a way, for one person.

Share
twitter facebook
That's small (Score:3, Interesting)

by ameline ( 771895 ) writes: <ian...ameline@@@gmail...com> on Friday February 12, 2010 @09:02PM (#31122846) Homepage Journal

Medium size is 250 to 750 thousand lines of code (one person can still understand how it all works). Big is 1 to 10 million lines of code. Really big is >10 million.
I have worked on code bases of all of those sizes, and I like the medium size the best -- it's big enough to be interesting, and small enough that you can understand it all.
One that I've worked on (over 25 million lines) is just too big for my tastes -- over 3 hours to do a clean recompile is excessive.

Share
twitter facebook
Don't be discouraged, just keep at it (Score:2)

by rxan ( 1424721 ) writes:

Don't be discouraged. It's not like English where everyone writes in a familiar way. Everyone writes code a little differently and it is hard to go through it. Even with good commenting it can be difficult. Just persist and hope that you can contact one of the original authors.
40,000?!? ARE YOU KIDDING ME? (Score:2)

by raftpeople ( 844215 ) writes:

When I was programming we did every project in 5 lines of code, or less, period. Anything more than that was just fancy stuff!
- Re: (Score:2)
  
  by Surt ( 22457 ) writes:
  
  Sure, but the medical policy must have been ridiculous to cover all the RSIs from the scrolling.
Fix small bugs (Score:2)

by Midnight Thunder ( 17205 ) writes:

I have been given projects of this nature and the best approach is to document what is obvious and then use bug fixing as a way in to the code. While it won't give you a complete picture, it should help you understand what is immediately important, and serve as guide posts for knowing more in the future. Generally I try not to spend too much time trying to understand everything, since its a waste of time, unless that knowledge is guaranteed to serve you - sometimes the client just wants it be tweaked once i
Design patterns are your friend (Score:2, Interesting)

by PerlPunk ( 548551 ) writes:

"A couple of times in my career, I've inherited a fairly large (30-40 thousand lines) collection of code. The original authors knew it because they wrote it; I didn't, and I don't."
A couple of times in your career? You must be lucky. Most jobs you can get coding will always involve taking over someone else's code.
In my experience, design patterns are your best friend, bearing in mind that most of the code base will always remain a black box to you.
For example, when I was doing some health insurance work,
As a maintenance programmer (Score:5, Informative)

by npsimons ( 32752 ) * writes: on Friday February 12, 2010 @09:42PM (#31123202) Homepage Journal
As someone who has done probably 90% of his work in maintenance programming, let me give you my tips:
- Snapshot what you get - don't change it, don't even look at it. As soon as you get it, check it in, binaries and all, to a change tracking system (eg, CVS, SVN, etc).
- Now that you know what they gave you, and you can get back to it at any time, your options are seemingly limitless, but for the quickest way to get up to speed, I would recommend writing unit tests for the software. This will be long and tedious, but by writing unit tests you will a) learn what to expect out of the software, b) be able to tell when you break something and c) truly learn the software.
- Automate, automate, automate! It's a close call as to whether you should start right away on your first unit test, or get the build system automated, but let me just say that it will save you a ton of time to have a "one button push" way to build, run and test the software. From there, you should be having your machine build and run the unit tests automatically, preferably nightly, from a clean checkout of the repository, just in case you forget to run a test after you change something or you forget to check something in.
- Run the software (including unit tests) through the gauntlet - valgrind's memcheck, electric fence, fuzz, bfbtester, rats, gcc's -fstack-protector-all flag, libc's MALLOC_CHECK_=3, gcc's _FORTIFY_SOURCE=2 define, gcc's -fmudflap flag, gcc's -Wall -Wextra and -pedantic flags; any way you can think to flush out bugs, do it, and start fixing them; you will learn much, not just about the code, but about the thought process of the original coder(s) this way. Change tools as appropriate for your programming language and environment (including compiler/interpreter, libs, OS, etc). As you can tell, I do a lot of C and C++ programming.
BTW, the fact that you have a hard time understanding this code may be more a reflection on the original authors' coding skills than on your abilities; any idiot can write code that "just works"; it takes a lot of thought, time and effort to write code that is maintainable, and more often than not, the original coders were short on at least one of those (if not all three). Here's hoping you have the time to follow my above tips; they take a lot of time, but can be worth it if you really need to maintain the code. It's funny to note that apart from the first one, most of those tips apply equally well to developing software from scratch. If the code already has a change tracking system, unit tests, a build/run/test system, *and* automated testing, consider yourself lucky and just start picking apart the unit tests.
Share
twitter facebook
- Re: (Score:3, Informative)
  
  by bill_mcgonigle ( 4333 ) * writes:
  
  truly learn the software.
  And then if your unit tests work you'll know enough to comment the code correctly for the next time you or your successor comes back to it.
My Dick is Bigger than Your 250,000 lines of code (Score:5, Interesting)

by BlueBoxSW.com ( 745855 ) writes: on Friday February 12, 2010 @10:16PM (#31123466) Homepage

Really. A guy asks a question for help and all of these people keep telling him 30-40,000 lines of code isn't much.
That's a lot of code to get your arms around if you didn't write it. It's not the end of the world, but it is a sizeable task, and is the type of topic that few professional journals or books will ever be written about.
Having been in similar situations, I my advice would be:
1) Try to get an understanding of the history of the code. Who wrote it? Why? How many developers? How long has it been around? Do people love it or hate it? Is there a version control system in place you can use for information?
2) Look at it from a technical viewpoint. Is is complete? Does it compile and run? How many languages are used? Are there interfaces with other systems you need to know about? What dependancies are there? How easy is it to setup a test server? What parts are well coded? What parts stink up the joint?
3) Dig for functional documentation. What does it do? For whom does it do it? What business needs does it support? How mission critical is it?
4) Meet with the business owners. Seriously. This helps you do two things: #1-- Define the real business need (which may be different than what was understood by the previous developers), and #2-- Set appropriate expectations about maintenance. You'll work hard to maintain and keep it working, but you are working from a disadvantaged position. It is important they know this and support you in your efforts, rather than complain loudly when something doesn't work.
5) Plan to remove the dead weight. There's always a lot of dead weight in these near-abandonded projects. Get an idea how to simplify things and plan your work in phases.
6) Setup real test and development servers. Yeah, you know that wasn't already done.
7) Use version control. But you know this. It's 2010, and no developer worth his/her salt would code a paying project without version control. Right?
8) All fixes will take much longer than if you wrote the code, so be careful with estimating time.

Share
twitter facebook
- Large? (Score:3, Insightful)
  
  by VirginMary ( 123020 ) writes:
  
  Ha, ha! Just 4 months ago I joined a project with a code base of about 500k lines. I would call that (the 500k lines one) intermediate in size. There are code bases with many millions of lines. I now feel pretty comfortable finding things in it. And I mostly use find and grep.
  - Re:Large? (Score:5, Insightful)
    
    by snowgirl ( 978879 ) writes: on Friday February 12, 2010 @08:47PM (#31122662) Journal
    
    Ha, ha! Just 4 months ago I joined a project with a code base of about 500k lines. I would call that (the 500k lines one) intermediate in size. There are code bases with many millions of lines. I now feel pretty comfortable finding things in it. And I mostly use find and grep.
    At my job at Microsoft, we were in the support end of the core os group. That meant that core os wrote WinXP, Server 2003, Vista, etc, and then it got completely moved over to us to maintain.
    Unfortunately, Windows doesn't really have find and grep, but it does have "dir /s /b [pattern]" and "findstr /sipc:"[pattern]"" Once I learned those, that's a lot of what I used to find the code that I needed to fix.
    All I can say is that it takes time, and effort to become familiar... and you're just stuck with it.
    
    Parent Share
    twitter facebook
    - Re: (Score:2)
      
      by Tawnos ( 1030370 ) writes:
      
      If you're here, then you should know that \\shindex\search has a fully indexed codebase for all branches.
      As for getting acquainted with the code - find places that need improvement, learn them, learn how they interact with their immediate dependencies and neighbors, continue up and out. 30-40k lines is tiny in the grand scheme of code.
      - Re: (Score:2)
        
        by snowgirl ( 978879 ) writes:
        
        If you're here, then you should know that \\shindex\search has a fully indexed codebase for all branches.
        Oh, I knew about shindex... there was also an internal webpage that one could use to search all the codebases as well.
        I however didn't have to deal with all the codebases, I had to deal with one and only one at a time in general, and typically the code was checked in last night, because if it were checked in the night before, it would have broken the build that previous night.
        Actually, Product Studio provided tons of information (better than any code indexing service that was available) about what just chan
      - Re: (Score:2)
        
        by snowgirl ( 978879 ) writes:
        
        Why are you divulging Microsoft's proprietary secrets? What is your employee ID?
        \\shindex\search isn't really a Microsoft proprietary secret... it's more just corporate culture... like talking about Blue Badges vs. Orange Badges. Outside of Microsofties, you're likely to get a bunch of "Huh?" But it doesn't divulge anything about Microsoft business practices.
        As for me divulging information about MS, I haven#t worked there in about two years, and I don't think that there is any duty of care to ensure that I don't share any trade secrets... any of them that I have are old, and most like
    - Re: (Score:2)
      
      by timmarhy ( 659436 ) writes:
      
      are you for real. google win grep, or are you going to tell me windows doesn't really have google either?
      - Re: (Score:2)
        
        by snowgirl ( 978879 ) writes:
        
        are you for real. google win grep, or are you going to tell me windows doesn't really have google either?
        Well, they actually had an internal webpage that would do Google and MSN search (at the time) at the same time and allow you to rate how well MSN search did compared to Google.
        But why install grep, when findstr has all the same functionality? Just because I'm familiar with it?
    - - Re:Large? (Score:5, Interesting)
        
        by snowgirl ( 978879 ) writes: on Friday February 12, 2010 @09:15PM (#31122954) Journal
        
        Are you Microsofties really so stupid and ignorant that you're not aware of the ports of GNU utilities to Windows [sourceforge.net] or Cygwin [cygwin.com] or even your own company's Interix [wikipedia.org] and Services for UNIX [wikipedia.org] products?
        No, but to explain this, I need to give you some background.
        When I joined Microsoft, I hadn't used any version of Windows at all for any reason other than playing games. After joining Microsoft, I never used Windows at home for any purpose other than logging into the VPN to work from home... and since I did not even have an x86 machine, this required using Virtual PC on my Mac OSX box.
        Now, I know of all of these tools, and I even could install GVim on the machine as well. However, I was working in a Build Group. This required me to occasionally log into 100 different machines at once in order to start the build process for WinXP/Server 2003. Most of these machines require no more input than logging in and starting up a single app... thus no reason to install special software on them.
        Then, something would break, and I would have to read logs, and/or code on the actual box that had the exact problem. Spending an hour installing apps to do my job would be an unacceptable use of my time, and delay the build unnecessarily.
        I learned to use the tools that were available with the environment that I was in. Thus, I did almost all of my programming at Microsoft in notepad.exe, and I'm not kidding you.
        Were I in a different group? The results could have been different... but having 100 different machines, most of which I didn't have admin rights to, meant that even just installing Notepad++ or something like that would have been a waste of time.
        
        Parent Share
        twitter facebook
        
        Re: (Score:3, Informative)
        
        by StuartHankins ( 1020819 ) writes:
        
        Sysinternals has a great tool you can use to automate installs / run software on multiple machines at once, called psexec. Depends on whether you need to run them interactively, in which case you'd have to also script a login. In the future maybe that's a workable solution for you, especially if you have to use large numbers of computers running Windows. Without grep, head, tail, less, etc I'd feel a bit frustrated. Of course if you're discouraged from installing something that's another issue as well. If n
        
        Re:Large? (Score:5, Interesting)
        
        by snowgirl ( 978879 ) writes: on Friday February 12, 2010 @09:29PM (#31123090) Journal
        
        What the hell? Are you serious?
        So Microsoft themselves hired you to work on Windows, although you were a Mac user and had absolutely no real experience with Windows?
        Not only that, but you had to manually log in to hundreds of systems just to run a script? They didn't push for this to be automated, and you tossed back on the street where you belong? What the hell?
        Don't get me wrong, I don't doubt that your story is true. It's the sort of shit that we should expect from any large company, especially Microsoft. Please tell me you're an H1B, though. At least then it'd make some sense why they'd hire you. H1Bs typically aren't worth more than a batch file.
        Yeah, it took me about a month before I understood that my entire group would be replaced by a few scripts in the Open Source world.
        The primary problem was that because the source code was not a "product", the build code was so full of holes and edge-cases and hacks, that it broke almost constantly, and required someone to babysit it for the whole 14-some hours that it takes to compile.
        Actually, in my orientation class, we went over patents, copyright, and trademark, and I knew it all, and the teacher asked me how I knew so much, and I told her that I owned a registered copyright on some GPL code, and she was like, "and your managers hired you knowing that?" And I was like, know about it? It's the only reason I got hired by Microsoft... be damn sure I didn't submit a resumé.
        
        Parent Share
        twitter facebook
        
        Re: (Score:3, Informative)
        
        by benjamindees ( 441808 ) writes:
        
        At my job at Microsoft, we were in the support end of the core os group.
        Windows doesn't really have find and grep, but it does have "dir /s /b [pattern]" and "findstr /sipc:"[pattern]""
        When I joined Microsoft, I hadn't used any version of Windows at all for any reason other than playing games.
        I did almost all of my programming at Microsoft in notepad.exe
        it took me about a month before I understood that my entire group would be replaced by a few scripts in the Open Source world.
        Dear lord, this is the most hilarious thing ever posted to /.
        
        Re:Large? (Score:4, Informative)
        
        by snowgirl ( 978879 ) writes: on Friday February 12, 2010 @09:44PM (#31123224) Journal
        
        Most of these machines require no more input than logging in and starting up a single app... thus no reason to install special software on them.
        Then, something would break, and I would have to read logs, and/or code on the actual box that had the exact problem. Spending an hour installing apps to do my job would be an unacceptable use of my time, and delay the build unnecessarily.
        "Then something would break" contradicts the earlier statement "no more input than logging in"
        The fact that something is likely to break, and you will need to troubleshoot it, should be reason enough in itself to install some (small) convenient, unobtrusive troubleshooting tools, as standard practice, and as part of the standard initial installs for those servers, to make troubleshooting faster and not require software installations or elaborate practices when things do break.
        You missed a part before the quote that you pulled out. "Most of the machines required no more input".
        My statements remains consistent and not contradictory when only 2 machines typically need direct interfacing.
        And small convenient, unobtrusive troubleshooting tools WERE installed as standard practice on the machines... I already said that there was dir /s /b, and findstr... do I have to have "find" and "grep" when I had tools with the same functionality?
        When I started off, there was a big learning curve because of the new tools, but by the time I left, it was as second nature to me as was find and grep when I joined.
        
        Parent Share
        twitter facebook
        
        Re: (Score:3, Interesting)
        
        by snowgirl ( 978879 ) writes:
        
        I used to work in a similar environment in a university. Tons of windows machines, that I didn't have admin access to. I just carried a usb with me with all sorts of tools that didn't require any more access than a user would have. Seriously borland made a grep for dos that was 7 k back in the 90's. It doesn't sound like you were very creative, but your story does illustrate why the lack of decent command line tools *by default* sucks.
        I didn't even have physical access to the machines. We just RDPed into them, and I had to be logged into every machine at the same time.
        While I had a DFS share that had some of my own tools in it, the problem with running GVim or such off of that is just one of convenience... there were already decent command-line tools available... findstr really does cover everything that I've ever tried to do with grep...
        So, the effort of going out of my way to jury rig all this stuff together wasn't any better than jus
      - It's just not the same. (Score:3, Informative)
        
        by tjstork ( 137384 ) writes:
        
        he ports of GNU utilities to Windows [sourceforge.net] or Cygwin [cygwin.com] or even your own company's Interix [wikipedia.org] and Services for UNIX [wikipedia.org] products?
        I had Win7 and Vista Ent with Services for Unix I downloaded, and it just did not feel right or work right. The command line utilities work, in part, because the whole OS in Unix is basically a tree of text files. windows isn't, and so, the utilities tend to be less effective. Plus, some gotchas like how Windows handles open files w
    - - Re: (Score:2)
        
        by snowgirl ( 978879 ) writes:
        
        > Windows doesn't really have find and grep
        Um... cygwin?
        Ok, again, this time with special emphasis for the retarded... WINDOWS ITSELF does not have find and grep.
        Any GNU OS will, GNU/Linux and GNU/Hurd included, as does any BSD OS.
- Re:30 to 40 thousand lines isn't large by any meas (Score:2)
  
  by ravenspear ( 756059 ) writes:
  
  unless they used a God class for everything.
  - Re:30 to 40 thousand lines isn't large by any meas (Score:5, Insightful)
    
    by hobo sapiens ( 893427 ) writes: <[ ] ['' in gap]> on Friday February 12, 2010 @10:51PM (#31123702) Journal
    
    I am currently working with a mission-critical codebase, which is written in PHP and has absolutely no cohesive design to it. Well, unless you consider making everything static and unnecessarily inheriting other classes and overwriting static variables willy-nilly a cohesive design. There are business rules just everywhere and API requests everywhere and all kinds of calls that overwrite static variables. If you don't methodically trace logic it's really easy to get lost. What makes it worse is that there are many many variables that are named very similarly and you don't really know which one is right and which one is just going to get overwritten in some method call you are not looking at right now. And if this software fails, the worst case scenario is that my company makes no money. It really has made my life over the last few weeks pretty horrid. Fortunately I enjoy the job and the co-workers and am well respected there. Otherwise, it wouldn't be worth the aggravation.
    My advice: communicate your difficulties to everyone who will listen (refrain from complaining or bellyaching, just communicate). If you inherit something like this, and it is mission critical, then you need to take as long as it takes to get it right. That's right, AS LONG as it takes. Take the time to document everything. Bother the crap out of anyone who can help you. You are responsible for doing your job, and part of doing your job is figuring out how to maintain this beast. And in order to do that, you need to use every resource at your disposal. If anyone wants to rush you along, you need to communicate the difficulty and the importance of the task. If you have been working at a place for a while and have done a good job to date, then they should trust you. If you're brand new, then you'd better hope someone there values your opinion and doesn't merely think you are incompetent. If you are asked to make enhancements, don't refactor until you understand the code. So make enhancements, leaving the potentially crappy code in place, even copying it if necessary. Steadfastly resist the temptation to refactor until you understand the entire piece that you are trying ti refactor. Don't remove seemingly unnecessary variables, and don't reduce seemingly redundant database calls. That comes later when you actually know what you are doing in there. IOW, if you have to navigate a lion's den by touch, don't stop to groom the sleeping lion (unless of course, that is your given task.)
    The word inherit seems to imply that either the original maintainer no longer works there or has moved on to a different position. This means that it's you on the hook to figure it out. You've gotta dig in, buckle down, and get to it.
    
    Parent Share
    twitter facebook
- Re:30 to 40 thousand lines isn't large by any meas (Score:5, Funny)
  
  by istartedi ( 132515 ) writes: on Friday February 12, 2010 @08:35PM (#31122508) Journal
  
  Very well, sir. Here's your 40,000 lines of Perl from the late 90s. It's mostly regex to parse revisions 30 through 451 of our in-house provisioning system. Oh, and BTW don't screw up like the last guy who had this job. He provisioned 32767 customers with tier-1 service, and it was the director's job to explain why we either had to let them have it for the remainder of the year, or else deal with the CR issues.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by abigor ( 540274 ) writes:
    
    That is indeed a heinous scenario, but don't conflate "obfuscated" with "large".
  - Re: (Score:3, Insightful)
    
    by GryMor ( 88799 ) writes:
    
    I currently maintain several million lines of perl. It's not hard, it mostly just works, and when it doesn't, it's not that hard to figure out where it's broken IFF there is a consistent repro case for the problem.
    If you have a proper development/production divide, there shouldn't be any weird production issues unless you or your predecessor missed some test cases. If you don't have test cases, that's a problem, if you don't have a properly firewalled and complete development environment, that's a problem,
  - Re:30 to 40 thousand lines isn't large by any meas (Score:5, Funny)
    
    by benjamindees ( 441808 ) writes: on Friday February 12, 2010 @09:40PM (#31123174) Homepage
    
    Perl is like the matrix. At a certain point, after you've stared at it long enough, it all just makes sense.
    
    Parent Share
    twitter facebook
- - Re: (Score:2)
    
    by pclminion ( 145572 ) writes:
    
    One million lines is starting to feel big.
    - Re: (Score:3, Insightful)
      
      by hobo sapiens ( 893427 ) writes:
      
      well that depends on how many developers we are talking about. The original question seems to indicate that the author has inherited the codebase. The need for this question wouldn't exist if the person were on some large team.
      For one or two or five people, 40K lines is a sizable codebase, especially if it has been poorly maintained / designed.
      - Re: (Score:3, Insightful)
        
        by Z00L00K ( 682162 ) writes:
        
        It somewhat depends on the language used - some languages are easier to penetrate than others. And some languages does more in 10 lines than other languages do in 100.
        But anyway - to learn the code you may have to find a starting point (there is usually at least one logical point to start) and then make a flowchart in PowerPoint or something for the general structure. It's no point trying to get into the finer details, just a general sense of flow. You will get things wrong in the beginning, but don't worry
  - Re: (Score:3, Interesting)
    
    by etymxris ( 121288 ) writes:
    
    I inherited a code base of 1.5 million lines of code at the last job I was at. Thankfully I wasn't the only one responsible for it. My advice to the original poster is to add lots of logging information. Log statements should document what the code is doing at any point in time and tell you where it is doing it. If it's java you can get the stack trace from anywhere--this is very handy for logging.
    - Re:30 to 40 thousand lines isn't large by any meas (Score:4, Informative)
      
      by Garridan ( 597129 ) writes: on Friday February 12, 2010 @08:45PM (#31122642)
      
      Oh yeah, well I just inherited a code base of 2.8 trillion lines of assembly code, and I have to read it over a 12.734 baud VAX connection! Why, back in my day...
      Anyway... I've taken on a few large-scale software projects before, and my approach has always been "read twice, hack once". I agree with the the parent, and I'll add a note: for the love of everything sacred and unholy, use revision control, and don't trust it -- that is, back up incessantly. Document the hell out of your process. Once you've really learned the system, you might want to back out some of the newbie mistakes that you're making right now.
      And yes. Learning a big system takes a lot of time -- you should be reading much more than writing until you've learned it. I find it helpful to diagram dependencies / draw up finite state machines.
      
      Parent Share
      twitter facebook
      - Re: (Score:3, Funny)
        
        by Enleth ( 947766 ) writes:
        
        Seeing software problems in terms of Flying Spaghetti Monsters? Ah, so that's where the "spaghetti code [wikipedia.org]" term comes from!
  - Re: (Score:3, Interesting)
    
    by QRDeNameland ( 873957 ) writes:
    
    Just out of curiosity, what is your opinion of a "Large" codebase then?
    My first programming job was on an enterprise system that was over 7 million lines of just C++ code by the time I left, not including SQL stored procedures, web server code for the reporting system, and surely other code stuff that I can't recall. The entire development team for the system was something like 45 programmers. So to many of us, 30-40 klocs does not seem like a large codebase at all.
    That said, I've also inherited code in the 10-50 kloc area of magnitude that was far more of a challenge/nigh
    - Re: (Score:2)
      
      by lgw ( 121541 ) writes:
      
      My first progrqamming job was also about 7 million lines of code - all assemby code. There were 5 of us maintaining it, and some of the object we were maintaining we didn't have matching source for (which isn't hopeless in assembly programming, fortunately, just time consuming and annoying).
      You can just read through 30 klocs in a few months, not a big deal, really. But for a larger codebase you have to learn how to do bugfixes without understanding the entire system. You can often find the source of an e
    - Re: (Score:2)
      
      by RogerWilco ( 99615 ) writes:
      
      It's not just architecture and coding standards. What I find, is that up-to-date documentation is very important. Not so much details about lines of code, but the general design, control flow and design decisions.
- - Re: (Score:2, Funny)
    
    by binarylarry ( 1338699 ) writes:
    
    yeah, the clown always creeped me out as well.
    - Re: (Score:2)
      
      by Deflatamouse! ( 132424 ) writes:
      
      It floats... they all float down here...

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Time (Score:5, Interesting)

Re: (Score:2)

Re:Time (Score:5, Insightful)

Re: (Score:3, Insightful)

I'm afraid the time may already have passed (Score:2, Interesting)

A good starting point (Score:4, Interesting)

Re: (Score:3, Insightful)

don't feel bad at all (Score:5, Insightful)

Re:don't feel bad at all (Score:5, Insightful)

Use Doxygen (Score:5, Insightful)

Re: (Score:2, Informative)

It depends on the language (Score:5, Funny)

Re: (Score:2)

Re:It depends on the language (Score:5, Informative)

Re:It depends on the language (Score:5, Funny)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3, Informative)

Not lots of code (Score:5, Insightful)

Re:Not lots of code (Score:5, Insightful)

Re: (Score:2)

Re: (Score:3, Funny)

Re: (Score:3, Insightful)

Hunt down the original developer (Score:5, Funny)

Re: (Score:2)

Re: (Score:2)

Re:Hunt down the original developer (Score:5, Funny)

Re: (Score:2)

Not at all. (Score:5, Insightful)

Re:Not at all. (Score:5, Insightful)

Re:Not at all. (Score:4, Insightful)

Re: (Score:3, Insightful)

Re: (Score:2)

Visualisation (Score:5, Informative)

Use it (Score:2)

Re: (Score:2, Interesting)

Re: (Score:2)

Re:Use it (Score:4, Insightful)

Re:Use it (Score:5, Informative)

Re: (Score:2)

Re:Use it (Score:4, Informative)

30-40kloc is not large (Score:2)

Trace sessions and time (Score:5, Insightful)

Tried and True (Score:3, Insightful)

Re:Tried and True (Score:4, Insightful)

Some things I do to figure out code... (Score:2, Interesting)

2000 lines can be enough (Score:2)

Read the source! (Score:2)

You don't. You find out what the software did (Score:5, Funny)

Re: (Score:2)

Hope your management understands (Score:4, Insightful)

Re: (Score:2)

Re: (Score:3, Funny)

Try to learn the structure (Score:5, Insightful)

Re: (Score:3, Interesting)

use grep (Score:2)

Done that.. (Score:3, Funny)

Quit (Score:2)

Divide and Conquer (Score:4, Informative)

That's small (Score:3, Interesting)

Don't be discouraged, just keep at it (Score:2)

40,000?!? ARE YOU KIDDING ME? (Score:2)

Re: (Score:2)

Fix small bugs (Score:2)

Design patterns are your friend (Score:2, Interesting)

As a maintenance programmer (Score:5, Informative)

Re: (Score:3, Informative)

My Dick is Bigger than Your 250,000 lines of code (Score:5, Interesting)

Large? (Score:3, Insightful)

Re:Large? (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Large? (Score:5, Interesting)

Re: (Score:3, Informative)

Re:Large? (Score:5, Interesting)

Re: (Score:3, Informative)

Re:Large? (Score:4, Informative)