Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Memory Leaks

michael posted about 13 years ago | from the bloatware dept.

Programming 34

G3ck0G33k writes: "Is there any free software version/clone of Rational's programs PureCoverage and/or Purify? I have worked with both of them on fairly large projects (>150,000 lines of code) and they were great to work with. When the first runs of Purify found nearly fifty instances of minor memory leaks, I was deeply frustrated/impressed. A free (perhaps GPLd) clone would be so interesting; Rational's licensing is killing my current budget. Of course, the more kinds of leaks it may detect, the better. GeckoGeek" We had a similar question last year but there's no harm in seeing what the current answers are.

cancel ×

34 comments

Sorry! There are no comments related to the filter you selected.

Bounded pointers, etc. (5, Informative)

Lumpish Scholar (17107) | about 13 years ago | (#2114924)

For the other kinds of stuff Purify does (aside from memory leaks), look at Greg McGary's bounded pointer [gnu.org] work.

Bad news: You'll have to build your own gcc (Greg's changes haven't yet been accepted in to the gcc trunk), and all your libraries (just as Purify re-writes all your libraries).

Good news: The resulting code is much faster than Purify'ed code, and finds some problems Purify doesn't. I know of a major software development effort (hundreds of developers, millions of lines of code; sorry, can't give details) that uses bounded pointers to great advantage.

Other tools: GNU Checker, dbmalloc, Bruce Perens' Electric Fence, MemProf, mpatrol, and Mprof; Google searches will turn them all up.

Re:Bounded pointers, etc. (1, Interesting)

Anonymous Coward | about 13 years ago | (#2153487)

Interesting - I had the same idea for a hardware implementation, where each index register would store the bounds. The operation "restrict to tighter bounds" is allowed, while the operation "expand bounds" is not. So the kernel begins with a single pointer to all RAM, and then gives pointers to the user-mode code which point to subsections. It's not even much extra silicon, although a pointer becomes 96 bits!

Re:Bounded pointers, etc. (1)

wik (10258) | about 13 years ago | (#2174329)

You might be interested in reading about the Unisys A-Series Mainframe architecture. The hardware does automatic bounds checking on arrays (and with support of the operating system, kill your program off if it tries to touch something it's not supposed to). The only recent public document that I know of is the architecture support reference manual at:

http://public.support.unisys.com/aseries/docs/HMPN X05_SSR461_SSP4/PDF/70126610.PDF

Unfortunately, that doucment is quite dense (and you're going to have to remove the lameness filter modifications). The A-Series actually uses a structure called an ASD (actual segment descriptor) to store information about the base address, length and type of data in the array, among other things. Of course, the processor can take a look at that data in parallel with accessing data in the array (and throw an exception before committing any changed data), so it has almost no performance cost (aside from reading the ASD, which is probably on par with the cost of loading the array length into a register at the beginning of a loop).

More food for thought: the architecture also has additional "tag" bits on every data word. These give some primitive type information (e.g. code, single-precision real, array element, ASD, etc...) . The processor will not allow a program to arbitrarily change data in a code segment, or things such as return addresses on your stack. I don't know if there are any other machines around today that still have this attribute (if anyone knows of some, please post!). For example, it makes a lot of the recent buffer overflow attacks that we see a moot point, since a string transfer operator would not be allowed to touch the stack frame!

Memory leak detection (5, Informative)

pthisis (27352) | about 13 years ago | (#2118340)

The Boehm-Weiser garbage-collecting malloc() can be built in a leak-detection mode. Every time an object is leaked, it prints out the address of the memory in question. Do that. Then it's 15 lines of python to correlate that back with the malloc() calls; I wrapped malloc/realloc to print out the line number and filename, e.g.

void *our_malloc(size_t howbig, int line, char * file)
{
void *p;

p=GC_malloc(howbig);
fprintf(stderr, "Line %d of %s/%s(): %p\n", line, file, p);
return p;
}
#define malloc(x) our_malloc(x, __LINE__, __FILE__)

with similar for realloc (and make free do GC_free).

Then run the proggy, redirecting stderr through a simple python script: (leading spaces have been replaced with underscores since slashdot doesn't do PRE)

import sys

a={}

for line in sys.stdin.readlines():
__line=line.strip()
__num=line[line.find("0x"):]
__try:
____num=num[0: num.index(" ")]
__except:
____pass

__if line[1]=="i":
____a[num]=line
__else:
____print "Leaked object: "+a[num]

When I run my program this way I get the following output:

Leaked object: Line 43 of leak_stuff.c/(): 0x806efe0
Leaked object: Line 43 of leak_stuff.c/(): 0x806eff0
Leaked object: Line 55 of leak_stuff.c/(): 0x806dfd8

Which tells me which lines to look for the initial allocations of leaked objects at.

The garbage-collecting malloc is really cool; it's at:

http://www.hpl.hp.com/personal/Hans_Boehm/gc/

for now, but rumor has it that gcc will become the official source for it at some point (it's needed for the Java compiler).

Sumner

Re:Memory leak detection (1)

Hangman Jim 99 (85153) | about 13 years ago | (#2140560)

this is all fine and dandy, but what about closed source, stripped 3rd party libraries?

i'm using closed source libraries in a multi-million line project, and I think they have a memory leak.

I cant wrap around malloc in there code, 'cos I dont have it. I call functions like FMLAdd() and it all happens magically.

Re:Memory leak detection (1)

pthisis (27352) | about 13 years ago | (#2120643)

You can use LD_PRELOAD to wrap malloc, assuming they're dynamically linked against libc (almost definitely). If they use GNU libc and don't dynamically link, they're required by the LGPL to distribute object files so you can relink against your own libc.

If you don't have the source, fixing a leak is tough but you can rebuild the garbage-collecting malloc in a redirect mode so their app uses it instead of libc's malloc. Then LD_PRELOAD it. I used to do this with netscape-communicator back when it leaked like mad; worked great, though as I mentioned there is a chance that gcc's optimizations could confuse the gc. In practice it seemed to work okay, for any app where a very rare crash isn't the end of the world (netscape crashed all the time anyway) and where the app is already leaking anyway, it's worth a try.

Sumner

Re:Memory leak detection (2)

PD (9577) | about 13 years ago | (#2169878)

I was going to write the same thing about the Boehm GC, but yours was the first message I saw. Seriously, this collector is an excellent thing. A person would have to be crazy to do manual memory management unless they had a solid technical reason not to.

Re:Memory leak detection (3, Informative)

pthisis (27352) | about 13 years ago | (#2122452)

Be careful using Boehm in production code; the web pages has the caveat:

C compilers may not hide pointers in the generated object code. In our experience, standard commercial compilers obey this restriction in unoptimized code. Most aggressive optimizing compilers do not obey this restriction for all optimized code. For details and examples see papers/pldi96.ps.gz. However, it is difficult to construct examples for which they violate it, especially for single-threaded code. In our experience, the only examples we have found of a failure with the current collector, even in multi-threaded code, were contrived.

However, the gcc developers claim the gcc does in fact violate this constraint. So using Boehm gc with gcc may not be safe in production code. The gcc mailing list has had a couple of threads on how to make gcc garbage-collector friendly in the future (once again, Java is one impetus for this). Until then, I'd stick to manual mm and use the gc only to help find leaks.

Sumner

Re:Memory leak detection (1)

PD (9577) | about 13 years ago | (#2121105)

Thanks, good to know. But personally I won't have to worry about this. All of my gcc code is permanently under development. I once read about something called "optimization". I hope to use it when I get a program completed, someday.

(I finish things at work to my manager's satisfaction. At home, I finish things to my satisfaction, and I'm never satisfied.)

Re:Memory leak detection (4, Informative)

d^2b (34992) | about 13 years ago | (#2169884)

dmalloc (www.dmalloc.com) seems to work pretty well for finding memory leaks. It is distributed under a BSDish
license.

Compiles and runs out of the box on an alpha
running Linux.

GUI? uh no. It has a nifty command line utility to control logging etc...

ccmalloc (1, Informative)

Anonymous Coward | about 13 years ago | (#2120027)

I went through this phase of trying to fix up the memory of all the code I'd ever written. I found ccmalloc [inf.ethz.ch] to be the best. Its the easiest, instead of gcc -o prog prog.o you just prefix with ccmalloc eg. ccmalloc gcc -o prog prog.o. It provides a nicely formatted output log file, with configurable filtering, showing the stack trace of each unfreed leak, and also catches over/underflows, and lots of other stuff. hint: if you are using the c++ std library get g++-3 (with libstdc++-3) and #define __USE_MALLOC to disable malloc pooling. RPMs here [rpmfind.net]

Repeat (0, Redundant)

AX.25 (310140) | about 13 years ago | (#2121932)

Re:Repeat (-1, Flamebait)

AX.25 (310140) | about 13 years ago | (#2121936)

Stupid moderators. Not only has this article been answered many times last year it is answered much more clearly in my link from December 2000 then the one Timothy posted from September 2000.

Re:Repeat (-1, Troll)

Anonymous Coward | about 13 years ago | (#2123933)

If you wanna be a k-whore, you have to re-post those links and other info from those stories, bitch. I'm sure someone will demonstrate. To whore karma you can't just jump in a shout "hey you guys are stupid, mod me as insightful for noticing !" you have to be a true syncopant.

Re:Repeat (0)

Anonymous Coward | about 13 years ago | (#2151468)

I don't think asking that a relevant post not be mod'd down is being a karma whore.

Re:Repeat (-1, Troll)

Anonymous Coward | about 13 years ago | (#2129626)

Mr. Idiot moderator. Care to explain how the previous post is redundent? Seems to me this whole article is redundent and Timothy is stupid for posting it again.

Here's a quick and easy solution (2, Informative)

spacewhale (253229) | about 13 years ago | (#2125915)

Write a malloc wrapper and #define it in place of the real thing. With #define you can easily log the location in the code, amount of RAM, and location in memory to a file, then write a script in the language of your choice to see which locations in RAM weren't dealloced, and match them with the appropriate malloc call, which also contains the location in code. It took me about an hour to implement this in a multi-thousand line program and it works very well. The only thing it doesn't catch is when a library call mallocs something and expects you to dealloc it, but i solved this by including a fake malloc call that just logs but doesn't actually malloc, so you'd call it right after the library call that actually does the malloc.

dmalloc for memory debugging (3, Informative)

epperly (188343) | about 13 years ago | (#2129544)

I like dmalloc [dmalloc.com] for memory debugging. It even found a memory bug for a program that purify choked on. It doesn't have a GUI.

Use block tagging or an allocation lookaside (0)

Anonymous Coward | about 13 years ago | (#2134868)

Looks like libc doesn't have heap-walking APIs anymore... you can solve this pretty easily by putting a "tag" in front of the allocation, and passing back a pointer past the tag space - free just moves the pointer passed "backwards" to the tag, uses normal free, and voila. A heapwalk is then done to find all remaining blocks at the end, and then you just print the tags that are left. However, in leu of that, you can do your own pseudo-tagging in the form of allocating lists of allocations - more or less what a garbage collector would do. You can then look at your list during process shutdown to see who allocated what where. As an added bonus, you can capture some stack backtraces so you know the context of the allocation. Far too often I find myself writing an object factory, and I know my heap tagging procedures will be useless without stack context - having 2000 allocations all at "factory.cpp:3042" isn't my idea of fun. The code below can be fixed up to use stack backtracing if it's available on the platform. (I know on Win32, you can capture stack traces by faking an SEH exception, capturing the stack frames, and then using imagehlp.dll to map the eip's back to their associated functions... I've been too long away from *nix C to know if something similar is available.) A far better implementation would use a hashtable on various groups of bits in the allocated pointer over a small number of preallocated pages, but that's got other overhead associated with expanding buckets, etc. that this brute-force implementation doesn't have. At least this grows "hot spots" of pages that are getting hit, on the basis that a pattern where an alloc/free of a block happens in a "nested" form more often than not - so you get "alloc a, alloc b, free b, alloc c, free c, free a" more than you get "alloc a, alloc b, alloc c, free a, free b, free c." (Of course, I haven't tested the below code, nor do I care to - my own tracking library is heavily Win32 based and uses a private heap to boot.) #define PAGE_SIZE (4096) /* Intel */ /* #define PAGE_SIZE (8192) */ /* Alpha? */ typedef struct _malloc_block_tag { int line; char* file; void* pblockpointer; int size; } malloc_block_tag; typedef struct _malloc_block_list { struct _malloc_block_list* next; int opencount, nextopen; malloc_block_tag tags[PAGE_SIZE - (sizeof(int) * 2 + sizeof(struct _malloc_block_list*))]; } malloc_block_list; malloc_block_list *malloc_tag_list_root, *malloc_tag_list_last; lock_t malloc_tag_list_lock; #define NUMBER_OF(x) (sizeof(x)/sizeof(*x)) void tagging_malloc_add_block() { malloc_block_list *pblock = NULL; pblock = malloc(sizeof(malloc_block_list)); memset( pblock, 0, sizeof(malloc_block_list)); pblock->opencount = NUMBER_OF(pblock->tags); enter_lock(&malloc_tag_list_lock); // assume reentrancy! if ( malloc_tag_list_root == NULL ) malloc_tag_list_root = malloc_tag_list_last = pblock; else { // Head insertion, speed up future lookups pblock->next = malloc_tag_list_root; malloc_tag_list_root = pblock; } leave_lock(&malloc_tag_list_lock); } void cleanup_tagging_malloc() { enter_lock(&malloc_tag_list_lock); while ( malloc_tag_list_root ) { malloc_block_list *phere = malloc_tag_list_root; malloc_tag_list_root = phere->next; free( phere ); } leave_lock(&malloc_tag_list_lock); } void find_leaked_tags() { malloc_block_list *phere = NULL; enter_lock(&malloc_tag_list_lock); for ( phere = malloc_tag_list_root; phere; phere = phere->next ) { if (phere->opencount == NUMBER_OF(phere->tags)) continue; for ( idx = 0; idx tags); idx++ ) { if ( phere->tags[idx] != NULL ) { fprintf(stderr, "Leaked %d bytes from %s:%d : @%p\n", phere->tags[idx].size, phere->tags[idx].file, phere->tags[idx].line, phere->tags[idx].pblockpointer); } } } leave_lock(&malloc_tag_list_lock); } void tagging_free( void* pv ) { malloc_block_list *phere; assert(pv != NULL); enter_lock(&malloc_tag_list_lock); for ( phere = malloc_tag_list_root; phere; phere = phere->next ) { for (idx = 0; idx tags); idx++ ) if ( phere->tags[idx].pblockpointer == pv ) { memset(phere->tags + idx, 0, sizeof(malloc_block_tag)); pv = NULL; phere->opencount++; phere->nextopen = idx; } } leave_lock(&malloc_tag_list_lock); fprintf( stderr, "Error: Block @%p wasn't tagged.\n", pv ); } void* tagging_malloc( size_t cb, int line, char* file ) { void* pv = malloc(cb); malloc_block_list* here; malloc_block_tag* tag; if ( !pv ) return NULL; enter_lock(&malloc_tag_list_lock); if ( malloc_tag_list_root == NULL ) tagging_malloc_add_block(); here = malloc_tag_list_root; while ( here ) { if ( here->opencount ) break; else here = here->next; } if ( here == NULL ) { tagging_malloc_add_block(); here = tagging_malloc_list_last; } here->opencount--; tag = here->tags + here->nextopen++; tag->line = line; tag->file = file; tag->size = cb; tag->pblockpointer = pv; // Fallen off the open slots, or candidate next open isn't? if ( ( here->tags[here->nextopen].pblockpointer != NULL ) || here->nextopen >= NUMBER_OF(here->tags)) { // Cycle through, looking for an open slot for ( here->nextopen = 0; ( ( here->nextopen tags) ) && ( here->tags[here->nextopen] != NULL ) ); here->nextopen++ ) ; } leave_lock(&malloc_tag_list_lock); }

Re:Use block tagging or an allocation lookaside (0)

Anonymous Coward | about 13 years ago | (#2139470)

often I find myself writing an object factory, and I know my heap tagging procedures will be useless without stack context - having 2000 allocations all at "factory.cpp:3042" isn't my idea of fun. The code can be fixed up to use stack backtracing if it's available on the platform. (I know on Win32, you can capture stack traces by faking an SEH exception, capturing the stack frames, and then using imagehlp.dll to map the eip's back to their associated functions. The use of Workshop withPurify can be a great assistance, the problem is that Purify can be a nightmare to actually find the start of the memory leak.

use C++ (2, Interesting)

mj6798 (514047) | about 13 years ago | (#2135794)

In C, this is a never ending battle. Even with Purify, you are going to spend lots of time introducing bugs, then tracking them down. If you must stick with C, consider using one of the C interpreters (EiC, cint, etc.). Machines have gotten fast enough that you can use them for debugging your code. Or stop worrying about it and just use the Boehm garbage collector as a garbage collector.

I switched from C to C++ basically because I couldn't get Purify for Linux. C++ has allowed me to adopt clear, well-defined memory management strategies and automate various pointer checks. I hardly ever get memory leaks or pointer errors in my C++ code anymore.

But no matter what you do in your own code, if you are using C or C++, you will always be exposed to numerous pointer bugs and leaks in library code. Most real-world C++ code commits the same memory allocation sins and has the same pointer bugs as real-world C code--people aren't taking sufficient advantage of C++'s smart pointer facilities (even STL is flawed in that way). Therefore, for multiprogrammer projects, I wouldn't use anything but Java or another safe language anymore.

Re:use C++ (0)

Anonymous Coward | about 13 years ago | (#2127629)

For anyone interested, an excellent book that covers resource management in C++ is C++ In Action: Industrial-strength Programming Techniques by Bartosz Milewski. I'm relatively new to C++, and this is the book that really sold me on the language...he presents a methodology that practically guarantees you won't have leaks. He's also practical enough to tell you how to retrofit the technique to existing projects. Milewski is a former physicist, and a very clear writer and thinker. And to top it off, the full text [relisoft.com] is available on the web.

Probably a bit late for existing projects but... (2, Interesting)

MadAndy (122592) | about 13 years ago | (#2144454)

One of the places I'm involved with doesn't use the standard malloc calls. Instead we use something more like:
get_mem(ptr, size, "widget hash table")
When debugging, get_mem keeps track of all allocs. At the end, just before the program shuts down the heap dump routine is called which lists all outstanding memory blocks along with the debug string so you can see where they were allocated.

It's also often practical to call the dump routine at various points within the program and give the output a quick look-over or diff - it's amusing how often you can nip these problems in the bud this way.

Also, if you get really desparate, change the get_mem routine to increment a global counter and tag that to the end of each allocation info block. If you keep a program debug log and log each allocation it makes it easy to see where a loose block was allocated - grab the unique ID from the dump and search the log file for it.

A handy feature about this trick is that you use #define to define get_mem, so when you go to production you simply define it to malloc and throw the debug string away - no speed or size cost in the running program. In addition, it basically costs nothing except an hour or so to set it up in the first place. The catch is you have to use it religiously from the start of your project.

A really simple trick, but it has saved me so much work!

Roll your own (2, Interesting)

Ratbert42 (452340) | about 13 years ago | (#2147055)

In college, I rolled my own wrapper for malloc(), free(), and array/pointer dereferences. A couple hours of coding that wrapper caught most of my memory leaks and seg faults. If I could do it when I was half-drunk and didn't know what I was doing, you've probably got a developer on staff who can handle it.

Re:Roll your own (0)

Anonymous Coward | about 13 years ago | (#2152994)

Boy, there is no better indicator of somebody's intelligence than how often they refer to their legendary states of drunkenness.

Thus if this moron can do anything at all the only reasonable conclusion is that it must be completely trivial.

Re:Roll your own (2, Funny)

tyoud1 (460688) | about 13 years ago | (#2123716)

Nod, he's a drunken master. :)

Your kung fu is no good, Anonymous Coward.

Free Beer? (2, Interesting)

Ratbert42 (452340) | about 13 years ago | (#2151254)

A free (perhaps GPLd) clone would be so interesting; Rational's licensing is killing my current budget.

Maybe you should put a developer or two on that project and see how long it takes them to build something similar. I think Purify runs about $1,500 now (could be wrong). That's what, two Aeron chairs? That shouldn't kill any real company's budget. Numega's Boundschecker is a viable cheaper alternative though. Or just rip off the free trial versions.

When I've seen Purify bought, a developer downloaded the trial and built a list of all the problems he found and fixed using it. When he showed his manager how much pain and suffering the product could save it was an easy sell. (The hardest part was countering the "so everything's fixed already?" mentality.)

MEMPROF (3, Informative)

kijiki (16916) | about 13 years ago | (#2151318)

Its by one of the RHAD labs kids. Its basically just a GUI around bohem's garbage collector in leak-detector mode.

Its not purify (it really aims for leak detection, not all the other errors purify finds), but the efence + memprof combination gets you about 85% of purify's functionality.

It seems to handle threaded apps reasonably well, and C++ doesn't faze it. The only down side is that its hard to get running on non-x86 platforms.

mpatrol works great.... (1)

chriscmp (5983) | about 13 years ago | (#2151613)

find a collection of different memory usage problems, and is reasonably easy to use even on large projects

I asked REdhat once.. (1, Redundant)

josepha48 (13953) | about 13 years ago | (#2152644)

they told me they use electric fence. While it is definately not the same (I have used purify as well) it is basically a library that you like against and then when you run your program it checks your malloc's and things like that to make sure you have allocated the correct amount of space.

But to answer the question are there any out there? NO, not with pretty GUIs and all.

Re:I asked REdhat once.. (2, Informative)

szomb (318129) | about 13 years ago | (#2141376)

ElectricFence detects overruns of malloc()d buffers (hence its name). Unless this changed recently I am fairly sure it has nothing to do with leak detection?

Checker (2, Informative)

anaymouse (513946) | about 13 years ago | (#2158606)

Try Checker [gnu.org] I think AX.25 pointed to some relevant information, but was moded has redundant for some odd reason.

Re:Checker (0)

Anonymous Coward | about 13 years ago | (#2146316)

I looked at this recently. It doesn't work with the latest gcc, and it appears to have been abandoned by the author. It looks promising.

mpatrol (2, Informative)

brianmed (131838) | about 13 years ago | (#2174296)

mpatrol is another tool to help with this.

It can:
- log your memory usage
- report on improper memory usage
- profile your memory usage
- work with your applications *without* re-linking (assuming your OS allows this)

The web page is at:

http://www.cbmamiga.demon.co.uk/mpatrol/

In addition, the author has excellent documentation. The pdf manual actually has a section that lists competing products and what they do.

http://www.cbmamiga.demon.co.uk/mpatrol/files/mp at rol.pdf
Check for New Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>