John Regehr sent us an excellent review of Mastering Algorithims with Perl, another O'Reilly & Associates effort. Written by Jon Orwant, Jarkko Hietaniemi, and John Macdonald, this is a book designed to take your Perl to a new level of wizardery.In The New Hacker's Dictionary under "superprogrammer," we read that "productivity can vary from one programmer to another by three orders of magnitude." I would argue that at least one of these factors of ten comes from the ability to quickly recognize what algorithms should be used to solve different parts of a problem and to find or write implementations of those algorithms that will result in an efficient program, given the available time and the characteristics of the problem. This ability is developed through experience and by understanding the highlights of the large body of algorithms and analysis of algorithms that has been developed to solve problems that occur over and over again in computer programs.
Mastering Algorithms with Perl is designed to provide the necessary background. It's structured like a traditional algorithms textbook: after describing some basic and advanced data structures (linked lists, trees, heaps, etc.), it has chapters about searching, sorting, sets, matrices, graphs, strings, and some related topics. After the introduction and discussion of data structures, the chapters are relatively independent and could be read in any order. The authors provide plenty of cross-references as well as pointers to books that describe individual subjects in more detail.
The intended audience is programmers who don't have a background in computer science, who know at least some Perl. However, experienced programmers who don't know Perl should have no trouble picking up the basics of the language with this book and a copy of Programming Perl. Also, computer scientists can often use a review of algorithms, and the CPAN pointers are very useful. So, I would go so far as to say that this book would enrich any programmer's bookshelf. A stringent test of the merit of a new technical book is to ask if it adds some value, given the best existing books in its area? I think that Mastering Algorithms with Perl definitely does. It is a well-written introduction to algorithms that is more accessible, practical, and entertaining than standard algorithm books. It leverages off of the strengths of a powerful language and a large base of reusable code.
The rest of this review will evaluate the strengths and weaknesses of Mastering Algorithms with Perl in more depth. The central issue that I will consider is why the reader might or might not prefer an algorithms book that concentrates on a single language, as opposed to a general algorithms book. I will try to be up-front about my biases: as a computer scientist, I consider this book to be a compromise between an algorithms book and a how-to manual. This compromise makes it much more useful to Perl programmers, but it sometimes causes the algorithms content to be too watered down.
It is traditional in algorithms books to describe algorithms in pseudocode, which often superficially resembles Pascal. The difference between pseudocode and real code is that pseudocode is not compilable - it ignores implementation details that are not helpful to understanding a particular example. This is considered to be an advantage: without the clutter, the core of the algorithm is easier to see and understand. At the beginning of the book the authors make the point that the Perl code for a binary search is actually shorter than the corresponding pseudocode. And it's true! The advantage of the Perl program is that we have a readable description of the algorithm, and it's executable too. (Unfortunately, it's often nontrivial to convert pseudocode into real source code - the devil is in the details.) The binary search example is slightly misleading, however, because in this case a native Perl data structure (the array) matches the semantics of the problem extremely well, leading to a clear and concise implementation. Later in the book, particularly in the chapter on graphs, we see examples where Perl's built-in data structures are less well suited to the problems. The executable Perl code for graph operations are much longer than the corresponding pseudocode, and are often so syntactically cluttered that they are difficult to read. Is this a flaw in the book or in Perl? No - it's a consequence of giving examples in runnable code instead of pseudocode. Is the tradeoff worth it? Probably, but it depends on what you're trying to get out of the book.
Another consequence of basing an algorithms book on a real language is that the authors can point readers to existing implementations of the algorithms, in CPAN. It's hard to overstate how big of a win this is. Perl is a powerful language to begin with, but it becomes far more powerful when programmers are able to take advantage of the large body of existing code modules. An unfortunate side effect of the fact that the book talks about specific versions of Perl and about specific CPAN packages is that this information will become outdated much more quickly than the algorithms will. Unless the Perl language and CPAN are exceptionally stable in the future, I would not expect most of this information to be valid for more than a few years - hopefully a new version of the book will be available before this one becomes too out of date.
Because the book provides executable code for the algorithms, it's possible to evaluate the performance of the example code (which is available at the O'Reilly site). The authors benchmark a number of the algorithms that they present, and compare the results. This is a nice change from the discussion of asymptotic running times found in traditional algorithm books, which generally ignore the constant factors that often make the difference between an algorithm being useful in practice or not.
The design and analysis of algorithms is a highly mathematical discipline. A sophisticated set of tools has been developed to evaluate the tradeoffs between various algorithms: How efficiently do they use memory and processor cycles? What is the best, average, and worst case running time of various operations? How does the algorithm scale as the size of the input grows? As it turns out, programmers need to understand a few of these formalisms, particularly the "big O" notation for describing asymptotic running time. I think that Mastering Algorithms with Perl uses theory in just the right way: as an aid to programmers' intuition about algorithms, rather than beating us over the head with formulae and proofs. That said, I think there is one area of theory that this book should have spent more time on: NP completeness. NP-complete problems are solvable, but are believed to be inherently hard: no efficient algorithm has been discovered to solve them. There are a wide variety of NP-complete problems, and they do come up in practice. For programmers, the important thing is first to recognize that an NP-complete problem has been encountered, and that it cannot be solved exactly except in small instances. Then, a heuristic that comes up with a good enough approximation of the solution needs to be found and implemented. This is a practical and well-studied part of algorithm design, and in a 650-page book I would expect more than a page or two to be devoted to it.
Several chapters of Mastering Algorithms with Perl are too shallow to be considered good introductions to the associated areas of algorithms. For example, the chapter on matrices only shows code for some of the more trivial matrix operations; for complex tasks, it tells the reader how to use PDL - the Perl Data Language. Although PDL looks like a useful and powerful package, readers should not confuse knowing how to use it with understanding matrix algorithms. In other words, the matrix chapter is too much of a how-to manual. Other chapters such as the ones on searching and sorting are excellent and avoid falling into this trap. Algorithms is a huge area, and it can't all be covered well in 650 pages. The later chapters are a lot of fun to read, but some of them should probably have been scrapped in favor of more depth in core areas.
In conclusion, this is a well-written, useful book. Viewed as a Perl book it's superb; it complements the strengths of Programming Perl and The Perl Cookbook, and I think most or all Perl programmers would benefit from having a copy. Viewed as a computer science book, it has made a number of compromises in order to focus on a specific language; this is not necessarily a problem but it is something that readers should be aware of.
Acknowledgments: Thanks to Tom Christiansen, Dave Coppit, Bill Pearson, and Jamie Raymond for helpful comments on previous drafts of this review.
Purchase this book at fatbrain.