Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Book Review: R Graphs Cookbook

samzenpus posted more than 3 years ago | from the read-all-about-it dept.

Book Reviews 64

RickJWagner writes "Once upon a time, I thought communication was one of my strong suits. Alas, a few years into my programming career I realized I'm more of the head-down codeslinging type, not one of the schmoozing managerial types. So when I have a point to make, I really like to have my data ready to do the talking for me. In that capacity, this book is a very good weapon to have in my arsenal." Read on for the rest of Rick's review.Right away, you should realize this is not a book that teaches R. R (an excellent open source statistical language) is a great tool for any technician. I've used it to analyze logs, find performance bottlenecks, and make sense of mountains of nearly unrecognizable data. But this book doesn't teach R, it teaches R graphing.

It turns out R has excellent graphing capabilities. You can draw scatter plots, line plots, pie graphs, bar charts, histograms, box and whisker plots, heat maps, contour maps and 'regular' maps. These are all good for demonstrating data in different ways, and the book lightly explains which graph will help you illustrate which point.

If you're getting a little interested, you'll also want to know that all this graphing can be scripted and scheduled. So you can get data-driven reports on a schedule, easily accomplished once you know how to write the graphing scripts (which are then scheduled using cron or a similar facility). One small caveat: To prepare your data for presentation, I think it's usually necessary to partner R with another language that's better for text extracting and manipulation. I prefer Python for this task, you might like another language.

The book is exceptionally easy to read and work with. This doesn't mean it's simplistic, though. Anyone who's tangled with R's graphing without a good example will testify that figuring out the various functions and arguments necessary to wrangle a descriptive graph can be really difficult. This book gives you the kind of graphs you need, with the bells and whistles you're going to want, in a series of snippets you can run immediately.

The book is written in Packt's "Recipe" format. In a nutshell, this means that it's a series of how-to sections worded in a templated form. There are headings for sections that inform you what you're going to accomplish, how it's done, and why it worked. You quickly realize it's a repetitive format, but it serves to make the book an excellent resource for quick reference.

Another really nice feature of the book is the downloadable source code and matching data. Knowing the data is half the battle, really. The specific formulas given are certainly useful, but without knowing how the underlying data is formatted you really wouldn't get nearly the practical value. For that reason, I urge anyone using this book to be sure they examine the underlying data for at least the first few formulas. After that, it'll be automatic, you'll know you want to look at that data when you're trying to master some graph type. Then when you go to make your own data ready for graphing, you reach for that secondary language like Python, extract the fields you want in a way similar to your example data set, and presto-- you've got the graph you want.

The book starts out with a first chapter that introduces the kinds of graphs you'll be able to produce and situations where each type is most useful. The next chapters, up until the final one, are in-depth sections on each of the graph types. Maps are treated to a different chapter than pie graphs, for instance. The final chapter covers putting final touches on your graphs, including saving them in different formats (PDF, PNG, JPEG, etc.) and niceties like adding scientific notations, mathematical symbols, etc.

The book states that the target audience is experienced R programmers. I really don't think that's necessary, though. There is an obligatory R installation section, and I think that a reasonably competent programmer with Google at his disposal could get off the ground (for graphing purposes) with this book and a little bumbling. If you already know R, then you needn't worry at all, there is nothing here that will look foreign to you.

If I could change one thing about the book, I'd want a comprehensive index of all the functions and arguments that augment the basic core functions that produce the example graphs. These functions and arguments tweak the basic function in ways that make them much more appealing than what the basic function alone can provide. But the book isn't able to show each and every combination with each graphing function, so it's up to the reader to figure out how to pick some of the options from one recipe and apply it to another. It's not difficult to do, but having an index to help you find the options you want would make this process easier.

You can purchase R Graphs Cookbook from amazon.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

cancel ×

64 comments

Sorry! There are no comments related to the filter you selected.

Damn. So... (1)

Anonymous Coward | more than 3 years ago | (#35858472)

How much are they paying you guys to keep putting these Packt reviews up?

Amazon Affiliation (0)

mholve (1101) | more than 3 years ago | (#35858876)

...Don't forget the Amazon-affiliated link for the book. So /. gets a cut of the book sales as well.

Re:Damn. So... (0)

Anonymous Coward | more than 3 years ago | (#35858878)

Obvious question! And no response means the answer is embarrassing.

Re:Damn. So... (1)

vlm (69642) | more than 3 years ago | (#35859166)

How much are they paying you guys to keep putting these Packt reviews up?

I donno, he advocates for python when everyone else would use perl, and I'm sure the python guys are not paying for that...

I'm sold! (0)

Anonymous Coward | more than 3 years ago | (#35861092)

I've even downloaded it form gigapedia already!

A++ Would pirate again!

This afternoon's Slashdot content... (3, Insightful)

PCM2 (4486) | more than 3 years ago | (#35858474)

...is brought to you once again by the letter Packt and the number RickJWagner.

Re:This afternoon's Slashdot content... (2)

Compaqt (1758360) | more than 3 years ago | (#35858730)

Are there no more O'Reilly books being published anymore? How about some reviews of new instant classics (like the Camel book [oreilly.com] )?

It's been all Packt, all the time for how long now?

As a side note, PacktLib (all you can eat package) is more expensive at $220/yr than O'Reilly's Safari Online Books. Safari is $110/yr for the base package--5 books at a time. That's strange, since O'Reilly books have usually been considered the best tech books. Also they have books from a whole lot of other publishers, while Packt is mostly just barely edited PDFs.

Re:This afternoon's Slashdot content... (2)

i.r.id10t (595143) | more than 3 years ago | (#35858844)

I pay $0 for an all-I-can Safari ... of course, it is bought by my county library system, and I pay taxes and donate to the library system specifically, so I guess I do pay something...

Re:This afternoon's Slashdot content... (1)

PCM2 (4486) | more than 3 years ago | (#35859156)

My library offers Safari access also (San Francisco Public Library). From the related page:

You are signed in to Safari Books Online, paid for and licensed by your academic or public library. You are accessing a Custom Safari Books Online Library that contains a specially-tailored subset of 3,827 titles from Safari Books Online's overall content.

Mind you, they point out that this is a subset of the full 12,000 books available on paid Safari. But 3,827 books is nothing to sneeze at.

Re:This afternoon's Slashdot content... (1)

RDW (41497) | more than 3 years ago | (#35859400)

'Safari is $110/yr for the base package--5 books at a time.'

For that money, you could also buy around 20 O'Reilly iPhone apps on iTunes. Each contains the unencrypted text of the book, which is easy to extract and re-package as a conventional ePub for use on any device:

http://oreilly.com/ebooks/oreilly_iphone_tips.csp [oreilly.com]
http://zef.me/3246/convert-cheap-oreilly-iphone-app-books-to-epub [zef.me]

Re:This afternoon's Slashdot content... (1)

timeOday (582209) | more than 3 years ago | (#35861900)

What's the problem? I thought it was a decent review of the type of book that should be of interest to the ./ readership - much preferred over "paranoid spin on political events" type stories that are taking over.

Shocking (0)

Anonymous Coward | more than 3 years ago | (#35858632)

I'm more of the head-down codeslinging type, not one of the schmoozing managerial types

Oh really, I wouldn't have known it from your binary description of the world. Because an IT person being cordial with people is "schmoozing."

Keep fighting those geek stereotypes, guys.

EPub and PDF versions available at packetpub.com (0)

Anonymous Coward | more than 3 years ago | (#35858634)

This book is listed as available in epub and pdf format at www.packetpub.com.

Re:EPub and PDF versions available at packetpub.co (1)

turkeyfeathers (843622) | more than 3 years ago | (#35858716)

Recursive Slashdot spamming... nice one!

Or you can use Excel (3, Informative)

AdamInParadise (257888) | more than 3 years ago | (#35858644)

Or any other spreadsheet program.

Now of course I admit that Excel is probably not as flexible as R. However, unless your job is to produce stunning, tailor-made graphs, a spreadsheet application will deliver results a lot faster.

Re:Or you can use Excel (1, Insightful)

pclminion (145572) | more than 3 years ago | (#35858718)

If your data set is so small that a spreadsheet can open it, then your data set is a toy data set.

Re:Or you can use Excel (2, Insightful)

Anonymous Coward | more than 3 years ago | (#35858780)

If your data set is so small that a spreadsheet can open it, then your data set is a toy data set.

Where's the +1,Smugly Superior mod option when you need it...

Seriously, any data set that you encounter "in the wild" is by definition not a toy data set. There's many instances where using a spreadsheet to quickly visualize some figures is fine, just like there's many instances where using a word processor to write a letter instead of firing up TeX is fine.

(Granted, you can easily write letters with TeX, too, and better-looking ones than what LibreOffice etc. will come up with, but that's because TeX has packages for just that sort of thing; you don't actually have to wrestle with raw TeX. R has a much steeper learning curve and thus may well be overkill. The right tool for the right job, people!)

Re:Or you can use Excel (0)

Anonymous Coward | more than 3 years ago | (#35858944)

Now, for bonus homework point due this Friday: Try to visualize and summarize in Microsoft Excel all the gene expression of all 22K human genes in 5000 patients. If you can do the same to all 1.4M exon fragments in all 5000 patients, you can pass final exam and get an A.

Re:Or you can use Excel (0)

Anonymous Coward | more than 3 years ago | (#35859496)

How much data did Mendel use to revolutionize biology?

Oh, that was just a toy dataset. A few hundred or thousand data points. No real scientist does anything with less than a million data points.

God, you're worst type of smug: Smug while blatantly wrong.

Now We Know One Data Point (0)

Anonymous Coward | more than 3 years ago | (#35859674)

"God, you're worst type of smug: Smug while blatantly wrong."
Anonymous Coward a Republican?

Re:Or you can use Excel (1)

tehcyder (746570) | more than 3 years ago | (#35865556)

Now, for bonus homework point due this Friday: Try to visualize and summarize in Microsoft Excel all the gene expression of all 22K human genes in 5000 patients. If you can do the same to all 1.4M exon fragments in all 5000 patients, you can pass final exam and get an A.

Yes, because obviously the GP meant that you should always use a spreadsheet for every single task you come across, and not that in many cases it would be quicker and easier to use Excel.

Re:Or you can use Excel (1)

pclminion (145572) | more than 3 years ago | (#35859558)

Where's the +1,Smugly Superior mod option when you need it...

I was trying to be funny, not an asshole. Apparently I failed. I apologize.

Re:Or you can use Excel (1)

tehcyder (746570) | more than 3 years ago | (#35865574)

Where's the +1,Smugly Superior mod option when you need it...

I was trying to be funny, not an asshole. Apparently I failed. I apologize.

Next time try adding a joke as a hint towards your intentions.

Re:Or you can use Excel (1)

pclminion (145572) | more than 3 years ago | (#35868540)

Next time try adding a joke as a hint towards your intentions.

I was trying to poke fun at Excel's well known limitation on the maximum number of rows. Sometimes a joke's not funny if you need to spell it out.

Re:Or you can use Excel (1)

sseaman (931799) | more than 3 years ago | (#35859014)

If you're talking about the ridiculous row limit, that went away in Excel 2007. [wikipedia.org]

However, like many researchers I have used several versions of Excel to produce publishable graphs from summary data--means, SEMs, etc. I love R, but it was only recently that I decided to spend enough time learning the ins and outs of its graphing capabilities that I felt comfortable producing even a bar chart in R for publication. Since I had been producing my tables in Excel anyway--and I'm still not entirely in love with using Sweave or other LaTeX packages in R, so I still find myself going to Excel for producing summary tables--it's trivial to then tell Excel to plot away.

That said, this book would seem very cool had the review actually talked about what sort of graphing capabilities are described in the text. I'm personally curious about its lattice graphing [wikipedia.org] packages, which R has good support for but for which I haven't seen any great instructional resources. Those are the sorts of graphs I imagine you are referring to, which are exploratory or diagnostic or just too sophisticated for Excel, and work over entire datasets using models you specify.

Re:Or you can use Excel (2, Informative)

Anonymous Coward | more than 3 years ago | (#35859194)

Wrong row limit. Sure you can _have_ 1M+ rows but you can still only graph 32K of them at a time.

Re:Or you can use Excel (2)

jrcoyle (1989436) | more than 3 years ago | (#35859300)

For lattice graphics, get Lattice: Multivariate Data Visualization with R, by the author of the lattice package in R. However, I would recommend instead the ggplot2 package, and the book ggplot2: Elegant Graphics for Data Analysis by its author. ggplot has all the functionality that lattice does, it produces prettier plots by default, and its easier to specify graphs and edit them with a minimal change in code.

ggplot2 (1)

epine (68316) | more than 3 years ago | (#35884962)

Hadley Wickham, author of ggplot2, is a prolific contributor of R modules. His documentation is fairly good, yet of the somewhat harried variety. You can get yourself quite lost by the amount of argument inheritance, which in R is entirely unlike tea. The book needs about 50% more material added, by someone who understands generic programming, stating precisely what operators are required for each argument passed into the ggplot hairball.

Hadley also indulged in some proscriptive urges. One was not to provide any form of pie chart, not even in 3D. This one I heartily endorse. The other was to make it difficult to put distinct vertical scales on both sides of the graph. He believes this can be used to create misleading graphics.

Unfortunately, he hasn't done many engineering graphics where you might want to put mA on one side and mW on the other (assuming a fixed supply voltage). There are many, many cases in engineering where you would like your graph to sport two distinct (yet fundamentally equivalent) sets of vertical scales. The root of all evil is premature proscriptivity. The one true simplicity in information technology is compositionality, and Hadley himself is one of the foremost practitioners in the way he architected the ggplot layering system.

I love ggplot, but I had to scribble madly in the margins of the first half of the book before I leveraged the power and ... uh ... convenience. Convenience paid for in blood, but well worth the price.

I also have the Lattice book, but have done less with it. It's a far more traditional approach. ggplot also does lattice graphics, in its own peculiar way.

With ggplot, when you get into faceting, you'll find yourself reading section 9.2 "Converting data from wide to long". This is nearly as fundamental to the ggplot architecture as the equivalence between pointers and arrays to the C language, yet it's buried in chapter nine a la harried documentation.

If you're not going to learn the equivalence in C between arrays and pointers, why bother? I'd say the same is true with ggplot. You had better get a grip on splorping your data frames with plyr, or why bother.

If you love data, the great thing about R (compared to Excel) is that you get to play in Myhrvold's kitchen without having to invest $10m. Even with the heavy machinery at hand, eventually in R, simple things are simple again, if you stick with it.

I should also add that I sometimes exploit the new-fangled ability to write inline C++ code in R. Where I used to do stand-alone applications in C++, these days I almost always use R as my graphing front end.

Another thing: I found on Hadley's website an Amazon wish list which included "The Flavour Bible". I bought this book after finding it there, and I love it. I then recommended it to some other foodies, and some of them report back that it has become their most used cookbook. Other people complain that it's just a list of lists, but not the kind of people in my circles.

Incidentally, ggplot supports both spellings "color" and "colour" for all colour arguments. I have to give Hadley a pass in the greater scheme of things for thwarting my mA/mW dual axis ambitions. But I sure hope he doesn't do it again.

Re:Or you can use Excel (1)

lwsimon (724555) | more than 3 years ago | (#35860672)

Or you've offloaded all the heavy processing work to the database server - where it belongs - and are doing mere presentational work on your desktop...

Re:Or you can use Excel (1)

tehcyder (746570) | more than 3 years ago | (#35865520)

If your data set is so small that a spreadsheet can open it, then your data set is a toy data set.

Says someone who has obviously never worked in, say, the financial services industry.

Re:Or you can use Excel (1)

retchdog (1319261) | more than 3 years ago | (#35858790)

what pclminion said. also: "this graphing can be scripted and scheduled."

Re:Or you can use Excel (1)

vlm (69642) | more than 3 years ago | (#35859232)

what pclminion said. also: "this graphing can be scripted and scheduled."

The best part is not just tail ending "a graph" at the end of a script, but automating thousands of graphs at various resolutions, high, medium, and thumbnail, and then creating the index page that clicks thru. And send emails of "noteworthy" graphs to certain personnel. Add and remove graphs as they appear in the data set, all automatically. I would imagine my little couple minute script would take months to do manually in Excel, one graph at a time.. But my script runs daily...

Excel is, unfortunately, our corporate standard database management system, not a usable graphing solution.

Re:Or you can use Excel (4, Informative)

Beryllium Sphere(tm) (193358) | more than 3 years ago | (#35858986)

People who know more about statistics than I do severely criticize Excel, e.g. http://www.stat.uiowa.edu/~jcryer/JSMTalk2001.pdf [uiowa.edu]

Re:Or you can use Excel (0)

Anonymous Coward | more than 3 years ago | (#35859762)

All of the information in the jcryer paper is over twelve years old. Do you have any reason to believe these problems have not been addressed in that time?

Re:Or you can use Excel (1)

AkkarAnadyr (164341) | more than 3 years ago | (#35861476)

Erm ... he is talking about Excel, right? With a Microsoft product, that'd be a pretty good a priori assumption ...

Re:Or you can use Excel (1)

tehcyder (746570) | more than 3 years ago | (#35865654)

That paper appears to have been written by a bright but semi-literate 13 year old anti-MS geek, so I'm guessing it's someone on slashdot.

Re:Or you can use Excel (3, Informative)

plopez (54068) | more than 3 years ago | (#35859250)

I like R because:
1) It can handle the large (million or more) ata sets I need to crunch and compare

2) Seriously, the latest versions of Excel seem to choke on larger datasets. The "Oh no! Excel is bogging down and getting ready to crash!" sensation is far too frequent. R is much more stable

3) You can do nice graphics in R you can't do in Excel. See http://addictedtor.free.fr/graphiques/ [addictedtor.free.fr]

4) There is a huge number of pre-rolled *serious* statistical libraries already written, and open sourced (including GPL'd) for it. FFT, geospatial stats, multivariate linear and non-linear statistical modeling, time series analysis, linear algebra, and more. Including OOP. I jam ust exploring how R does OOP now.

5) The scripting language is in the Lisp family. It works the way I think.

6) You can compile and link in your own packages in Fortran (pick your flavor 77, 88, 95, '03, or '08), C, C++, etc. If it links, you can link it.

Sweet. Also more stable than Matlab (and cheaper), and more user friendly than SAS.

Re:Or you can use Excel (2)

garcia (6573) | more than 3 years ago | (#35863206)

And R is free and SAS and/or Excel are not. For most here that would be the big deal breaker.

While I use SAS myself, it's because it's available to me. However, I would not use Excel to build charts simply because if you have to change something it's very likely you will have to recreate the chart too. Personally I like running a block of code and having the output get e-mailed to the report's recipient each day/week/month/quarter/foo w/o me having to do anything manually.

Excel = manual and that scares the shit out of me.

YMMV.

Re:Or you can use Excel (0)

Anonymous Coward | more than 3 years ago | (#35860484)

How about just producing graphs that don't suck? Excel produces ugly graphs and is not an appropriate tool for anything serious.
How do you document what an Excel spreadsheet does? All of the formulas are scattered around and hidden in cells.

If you're using Excel to do science, you're doing bad science.

Re:Or you can use Excel (0)

Anonymous Coward | more than 3 years ago | (#35860694)

In that case, close to everything done labs in bio, chemistry, medicine, finance is bad science. And then, the bad science is applied to the real world. I suppose we should blame the financial crisis on Excel, just the way all of NASA's problems can be blamed on Powerpoint.

Re:Or you can use Excel (1)

SuseLover (996311) | more than 3 years ago | (#35860504)

And just how do you write a UNIX script that can automatically aggregate the desired data run it through R using Excel (without having to ship the data off your UNIX system via Samba or some other roundabout way)?

I'll bet most of the users of R are working on some sort of UNIX/Linux system as is common in the scientific community.

Re:Or you can use Excel (2)

cellocgw (617879) | more than 3 years ago | (#35861668)

I'm sorry, but if you think Excel's graphs are good for much of anything, or you think they are easy to edit and reformat, you are grossly mistaken. I'm no novice: I've written spreadsheets with named variables so I can change the content of Excel graphs by changing names or data in cells.

Before you get snarky about R, at least take the time to find one of the web sites dedicated to displaying charts, maps, and graphs generated with R. Most of them are far beyond anything Excel can do.
If all you want are Enterpris-ey pie charts and bar charts (both worthless pieces of crap that only make PHBs happy), then use Excel. But if you've learned enough to know the difference between a line chart and a scatterplot, time to move up to a real tool such as R, Origins, Mathematica, Numpy, etc.

Re:Or you can use Excel (1)

cellocgw (617879) | more than 3 years ago | (#35861708)

I should add: I've even written a set of macros in Excel that let Excel play Pong against itself. I bring it out whenever someone says "but I can do that in Excel..." to which I say,"I can do this.... but just because you can does not mean you should." Sic semper Excel graphics.

Re:Or you can use Excel (0)

Anonymous Coward | more than 3 years ago | (#35861688)

It is no good idea to do statistics with Excel:

* Burns, P. (2005): Spreadsheet Addiction. [burns-stat.com]

* Cryer, J. (2001): Problems with using Microsoft Excel for Statistics. (PDF) [uiowa.edu]

* Pottel, H. (n.d.): Statistical flaws in Excel. (PDF) [coventry.ac.uk]

* Practical Stats (n.d.): Is Microsoft Excel an Adequate Statistics Package? [practicalstats.com]

* Heiser, D. (2008): Errors, faults and fixes for Excel statistical functions and routines [daheiser.info]

For a more comprehensive and technical discussion, see the papers by Yu (2008); Yalta (2008); and McCullough & Heiser in Computational Statistics and Data Analysis 52(10)

Re:Or you can use Excel (2)

dtdmrr (1136777) | more than 3 years ago | (#35861824)

a spreadsheet application will deliver results a lot faster.

Not really, particularly if you have the data already entered. Running:
R
data=read.csv("data.csv")
hist(data)

takes far less time than selecting your columns, dragging the mouse over to the graph button, selecting the region for your plot, and then trudging through a multi-stage wizard. Even if you actually want to type in some data in a spreadsheet its frequently faster to save the table and load it up in R or gnuplot to graph it. And if you do want something like a histogram or a boxplot, excel doesn't stand a chance (gnumeric at least supports boxplots).

I might accept that creating a slightly prettied up graph might be a little quicker in a gui spreadsheet. But for quick and dirty and higher quality graphs they are slower, if they work at all. Once you start encoding your style preferences in little scripts that you load before graphing, you'll find even higher quality graphs take less time than mediocre graphs from a spreadsheet. And really there's something satisfying about tweaking one line in a single file and that automatically updating the style of 20 graphs in an article.

I generally find that when plotting, if doing it once its a coin toss whether to write a script or manipulate the data and plot manually, twice and scripting definitely breaks even and of course more than that and scripting just gets more and more valuable. R (and many other environments) save your history, so that if you do decide a day later you should have just written a script, its already there, you just need to copy the commands out of the history file. In excel, well at least you learned from experience what to do that next day.

As I see it there are two reasons to graph in a spreadsheet. First if you're actually working in a spreadsheet and just want a quick look at some data (not debating the merits of that, separate discussion). Second, when you're not sure what you want and are unfamiliar with the tools available, a gui gives you something to poke at blindly with a mouse. In that second case, I think one should accept the accept the pitfalls of ignorance with an intent to learn more and improve. Stubbornly grasping your spreadsheets, knowing there's a better world out there, will only hurt you in the long run.

Re:Or you can use Excel (2)

subreality (157447) | more than 3 years ago | (#35862100)

Now of course I admit that Excel is probably not as flexible as R. However, unless your job is to produce stunning, tailor-made graphs, a spreadsheet application will deliver results a lot faster.

R is not a graphing language. It's a statistics language. If you just want to plot your sales growth by quarter, sure, a spreadsheet is much more convenient. But professional-quality graphs aren't the only (or even the primary) reason for R.

R has an enormous library of very well refined statistics functions. Spreadsheets are not designed to handle hundreds of thousands of data points, cross-correlations, advanced data transforms, and all kinds of analysis that spreadsheets don't (and shouldn't) have.

Re:Or you can use Excel (1)

Paltin (983254) | more than 3 years ago | (#35863046)

No.

Graphing things in R is much faster.



plot(foo$bar,foo$blarg)

Done.

As opposed to highlightning columns, switching to insert chart, inserting.... makin sure everything is in the right place...

R makes great graphs, but... (2)

proxima (165692) | more than 3 years ago | (#35858956)

R makes great graphs functionally speaking, but without mucking about with the options and some post-processing they are not the most attractive. Open up your favorite financial/data intensive news source and look at the visuals and you'll find that generating that style with just code is fairly difficult.

Until about Office 2007, the defaults in Excel charts were also atrocious. Openoffice.org is still pretty bad, and Matlab is not much better than R. The good news is that you can generate PDFs from each of these and easily open them in Inkscape/Illustrator, where making vector-based edits is easy.

Anyone who regularly visualizes data needs to pick up resources on how to clearly organize and display your data, like "The Visual Display of Quantitative Information" by Edward Tufte (though some of his examples are a little dated). Books like that are full of examples that would be very tricky to replicate without any post processing, because it usually involves eliminating excessive lines and cluttering detail.

Re:R makes great graphs, but... (2)

dondelelcaro (81997) | more than 3 years ago | (#35859026)

R makes great graphs functionally speaking, but without mucking about with the options and some post-processing they are not the most attractive.

Base graphics aren't that nice looking, but that's why ggplot and lattice exist. You can fairly easily produce publication quality graphs with them without spending much time dealing with additional options. There are also packages which produce many of the plots which Tufte promulgates.

Re:gplot and lattice (0)

Anonymous Coward | more than 3 years ago | (#35859200)

Awesome. Thanks.

Re:R makes great graphs, but... (1)

definate (876684) | more than 3 years ago | (#35861960)

MATLAB is BETTER than R? Holy shit, R must look fucking terrible, because even in MATLAB 2010b, after a bit of editing, the result is still fucking hideous.

Re:R makes great graphs, but... (1)

Lorens (597774) | more than 3 years ago | (#35865262)

Anyone who regularly visualizes data needs to pick up resources on how to clearly organize and display your data, like "The Visual Display of Quantitative Information" by Edward Tufte (though some of his examples are a little dated).

For a modern example please see Hans Rosling :

http://singularityhub.com/2010/12/09/hans-rosling-shows-you-200-years-of-global-growth-in-4-minutes-video/ [singularityhub.com]

Really. I've showed it to my parents, wife, and my two kids (sub-teen), they were all totally enthralled.

Far, far too basic. (3, Informative)

dondelelcaro (81997) | more than 3 years ago | (#35858982)

Just from examining the few preview pages on amazon.com, this book appears to be far too basic for anyone who has actually done any serious work with R. I personally would forgo this entire book, and spend the time wandering through the R Graph Gallery [addictedtor.free.fr] which has far more examples with source code and underlying data. It's also rather odd that this book doesn't cover ggplot, grid graphics, lattice, or any of the more commonly used tools in advanced R graphics.

Perhaps this book could be useful as your first foray into graphing with R... but I'm unconvinced it even covers that well.

Re:Far, far too basic. (0)

Anonymous Coward | more than 3 years ago | (#35859756)

It's not clear to me how a book that only covers R's base graphics is useful. On the one hand, lattice and ggplot are newer, more flexible, and more attractive out of the box. On the other hand, I doubt that the book has the time to cover the dozens of specialized graphing functions in various packages that are based on base graphics.

In terms of people mentioning Excel, it's okay if your data's already in Excel and you want to use one of Microsoft's choices. The actual number of chart types is quite limited -- not counting the various gimmicky, data-hiding charts like 3D bar charts and donut charts -- and it only half implements many features, or implements them in strange ways that cause strange interactions if you want to change part of the graph. If you actually intend to do something interesting, like say a scatterplot with a loess smooth through it, you'll actually end up doing about the same thing as you'd do in R with base graphics -- something which is trivially simple in ggplot2.

The only actually nice thing about Excel graphics is the support for sparklines, which is very nice in a tabular layout. (Talking about out-of-the-box Excel. Perhaps various third-party addons bring it up to snuff.)

Re:Far, far too basic. (1)

the eric conspiracy (20178) | more than 3 years ago | (#35859968)

I think that is generally true of books from Packt Publishing. They present an introduction to the topic only and just when you are getting to the point where some real-world depth is needed to solve your problem they give out. As such I avoid books from them.

hardly a review that helps R (0)

Anonymous Coward | more than 3 years ago | (#35859236)

Quote "Anyone who's tangled with R's graphing without a good example will testify that figuring out the various functions and arguments necessary to wrangle a descriptive graph can be really difficult. "

he really is not management material; the 1st thing a manager would ask is, since I'm paying for your time, is there a software tool that is easy to use....

when you spend a lot of time mastering a difficult language or tool, it doesn't mean you are smart and should impart your knowledge to others: it means you are dumb and should have looked for a simpler tool

mmm, R (1)

hubertf (124995) | more than 3 years ago | (#35860814)

nuff said! - try it!

Derailing the data-driven (1)

Glubbdrubb (1450653) | more than 3 years ago | (#35861028)

Lets just be careful we are not overly reliant on pure data in the first place. Or you become susceptible to these (http://pastebin.com/p2HfGx1L) techniques. P.S. Sorry for the pastebin link, but it looks like Venkat took down his online email archives...

ggplot2 (0)

Anonymous Coward | more than 3 years ago | (#35862280)

Graphs in R are fiddlier and uglier than they need to be. ggplot2 makes it a lot easier (both to create graphs and to manipulate the data behind them). It's based on Tufte's ideas, and lets you put a huge amount of data in plots, cleanly.

Re:ggplot2 (1)

dcl (680528) | more than 3 years ago | (#35863956)

ggplot2 is amazing.

If I need to make a presentation/publication worthy graphic, I'll always see if I can produce it using ggplot2.

http://www.coachoutletsstore.org (1)

coach24 (2044490) | more than 3 years ago | (#35863308)

Coach Outlet Coach Outlets Store Coach Outlets Online Coach Outlet Store Coach Factory Outlets http://www.coachoutletsstore.org/ [coachoutletsstore.org]

better than R. better than Matlab. : Yorick (0)

Anonymous Coward | more than 3 years ago | (#35863810)

Yorick is seriously better than anything else, and no need to "... reach for that secondary language like Python, extract the fields you want in a way similar to your example data set..."

Yorick is a c-syntax interpreted language that can manage huge datasets, parse and convert arbitrary text or binary formats, and produce beautiful graphs. And it's *fast*!!

free and open source.

yorick.sourceforge.net/

Check for New Comments
Slashdot Login

Need an Account?

Forgot your password?