The Power of the R Programming Language

samzenpus posted more than 5 years ago

Programming 382

BartlebyScrivener writes "The New York Times has an article on the R programming language. The Times describes it as: 'a popular programming language used by a growing number of data analysts inside corporations and academia. It is becoming their lingua franca partly because data mining has entered a golden age, whether being used to set ad prices, find new drugs more quickly or fine-tune financial models. Companies as diverse as Google, Pfizer, Merck, Bank of America, the InterContinental Hotels Group and Shell use it.'"

Only for certain kind of analyst... (2, Insightful)

Anonymous Coward | more than 5 years ago | (#26366243)

... most others keep thinking that M$ Excel is the silver bullet.

Sad, but f****** true.

Re:Only for certain kind of analyst... (5, Insightful)

Samschnooks (1415697) | more than 5 years ago | (#26366339)

... most others keep thinking that M$ Excel is the silver bullet.

The folks I know who use Excel for analysis use it because it's the package that everyone gets in their organization, there's a shit load of material on the web that uses excel, there's plenty of add-ons for it (no need to reinvent the wheel), and when sharing data and analysis, everyone is familiar with it. An engineer I know who uses excel chose it because it was the fastest way to connect to his testing equipment. R is relatively new and as more folks come into the workforce who know it, we'll see it replace Excel for functions that it is better suited for.

Re:Only for certain kind of analyst... (4, Interesting)

Anonymous Coward | more than 5 years ago | (#26366395)

I guess I was thinking of analysts using Excel to develop "complicated" statistical analyses. Sure, Excel is unbeatable at handling small, tabular datasets and doing basic or even considerable arithmetic with them.

When it comes to do more elaborate analysis, using Excel IS reinventing the wheel. Plus, it is IMPOSSIBLE to understand later.

Re:Only for certain kind of analyst... (5, Informative)

jaxtherat (1165473) | more than 5 years ago | (#26366655)

Sorry, but R is not relatively new, it's been around for at least 10 years, I was taught how to use R at University back in 2001, and S and later S+ (which R is a FOSS version of) has been around for even longer, since the mid 70's.

Re:Only for certain kind of analyst... (-1, Troll) (1108067) | more than 5 years ago | (#26367309)

"Sorry, but R is not relatively new, it's been around for at least 10 years"

FTFA: "whether being used to set ad prices, find new drugs more quickly or fine-tune financial models."

So we can the financial crisis on idiots who don't understand that GIGO applies in EVERY computer language?

Re:Only for certain kind of analyst... (3, Interesting)

colinrichardday (768814) | more than 5 years ago | (#26366845)

Has Microsoft corrected its percentile function? Or does it still put the largest datum in the 100th percentile, as well as assign fractional percentiles?

Re:Only for certain kind of analyst... (4, Informative)

zippthorne (748122) | more than 5 years ago | (#26367379)

Pfft. Matlab is the fastest way to connect to his testing equipment.

Well.. Labview, actually, but no one in their right mind would want to actually use it. Anyway, simulink gets you a lot of the graphical programming features if you need that.

Re:Only for certain kind of analyst... (-1, Offtopic)

Anonymous Coward | more than 5 years ago | (#26366643)

If your username was erris or twitter you be getting 1000 replies about your use of "m$".

Re:Only for certain kind of analyst... (2, Insightful)

Hatta (162192) | more than 5 years ago | (#26367209)

Do analysts who use R get better returns than those who use Excel?

What's a pirate's favorite programming language? (5, Funny)

Sure about that? (4, Funny)

Bill, Shooter of Bul (629286) | more than 5 years ago | (#26366623)

I'd have guessed perl. Think about it
  1. Sounds like treasure ( pearls)
  2. Reinforces their need to hide their booty by making indecipherable maps to the treasure buried within
  3. Incomprehensible mangling of commong symbols, like their english dialect.
  4. It often requires the programmer to consume large amounts of rum as a coping mechanisim
  5. New virtual machine Parrot named after favorite pet
  6. Can use actual pirate language to program Acme::Lingua::Pirate::Perl []

geekoid (135745) | more than 5 years ago | (#26366283)

Growing in use? sure.

Based on S (1)

BadAnalogyGuy (945258) | more than 5 years ago | (#26366313)

Why they chose to go with R rather than T, I'll never know.

Re:Based on S (1)

drolli (522659) | more than 5 years ago | (#26366407)

The point is S is popular, but expensive. R gets some popularity from that. One could hope also S is made free at some point.

And i also dont know why it is called R.... maybe because sounds starting with a vowel sound better.....

Re:Based on S (3, Informative)

Anonymous Coward | more than 5 years ago | (#26366537)

And i also dont know why it is called R

The guys who originally wrote both had first names that started with R and being the jokers that they were, they thought it would be funny to give it a name very similar to S.

Re:Based on S (5, Interesting)

Anonymous Coward | more than 5 years ago | (#26367289)

I wish it had a more googleable name. It's hard to search for help. The signal to noise ratio is low.

Re:Based on S (4, Funny)

irtza (893217) | more than 5 years ago | (#26366417)

Trying to find middle ground with C?

Two links? (1)

CannonballHead (842625) | more than 5 years ago | (#26366329)

There appear to be duplicate links in the summary :)

Show me some example code (5, Insightful)

bogaboga (793279) | more than 5 years ago | (#26366369)

My request is to those that are in the know to show me some example code, that does something useful. Then later, compare that code to code from other languages to accomplish the same task.

Include reasons to support the notion that the R language is [necessarily] better at what it does.

Re:Show me some example code (0)

Anonymous Coward | more than 5 years ago | (#26366385)

as opposed to something like SAS (leaving aside licensing costs)

Re:Show me some example code (5, Insightful)

transonic_shock (1024205) | more than 5 years ago | (#26366501)

"I think it addresses a niche market for high-end data analysts that want free, readily available code," said Anne H. Milley, director of technology product marketing at SAS. She adds, "We have customers who build engines for aircraft. I am happy they are not using freeware when I get on a jet.""

Seriously, does this person know what she is talking about?

1. Yes, CFD and Structural Analysis software is increasingly written using open source tools and run on open source OS (Linux running on clusters)

2. SAS is not used to design any part of the aircraft.

I have noticed SAS uses the same kind of FUD to counter R as M$ uses to counter Linux.

Re:Show me some example code (5, Insightful)

visible.frylock (965768) | more than 5 years ago | (#26366555)

Seriously, does this person know what she is talking about?

Let's see, Director of technology product marketing. I'm gonna go with a big NO.

Re:Show me some example code (0)

Anonymous Coward | more than 5 years ago | (#26367345)

Come on, she is in marketing and I guess she got this position by cleverly convincing similarly unqualified people to buy SAS software.
She doesn't have to know how aircrafts are built because she is for sure flying first class.

Re:Show me some example code (0)

Anonymous Coward | more than 5 years ago | (#26367375)

Sadly SAS software IS being used to design aircraft. There are a lot of similarities between drug testing and aircraft design space modeling.

Freak your colleagues out with "no loop" code... (4, Interesting)

refactored (260886) | more than 5 years ago | (#26366533)

I remember once years ago freaking my colleagues out with a largish app written in R... with nary a loop anywhere.

Actually that wasn't why I used R, just a fun addendum. The reason to use R is the huge body of statistics, data mining and graphics facilities. Superb.

Of course, the problem with any statistical library is you have to turn your brain on first. Nothing produces "Garbage in Garbage out" quite like statistical analysis.

With R you tend to need to spend far more time thinking about why you are doing something, and what the answer means than in say vanilla C/Ruby programming.

Which is actually not a Bad Thing at all.

The worse thing about R programming is its name. Googling for "R" turns up way to much noise and way too little signal.

Re:Freak your colleagues out with "no loop" code.. (5, Informative)

Anonymous Coward | more than 5 years ago | (#26366815)

"The worse thing about R programming is its name. Googling for "R" turns up way to much noise and way too little signal"

Try searching from [] instead of directly from Google.

Wow! The Google Star of R has risen... (1)

refactored (260886) | more than 5 years ago | (#26366981)

Since I last Google for R the page rankings of all things R language related have risen gloriously! Wow!

I retract my sole criticism of R.

Re:Freak your colleagues out with "no loop" code.. (0)

Anonymous Coward | more than 5 years ago | (#26366827)

The statistical libraries are what I use it for, but what pisses me off is the lack of more general data structures, like dictionaries, or vectors of nontrivial types*. It's not nearly as self contained and consistent as, say, Python.

If Python's scientific libraries get better, I'll probably switch to that.

*e.g. if you want a list of functions, it has to be a linked list, which then has O(n) access time.

Re:Freak your colleagues out with "no loop" code.. (2, Informative)

dookiesan (600840) | more than 5 years ago | (#26367155)

The following R code might be implemented with a linked list internally, but I assume they could change this behavior without breaking many programs :

x = vector(mode="list")
x[["joe"]] = y
x[["bob"]] = z #z can be a function!

x = list(joe=y)
x$bob = z

Re:Freak your colleagues out with "no loop" code.. (4, Interesting)

fm6 (162816) | more than 5 years ago | (#26366977)

I remember once years ago freaking my colleagues out with a largish app written in R... with nary a loop anywhere.

That's a feature of functional languages, a class that also includes Scheme and XSLT. The basic idea is that programs should not have state, because state makes them harder to debug. A for or while loop, by definition, has state, so you have to do your iteration some other way, namely Tail Recursion [] .

I suppose that makes sense, but I've never been able to teach myself to think that way. It's the main reason I never managed to get through The Wizard Book [] .

Re:Freak your colleagues out with "no loop" code.. (1, Informative)

Anonymous Coward | more than 5 years ago | (#26367311)

The worse thing about R programming is its name. Googling for "R" turns up way to much noise and way too little signal.

Problem solved.

Re:Freak your colleagues out with "no loop" code.. (1)

Daniel Dvorkin (106857) | more than 5 years ago | (#26367319)

The worse thing about R programming is its name. Googling for "R" turns up way to much noise and way too little signal.

(3, Informative)

Kludge (13653) | more than 5 years ago | (#26366541)

(5, Funny)

DahGhostfacedFiddlah (470393) | more than 5 years ago | (#26367031)

the libraries available for doing such analysis are unparalleled.

With multi-core processors becoming more and more prevalent, R's developers should remedy this as soon as possible.

Re:Show me some example code (4, Informative)

Anonymous Coward | more than 5 years ago | (#26366579)

It may not be "better" in the sense of "calculating stuff with higher efficiency" (i reckon you can do the same stuff in C, given the right libraries :P), but for statistical and data mining/visualization purposes it is a quite simple object-oriented functional language with many useful built-in procedures and lots of freely available packages/libraries that is simple enough for "non-programmers" and, so far, it does what i want it to do fast enough and.. it's free.

So.. probably not the best all-purpose programming language, but fits nicely in the "statistical software environment/language" niche and, unlike SPSS et al., it's free (as in "libre", as in "everyone can independently verify your results without having to shell out cash", which is useful in academia).

Example code:

results <- prcomp(datamatrix)

This does a PCA (Principal Component Analysis [] ) on the data contained in "datamatrix" and dumps the results into the "results" variable.

I have no idea how i would start to code that in C, python, etc. in a way that's remotely efficient ;)

Re:Show me some example code (1)

Jurily (900488) | more than 5 years ago | (#26366975)

I have no idea how i would start to code that in C, python, etc. in a way that's remotely efficient ;)

I'd go with

#include "prcomp.h"

Once someone did the algorithm for you, any programming language is easy. I think the point of the language would be, if said algorithm was orders of magnitude easier to code, represent, argue about, etc. in R, than it would be in "C, Python, etc."

Re:Show me some example code (1)

Watson Ladd (955755) | more than 5 years ago | (#26367053)

R has vector operations. Every operator works componentwise on a vector, and does the right thing with scalars. This makes vector heavy code easier and clearer to write.

Re:Show me some example code (2, Insightful)

zippthorne (748122) | more than 5 years ago | (#26367315)

But we already have a language that does vectors correctly. It's called Matlab and it's based on Fortran, which I guess technically also does vectors correctly, if you want to bother to learn it.

Re:Show me some example code (4, Insightful)

Daniel Dvorkin (106857) | more than 5 years ago | (#26367359)

One big advantage R has over Matlab (er, besides the fact that R is OSS, but of course there's Octave for those who want an OSS Matlab alternative) is that R handles non-matrix data structures much, much better than Matlab does. Trying to work with anything that isn't a vector or a matrix in Matlab is an exercise in pain.

Re:Show me some example code (2, Interesting)

stephentyrone (664894) | more than 5 years ago | (#26367173)

I have no idea how i would start to code that in C, python, etc. in a way that's remotely efficient ;)

How about:

#include <clapack.h>
dgesdd( argument list );

This sort of thing is a feature of libraries, not an inherent advantage of one language.

Re:Show me some example code (1)

Neil Blender (555885) | more than 5 years ago | (#26366593)

Include reasons to support the notion that the R language is [necessarily] better at what it does.


Re:Show me some example code (0)

Anonymous Coward | more than 5 years ago | (#26366685)

inb4bioconductor :3

oh... shucks -_-'

brb, back2affymetrix ^-^

Re:Show me some example code (4, Informative)

Keyper7 (1160079) | more than 5 years ago | (#26366637)

It's been a while since I worked with it and I don't have code examples with me at the moment, but think of it as the Matlab/Octave of statistics, including the preference for "function over each row/column" instead of loops.

Compared to other languages, R makes it easy to do statistical analysis tasks like Matlab/Octave makes it easy to do linear algebra tasks.

Plus, as other posts stated above, there's excellent documentation and tons of useful libraries (take a peek at the libraries available at the Debian repositories), Bioconductor being the finest example.

Oh, and nice emacs integration. :)

Re:Show me some example code (1, Informative)

Anonymous Coward | more than 5 years ago | (#26366695)

R is not a programming language, it is an environment which uses the S programming language. The S programming language was developed in the 70s at Bell Labs.

If you use Linux, you can install R with

yum install R*

which contains many examples.

People switching to R usually started with Splus, which a few years ago worked to close source code contributed by academics. They have chosen to move to R.

Re:Show me some example code (-1, Redundant)

turbidostato (878842) | more than 5 years ago | (#26367323)

"If you use Linux, you can install R with
yum install R*"

I do use Linux, let's see:
mybox:~# yum install R*
bash: yum: command not found

Let's rewrite your sentence:
"On *some* variants of Linux (notably the SUSE family) you can install R with `yum install R*`"

Re:Show me some example code (5, Informative)

lt. slock (1123781) | more than 5 years ago | (#26366715)

I use R a great deal. Think of it as an alternative to MATLAB, or Excel, rather than C or perl or lisp or whatever you like to use as a general purpose language. So, compared to MATLAB, functions are first class objects (rather like lisp), so, you can write functions that take functions as arguments, and return them as well, just as though
they were simple variables. It handles
vectors rather easily, and has decent plotting tools.

#quick example

# function, which, given numerical arguments a and b, and a function g, returns a function of x
f - function(a,b, g){
    function(x){ a * x + g(b * x)}

f1 - f(1,2.5,sin)
x - seq(-pi,pi,l=100)

Re:Show me some example code (2, Informative)

lt. slock (1123781) | more than 5 years ago | (#26366769)

note that the minus (as in f - function...) signs should be (left angle bracket minus sign), that is, the R assigmnent operator, I guess this is the lameness filter

Re:Show me some example code (0)

Anonymous Coward | more than 5 years ago | (#26366841)

it's because slashdot interprets your comment as html code.. try using "<" instead of the "left angle bracket" ;)

Re:Show me some example code (1)

lt. slock (1123781) | more than 5 years ago | (#26366869)

cheers... don't do a lot of posting...

Re:Show me some example code (1)

tsalmark (1265778) | more than 5 years ago | (#26367371)

I think you mean &lt;

Re:Show me some example code (0)

Anonymous Coward | more than 5 years ago | (#26367307)

I support this comparison, but only to a certain extent.

I was taught Matlab for numerical analysis, then years later had SPSS beaten into me for statistics. SPSS has its own pseudo-scripting interface which it calls "syntax" which allows functionality like Excel, but with a slightly more unified way of going between gui and text command. Conversely, R and Matlab are both by and large command line high-level C (from what I can understand, anyway) but R has a great deal more functionality since you can create your own packages. So, to review:

R:Matlab :: SPSS:Excel

Re:Show me some example code (4, Informative)

Anonymous Coward | more than 5 years ago | (#26366717)

i'm a PhD student in biostatistics at a fairly prestigious american university. we use R almost exclusively, because it is better than other statistical software options. reasons for it's superiority are i) it's free ii) it's open source and iii) its considerably more powerful than STATA, SPSS, SAS, etc.

it is true that other languages can be quicker for many tasks. proficiency in C is desirable, but C is not geared toward statistics, where many built-in libraries and user-contributed packages for R implement complex methodologies.

i'm not as versed in C as i am in R, so i can't provide a direct comparison of the languages, but i have included a sample below. it's a function that fits a simple linear model, taking the outcome data and input data (as a matrix) and a couple of other parameters as inputs. it returns a variety of values, including the model coefficients and fitted values. there is an R function that does this exact thing, but we have to do something for homework.

        #use range around 0, for roundoff error
        if(-1e-5=det(t(x)%*%x) & det(t(x)%*%x)=1e-5){stop("x'x not invertible",call.=F)}

        beta=solve(t(x) %*% x) %*% t(x) %*% y
        sigma = as.numeric(sqrt(var(y-(x%*%beta))))
        varbeta=sigma * (solve(t(x)%*% x))
        fitted=x %*% beta


        hat=x%*% solve(t(x) %*% x) %*% t(x)
        names(output)=c("beta", "sigma", "varbeta", "fitted", "residuals", "hat matrix")



i'd also say that i'm glad to see some press for R. it's popular in some circles, but not as accepted by companies and some academics because it is open source. the idea is that software you have to pay a licensing fee for must be more reliable because, well, you paid for it (thinking i'm sure you're familiar with).

Re:Show me some example code (0)

Anonymous Coward | more than 5 years ago | (#26366927)

"it's superiority"


Well, at least you're not an English major.

Re:Show me some example code (1, Informative)

Anonymous Coward | more than 5 years ago | (#26366853)

Want example code and comparison with other stats software? Here's 80 pages from an entire BOOK devoted to your request:


From the text:
From the text:
Since its release in 1996, R has dramatically changed the landscape of research software. There
are very few things that SAS or SPSS will do that R cannot, while R can do a wide range of things
that the others cannot. Given that R is free and the others quite expensive, R is definitely worth
It takes most statistics packages at least five years to add a major new analytic method.
Statisticians who develop new methods often work in R, so R users often get to use them
immediately. There are now over 800 addâon packages available for R.
R also has full matrix capabilities that are quite similar to MATLAB, and it even offers a MATLAB
emulation package.

If you'd like to see some examples with accompanying graphics, check out the newsletters or manuals at

I use R because it's free, there's lots of free add-on code, every other statistician I know uses R, it's quick and easy to test stuff out in R, and if you want you can speed up things by writing the most computationally intensive parts of your program in C, C++, or FORTRAN. Also, you can get great graphics out of R if you put in a little effort to learn how.

Re:Show me some example code (2, Interesting)

garcia (6573) | more than 5 years ago | (#26366905)

My request is to those that are in the know to show me some example code, that does something useful. Then later, compare that code to code from other languages to accomplish the same task.

Would you ask someone who utilizes SAS or SPSS to do the same thing? Because that's more or less what R is -- a free version of SAS or SPSS. I work in SAS all day long and I have been planning on using R to automate some of my personal website statistics/graphing that I run regularly because I don't really like doing the queries in MySQL on the console, copying the data to Excel, and graphing the results.

As anyone knows, you should utilize the best tool for any particular job you're doing. There's no sense in recreating the wheel in C or Perl or Foo when R, SAS, SPSS, or whatever does stats, mining, and graphing well.

Re:Show me some example code (1)

Daniel Dvorkin (106857) | more than 5 years ago | (#26367383)

Because that's more or less what R is -- a free version of SAS or SPSS.

More specifically, R is a free implementation of the S language; it would be more accurate to call it "a free version of S+" -- although at this point I suspect that, thanks to CRAN, its capabilities exceed those of the proprietary alternative.

Re:Show me some example code (1, Informative)

Anonymous Coward | more than 5 years ago | (#26366953)

# Draws labelled diagrams with critical
# region's for the normal and t distributions
# Excuse my lack of code reuse, etc. this
# was meant to make diagrams just for a quick
# homework assignment
# Please show me how to do this in SAS!!!
# Tell me you'd even think of trying this in SAS
# to draw pictures for your short homework
# assignment

crit_norm_diag = function(alpha,lowertail=T) {

        end = -4;
        crit_value = qnorm(alpha);
        if(!lowertail) {
                crit_value = -crit_value;
                end = -end;

        pts = c(end,crit_value);
        pts = sort(pts);

        x = seq(pts[1],pts[2],by=0.01);
        y = dnorm(x);

        x = append(x,c(pts[2],pts[1]));
        y = append(y,c(0,0));

        if(crit_value0) {
        } else {



crit_t_diag = function(alpha,df,lowertail=T) {

        end = -4;
        crit_value = qt(alpha,df);
        if(!lowertail) {
                crit_value = -crit_value;
                end = -end;

        pts = c(end,crit_value);
        pts = sort(pts);

        x = seq(pts[1],pts[2],by=0.01);
        y = dt(x,df);

        x = append(x,c(pts[2],pts[1]));
        y = append(y,c(0,0));

        plot(function(x) { dt(x,df) },-4,4,xlab=paste('t, d.f.=',df,sep=''),ylab='p.d.f.');
        if(crit_value0) {
        } else {



Re:Show me some example code (0)

Anonymous Coward | more than 5 years ago | (#26367017)

Why would anyone bother to do any of that for you? Are you handicapped? Too busy doing something altruistic?

And what do you plan to do with the information? Skim it and then dismiss it, maybe after posting a clever fisking?

Satisfy your own curiosity and do your own research.

Re:Show me some example code (2, Informative)

dookiesan (600840) | more than 5 years ago | (#26367097)

In R you can easily extract elements of an array :

x = 1:10 #integers from 1 to 10

#set all even elts of x that are less than 7

x[(x < 7)&(x %% 2 == 0)] = -1

#y is some big array with several dimensions

#I and J are vectors of integers

z = y[I,,J,,, drop = F]

#'z' is now a sub array

z = y[I,2,J,1,]

#now z is a subarray with fewer dimensions

Free as in beer (3, Insightful)

visible.frylock (965768) | more than 5 years ago | (#26366521)

"R is a real demonstration of the power of collaboration, and I don't think you could construct something like this any other way," Mr. Ihaka said. "We could have chosen to be commercial, and we would have sold five copies of the software."

Very true. This is what I try to explain to people when they can't understand why some software is given away gratis. Because if they charged for it, given the current attitudes of the market, they wouldn't stand a chance and wouldn't ever get any market share to begin with.

SAS strikes out ^H^H^H er, "back" (5, Informative)

enilnomi (797821) | more than 5 years ago | (#26366525)


She [Anne H. Milley, director of technology product marketing at SAS] adds, "We have customers who build engines for aircraft. I am happy they are not using freeware when I get on a jet."

Good thing Boeing's not using fere software for aircraft simulation tools [] , space station labs [] , sub hunters [] , or moon rockets [] ;-)

Re:SAS strikes out ^H^H^H er, "back" (5, Informative)

jd (1658) | more than 5 years ago | (#26367151)

Good thing NASA [] likewise [] never [] uses [] Open [] Source [] to [] design [] engines [] and [] aircraft [] alongside [] companies [] like [] Boeing [] . (*This product may contain nuts^H^H^H^Hsarcasm.)

Not a language, really (0, Flamebait)

daknapp (156051) | more than 5 years ago | (#26366635)

Calling R a programming language is like calling Mathematica or Matlab a language. R is a system for statistical tasks that has a language and snytax, and but it is not capable of producing stand-alone executables that do not require the entire R environment.

Re:Not a language, really (1, Insightful)

Anonymous Coward | more than 5 years ago | (#26366759)

Calling R a programming language is like calling Mathematica or Matlab a language. R is a system for statistical tasks that has a language and snytax, and but it is not capable of producing stand-alone executables that do not require the entire R environment.

So, you're saying java, js, python, perl, and ruby aren't programming languages?

Re:Not a language, really (0)

Anonymous Coward | more than 5 years ago | (#26366797)

That definition makes every scripting language not a language.

However, I can see what you are trying to say.

Re:Not a language, really (1)

VicarofCletus (1144201) | more than 5 years ago | (#26366983)

It's amazing how often I hear people refer to Matlab as a language (mostly engineering professors).

Re:Not a language, really (5, Insightful)

Hobbes_2100 (171980) | more than 5 years ago | (#26367059)

Are you kidding me? Are you really *(*$@#ing, Grade A kidding me?

Python/Perl/Ruby require interpreters. Scheme and Lisp are frequently run within interpreters. "stand-alone executable" require HARDWARE. Any programming system requires *something* underneath it unless you are programming in a purely physical system like an automated abacus with mechanical gears that buzz and whirr.

Programming languages are defined by their Turing completeness: can they do things repeatedly, can they assign values to memory locations and perform some basic set of operations (nand works nicely), can they make decisions. Everything else is fluff.

Perl has "fluff" that handles regular expressions very well.

Python (and others) have "fluff" that make networking and database ops easy.

R has "fluff" that makes it terribly convenient to work with data.

Matlab has "fluff" that makes it very easy to do numerical methods programming.

Mathematica has "fluff" that makes it very easy to do symbolic computation.

Each and every one of these, and most well-known languages, with all their warts and beauty marks are Turing complete and are deserving of the term "programming language".


Re:Not a language, really (1)

Otter (3800) | more than 5 years ago | (#26367245)

It's also (hence the name) an open-source implementation of the much older S platform. The article distorts its history to the point of dishonesty.

Re:Not a language, really (3, Insightful)

slashdotmsiriv (922939) | more than 5 years ago | (#26367347)

Your comment is absolutely wrong. []

R is a Turing complete programming language. The fact that it requires an interpreter is completely irrelevant.

Re:Not a language, really (1)

Improv (2467) | more than 5 years ago | (#26367357)

What's libc again? Oh, that's right, it's something C programs generally need to run. So you're only programming in C if you don't use libc or statically linking? How awesome it is to have an "I am actually programming" flag in your compiler and linker!

Re:Not a language, really (1)

Palinchron (924876) | more than 5 years ago | (#26367365)

Note that the truth of your statement does not change when you replace "R" by "python" (and remove the word "statistical"). Nevertheless, I would still call python a programming language.

Re:Not a language, really (3, Informative)

tcsh(1) (683224) | more than 5 years ago | (#26367369)

Actually, R is a real (Turing-complete) programming language like Perl, Python, Ruby, etc. It just happens to have lots of statistical libraries and matrix-oriented functions.
You put #!/usr/bin/Rscript in your first line and it can work just like any other scripting language, with command-line arguments, etc. I use it all the time as a replacement for other scripting languages (think PDL+Perl or Numpy+Python).

R is an excellent language for any scientist. The sytax and semantics of the language are very well thought-out.

FUD from SAS (3, Insightful)

idiot900 (166952) | more than 5 years ago | (#26366647)

"I think it addresses a niche market for high-end data analysts that want free, readily available code," said Anne H. Milley, director of technology product marketing at SAS. She adds, "We have customers who build engines for aircraft. I am happy they are not using freeware when I get on a jet." about FUD. Does SAS imdemnify against plane crashes?

Re:FUD from SAS (1)

gishzida (591028) | more than 5 years ago | (#26366847)

You can bet someonelse's S they don't...

R sucks as a language (2, Interesting)

idiot900 (166952) | more than 5 years ago | (#26366665)

Actually it may not suck. But having used it on and off over the past few years while not being a statistics pro, I find the R language bletcherous and annoying. - as an assignment operator?

Re:R sucks as a language (1)

idiot900 (166952) | more than 5 years ago | (#26366691)

Well, crap, hit Submit instead of Preview. Meant to say, <- as an assignment operator (I know = works now, but still...)? Bizarre data frame and object semantics? R is quite useful but I really dislike writing anything nontrivial in it.

Re:R sucks as a language (0)

Anonymous Coward | more than 5 years ago | (#26366849)

since you use R "on and off" over several years, i'm not really surprised that you find it annoying. it's not particularly easy to learn (and lacks really good resources for diving into), and takes months or years of dedicated work to gain anything like proficiency. also, it really get anything out of R (at least enough to make it worth the effort of learning) one would have to be using it on relatively advanced statistical projects. otherwise, one would be better off using something more off the shelf.

however, for statistical pros R is IMO the best tool out there, and can be used in very nontrivial settings.

Re:R sucks as a language (2, Informative)

tmoertel (38456) | more than 5 years ago | (#26367041)

The R language is optimized for writing statistical code. It's going to seem a little weird, especially if you have a traditional programming background. Once you spend some serious time writing R code, however, you will probably begin to appreciate many of the things that initially seemed odd.

For example, consider the way R handles function calls [] :

  • It allows you to pass function arguments by name and abbreviate the names, which is handy during live sessions when you want to call statistical routines that have lots of arguments (which is common).
  • During a function call, arguments are bound lazily, which lets you pick apart the expressions behind them and write functions that serve as control-flow constructs. This lets you do things such as pass model expressions as arguments.
  • Also, function arguments can have default values, which are again evaluated lazily but can also see values within the scope of the function body. This lets you use computed values as defaults and have those values depend on other arguments, which in most programming languages requires extra work on your part.

All of these "oddities" serve to reduce the amount of boilerplate code you need to write when coding up statistics routines. (Click the link above if you want to see examples and take a more in-depth tour of R's fascinating and time-saving function call behavior.)

Embedded FUD (1)

bstadil (7110) | more than 5 years ago | (#26366669)

I have high regards for Ashlee Vance and miss his Podcasts he did while he was at The Register. It puzzles me he included this old FUD chestnut. Seems like a throw back from the 90's.

Anne H. Milley, director of technology product marketing at SAS ... adds, "We have customers who build engines for aircraft. I am happy they are not using freeware when I get on a jet."

High Level Lingua Franca (1)

Baldrson (78598) | more than 5 years ago | (#26366933)

From TFA:

It is becoming their lingua franca partly because data mining has entered a golden age, whether being used to set ad prices, find new drugs more quickly or fine-tune financial models.

The "smart set" needs a such a high level lingua franca to express infinite precision financial models of no accuracy whatsoever!

Actually I Prefer Q (0)

Anonymous Coward | more than 5 years ago | (#26367055)

Q [] is awesome.

Not a programming language! (1)

MrCrassic (994046) | more than 5 years ago | (#26367215)

Or at least in the context it's made out to be in this article. Isn't it a language suited mostly to statistics? For that use, I hear that it's one of the best.

Anonymous Coward (0)

Anonymous Coward | more than 5 years ago | (#26367225)

How does R compare to Python + numpy + plotting libs

The R language and its uses (5, Informative)

golodh (893453) | more than 5 years ago | (#26367257)

I'll pitch in because R deserves better than the usual Slashdot cocktail of random ignorance and immature jokes.

The R language (yes, it's a language; an interpreted languages is a language too) has developed as the language of choice by statisticians (both academics and sundry statistical researchers) around the world as their main computer language. It is used in those cases where researchers feel the need for customized computations rather than the use of a package like SAS or SPSS.

The reason that R has become popular is due to a snowball effect and history. It started as a FOSS re-implementation-from-scratch of the "S" language designed for statistical work at Bell labs (see [] . Some academics and researchers of repute used it (the S language) because at that time (1975) it was very innovative and far better than most alternatives, and others followed. The S language gained a measure of acceptance among statisticians. Then when R became available the cycle intensified because of the much improved availability of the interpretor and its libraries. This cycle continued to the point that by now probably most professional statisticians use it.

As far as I can see, the R language isn't especially sophisticated or elegant, and may strike people used to more modern languages as a bit repugnant. It does however excel in three respects:

(a) it allows for easy access of Fortran and C library routines

(b) it allows you to pass large blobs of data by name

(c) it makes it easy to pass data to and from your own compiled C and Fortran routines

The first reason is particularly important because it allows one to use e.g. pre-compiled linear algebra package like LAPACK, or Fourier Transforms, or special function evaluations and thereby gain execution speeds comparable to C despite being an interpreted language (just like Matlab, Octave, Scilab, Gauss, Ox and suchlike): the hard work is carried out by a compiled library routine which is made easily accessible through the interpreted language. Any algorithm needed in statistics that's available as C or Fortran code can be linked in and called without too much effort.

The second reason is important because it slows down execution much less than any pass-by-value interpreted language would, and it allows you to change data that is passed into a function.

The third reason is particularly important because it helps researchers be more productive. Reading in your data, examining it, graphing it, tracing outliers and cleaning them up is best done in an interactive environment in an interpreted language. Coding such things in C or Fortran is an awful waste of time, and besides, researchers aren't code-monkeys and don't enjoy coding inane for-loops to read, clean, and display data. Vector and matrix primitives are far more powerful, and usually preferable unless they are so inefficient that you have to wait for the result. However, there are times when you just need to carry out standard algorithms (linear algebra, calculation of mathematical or statistical functions) or simply time-consuming repetitive algorithms that run so much faster in a genuine compiled language. You could start out by coding the algorithm in an interpreted language to check if it's working, and then isolate the computationally expensive part and code it up in C or Fortran. Using R (or Matlab or Scilab) you can *call* the compiled subroutine, pass it your (cleaned) data, and get the result back in an environment where you can easily analyze it.

That's why languages like R, Matlab, Scilab, Octave, Gauss, and Ox are so productive: you get the best of both worlds. Both the convenience, interactiveness, and terseness of a high-level interpreted language and the speed of compiled languages.

So why R, and why not Gauss or Matlab or whatever?

Well, part of that is cultural. If you're an econometrician you'll have been weaned on Gauss and/or Ox. When you start to write something to solve a problem, you're likely to reach back for the environment you were taught. Same with engineers and Matlab: that's what they were taught, so that's what they'll use. Statisticians have historically had affinity with S.

Another part is that "R" (or "S") allow you to do some "computations on the language". You can give a function (say a plot function) a *string* as argument that is actually a piece of "R" or "S" code which calculates the function to be plotted. This allows for some easy ways to produce convenient and very general packages (e.g. the R package for linear models allows you to specify a linear model quasi as a formula in terms of existing variables in the workspace, which the package can then interpret, evaluate, and estimate.).

If you tried to code that in C, you'd end up coding a complete interpreter, which is a big waste of time for a statistical researcher. Trying to get one of the many obscure, untested, undocumented, and low-quality interpreters floating around on the net to work is likewise a big waste of time. Hence "S" and "R" rather than Matlab, Scilab, Octave, and Gauss.

Over the past 10 years "R" has become even more popular in statistical computations because (due to it's FOSS character) it's freely available for download and because lots of people have contributed both low-end features (R has a package that implements database connectivity, R can easily produce publication-grade graphs in just about any sane format, there are useful syntax-colouring editors and fairly good GUI's) and high-end features (R packages written by actual researchers and academics that implement a particular statistical algorithm, document it, and make it accessible).

R is serving as a focus point of statistical computation and seems only to increase in quality and applicability over the years.

I tried to learn R... (-1, Offtopic)

Anonymous Coward | more than 5 years ago | (#26367337)

...but there are just too many R-rors in it. Ok, to R is human. But this crap made me scream until I ran out of R! I think, the whole language is an R-ror, and needs to be R-raised from the face of the planet Rth.

Damn. It took me 'til here, to realize, that this works only with the German way of speaking the R! :(
What an R-ror.

Fine-tuning financial models (1)

macraig (621737) | more than 5 years ago | (#26367351)

I think we all know how well that's turned out, eh? So it that the fault of the language or programmer error?

Load More Comments
Slashdot Login

Need an Account?

Forgot your password?