Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Spark Advances From Apache Incubator To Top-Level Project

timothy posted about 8 months ago | from the distribution-solution dept.

Open Source 24

rjmarvin writes "The Apache Software Foundation announced that Spark, the open-source cluster-computing framework for Big Data analysis has graduated from the Apache Incubator to a top-level project. A project management committee will guide the project's day-to-day operations, and Databricks cofounder Matei Zaharia will be appointed VP of Apache Spark. Spark runs programs 100x faster than Apache Hadoop MapReduce in memory, and it provides APIs that enable developers to rapidly develop applications in Java, Python or Scala, according to the ASF."

cancel ×

24 comments

Sorry! There are no comments related to the filter you selected.

Not good (1)

Anonymous Coward | about 8 months ago | (#46375209)

Generally when Spark advances you get engine knock.

Re:Not good (1)

Hognoxious (631665) | about 8 months ago | (#46375765)

I not only get your joke, but I've adjusted the spring-and-weights thingy that controls it. I can't remember its name, mind.

Shit, I'm getting old.

Re:Not good (1)

TheRealHocusLocus (2319802) | about 8 months ago | (#46382449)

spring-and-weights thingy

I think you mean a governor, guv'ner. The origin of the Motor-Operated Pushover is aptly described here, "To think, all I had to do was put the balls on the other side! Aren't they beautiful?" [youtube.com]

I like the Future, I'm in it.

Re:Not good (1)

NelsChristian (66295) | about 8 months ago | (#46382773)

Governor? How about adjusting the contact points which you'd find in the distributor?

I'm gonna tinker with it (1)

Mister Liberty (769145) | about 8 months ago | (#46375271)

Only thing -- where do I get my big data?

Re:I'm gonna tinker with it (0)

Anonymous Coward | about 8 months ago | (#46375333)

Google it sometime ya stupid fuck

Re:I'm gonna tinker with it (0)

Anonymous Coward | about 8 months ago | (#46382923)

Google it sometime ya stupid fuck

That's your job, or will be soon.

Too complicated (-1)

Anonymous Coward | about 8 months ago | (#46375273)

What is this nonsense? Dice why are you trying to appeal to nerds, I want more news on how to reduce my cable company bill or a bunch of misguided articles on some new social construct theory pooped out by bias research. I also want more whitespace, bigger fonts, and more exposure to the wonderful image library of istockphoto. I can't stand the information density of this design. I like to scroll a lot. If a document could fit on 1 page, I'd rather it be spread out so I can scroll 10 pages instead.

Is there anything you can do for me about all this? Stuff that matters.

Re:Too complicated (0)

Anonymous Coward | about 8 months ago | (#46379315)

Shh, or they'll add horizontal scrolling too!

I hope this is far better than Apache Solr (0)

IgnorantMotherFucker (3394481) | about 8 months ago | (#46375275)

Solr claims to be yet strictly fails to be a drop-in search engine for your website.

A former employer of mine, who didn't have a clue about Linux, Java or Open Source, bet the farm on Solr speeding up the report generation for his online service.

I don't want to tell you who this employer is because they provide a valuable service to the business community. But the owner of the company is a raging alcoholic, who devoted at least an hour at the end of each day for not having gotten Solr up and running yet, despite his not having lifted a finger to evaluate it before committing to the project.

If there is the slightest error in Solr's configuration, and you have logging enabled, it spews Java exception stack traces, but does not give you the first clue as to what you did wrong.

Stack traces are for developers not end-users, M'Kay? How about a diagnostic message?

I repeatedly asked for help on Stackoverflow but no one ever answered my questions. All I ever found were questions from other desperate Solr users, for the most part unanswered.

Before you commit to a technology, or your new bride, or your vote for a political candidate, put it or their name into google along with "sucks" just for grins.

for example, at the time "Solr sucks" got 600,000 hits.

It is appalling that software like that would be released to end users with the claim it is production quality software.

From time to time I see Solr coding gigs on the job boards. I never apply, I just say "You are doomed" to myself. Perhaps I should do the right thing, by sending a polite email to the hiring manager, suggesting he select some other solution to his problem.

Re:I hope this is far better than Apache Solr (1)

iggymanz (596061) | about 8 months ago | (#46375761)

the target market for Solr is the "enterprise". big corporations who have developers and operations people on staff with heavy duty skills.

don't cry because because you can't handle it

Re:I hope this is far better than Apache Solr (1)

jockm (233372) | about 8 months ago | (#46375779)

So do you judge every Apache project this way? Are Apache, Tomcat, Commons, Batik, CouchDB, etc etc etc all crap until proven otherwise because of Solr? Apache is a collection of projects, maintained by different people.

And not to trash your friend's company, but he picked a technology without trying it out yet? Then that company had bigger problems that Solr. Nor would I judge Solr by that story (I have never used Solr, nor am I involved with it in any way).

huh? (0)

Anonymous Coward | about 8 months ago | (#46376985)

Solr speeding up "report generation"? That's completely stupid. It's meant to search text and then link back to the "source" documents, it's not meant for report generation.

I am a HUGE solr fan and I think it's one of the most impressive open source projects I've ever seen. Yeah there are a ton of knobs, switches, and things for you to fiddle with but that's what makes it so great, you can do so much with it. At my job, we needed to do "on demand" indexing because it was basically impossible to index all of the stuff in the database all at once, so we wrote a custom "data import handler" to index people's information when a user picks them and it works unbelievably well. Solr handles over 100,000 "build this person's index" requests a day while also handling the search load without blinking. I'm a programmer and I think it's absolutely great.

yeah I knew that but he didn't (0)

Anonymous Coward | about 8 months ago | (#46379715)

his real problem was that his Microsoft SQL Server database had three hundred tables, many of which had three hundred columns.

It, as they say, "Grew Organically".

I never got a look at any of the C# .net code for their web application, but I was told that it had grown organically too.

The long-term plan was to scrap all the C#, then rewrite the entire thing in Java.

However, the company owner never seemed to actually clue in to that there was something wrong with the database.

I resigned in protest when I concluded he was a raging alcoholic. I've seen this many times before; he was never drunk during the work day but there were vast quantities of empty beer bottles, as well as ten lovingly preserved, quite large and empty hard liquor bottles.

the only room in the office that didn't look like a bomb went off in it, was a pristine, spotlessly clean room with a real nice pool table, as well as the kinds of liquor advertisement mirrors that one commonly sees in bars.

To top it off, the office did not have sufficient cooling. While I was right next to a door, my cubicle wall was right against the door. I'm pretty sure that's a fire code violation.

Re:I hope this is far better than Apache Solr (0)

Anonymous Coward | about 8 months ago | (#46383213)

Here's some more "research":

Apple sucks: 73,300,000
Windows sucks: 41,800,000
Microsoft sucks: 41,300,000
Linux sucks: 8,980,000
OSX sucks: 769,000
Canonical sucks: 189,000
Electrolux sucks: 147,000

rapid development? (0)

Anonymous Coward | about 8 months ago | (#46375407)

enable developers to rapidly develop applications in Java

That's a contradiction.

Re:rapid development? (0)

Anonymous Coward | about 8 months ago | (#46379337)

Ever heard of MDE, punk?

And Tachyon boosts Spark another 2-8x (2)

michaelmalak (91262) | about 8 months ago | (#46375545)

Spark runs programs 100x faster than Apache Hadoop MapReduce in memory

And Tachyon, another component of Matei's Berkeley Data Analytics Stack, boosts [datascienceassn.org] Spark another factor of 2-8x by sidestepping JVM garbage collection issues.

Degrees before TDC? (0)

Anonymous Coward | about 8 months ago | (#46376103)

Or after? Which is advanced? Anybody got a timing light? Anybody know what a timing light is? Does anybody really care? Know what time?

Spark rarely performs as well as advertised (1)

Anonymous Coward | about 8 months ago | (#46376593)

On one carefully selected benchmark, discounting a lot of things that matter (like data movement) spark performs better than Hadoop. Tech reports generated by the authors suggest that this is a corner case and that the variance in spark performance is wildly variable. Don't believe the hype.

Re:Spark rarely performs as well as advertised (1)

techhead79 (1517299) | about 8 months ago | (#46377225)

I think the major advantage to using Spark isn't just in the performance but in using libraries such as MLBase/MLLib. Is this not correct? While I realize R is mostly adopted in the industry, MLLib seems to be catching up very fast.

Swiss army knife of data processing (1)

anti-gens (871670) | about 8 months ago | (#46378937)

Spark is a seriously awesome project. I have used it quite a lot in the last 3 months and I have to say, I love it. I currently use the spark shell as I would use awk. I have input files and I want to process stuff in it and output stuff out. That is probably the best thing about spark. Almost everyone I talked to is doing something different with it. Machine Learning with MLLib, graph processing with graphX.

my 2 cents on the matter... (0)

Anonymous Coward | about 8 months ago | (#46380333)

I am the 19th comment. *tips hat*

H2O (0)

Anonymous Coward | about 8 months ago | (#46380695)

How does Spark compare with H2O http://0xdata.com/ from Cliff Click

Check for New Comments
Slashdot Login

Need an Account?

Forgot your password?