Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Oracle Claims Dramatic MySQL Performance Improvements

timothy posted more than 2 years ago | from the breathing-in-those-sulfurous-fumes dept.

Databases 168

New submitter simula67 writes "Oracle wins back some karma from the open source community by releasing MySQL cluster 7.2 with ambitious claims of 70x performance gains. The new release is GPL and claims to have processed over 1 billion queries per minute. Readers may remember the story about Oracle adding commercial extensions to MySQL."

cancel ×

168 comments

Sorry! There are no comments related to the filter you selected.

PostgreSQL (0, Insightful)

Anonymous Coward | more than 2 years ago | (#39063913)

Shove it Oracle, I'll stick with PostgreSQL.

Re:PostgreSQL (4, Funny)

Hognoxious (631665) | more than 2 years ago | (#39064373)

Is PostgreSQL webscale? MongoDB is.

Re:PostgreSQL (5, Funny)

roman_mir (125474) | more than 2 years ago | (#39064523)

Any questions? [youtube.com]

Re:PostgreSQL (2)

Sarten-X (1102295) | more than 2 years ago | (#39067883)

The first 27 seconds made a good point about comparing various products and evaluating them on their merits. The remaining 5 minutes was a mix of strawmen and fallacies. Is there supposed to be a point to that?

Re:PostgreSQL (1)

dkleinsc (563838) | more than 2 years ago | (#39064793)

Yes: It's strengths are reliability and price point (free), but it's pretty fast, has clustering capabilities, and has been used for large-scale web applications.

Seriously, it's hard to go wrong with PostgreSql when you need a relational database. MongoDB, of course, is a very different animal intended for very different tasks.

Re:PostgreSQL (2)

Admiral Llama (2826) | more than 2 years ago | (#39064983)

Versus all the other factors you can throw in there for anything involving heavy lifting for an enterprise app, raw price point of the DB engine is pretty close to the bottom of the list.

Re:PostgreSQL (1)

jedidiah (1196) | more than 2 years ago | (#39067163)

Are you kidding? In an Oracle discussion?

We're talking very large numbers here, and potentially more money than you will make in your entire lifetime.

Some one definitely cares. They may bite their tongue and still buy Oracle but they do care. The numbers are not trivial.

Re:PostgreSQL (1)

apotheon (577944) | more than 2 years ago | (#39070579)

It's worth noting that there have been some pretty high-profile cases where Oracle was failing to deliver for scaling under heavy load, and companies like EnterpriseDB (a value-added PostgreSQL vendor) provided a PostgreSQL-based solution that saved the day. One such example that comes to mind is the FTD migration a few years back. While the EnterpriseDB deployment of PostgreSQL for FTD wasn't free, it sure as hell was cheaper than maintaining Oracle licenses, and it worked a lot better too -- a pretty big deal when considering the size of the FTD network and the fact this was a Valentine's Day season issue when everybody and his brother is getting some flowers delivered to Mom, Wife, and Mistress at the same time.

Re:PostgreSQL (0)

Anonymous Coward | more than 2 years ago | (#39067827)

Versus all the other factors you can throw in there for anything involving heavy lifting for an enterprise app, raw price point of the DB engine is pretty close to the bottom of the list.

Only when you work for extremely large enterprises. With most companies in America and around the world, its far, far from the bottom of the list but as you attempt to point out, not necessarily at or even near the top.

MariaDB? (0)

Anonymous Coward | more than 2 years ago | (#39063977)

How many of these performance improvements were copied back from MariaDB 5.3? For me, it's already too late for Oracle's MySQL. I've switched to Maria and I'm not planning on coming back.

Re:MariaDB? (0)

Anonymous Coward | more than 2 years ago | (#39064271)

Considering MariaDB wasn't really faster than MySQL in any appreciable way, I'm going to guess not many of improvements came from there. Since both are open source, feel free to compare for yourself though.

Re:MariaDB? (1)

Larryish (1215510) | more than 2 years ago | (#39068839)

Is there a Virtualmin module for MariaDB yet?

If so, I'll switch tonight.

Considering sub queries in IN statements. (5, Insightful)

micheas (231635) | more than 2 years ago | (#39064007)

If you create a query in mysql with an IN statement in the where clause and you put a sub query as the in statement current versions will run the query once for each row of the primary table you are querying. Caching result alone would probably get the 70x speed up. I am suspect that there are other performance stupidities in mysql that are worked around by people doing a simple query and then using php/perl/python/java/etc to parse the result and generate the second query.

Re:Considering sub queries in IN statements. (5, Informative)

Anonymous Coward | more than 2 years ago | (#39064183)

If you create a query in mysql with an IN statement in the where clause and you put a sub query as the in statement current versions will run the query once for each row of the primary table you are querying. Caching result alone would probably get the 70x speed up. I am suspect that there are other performance stupidities in mysql that are worked around by people doing a simple query and then using php/perl/python/java/etc to parse the result and generate the second query.

yes, but many database engines have problems with inner select statements. Refactor your query to use a join an you'll be fine.

Re:Considering sub queries in IN statements. (4, Insightful)

K. S. Kyosuke (729550) | more than 2 years ago | (#39064299)

yes, but many database engines have problems with inner select statements. Refactor your query to use a join an you'll be fine.

And what's wrong with using an RDBMS with a non-braindead optimizer?

Re:Considering sub queries in IN statements. (3, Insightful)

cheater512 (783349) | more than 2 years ago | (#39065575)

The poor RDBMS has to put up with crap like that over and over again.
It will probably need therapy after you are through with it.

The programmer is supposed to know how to use the tool effectively to get the best performance.
In that example there is a stupidly obvious solution - use a join.

Re:Considering sub queries in IN statements. (4, Insightful)

K. S. Kyosuke (729550) | more than 2 years ago | (#39067561)

The poor RDBMS has to put up with crap like that over and over again. It will probably need therapy after you are through with it.

The programmer is supposed to know how to use the tool effectively to get the best performance. In that example there is a stupidly obvious solution - use a join.

To me that sounds like fixing a bug in the wrong place. RDBMS is *not* supposed to be dumb. If it were supposed to be dumb, we would not be using SQL in the first place and we'd be passing our own execution plans in the DB engine's internal intermediate language instead. (Well, you can sort of do it with Firebird (BLR) and SQLite (VDBE), but it's not how it's supposed to be used.)

Re:Considering sub queries in IN statements. (2)

theshowmecanuck (703852) | more than 2 years ago | (#39068327)

This is going to happen less and less as programmers think they don't need to understand databases because they ORM like hibernate. I know of many programmers who don't want to know about sql beyond what they need to make their hibernate mapping work.

Re:Considering sub queries in IN statements. (2)

apotheon (577944) | more than 2 years ago | (#39070611)

It's tough to blame those programmers who don't want to use SQL. It's a miserable language. Even one of its principle designers hates it.

Re:Considering sub queries in IN statements. (1)

znrt (2424692) | more than 2 years ago | (#39066347)

And what's wrong with using an RDBMS with a non-braindead optimizer?

from a general database design point of view, everything. considering it is already a quite gratuituous limitation to tie yourself to any particular rdbms, relying on a particular optimizer is just an extreme form of self-mutilation.

it's true, however, that sometimes specific requirements of a problem will force you to rely on particular rdbms features, including but not limited to optimization. this could be what ... 0,1% of the time?

and what's wrong with actually learning sql and understanding how rdbs' actually work?

Re:Considering sub queries in IN statements. (1)

Anonymous Coward | more than 2 years ago | (#39067719)

and what's wrong with actually learning sql and understanding how rdbs' actually work?

This.

I haven't been a true database developer since Oracle 7 way back in the mists - but really. Just learn SQL and tune your SQL properly. It's not fricking rocket science.

Re:Considering sub queries in IN statements. (0)

Anonymous Coward | more than 2 years ago | (#39064313)

PostgreSQL and Oracle have terrible IN sub select performance. If you're having trouble with MySQL, you'd be shocked what the "big boys" are like.

You provide no data, volume of records, table defs and index, and no SQL code. How about you provide something concrete, you might get some help.

Re:Considering sub queries in IN statements. (3, Interesting)

Anonymous Coward | more than 2 years ago | (#39064353)

The latest (past year's worth) of PostgreSQL releases automatically rewrite subqueries as joins for you.

However, MSSQL 2008 performance dropped significantly from prior versions due to the exact opposite --- subqueries are now looped through, where previously they were handled with hash tables internally.

Re:Considering sub queries in IN statements. (3, Informative)

micheas (231635) | more than 2 years ago | (#39064437)

SELECT id, title FROM page WHERE id IN (SELECT pageid FROM hotnews WHERE user = 8)

The answer to this is to execute the sub query and pass the result back as an array. If you have 10,000 records in table page, you will see about a 2000x speed improvement if you are using php

Re:Considering sub queries in IN statements. (2)

K. S. Kyosuke (729550) | more than 2 years ago | (#39064491)

That's utter crap. Why involve the client? Why the extra roundtrips? If you need to do the thing you've just described, you must be doing something horribly wrong.

Re:Considering sub queries in IN statements. (1)

errandum (2014454) | more than 2 years ago | (#39069019)

Where do you see the client and the roundtrips? There would be only one (DB access once) and the the this method would be 100% transparent to the client.

Re:Considering sub queries in IN statements. (2, Informative)

Anonymous Coward | more than 2 years ago | (#39064581)

That's just a crappy query... no wonder you've got bad performance.

SELECT id, title FROM page JOIN hotnews on page.id = hotnews.pageid AND user = 8

There. Enjoy your performance boost.

Re:Considering sub queries in IN statements. (1)

K. S. Kyosuke (729550) | more than 2 years ago | (#39064869)

The inner query was *perfectly fine*. It wasn't correlated with the outer query in any way. Nothing prevents the RDBMS to optimize the sequence of these to queries by first fetching the results from the inner query, hashing it and using it to filter the outer query. Optionally, the optimizer is free to rewrite it to a join on its own if the expected result set from the inner query is large.

Re:Considering sub queries in IN statements. (0)

Anonymous Coward | more than 2 years ago | (#39064783)

Yes... that's the right answer... *facepalm*

Re:Considering sub queries in IN statements. (2)

viperidaenz (2515578) | more than 2 years ago | (#39064883)

I swear some people don't even know there is a JOIN keyword in the SQL language.

Re:Considering sub queries in IN statements. (0)

Anonymous Coward | more than 2 years ago | (#39065313)

That was obviously a simplistic example.. there are times when an IN statement is required, or makes an otherwise unintuitive query make much more sense.

Either way if you have to write your SQL around poor areas of the optimizer it's not a good optimizer.

Re:Considering sub queries in IN statements. (2, Insightful)

Anonymous Coward | more than 2 years ago | (#39065801)

"some" people are perfectly aware of JOIN and have suffered one too many hairball, incomprehensible SQL statements with 10+ tables mashed together under FROM using JOIN. Sub-queries provide opportunities to factor out some of these into discrete parts that can be individually examined and tested.

Please, stop discouraging the use and application of sub-queries. If you have encountered people that insist on using them despite MySQL's tragically bad optimizer, it is likely they have been exposed to powerful databases that haven't been copping out on basic RDBMS functionality for over a decade. Instead, harangue the MySQL developers; they are the people at fault.

Re:Considering sub queries in IN statements. (1)

sg_oneill (159032) | more than 2 years ago | (#39069929)

Which would be fine if it wasnt for the fact the version with the JOIN isn't just faster, but simpler to understand.

Seriously there is no excuse for bad SQL. Subqueries have their places, but using them like that is just bad programming.

Re:Considering sub queries in IN statements. (0)

Anonymous Coward | more than 2 years ago | (#39067373)

I use JOIN if I want to JOIN. I use filters (IN, EXISTS) if I want to filter. Doing a JOIN when you want EXISTS is a hack.

Re:Considering sub queries in IN statements. (0)

Anonymous Coward | more than 2 years ago | (#39065489)

God I can't believe MySQL is still making people suffer this. The answer is to demand that this database catch up with the 90's and handle simple, basic stuff like sub-queries efficiently. There is absolutely NO REASON for this query not to operate efficiently.

Is that romper room database still silently truncating numbers and strings by default?

Re:Considering sub queries in IN statements. (1)

micheas (231635) | more than 2 years ago | (#39067155)

The version of mysql in debian no longer silently truncates strings by default.

I found out as I was abusing the "feature" to not worry about sanitizing the length of data input and my import script suddenly stopped working last year and I had to do things a little less wrong.

.

Re:Considering sub queries in IN statements. (1)

StuartHankins (1020819) | more than 2 years ago | (#39066295)

Before using an "in" clause here I'd probably use this (depends on the number of anticipated rows in both tables):
select id, title
from page as a
where exists (
select 1
from hotnews b
where b.user = 8
and b.id = a.id
)
I would rearrange the fields in the "where" clause to match the index, highest selectivity first. If there were no suitable index I'd consider creating one, possibly a covering index.

Re:Considering sub queries in IN statements. (1)

Admiral Llama (2826) | more than 2 years ago | (#39064453)

MySQL choked an on "...and (condition or condition)" doing a seq scan instead of filtering even though there was an index on the table that those conditions were on. I changed it to an "where in (...)" and I got the same result. I had to move the or conditional into being a derived table joined in.

For the same type of query Informix does a dynamic hash join with no SQL thuggery necessary.

Re:Considering sub queries in IN statements. (1)

Anonymous Coward | more than 2 years ago | (#39067275)

I've never had any problems with Oracle IN sub select performance: execution plans are pretty reasonable. Can you ellaborate?

Re:Considering sub queries in IN statements. (5, Informative)

wstrucke (876891) | more than 2 years ago | (#39064785)

You're probably right, but that's not what this release is referring to. The NDBCLUSTER engine [mysql.com] separates "API" nodes from Data nodes. A server running MySQL with NDBCLUSTER enabled is considered an API node, but you can also have a C++ or Java or whatever API node that isn't MySQL. Data nodes are provisioned in one or more "node groups" with one or more data nodes in each group, though it would be dumb to have a single node group or a single node in a node group. Each node group splits the data somehow. You can force tables to exist on just one group, but by default if you have three node groups you would have approximately a third of your data in each group.

Anyway -- prior to NDBCLUSTER 7.2 if you performed any join whatsoever the API node had to pull the complete tables from all data node groups prior to doing the join on itself and returning the result. This made join performance, simply put -- terrible. I've tested the same query on a standalone out of the box mysql server against a 7.1 cluster and had an 8 second query come back from the cluster in several minutes due to the join performance.

NDBCLUSTER 7.2 adds what was called "push down joins" [blogspot.com] in development -- basically the data nodes now do the joins within their own sub-sets of the data for certain joins resulting in a dramatic improvement in performance, since now the API nodes just get the result from the network instead of the entire dataset.

It really is an amazing improvement and is a result of the dynamic capabilities of MySQL. NDBCLUSTER was never designed for the type of data people are throwing at it, and with the recent improvements it might actually be viable for use on production web sites for more common usage scenarios.

What I do not see addressed yet with 7.2 is the reload time -- if you have a cluster loaded with several GB of data it can take upwards of 20 minutes to reload the data and indices across the cluster. While the cluster is designed to be up 24x7 (even through upgrades), a single bug or memory error that takes it down can result in a prolonged outage. There are several of these [mysql.com] open in the bug tracker.

Re:Considering sub queries in IN statements. (4, Informative)

Anonymous Coward | more than 2 years ago | (#39064979)

If you create a query in mysql with an IN statement in the where clause and you put a sub query as the in statement current versions will run the query once for each row of the primary table you are querying. Caching result alone would probably get the 70x speed up

We have exactly that in MariaDB 5.3 (a fork of MySQL) : http://kb.askmonty.org/en/subquery-cache. It won't always give the 70x speedup, sometimes you need to start execution from the subquery to get decent performance: http://kb.askmonty.org/en/semi-join-subquery-optimizations.

I am sorry for shameless self-promotion, couldn't resist :-)

Re:Considering sub queries in IN statements. (1)

Anonymous Coward | more than 2 years ago | (#39065135)

Each SQL engine has its own set of limitation and performance hug. You have to deal with them and optimize your code this way. A lot of people tend to forget the golden rule : do as much data sorting and filtering on the database, leave the logic in the code. You shouldn't pass an array in an SQL request. Those data should be stored in a temporary table.

Re:Considering sub queries in IN statements. (3, Informative)

jon3k (691256) | more than 2 years ago | (#39065359)

Just fyi, It's called a correlated subquery.

Re:Considering sub queries in IN statements. (-1)

Anonymous Coward | more than 2 years ago | (#39065761)

Just fyi, you're called a worthless little pompous shite.

Re:Considering sub queries in IN statements. (1)

K. S. Kyosuke (729550) | more than 2 years ago | (#39067583)

Just fyi, It's called a correlated subquery.

If the thing he's talking about had been correlated subquery, he wouldn't have been able to cache its results. Ergo, he must be talking about a lousy execution of a non-correlated subquery.

Re:Considering sub queries in IN statements. (2)

StuartHankins (1020819) | more than 2 years ago | (#39066197)

If your living depends on you knowing how to do something well, you will learn to use the tools most effectively.

Professional DBA's know this, and know how to code correctly to avoid the problem in the first place. Having it "work" vs having it work reliably, quickly, and in a manner which scales is what separates any computer user from a professional. Training, ambition, curiosity, and experience are what transforms you into that professional.

So while it may be nice to have the machine try to outsmart you while you work in a 4th generation language, learning how and why it works can be illuminating in other projects and other methods. You are able to do more because you know more.

Re:Considering sub queries in IN statements. (0)

Anonymous Coward | more than 2 years ago | (#39069427)

For those that didn't RTFA, that is exactly how it was achieved. They put a memcached component in so that (obviously) the select queries would be cached, knowing full well that those results may be hit over and over during the JOIN statement.

Re:Considering sub queries in IN statements. (0)

Anonymous Coward | more than 2 years ago | (#39069835)

For those that didn't RTFA, that is exactly how it was achieved. They put a memcached component in so that (obviously) the select queries would be cached, knowing full well that those results may be hit over and over during the JOIN statement.

Please, you have *no* idea what you're talking about.

The Memcache interface is just that--an interface to make it easy to talk to Memcache from the NDB storage engine. It has NOTHING to do with increased execution speed for many joins on NDB tables.

And yes, the 70x speedup figure is about what I got the first time I compared an NDB 7.1 release with a 7.2 that I built from the development tree just after the SPJ patches went in.

Finally! (2)

K. S. Kyosuke (729550) | more than 2 years ago | (#39064053)

I knew they would implement the Oracle storage backend into MySQL one day...

Re:Finally! (3, Interesting)

Admiral Llama (2826) | more than 2 years ago | (#39064539)

No, rather it's amazing what happens when you architect the DB such that the entire thing is forcibly held in memory.

Re:Finally! (1)

K. S. Kyosuke (729550) | more than 2 years ago | (#39067599)

That was supposed to be a joke. :)

MySQL Cluster (1)

CastrTroy (595695) | more than 2 years ago | (#39064063)

MySQL Seems like it could be interesting, but I can't get over how it requires the whole thing to be hosted in memory. I'm much more interested in Percona Cluster [percona.com] which is also based off MySQL.

Re:MySQL Cluster (0)

Anonymous Coward | more than 2 years ago | (#39064171)

Swap isn't really "memory" in the sense that RAM is. Unless you're running WAMP or something.

Re:MySQL Cluster (1)

ducomputergeek (595742) | more than 2 years ago | (#39064181)

Using memory for db cluster nodes isn't new. Teradata has been doing it for a long time now in their data warehousing. It has its advantages on large databases and you have the money to afford the nodes you need for reliability.

MySQL Cluster != MySQL (5, Informative)

Anonymous Coward | more than 2 years ago | (#39064083)

Not the same thing.

Re:MySQL Cluster != MySQL (1)

Animats (122034) | more than 2 years ago | (#39064383)

MySQL Cluster used to be a pay-only product. Is it free now?

(The setup with one MySQL master and N read-only slaves has been free for years. That powers Wikipedia, for example. It's not MySQL Cluster. A MySQL Cluster has multiple read-write machines.)

Re:MySQL Cluster != MySQL (0)

Anonymous Coward | more than 2 years ago | (#39065035)

MySQL Cluster has never been a pay-only product.

But of course it reads from RAM (4, Interesting)

photonyx (2507666) | more than 2 years ago | (#39064137)

The case when the data set is bigger than RAM amount has not been investigated (link here [blogspot.com] , see the comments). The hard drive I/O speed would slow it dramatically, unless it's an expensive array of SSDs.

Re:But of course it reads from RAM (2)

fuzzyfuzzyfungus (1223518) | more than 2 years ago | (#39064265)

Nothing can really save your performance numbers once you have to hit the platters, so the question would be not 'does it suck more when it can't work purely in RAM?' but 'How good are the mechanisms available for minimizing(ideally automatically, but at least without too much black magic on the application programmers' parts) the frequency with which your setup ends up needing something that isn't in RAM and being bottlenecked by having to fetch it from disk?'...

Re:But of course it reads from RAM (2)

Bill, Shooter of Bul (629286) | more than 2 years ago | (#39065617)

Who the hell uses actual disks when performance is an issue?
Also, who the hell uses ssd's when performance is an issue?
The fast kids use PCI Express cards loaded with flash.
Texas Memory systems or Fusion IO. Expensive, yes, but this is Mysql CLUSTER we are talking about. You don't use that unless you need the speed.

Re:But of course it reads from RAM (1)

sitkill (893183) | more than 2 years ago | (#39067027)

That's weird....I believe that when the dataset is bigger than ram, you just break it into different servers (or just get more ram). This isn't anything new, this is actually old OLD technology, and you can find cases of this everywhere (google, microsoft uses it for hotmail, etc), called the Brick architecture or something like that.

Actually, a ton of research was done on these types of architectures by Jim Gray (who tragically was lost at sea a few years ago) at Microsoft. Not sure how many of these links are still active but: http://research.microsoft.com/en-us/um/people/gray/ [microsoft.com]

From my own experience, at a job a few years ago, we were dealing with a little over 1 petabyte of data, and the system was engineered to NEVER hit platter, and to always have data in ram.

Re:But of course it reads from RAM (1)

Trogre (513942) | more than 2 years ago | (#39067405)

That would make it a RAED array.

That's great... (1)

Daetrin (576516) | more than 2 years ago | (#39064295)

Now can they please work on some dramatic usability improvements so i don't have to cringe every time an Oracle support question comes up at work?

Re:That's great... (1)

jedidiah (1196) | more than 2 years ago | (#39067225)

If you can't be bothered, you can ignore a lot of the finer details in Oracle just as much as you would be prone to with any other RDBMS.

Re:That's great... (1)

Daetrin (576516) | more than 2 years ago | (#39067369)

Uh, if it was something i could ignore then i wouldn't be cringing, i would be just shrugging and ignoring the problem. As long as some of our (big) customers insist on using Oracle we have to work with Oracle to support them. And they're not going to accept "we didn't implement/debug that feature in Oracle because working with Oracle is a pain" very well.

Yes, I RTFA (sue me) (3, Informative)

billcopc (196330) | more than 2 years ago | (#39064301)

If I read the sales pitch correctly, they just integrated Memcached as a backend storage module, so that it plays nicely wrt ACID compliance. Yeah, memory is 70x faster than disk I/O... big whoop!

Anyone running a sizeable MySQL installation already has heaps of RAM allocated to the InnoDB buffers/caches anyway. It sounds like Oracle compared a stock, distro-default MySQL to their memory-hungry tweaks. Yeah, DUH. I can get a 70x speedup too if I increase MySQL's memory usage from the default 64mb to 48 gigabytes.

Re:Yes, I RTFA (sue me) (2)

medv4380 (1604309) | more than 2 years ago | (#39064817)

If you're going to read an article about a benchmark you might actually read the benchmark too.

Previously each data node exploited about 6 CPUs with 8 threads, in this benchmark we used about 14 CPUs with 24 threads. We've also removed a number of bottlenecks and the configuration of the threads is done in such a manner as to handle many different loads well.

The little thing you read about memcached was about adding in a Native API which probably helped things, but getting the multithreading upgraded probably had more to do with it. They also were comparing two benchmarks that both were using just memory, and not a case where 1 used disk and the other used ram.

Re:Yes, I RTFA (sue me) (1)

bdgregg (744616) | more than 2 years ago | (#39065023)

I thought he was talking about the 70x result, not the 1 billion result. Quoting from the 70x benchmark:

For this test, 3 machines were used:

Intel Core 2 Quad Core @2.83 GHz; 8 Gbytes RAM; single, multi-threaded data node (ndbmtd)
Intel Core 2 Quad Core @2.83 GHz; 8 Gbytes RAM; single, multi-threaded data node (ndbmtd)
4 Core Fedora VM running on VirtualBox on Windows 7, single MySQL Server

It looks like the AQL feature has pushed work from that MySQL-on-VirtualBox-on-Windows-7 system to the pair of data nodes with 8 GB RAM, and found it runs 70x faster. It doesn't say how much RAM the MySQL server had, what the data nodes were running, or why it was running in that virtualized environment in the first place. I can't say I'm surprised that offloading work from such a place has has provided a big win. It may be more interesting to see the result from a more typical setup, such as MySQL running on a baremetal OS.

Re:Yes, I RTFA (sue me) (1)

dreemernj (859414) | more than 2 years ago | (#39065495)

It sounds like the RAM isn't actually the real boost. This is MySQL Cluster. The last time I used it, it kept everything in RAM already. I think since then they let you store non-indexed tables onto disk, but that's it. The speed boost sounds like it is for NDB, not InnoDB. NDB doesn't support foreign keys and removing that constraint probably helps its performance too.

In this case, the actual boost is from AQL, which allows for a query to be split up by JOINs and sent out to different servers to run against their local tables in parallel.

Re:Yes, I RTFA (sue me) (0)

Anonymous Coward | more than 2 years ago | (#39065765)

Actually the older versions ONLY could keep the data in RAM on the data nodes. Newer versions (7.x) can hold it on disk too.

Re:Yes, I RTFA (sue me) (1)

dreemernj (859414) | more than 2 years ago | (#39065991)

Yep, that's exactly what I said. But thank you for reiterating.

Re:Yes, I RTFA (sue me) (0)

Anonymous Coward | more than 2 years ago | (#39066351)

And towards the bottom of the discussion we have people that read and understand the article. Yay!

Re:Yes, I RTFA (sue me) (1)

lewiscr (3314) | more than 2 years ago | (#39067037)

DUH. I can get a 70x speedup too if I increase MySQL's memory usage from the default 64mb to 48 gigabytes.

Just make sure you actually test it. Some of the performance recommendations in the manual don't scale from 64 MB to 48 GB. MyISAM Key Buffer and Query Cache, I'm looking at you.

the old release (2)

nimbius (983462) | more than 2 years ago | (#39064375)

was GPL as well.
dont forget that Oracle committed that MySQL server will continue to use the dual-licensing strategy long used by MySQL AB with commercial and GPL versions available until at least 2015. if other big players are any indication this is oracles attempt to avoid another apache/java landmine and preserve the trust and respect of developers until such time as the product is sufficiently 'in-housed' to divorce from the community without fear of retalliation. by 2015 the forked code, should there be yet another fork, can probably enjoy a tenth of the performance of oracles version and have to compete with ingrained lock-ins and contract conditions developed by oracle to further ostracize open source competetors.

but im not sure if its really relevant at all. databases like hypercube and couch are really giving oracle a run for their copious amounts of money. by 2015 the "paradigm" may have "shifted" as the PHB says.

MySQL in-memory speed (5, Interesting)

Amadodd (620353) | more than 2 years ago | (#39064377)

Slightly off-topic, but I recently had the oppurtunity to test the speed of a MySQL in-memory database. I have some frequently queried read-only data that simply would not handle the load in MS SQL and was looking for an in-memory solution. MySQL provided the simplest implementation - simply tell the table to use memory storage and configure the server to allow the amount of data you want to host (~250MB in this case). You also have to remember to reload the data from normal InnoDB tables every time you restart the server. I used the same table structures, keys indexes and stored procedures (almost the same) to query the data and linked it through MS SQL so that my applications never new the difference. On exactly the same hardware the speed increase was at least 50X over MS SQL.

Re:MySQL in-memory speed (2)

bbbaldie (935205) | more than 2 years ago | (#39064687)

I've programmed PHP to use SQL Server and MySQL, the MySQL statements typically run in milliseconds, it usually takes SQL Server two or three seconds to respond.

Re:MySQL in-memory speed (1)

rgbrenner (317308) | more than 2 years ago | (#39064955)

it usually takes SQL Server two or three seconds to respond

then you're incompetent. stackoverflow uses SQL Server 2008, and serves 12-14 million pages a day. [brentozar.com]

Re:MySQL in-memory speed (1, Funny)

bbbaldie (935205) | more than 2 years ago | (#39065045)

Thanks for judging me. And for being so humble about it. Oh, and thanks for helping to dispel the notion that arrogant pricks hang around /. :-)

Re:MySQL in-memory speed (1)

SuiteSisterMary (123932) | more than 2 years ago | (#39065439)

Well, what did you expect?

Re:MySQL in-memory speed (1)

bbbaldie (935205) | more than 2 years ago | (#39065539)

;-) This sentence added because the winking emoticon above was rejected as a valid comment...

Re:MySQL in-memory speed (1)

gbjbaanb (229885) | more than 2 years ago | (#39066961)

that's 162 *pages* per second, not SQL queries. Thing is, how many of those are cached and simply re-served up again? How big are the servers that run these? How many servers do they use? How much Lucene do they use instead of SqlServer??? (you didn't watch the video did you? - 23 minutes in)

Anyway, your argument is like saying Facebook takes 2 seconds to return my wall of data whereas MyPersonalWebsite with a PHP script I wrote returns instantly, therefore Facebook are useless losers.

There might well be a 2 second delay if the server is repeating many different sql queries - I used to see a Oracle DB that spent most of its time waiting on the query cache - or if its hitting the DB with queries that lock a lot of data pages - I've seen an app that happily took a lock on nearly all the data in various tables escalating the lock to a full table lock. So yes, they're incompetent, but sometimes queries that work well on one DB don't work well on others, in particular queries that happily ran on Oracle killed Sql Server, simply because of the locking design these 2 DBs implemented (ok, so much so that Sql server now has row-level locking, but there could be others still in there causing problems).

Re:MySQL in-memory speed (0)

Anonymous Coward | more than 2 years ago | (#39067035)

therefore Facebook are useless losers.

Isn't this true anyway?

And if you want to use some script you wrote up as proof, then I'm with you 100%.

Re:MySQL in-memory speed (1)

xaxa (988988) | more than 2 years ago | (#39066021)

I've programmed PHP to use SQL Server and MySQL, the MySQL statements typically run in milliseconds, it usually takes SQL Server two or three seconds to respond.

Can you give an example?

We're evaluating Postgres vs MS SQL at work, and at the moment haven't found any problem big enough to rule either out.

Re:MySQL in-memory speed (1)

bbbaldie (935205) | more than 2 years ago | (#39066143)

Sadly no, this was at a previous employer. I had a script that looked at the arp tables of a core router and compare them to a database of known mac addresses (in MySQL). Typically, there would be around 1,000 macs in the arp table, compared to roughly 4,000 known macs. The web page would load in about 1-2 seconds. OTOH, getting SQL Server's attention would run a couple of seconds or more, followed by the initial query (running a query joining two or three tables, returning perhaps twenty records) adding from 2-5 more seconds. Granted, it would run much quicker after that. Part of the problem was that the community had a much faster SQL Server module for PHP, then Microsoft took it over. The Windows dll was twice as big as it used to be, and performance took a hit. FYI, WAMP on the server end.

Re:MySQL in-memory speed (1)

codepunk (167897) | more than 2 years ago | (#39066981)

problems other than cost you mean.

The 70x is for (distributed) joins (5, Informative)

jorela (84188) | more than 2 years ago | (#39064693)

which previously were quite poorly handled.
See http://www.clusterdb.com/mysql-cluster/70x-faster-joins-with-aql-in-mysql-cluster-7-2/?utm_source=rss&utm_medium=rss&utm_campaign=70x-faster-joins-with-aql-in-mysql-cluster-7-2

33,500 rows? (1)

PhuFighter (1172899) | more than 2 years ago | (#39064765)

k. This is slightly laughable. 33,500 rows? in 87 seconds? that seems glacial. And 1.23 seconds being the new speed? that seems as expected. aside from comparing the speed of a Lada vs a common garden slug, how does this compare against other databases?

Re:33,500 rows? (1)

hawguy (1600213) | more than 2 years ago | (#39064913)

k. This is slightly laughable. 33,500 rows? in 87 seconds? that seems glacial. And 1.23 seconds being the new speed? that seems as expected.

aside from comparing the speed of a Lada vs a common garden slug, how does this compare against other databases?

It was actually only 2000 rows returned -- the source tables had 35K rows. But in fairness, this was for a join across 11 tables, and it's easy for a database to perform slowly at this task if it doesn't have a good query optimizer. A badly optimized query could end up doing thousands (or even millions) of full table scans of multi-thousand row tables.

Re:33,500 rows? (1)

PhuFighter (1172899) | more than 2 years ago | (#39065051)

25 columns, at 4 bytes per column, that's 100 bytes. I'm not sure how much overhead, but hopefully not much. 100 bytes at 2000 rows = 200,000 bytes. Across 32KB pages, that's 7 pages, allowing for 10% overhead. I assume that it's 40 Gb/sec IB, how many microseconds would that take to transfer? As for the join, i guess if there is more details about the schema (i think there's a link to it buried somewhere), simple joins should be pretty quick. I agree that a poorly implemented optimizer producing a bad plan could result in a nested-loop join wasting a lot of cycles, what this article tells me is that this database is pretty much on par with a quality RDBMS from years ago, and nowhere near as fast as NoSQL databases of today.

70x (5, Funny)

slasho81 (455509) | more than 2 years ago | (#39064835)

Developers: We've got some really good ideas for increasing performance of complex queries by...
Marketing: How much in the best conceivable case?
Developers: Oh, I dunno, maybe 70x.
Marketing: 70x? Is that good?
Developers: Yeah, I suppose, but the cool stuff is...
Marketing: Wow! 70x! That's a really big number!
Developers: Actually, please don't quote me on that. They'll make fun of me on Slashdot if you do. Promise me.
Marketing: We promise.
Developers: Thanks. Now, let me show you where the good stuff is...
Marketing (on phone): Larry? It's me. How big can you print me up a poster that says "70x"?

Re:70x (0)

Anonymous Coward | more than 2 years ago | (#39065749)

So Larry Ellison now works in the poster-printing department at Oracle?

Re:70x (1)

slasho81 (455509) | more than 2 years ago | (#39065945)

Larry is just a random sleazeball from Marketing. I didn't think of Ellison.

Re:70x (0)

Anonymous Coward | more than 2 years ago | (#39066103)

Larry is just a random sleazeball.

ftfy

Re:70x (1)

TrueSpeed (576528) | more than 2 years ago | (#39066831)

Larry is a sleazeball.

FTFY

Re:70x (1)

socceroos (1374367) | more than 2 years ago | (#39069565)

[quote]Larry is sleazeball[/quote]
ftfy

Re:70x (0)

Anonymous Coward | more than 2 years ago | (#39066845)

So Larry Ellison now works in the poster-printing department at Oracle?

Larry does indeed design a lot of the ads you see like the ones trashing HP. The really funny ones never make it outside Oracle HQ so be on the lookout if you ever get to visit.

In the fine print... (1)

TrueSpeed (576528) | more than 2 years ago | (#39066819)

*Performance will vary and best results are only possible with an Oracle service contract.

Orcle Linux Unbreakable (0)

Anonymous Coward | more than 2 years ago | (#39066935)

Where did I hear this before "Oracle Linux Unbreakable"? Hmm... Probably coming from the same mouth.

TPC-H (1)

Skinkie (815924) | more than 2 years ago | (#39067629)

So if Oracle is so happy with MySQL. Give us a TPC-H [tpc.org] benchmark. Scalefactor: 100.
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>