Cassandra and Voldemort Benchmarked - Slashdot

Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

×

Cassandra and Voldemort Benchmarked 45

Posted by timothy on Saturday May 08, 2010 @02:02PM from the rifling-the-file-cabinet dept.

kreide33 writes "Key/Value storage systems are gaining in popularity, much because of features such as easy scalability and automatic replication. However, there are several to choose from and performance is an important deciding factor. This article compares the performance of two of the most well-known projects, Cassandra and Voldemort, using several different mixes of access types, and compares both throughput and latency."

This discussion has been archived. No new comments can be posted.

Cassandra and Voldemort Benchmarked

Load All Comments

Search 45 Comments Log In/Create an Account

Comments Filter:

No Winner (Score:5, Informative)

by WrongSizeGlass ( 838941 ) writes: on Saturday May 08, 2010 @02:11PM (#32140228)

Their conclusion was that there was "no clear winner". Not surprising. Both of these products are in their early stages of development (Voldemort v0.80.1, Cassandra 0.6.0-beta3) and will certainly work on optimization and performance issues after the product is stable.

I'd like to have seen them run MySQL, PostgreSQL or SQLite through the same tests so we could see how these NoSQL solutions compared.

Share
twitter facebook
- - Re: (Score:2)
    
    by ThePhilips ( 752041 ) writes:
    
    Digg is going to no-sql, for example. They released some of their mysql schema/code and it was poorly designed (bad indexing, manual joins, braindead queries). They chose to go with no-sql because they're clueless retards.
    
    Lazy to rephrase, so here it goes straight from rfc1925:
    Some things in life can never be fully appreciated nor understood unless experienced firsthand. Some things in networking can never be fully understood by someone who neither builds commercial networking equipment nor runs an operational network.
    In general I'm very very cautious when criticizing production code. After all it works.
    - Re: (Score:2, Informative)
      
      by Anonymous Coward writes:
      
      I wouldn't have mentioned it if it wasn't pure shit that. 1.5 seconds for a query that should be 3-4 disk blocks at max?
      
      digg's blog with their schema [digg.com]
      criticism [blogspot.com]
      more criticism [yafla.com]
      and some more criticism [yafla.com]
      - Re: (Score:2)
        
        by mini me ( 132455 ) writes:
        
        The guys at digg fully admit that they could spend their days tuning MySQL to achieve the performance they need. What is important to realize is that it costs real money and time to perform that tuning. Time and money that could be better spent improving the user experience of the website.
        Cassandra, on the other hand, performs optimally no matter what the developers throw at it, without the need to tune every last detail to squeeze every last bit of performance out of it. As the site grows, if the cluster i
    - Re: (Score:3, Informative)
      
      by Hognoxious ( 631665 ) writes:
      
      Production code works ... until it doesn't.
      I've seen a situation where half of the bugs reports in our system were down to one badly conceived and shittily implemented module. But when I suggested binning it and doing it again properly, the answer was "but it works!".
    - Re: (Score:2)
      
      by Arancaytar ( 966377 ) writes:
      
      > After all it works.
      Since they're abandoning MySQL, apparently their schema didn't work so great...
  - Re: (Score:2)
    
    by DragonWriter ( 970822 ) writes:
    
    I don't know that comparing an RDBMS vs key-value storage is meaningful.
    
    Since they are alternative approaches to implementing a backend store for an information system, and the decision between key/value and relational technology is in many cases a bigger decisions with greater risk involved in making the wrong choice than the decision between particular key/value or particular relational options (since the conversion between different systems using the same basic information model is cheaper than the conv
- Re: (Score:2)
  
  by mrmeval ( 662166 ) writes:
  
  I read the brief descriptions of each system and if there is any text that is as cotton mouthed fuzzy and unclear outside of legaleze I've not seen it.
- Re: (Score:2)
  
  by greg1104 ( 461138 ) writes:
  
  I'd like to have seen them run MySQL, PostgreSQL or SQLite through the same tests so we could see how these NoSQL solutions compared.
  That wouldn't have made any sense given the replication scheme used: "N=3 (replicas for each entry), R=2 (nodes to wait for on each read), W=2 (nodes to block for on each write)". It's hard to translate that into the sort of replication features available in the other databases you mentioned.
  Also, these tests focused on individual put/get operations, where a standard database is going to get creamed no matter what. You'd need to include something that had a higher-level query component to it than that to
- Re: (Score:3, Informative)
  
  by inKubus ( 199753 ) writes:
  
  And what about memcached [memcached.org]? It's a simple key/value object database. What about an "associative array", isn't that basically a key/value database? I don't see what the hype is about.
Drat (Score:2)

by sys.stdout.write ( 1551563 ) writes:

Did anyone else read this as comparing Cassandra from King's Quest and Voldemort from Harry Potter?
- Re: (Score:1, Funny)
  
  by Anonymous Coward writes:
  
  You mean "He-Who-Must-Not-Be-Named"
  - - Re: (Score:2)
      
      by blair1q ( 305137 ) writes:
      
      http://www.democraticunderground.com/discuss/duboard.php?az=view_all&address=433x266988 [democratic...ground.com]
- Re:Drat (Score:5, Funny)
  
  by WWWWolf ( 2428 ) writes: <wwwwolf@iki.fi> on Saturday May 08, 2010 @02:27PM (#32140358) Homepage
  
  Did anyone else read this as comparing Cassandra from King's Quest and Voldemort from Harry Potter?
  I was expecting something about Cassandra producing a bunch of warnings in log files that no one ever bothers to read, and Voldemort having various problems managing the child processes in the cluster (mostly being unable to kill or reap them).
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by Arancaytar ( 966377 ) writes:
    
    Wouldn't the problem be rather that Voldemort would keep killing child processes randomly?
  - Re:Drat (Score:4, Funny)
    
    by deniable ( 76198 ) writes: on Saturday May 08, 2010 @07:37PM (#32142624)
    
    I hear Voldemort has a really good replication strategy.
    
    Parent Share
    twitter facebook
    - Re: (Score:1)
      
      by mysidia ( 191772 ) writes:
      
      Maybe so, but Cassandra is sexier, and Voldemort is just plain evil.
- Re: (Score:2)
  
  by Rennt ( 582550 ) writes:
  
  I always figured Cassandra was a reference to Red Dwarf.
- Re: (Score:1)
  
  by OeLeWaPpErKe ( 412765 ) writes:
  
  I kinda wonder why it's not possible to use these projects as backends for mysql and postgres. Seems to me that shouldn't be that hard an exercise.
  Or even having these as mountable volumes.
  - Re: (Score:2)
    
    by DragonWriter ( 970822 ) writes:
    
    I kinda wonder why it's not possible to use these projects as backends for mysql and postgres.
    You could, but as soon as you try to implement the features of SQL that they lack on top of them you'll end up making them peform far worse than existing backends that are designed from the ground up to provide these features, so what would be the point?
Key compression... (Score:2)

by shic ( 309152 ) writes:

Are there ANY open source key/value stores that support prefix compression?
Silly question (Score:4, Interesting)

by Hognoxious ( 631665 ) writes: on Saturday May 08, 2010 @03:28PM (#32140822) Homepage Journal

Is a key/value system a database with just one table that has one key field and one non-key field?

Share
twitter facebook
- Re: (Score:2)
  
  by CrashNBrn ( 1143981 ) writes:
  
  AFAIK it's akin to a Mapping/Hash (array), ie:
  mArray[name] := ({ "Crash" }) or
  mArray[stats] := ({ ({ "STR", 10 )}, ({ "DEX", 12 }) })
  
  They could also be multi-tiered mappings:
  mPlayer[data][name], mPlayer[data][stats]
  
  DGD and LPmuds have done mapping/arrays for ~20 years. The underlying DGD core is C++ and the interpreted language is like-C. The underlying core of most other LPMuds is C and interpreted like-C.
  
  Mappings and Compiled (Data) Objects were extremely useful in DGD. Named arrays with dec
- Re: (Score:2)
  
  by legoburner ( 702695 ) writes:
  
  At the simplest level yes, but cassandra (for example) is more like a multi-dimensional hashmap. Eg; Key-Value where Value points to another Key-Value and so on, so you can reference values such as: SomeApp.Users[UserID][username]=bob The advantage of this is being able to sort by time, alpha, etc, and therefore handle sorted pagination from the key/value listings. The main advantage though is that you can literally just plug in more systems and have it scale horizontally without any extra work, unlike d
  - Re: (Score:2)
    
    by inKubus ( 199753 ) writes:
    
    That sounds like a tree. Like LDAP for instance, who has been doing this with extremely high performance, with replication, etc. for decades ;) These are all solved problems, new copies of the same comes out every 10 years in a cycle, and all the new kids don't realize that it already existed, came to full maturity and was bought by IBM long ago.. IBM has a product that will solve everyone's problems if you'd just call them. But the kids like to go it alone, as if the problem of indexing a few million w
    - Re: (Score:2)
      
      by Hurricane78 ( 562437 ) writes:
      
      It’s not a tree. It’s a graph. A tree is a graph’s retarded incomplete brother.
      I wish people would stop using trees, and use full ontologies instead. It only creates problems. In file systems. In OO class hierarchies, in categories and tags, etc.
- Re: (Score:2)
  
  by adamchou ( 993073 ) writes:
  
  When you say "database", I imagine you're referring to the traditional relational database. I've never used Cassandra or Voldemort but I have used memcachedb and tokyodb and the one major difference is that you can't select on ranges in a key/value system. You can't select all keys > 100 or keys 100 - 500, etc
  - Re:Silly question - Couchdb (Score:1)
    
    by MrTrick ( 673182 ) writes:
    
    Try couchdb if you want to select ranges.
    Its keys are stored in a heap, so selecting ranges of values is a core use case.
    The view system also uses the same mechanism, so by having a cached view you can emit any key you like per record, and grab individual or ranges of values.
    Nifty. :-)
Well, Cassandra does better. (Score:3, Funny)

by Chas ( 5144 ) writes: on Saturday May 08, 2010 @05:14PM (#32141560) Homepage Journal

Until Voldy pulls that whole Avada Kedavra thing...

Share
twitter facebook
- Re: (Score:2)
  
  by blair1q ( 305137 ) writes:
  
  fawkes.hogwarts.edu # su - voldemort
  Password:
  2010-05-08 16:08:45 have you hugged your death eater today? alias avada_kedavra
  kill -9
  2010-05-08 16:08:45 have you hugged your death eater today?
Oracle/MySQL - Voldemort (Score:2)

by caluml ( 551744 ) writes:

A very large company I know is moving from Oracle/MySQL to Voldemort for certain parts of their system. The two they evaluated were Cassandra and Voldemort.
File system? (Score:2)

by Bromskloss ( 750445 ) writes:

Key-value storage? That sounds like the ordinary file system to me.
- Re: (Score:2)
  
  by Xeriar ( 456730 ) writes:
  
  Not a particularly useful use of the inode table. The filesystem is great for a few hundred or even a few thousand records, but when you're dealing with billions of records, that adds up to a lot of wasted space.
Comparison Against Established Systems? (Score:2)

by RAMMS+EIN ( 578166 ) writes:

I am in a bit of a rush, so I can't netgrep for it myself right now, but I am curious how these new contenders stack up against more established key-value stores such as Berkeley DB and GDBM. Has anyone run the benchmarks?

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Related Links Top of the: day, week, month.

321 commentsShould a Variable's Type Come After Its Name?
293 commentsAre Scrums a Cancer?
258 commentsC++ Creator Rebuts White House Warning
228 commentsWhite House Urges Devs To Switch To Memory-Safe Programming Languages
226 comments34% of AP CS Students Couldn't Solve This Java-Based 2D Array Question

There are two ways to write error-free programs; only the third one works.