Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Are Relational Databases Obsolete?

kdawson posted about 7 years ago | from the long-in-the-tooth dept.

Databases 417

jpkunst sends us to Computerworld for a look at Michael Stonebraker's opinion that RDBMSs "should be considered legacy technology." Computerworld adds some background and analysis to Stonebraker's comments, which appear in a new blog, The Database Column. Stonebraker co-created the Ingres and Postgres technology while a researcher at UC Berkeley in the early 1970s. He predicts that "column stores will take over the [data] warehouse market over time, completely displacing row stores."

cancel ×

417 comments

Sorry! There are no comments related to the filter you selected.

They're not mutually exclusive. (5, Insightful)

KingSkippus (799657) | about 7 years ago | (#20495819)

Okay, at the risk of sounding stupid...

Since when is a column store database and a relational database mutually exclusive concepts? I thought that both column store and row store (i.e. traditional) databases were just different means of storing data, and had nothing to do with whether a database was relational or not. I think the article misinterpreted what he said.

Also, I don't think it's news that Michael Stonebraker (a great name, by the way), co-founder and CEO of a company that (surprise!) happens to develop column store database software, thinks that column store databases are going to be the Next Big Thing. Right or wrong, his opinion can't exactly be considered unbiased...

Re:They're not mutually exclusive. (5, Interesting)

XenoPhage (242134) | about 7 years ago | (#20495911)

Since when is a column store database and a relational database mutually exclusive concepts? I thought that both column store and row store (i.e. traditional) databases were just different means of storing data, and had nothing to do with whether a database was relational or not. I think the article misinterpreted what he said.

Agreed. It definitely looks like a storage preference. Though column-based storage has definite benefits over row-based when it comes to store once, read many operations. Kinda like what you'd find in a data warehouse situation...

Also, I don't think it's news that Michael Stonebraker (a great name, by the way), co-founder and CEO of a company that (surprise!) happens to develop column store database software, thinks that column store databases are going to be the Next Big Thing. Right or wrong, his opinion can't exactly be considered unbiased...

Hrm.. You must be new here....

Yea, it's all the same. (5, Insightful)

SatanicPuppy (611928) | about 7 years ago | (#20496021)

Column stores are great (better than a row store) if you're just reading tons of data, but they're much more costly than a row store if you're writing tons of data.

Therefore, pick your method depending on your needs. Are you storing massive amounts of data? Column stores are probably not for you...Your application will run better on a row store, because writing to a row store is a simple matter of adding one more record to the file, whereas writing to a column store is often a matter of writing a record to many files...Obviously more costly.

On the other hand, are you dealing with a relatively static dataset, where you have far more reads than writes? Then a row store isn't the best bet, and you should try a column store. A query on a row store has to query entire rows, which means you'll often end up hitting fields you don't give a damn about while looking for the specific fields you want to return. With column stores, you can ignore any columns that aren't referenced in your query...Additionally, your data is homogenous in a column store, so you lose overhead attached to having to deal with different datatypes and can choose the best data compression by field rather than by data block.

Why do people insist that one size really does fit all?

Re:Yea, it's all the same. (5, Interesting)

theGreater (596196) | about 7 years ago | (#20496125)

So it seems to me the -real- money is in integrating an RDBMS which, for usage purposes, is row-oriented; but which, for archival purposes, is column-oriented. This could either be a backup-type thing, or an aging-type thing. Quick, to the Pat(ent)mobile!

-theGreater

Re:Yea, it's all the same. (4, Interesting)

stoolpigeon (454276) | about 7 years ago | (#20496189)

Maybe, but I doubt it. The money is in the data warehouse market and the etl tools that move the data from the oltp environment to the warehouse environment. I think what the author points out is not that people are trying to use the same database to do both, but rather that they are trying to use the same product to both. He says it would make more sense to use Oracle (for example) for oltp - and something else for the warehouse, rather than trying to get Oracle to do both well.

Re:Yea, it's all the same. (5, Insightful)

KingSkippus (799657) | about 7 years ago | (#20496341)

Why do people insist that one size really does fit all?

I went back and read the original article. To Michael Stonebreaker's credit, the ComputerWorld article (and the submitter) grossly misrepresents what he said.

He did not say that RDBMSes are "long in the tooth." He said that the technology underlying them hasn't changed since the 1970's, and that column stores is a better way to represent data in certain situations. In fact, the very name of his original column was "One Size Fits All - A Concept Whose Time Has Come and Gone"

Re:Yea, it's all the same. (1)

jhantin (252660) | about 7 years ago | (#20496557)

A query on a row store has to query entire rows, which means you'll often end up hitting fields you don't give a damn about while looking for the specific fields you want to return.

So build a covering index [google.com] . What they seem to be driving at is aggregation performance, though, not just raw read performance; for that you need to build a materialized view, and not all RDBMS support those conveniently so you end up building a horrid mess of triggers or just rebuilding your proto-aggregate table out of a cron job. Column stores have an easier time of aggregation (just scan the column and fold) unless you are doing something that really is row-oriented, like SUM(A.x + B.y * B.n) over some huge wretched join with conditions have absolutely nothing to do with what you're aggregating.

Probably a better policy is to have a DBMS that supports a variety of table store formats. MySQL wins that point, and would probably run away with it if it supported a column store format. Second best is to have a variety of index store formats, like Postgres. Or you can always have just one trick but optimize it to the hilt.

Perl Objects have both column and row DB advantage (4, Interesting)

goombah99 (560566) | about 7 years ago | (#20496617)

Traditionally perl-objects are hashes with one blessed hash per instance. The hash contains all the instance variable values using their names as keys.

instead one can use blessed scalars holding a single integer value for instances and let the class variable contain all the instance data in arrays indexed by the instances scalar value.

This technique was originally promoted as an indirection to protect object data from direct manipution that bypassed get/set methods. But it also allows the object to be either row or column oriented internally. that is the class could store all the instance hashes in an array indexed by the scalar. or it could store each instance variable in a separate array that is indexed by the scalar value.

Thus the perl class can, on-the-fly, switch itself from column-oriented to row-oriented as needed while maintaining the same external interface.

Of course this is not a perl-exclusive feature and it can implemented in other languages. It just happens to be particularly easy and natural to do in perl.

Re:They're not mutually exclusive. (1)

theGreater (596196) | about 7 years ago | (#20495925)

Exactly what I was thinking. There is nothing (to my knowledge) in the Relational Model which specifies row vs. column store....

-theGreater.

Re:They're not mutually exclusive. (4, Interesting)

stoolpigeon (454276) | about 7 years ago | (#20495947)

You are exactly right and this is backed up by the home page for c-store [mit.edu] . It says: "C-Store is a read-optimized relational DBMS " - c-store is the open source project that apparently is the basis for Vertica - Stonebraker's commercial offering.

Re:They're not mutually exclusive. (1)

bobcat7677 (561727) | about 7 years ago | (#20496505)

Yeah, I looked into Vertica. Was pretty excited at first as my company does alot of data warehouse and data mart stuff. The potential performance gains were significant enough to start looking at converting alot of our reporting, ect to it. So I gave them a call and started asking some questions related to the key usage one would expect to make of a "warehouse" or datamark type of database.

1st Q: "Can you run MDX queries against the Vertica DB?" A: "No, we don't support MDX queries."

The conversation went downhill from there. At the time you could only get the DB as a limited beta and looking at their website it doesn't appear that has changed. Basically the DB is no where near ready for prime time. Even when it is ready, it's not going to make traditional RDBMS obsolete, just create a niche for itself in the warehouse/datamart corner. And I wouldn't be surprised if Microsoft came out with a new version of SQL server in short order (as well as others) that allowed you to create column oriented DB objects which would make maintaining two different database server technologies for the different purposes somewhat silly.

Re:They're not mutually exclusive. (1)

Denis Troller (1002792) | about 7 years ago | (#20495955)

Completely true.

The article itself even points to http://en.wikipedia.org/wiki/Column-oriented_DBMS [wikipedia.org] that states that Row storage and Column Storage are two technologies for RDBMS.

Re:They're not mutually exclusive. (5, Funny)

Anonymous Coward | about 7 years ago | (#20495971)

Well I just turned my server on its side and now all my tables are storing in columns. I love new technology.

Re:They're not mutually exclusive. (0)

reddburn (1109121) | about 7 years ago | (#20495999)

Right or wrong, his opinion can't exactly be considered unbiased...
Congratulations! You figured out what an opinion is.

Re:They're not mutually exclusive. (5, Insightful)

OECD (639690) | about 7 years ago | (#20496185)

Congratulations! You figured out what an opinion is.

An opinion is subjective, but it's not necessarily biased. A disinterested observer could have an unbiased opinion.

Re:They're not mutually exclusive. (-1, Offtopic)

Anonymous Coward | about 7 years ago | (#20496063)

True, but you should keep in mind that bratac always tries to sacrifice himself even when there is no need...he thinks hes a hero, he wants to kill himself even when the battle is over. its pathetic...and he thinks hes some great leader, he knows nothing, hes almost as pathetic as tealc. If he wants to sacrifice himself, he should throw himself in front of his true God, HAMMOND.

Re:They're not mutually exclusive. (4, Interesting)

homb (82455) | about 7 years ago | (#20496265)

I wish we could put this thing to rest once and for all. And I wish so-called "experts" in the field actually were.

Rule of thumb:
- you use row dbs for OLTP. They're great for writing.
- you use column dbs for data mining. They're amazing for reading aggregates (average, max, complex queries...)

The major problem with column dbs is the writing part. If you have to write one row at a time, you're screwed because it needs to take each column, read, insert into it and store. If you can write in batch, the whole process isn't much more expensive. So writing a single row could take 500ms, but writing 1000 rows will take 600ms.
Once the data's in, column dbs are the way to go.

Yeah, but look at all the publicity (0, Flamebait)

Colin Smith (2679) | about 7 years ago | (#20496283)

Vertica's getting a load of it free.

Hey, you never know, some good may come of it. Maybe some of the "Nobody ever got fired for buying Oracle/SQL Server/DB2" people will have to explain why they're using inappropriate technology.

 

Re:They're not mutually exclusive. (1)

Lars T. (470328) | about 7 years ago | (#20496525)

Okay, at the risk of sounding stupid...

Since when is a column store database and a relational database mutually exclusive concepts?

It doesn't. The original blog is about Row-oriented DBMS vs. Column-oriented DBMS, and the author of the article (or his know-it-all-better editor) confused himself enough to believe somebody abbreviated that as RDBMS which of course means Relational DBMS. The submitter probably not reading the Wikipedia article he linked to didn't help either.

C'mon, the guy is biased! (4, Funny)

winkydink (650484) | about 7 years ago | (#20495821)

The name of his blog is The Database Column after all.

Re:C'mon, the guy is biased! (4, Funny)

everphilski (877346) | about 7 years ago | (#20495901)

database row would sound too much like prison :P

Mod Article -1 (Author doesn't get it) (5, Informative)

DrinkDr.Pepper (620053) | about 7 years ago | (#20495879)

Relational databases aren't being obsoleted. Some schema design heuristics are.

Re:Mod Article -1 (Author doesn't get it) (1)

dami99 (1014687) | about 7 years ago | (#20495917)

Yep.

Stupid article, skip to the blog only to see what they are really talking about.

It's not about relational databases being obsolete at all.

Re:Mod Article -1 (Author doesn't get it) (1)

ben there... (946946) | about 7 years ago | (#20496413)

And the important part left out with all the sensationalism: the API wouldn't change. It would still be SQL. It would still be an RDBMS. It would essentially just have a different storage engine. Products like MySQL seem to get along just fine supporting multiple options for storage engine. I don't see why an additional option to improve performance in certain cases would obsolete anything.

dual-mode db? (4, Interesting)

192939495969798999 (58312) | about 7 years ago | (#20495889)

Is there a dual-mode db, that lets you create a row-based or column-based "table"? I imagine cross-mode queries would kill performance, but at least you could have a system front-loaded with row tables, where data comes in, and then archive this data over time into the column-based tables, so that reads were fast.

Re:dual-mode db? (2, Informative)

XenoPhage (242134) | about 7 years ago | (#20495951)

I believe you can build a storage engine in MySQL that deals with column-based storage. I'm not sure if it's been done yet, but I don't see why it couldn't be.

Re:dual-mode db? (1)

stoolpigeon (454276) | about 7 years ago | (#20496141)

Nice - in other words - no, but you could write one on (open source rdbms of your choice)

Re:dual-mode db? (2, Informative)

tinkertim (918832) | about 7 years ago | (#20496315)

I believe you can build a storage engine in MySQL that deals with column-based storage. I'm not sure if it's been done yet, but I don't see why it couldn't be.

The FA threw me for a loop a couple of times, I honestly _did_ try to read it :) Correct me if I'm incorrect, but wouldn't having a service for column stores be (usually) not needed on most Unix-like platforms? Since this is mostly reading, I would think such efforts might be better spent on sqlite (or similar)?

If your in a situation where you're mostly reading with (likely) only one infrequent writer, wouldn't eliminating the overhead of a database service entirely be desirable?

I can't think of a situation where you would want many frequent writers to a column store schema, again, correct me if I'm off.

Re:dual-mode db? (1)

jellomizer (103300) | about 7 years ago | (#20495969)

It can be done with MS SQL. and other you build a cursor that creates a SQL Call. and you execute the SQL Call mixing case statemts and agrate functions and you are all set. Is it Easy no not really is it supported no but it can be done, and I have done it.

Re:dual-mode db? (2, Funny)

stoolpigeon (454276) | about 7 years ago | (#20496047)

You can get MS SQL Server to store tables differently than the default? It will write columns to disk as opposed to rows? You can store columns in their own files? It's been a few years since I worked with SQL Server - but I really don't remember those features. Is it a SQL Server 2005 thing?

Re:dual-mode db? (1)

jellomizer (103300) | about 7 years ago | (#20496245)

Yes in Windows 2000 if you don't want to look behind the curtains.
No but you can make views and stored procedures that can do the trick, that makes it look like it and aids in programming and can save time.

Re:dual-mode db? (1)

Vancorps (746090) | about 7 years ago | (#20496319)

Actually yes it is a SQL 2005 thing although there was a way to do it SQL 2000 called Data cubes from my understanding. You end up having multiple data files just like you would in an Oracle situation. It's easier to explain in Oracle terms as you'd just create a tablespace for column based tables and a tablespace for row-based tables. Then away you go, both storing files as you see fit.

I guess in SQL 2005 terms you'd be creating another database on the same server and just use server linking to get your data from one place to another. Either through an ISIS package or a SQL Job.

Re:dual-mode db? (0)

Anonymous Coward | about 7 years ago | (#20495989)

Do you mean like using replication... row-based as the master and column-based on the slaves?

well (5, Informative)

stoolpigeon (454276) | about 7 years ago | (#20495891)

every article linked makes it clear that this is about warehousing as opposed to oltp. so is the technology dead? no - can it do everything? no

Re:well (1)

lucabrasi999 (585141) | about 7 years ago | (#20496009)

Wait a minute!!! Are you suggesting that the submitter actually reads the article before submitting it? Blasphemer!

Re:well (1)

jimstapleton (999106) | about 7 years ago | (#20496105)

You mean... My row based MySQL tables can't wash my dishes for me?

Re:well (1)

stoolpigeon (454276) | about 7 years ago | (#20496225)

From what little exposure I've had to MySQL - it can't do much of anything for you or anyone else. I think if you move to PostgreSQL you will find that it will do your dishes for you and make you a better person.

Re:well (1)

jimstapleton (999106) | about 7 years ago | (#20496345)

down religeous db fanatic! down!

both have their pros and cons, now stop trying to take a stupid joke an make it into your personal database soapbox

Re:well (1)

TechyImmigrant (175943) | about 7 years ago | (#20496387)

>From what little exposure I've had to MySQL - it can't do much of anything for you or anyone else. I think if you move to PostgreSQL you will find that it will do your dishes for you and make you a better person.

Well it stores my data and meets my performance requirements. Is there something else I need it to do, given that I already own a dish washing machine.

Re:well (1)

Scaba (183684) | about 7 years ago | (#20496471)

Is there something else I need it to do, given that I already own a dish washing machine.

Yes. You need to take your dish washing machine out to dinner occasionally, and buy her flowers. And tell her she looks beautiful.

Re:well (0)

Anonymous Coward | about 7 years ago | (#20496513)

Obviously you need it to take the more difficult steps of putting the dishes into the washer, and putting them away when clean.

Rotate (5, Funny)

Kozar_The_Malignant (738483) | about 7 years ago | (#20495909)

>"column stores will take over the [data] warehouse market over time, completely displacing row stores."

Hmmmm. So if I rotate my Paradox or Excel table by 90 degrees, I have achieved database coolness? Who knew it was so easy.

Re:Rotate (0)

MightyMartian (840721) | about 7 years ago | (#20495973)

Just wait until next week, when they'll be stored diagnoally, or the week after, when the individual bytes are chopped, reordered, and then sent to /dev/ram.

Re:Rotate (1)

jellomizer (103300) | about 7 years ago | (#20496055)

Excel only handles 255 Columns. It is anoying. It is stupid and antiquated. But no one is sayis is Excel Obsolete... I guess it is because we already know the answer.

Re:Rotate (3, Informative)

Hijacked Public (999535) | about 7 years ago | (#20496159)

The most recent release (2007) will handle 2^14 columns.

Re:Rotate (5, Insightful)

ben there... (946946) | about 7 years ago | (#20496473)

Excel only handles 255 Columns.
It should be noted that if you've designed a database (rather than an Excel abomination) with more than 255 columns, chances are, you're doing it wrong.

Re:Rotate (2, Funny)

kpainter (901021) | about 7 years ago | (#20496477)

Hmmmm. So if I rotate my Paradox or Excel table by 90 degrees, I have achieved database coolness? Who knew it was so easy.
I tried that but my neck gets tired after a few minutes.

IDG *owns* Slashdot These Days... (-1, Offtopic)

Frosty Piss (770223) | about 7 years ago | (#20495937)

Looks like IDG (ComputerWorld) is really hitting Slashdot HARD, either that or they have a deal with Slashdot. Here's a partial list of the shills that regularly show up and have almost 100% article acceptance rates: inkslinger77 narramissic jcatcw Looks like they spread out the work over a few shill user accounts, which is to be expected. If it's all OK and everything with the corporate ownership of Slashdot to be played by IDG, I suppose that's their business, but one would hope that they are actually getting PAID for being part of IDG's advertising program. And of course there should be disclosure so that visitors to Slashdot realize they are reading advertisements and not an article submitted by a "real" user...

Re:IDG *owns* Slashdot These Days... (0)

kevin_conaway (585204) | about 7 years ago | (#20495995)

And of course there should be disclosure so that visitors to Slashdot realize they are reading advertisements and not an article submitted by a "real" user...

Meh, content is content. As long as the linked "article" is informative and sparks discussion, I'm happy.

slightly OT (-1, Offtopic)

Deadplant (212273) | about 7 years ago | (#20495961)

Can I use this article as an excuse to rant about how much I hate working with SQL?
I'm working in python mostly and when I have to touch SQL it is like a... a... something really nasty.
Can someone please declare SQL legacy technology? pretty please?

Re:slightly OT (1)

MemoryDragon (544441) | about 7 years ago | (#20496041)

The main problem is SQL is just a description language for set data, and a relational database is exactly that a set data.
The main problem is so far nobody really has brought out something more reable to deal with sets in a mathematical sense, you could use mathematical operators but then things would become even less readable than SQL is.

All approaches on the programming side I have seen (criteria objects etc...) make things only easier in some domains, after that you revert to plain sql and its derivates.
Elminiating SQL would mean you probably have to eliminate the data storage model of sets as well.

Re:slightly OT (1)

plague3106 (71849) | about 7 years ago | (#20496087)

Would you like it better with IronPython and Linq?

Re:slightly OT (1)

lucabrasi999 (585141) | about 7 years ago | (#20496107)

Can someone please declare SQL legacy technology? pretty please?

TRUNCATE TABLE SQL_LANGUAGE;

COMMIT;

There, feel better?

Re:slightly OT (0)

Anonymous Coward | about 7 years ago | (#20496339)

Staying OT with you, I do not agree with your opinion on SQL in general, although using it for basic CRUD in an object-oriented application is tedious and bug-prone. For that, I warmly recommend the up-and-coming Python-based ORM called SQLAlchemy. See: http://www.sqlalchemy.org/ [sqlalchemy.org] .

Re:slightly OT (1)

Just Some Guy (3352) | about 7 years ago | (#20496529)

For that, I warmly recommend the up-and-coming Python-based ORM called SQLAlchemy.

Exactly. If you're using Python, you're not allowed to complain about SQL because there are good alternatives. Besides Alchemy, Django has a very nice object mapper [djangoproject.com] of its own. Both of those have progressed to the point that writing raw SQL is simple unnecessary for almost any application development.

The guy... (5, Interesting)

AKAImBatman (238306) | about 7 years ago | (#20495963)

...is duping [slashdot.org] himself [slashdot.org] and thus Slashdot is duping the stories by extension.

Stonebraker has been pushing the concept of column-oriented databases for quite some time now, trying to get someone, ANYONE, to listen that it's superior. While I think he has a point, I'm not sure if he really goes far enough. Our relational databases of today are heavily based on the ISAM files of yesteryear. Far too many products threw foreign keys on top of a collection of ISAMs and called it a day. Which is why we STILL have key integrity issues to this day.

It would be nice if we could take a step back and re-engineer our databases with more modern technology in mind. e.g. Instead of passing around abstract id numbers, it would be nice if we had reference objects that abstracted programmers away from the temptation of manually managing identifiers. Data storage is another area that can be improved, with Object Databases (really just fancy relational databases with their own access methods) showing how it's possible to store something more complex than integers and varchars.

The demands on our DBMSes are only going to grow. So there's something to be said for going back and reengineering things. If column-oriented databases are the answer, my opinion is that they're only PART of the answer. Take the redesign to its logical conclusion. Let's see databases that truly store any data, and enforce the integrity of their sets.

Re:The guy... (0)

Anonymous Coward | about 7 years ago | (#20496389)

If you mean for too many products use a crappy mysql backend, I'll agree.

Personally I use real relational databases, made by folks who never said stupid things like "you don't need transactions", and I don't have any key integrity issues.

Re:The guy... (1)

Otter (3800) | about 7 years ago | (#20496423)

...duping himself and thus Slashdot is duping the stories by extension.

I read the blurb and thought "Haven't we had the same 'debate' over the same guy a bunch of times before?" The name stuck in my head as I always envision the former Notre Dame linebacker [cstv.com] and his famously low GPA turning to a career in database architecture.

Re:The guy... (1)

oh_my_080980980 (773867) | about 7 years ago | (#20496425)


No, there's something to be said for EDUCATION. The fact that you think a Relational Database can only store integers and varchars screams how ignorant you are. I would suggest reading, "An Introduction to Database Systems" by C.J. Date. Then you will be informed enough to make a comment about the state of affairs in Relational Databases.

Re:The guy... (1)

AKAImBatman (238306) | about 7 years ago | (#20496609)

The fact that you think a Relational Database can only store integers and varchars screams how ignorant you are.

Glad you're paying attention. Not. :-/

The relational model can store just about anything you want. It's just math based on sets. But that's not what I'm talking about. What I'm talking about is that today's *batch* of DBMSes are terrible at storing data that is not an integer or varchar.

In addition, today's DBMSes have no protection against bad data that isn't explicitly engineered into the data model. This provides an opportunity for human error to creep in. Thus you see issues with keys that aren't referenced/restricted, incorrect keys getting inserted because the ID exists in the referenced table, nulls that shouldn't be allowed, transactions that aren't closed in transactions, multiple rows for the same secondary key when it should be unique, data accesses that shouldn't be allowed, etc. There are solutions for nearly all of these in your average DBMS product, but they all require that the Database Administrator play dictator to enforce.

A better solution would be to take the same tack that computer languages and compilers have been taking: Make a mistake impossible as early in the process as possible. Just as modern compilers can catch a variety of errors before the code is even executed, it would be better if our databases offered more up-front protection and solutions for these problems. As a bonus, the database would "know" more about the data and thus be able to plot better data storage and retrieval schemes automatically.

Re:The guy... (0)

Anonymous Coward | about 7 years ago | (#20496595)

It is an interesting idea however it has basically a 180 degree problem on what is wrong with it vs 'traditional' dbs.

When writing it is easy to 'add new' for traditional. But for column wise you need at least the number of disk seeks in it as there are columns. Instead of just blasting the whole row to 1 sector. You need N number of writes vs 1.

For reading with traditional it is the opposite. It is number of seeks per row read is = to the number of columns. You need to seek for the number of rows in the DB.

For data that changes a lot (either on the data element or added) column wise would be crap performance wise. For data that is fairly 'static' and unchanging such as legacy data it would smoke.

Or in other words column wise read would be fast but write slow. I could see this being an option within DBs in the future. As it is just a data layout problem not a language problem.

He may have a point now that I sit and think about it.

Re:The guy... (1)

nuzak (959558) | about 7 years ago | (#20496619)

Stonebraker has been pushing the concept of column-oriented databases for quite some time now, trying to get someone, ANYONE, to listen that it's superior.

Oh, more than just Stonebraker: column-oriented databases have been getting pushed going on at least a decade, probably two. It comes up every few years in breathless statements like "row storage is obsolete legacy technology". Nevermind that most OLTP demands are a bit more of a hybrid thing, and that vertical partitioning usually does the trick pretty well.

As far as ISAM goes, I think the only thing still using it is MySQL, and I'm pretty sure it's variable-length, so it's closer to VSAM. I guess you could count Paradox and all the other xBase stuff too as still using it. But legacy is legacy, it's hardly an indictment of today's technology.

Concerning "abstract ID numbers", that's probably surrogate keys you're talking about -- which are technically a design flaw (or perhaps a "design smell" to crib an XP term), but they're kind of unavoidable for efficiency (I sure as hell don't want to sling around whole arrays of street/city/state/post in my CRM app). Any proper ORM or even good SQL should keep you from ever having to reference id columns except perhaps in a join (in case you're too cowardly to use NATURAL JOIN, which is actually pretty prudent given the surprises it can hit you with). PostgreSQL tried to get rid of these "rowid keys" with the OID column, but this didn't turn out well, and OID's are heavily deprecated in postgres now. OODBs and "document DBs" like CouchDB usually do a better job at hiding ID columns, but until those address their orthogonality problems, RDBMSs are here to stay, warts and all.

Differnt Solutions (1)

jythie (914043) | about 7 years ago | (#20495975)

*headdesk* heaven forbid that solutions might be different depending on the situation and the data....

Nope! Apparently there is a new method and thus it must be the real One Twue Way.

IMS--Hierarchical DB Still Exists (5, Insightful)

curmudgeon99 (1040054) | about 7 years ago | (#20495979)

You've all heard of the IBM product called DB2, right? So what was DB1? Answer: IMS, which is a hierarchical database. They were a pain in the ass to use--PSBs and all--but they were/are faster than hell and I doubt any company is going to throw them out for any reason. Same goes for relational databases. They're going nowhere. Sure, we have room for more but nobody is going to displace the RDBMS anytime soon.

Re : Are Relational Databases Obsolete? (5, Funny)

littlefoo (704485) | about 7 years ago | (#20495981)

No. There, that was easy !

It's like the packet of crisps that says "Is there a 20 pound note in here !!?" - the answer should always be 'No'.

Except maybe for one person.

sed -e 's/crisps/potato chips/' -e 's/pound/dollar/'

Re:Re : Are Relational Databases Obsolete? (2, Funny)

ahem (174666) | about 7 years ago | (#20496275)

Your sed is missing two expressions for full i18n:

sed -e 's/crisps/potato chips/' -e 's/pound/dollar/' -e 's/note/bill/' -e 's/packet/bag/'

Re:Re : Are Relational Databases Obsolete? (0)

Anonymous Coward | about 7 years ago | (#20496281)

sed -e 's/pound/pound note' -e 's/dollar/dollar bill'
To be applied to your sedscript. That's right, I sed'd your sed.

sed -e 's/packet/bag/' (0)

Anonymous Coward | about 7 years ago | (#20496323)

There, finished that off for ya. :)

that doesn't mean they're going to become obsolete (5, Insightful)

Arathon (1002016) | about 7 years ago | (#20495991)

Obviously, he's biased. But more importantly, he just said that column-store databases are going to take over the WAREHOUSE market. That doesn't mean that row-store databases are going to become obsolete, because there will always be applications out there that do a substantial amount of writing as well as reading.

In fact, the new wave of user-generated-content websites and webapps seems to me to indicate the exact opposite - if anything, row-store databases, with their usefulness in write-heavy applications, should becoming, if anything, more and more necessary/useful on the web.

So...chalk this one up to some grandstanding on the part of a guy who wants to put more money in his pockets...

Marketing hype by FUD.. typical (2, Informative)

cwford (848987) | about 7 years ago | (#20495997)

From TFA:

"Column-oriented databases -- such as the one built by Stonebraker's latest start-up, Andover, Mass.-based Vertica Systems Inc. -- store data vertically in table columns rather than in successive rows. "

Marketing hype for his startup.

What a sleezeball.

Re:Marketing hype by FUD.. typical (1)

thatskinnyguy (1129515) | about 7 years ago | (#20496267)

Would you expect anything better out of Computerworld?

Re:Marketing hype by FUD.. typical (0)

Anonymous Coward | about 7 years ago | (#20496537)

Yeah, but this is the cost of enjoying Slashdot.

Consider that it is a dynamics site that handles traffic that routinely knocks over many other static ones. It isn't cheap to maintain that kind of capacity so I doubt advertising revenue pays all the bills. Subscribers are few and far between.

Without Slashvertisements I don't think Slashdot would still be here in its current form.

No. (1)

Just Some Guy (3352) | about 7 years ago | (#20496017)

Relational databases will be around as long as humans generate relational data. Take the classic example of an invoice that may have many entries, each entry referencing an inventory item. This sort of thing is likely to exist forever, and RDBMSes model that pretty well.

As far as whether the backend is row- or column-oriented - who cares? As long as I can use the one most appropriate to my access pattern, the implementation details just don't interest me enough to get worked into a furor. Don't get me wrong - I think that there are some neat developments in the works - it's just that I don't have a strong general preference on how my information is physically laid down on the platters.

A more interesting question for me is whether SQL is obsolete. For the most part, I'd say that it is in the sense that most people need never use it directly. We use SQLAlchemy [sqlalchemy.org] instead of writing raw SQL, and the Java folks seem to be fond of Hibernate. I still look at the generated queries sometimes to convince myself that it's sane or for debugging or optimization purposes, but if they inserted a new middle layer between Alchemy and PostgreSQL that used something completely different, our code wouldn't notice the difference.

Re:No. (1)

dada21 (163177) | about 7 years ago | (#20496215)

Your invoice example is one that historically seems to only work with a purely relationship database, but I beg to differ because I've seen systems in place (custom coded) where an invoice front end was way more interactive in how you entered data. Searches were faster, and for large companies, the shoehorning to get data into place can be almost violent because they have such a variety of what they want to enter, or search for, or assemble by, etc.

We have one customer, a large contractor, who is trying a new PO system that isn't relational, and uses a ton of non-persistent data. Each supplier, each job, each inventory item might be completely different on a particular job, and the relational database system fails because it is so complex to try to make everything work with one another properly.

I'm NOT saying that relational isn't the best solution for a simple invoice or inventory system, but it isn't the perfect solution for many companies that have a need to provide a different "mapping" for items that might seem similar. I recall one example where the company orders items from 3 suppliers, with those same invoices being distributed to different jobs that had different invoicing structures and different billing situations. It was a mess if we just said "10,000 items from supplier A, 10,000 items from supplier B, etc" because of the sheer number of fields involved. The system we provided for them, while proprietary to my business, is a much more fluid and adapting system that made the entry clerks and the project managers happy because the system WAS so fluid and capable of dealing with really sticky situations of having 15 suppliers for 35 jobs and 20 different processes to invoice, bill, collect, process and track. What we did was "revolutionary" according to the client, and I'm hoping to integrate it for future clients as well. We tried for years to work these problems out relationally, but it never worked, even hiring some top database consultants who said "just do it the way it has always been done."

I really think the relational database WILL eventually die as we find more ways to balance the idea of persistent data versus non-persistent data versus how we look things up, store them, and regurgitate them in a different way for different tasks. I am not a database programmer, but I always try to attack ongoing problems using newer tools and unique ways to look at things.

Re:No. (1)

plague3106 (71849) | about 7 years ago | (#20496607)

You are pretty vague about what data you couldn't handle relationally, but it sounds more to me like a problem with your 'top consultants' not knowing how to do proper entity relation design. If the solution they provided didn't work for your problem domain, they didn't do their job properly.

I'd like to hear more; I'm not saying relational is perfect, but I find it odd that you had to abandon RDBMs all together.

Aha! (4, Funny)

Stanistani (808333) | about 7 years ago | (#20496029)

The next big thing in DBMS:
turning your head sideways.

Re:Aha! (1)

lucabrasi999 (585141) | about 7 years ago | (#20496169)

Damn you! You owe me a cup of coffee.

Should be, but isn't, and won't. (4, Interesting)

dada21 (163177) | about 7 years ago | (#20496039)

In my IT business, a vast majority of our top tier clients (grossing over US$100 million annually) are still using antiquated software that is still using a relational database backend. While these companies are generally VERY efficient in terms of providing services or products to their market, their accounting, purchase orders and project management software is decades outdated. Many of the companies that maintain these packages have merely made the interface more current (but still 5+ years old, but are still using terribly outdated software. I can't begin to tell you how often the words "FoxPro" and "MS SQL" come up and it ends up being a relational database "solution" or even worse.

It is very frustrating because we do have programmers on staff that create third party plug-ins to these databases to try to make solutions that the OEM code doesn't. When you meet younger programmers, many of them are frustrated themselves to work on ancient solutions that have no hope of being upgraded, because these industries we work in are not in a rush to try anything new and shiny, but instead are happy with the status quo.

I just bid a job a few months back that would cost $150,000 to upgrade their database infrastructure, and likely save the company $300,000+ annually in added efficiency, less downtime, and a more robust report system. Guess what they said? "We all think it is fine the way it is." That's money thrown out the window, employees who are frustrated (without knowing why), and forcing the company to lose efficiency by not being able to compete with newer companies that are utilizing newer technology to better their bottom line.

Ugh.

Re:Should be, but isn't, and won't. (0)

Anonymous Coward | about 7 years ago | (#20496553)

You want them to go without relational databases? There is a reason they said no.

There are places (data warehousing, mining, etc) where having the data in a non-relational format is fine, but for most database uses a company has, a relational database IS needed.

They might... (0)

Anonymous Coward | about 7 years ago | (#20496049)

Even Slashdot has become obsolete.

Simple solution. (3, Funny)

fahrbot-bot (874524) | about 7 years ago | (#20496057)

I tried turning my Oracle server on its side to get column-store access. Strangely, I didn't see any increase in performance. Perhaps I'll try the other side...

Re:Simple solution. (1)

QuantumRiff (120817) | about 7 years ago | (#20496495)

Don't be so gentle when you turn it, and make sure the DB server is running.. If you happen to do it just right, you can make enough hard drives in your raid array hiccup, and lose the whole array. Then it can read the entire DB in a millisecond!

Well, he WOULD say that (1)

Random BedHead Ed (602081) | about 7 years ago | (#20496065)

Let me get this straight by paraphrasing: Column databases are the wave of the future, says a column database distributor on his new column database blog. And Red Hat would recommend you run your new column database on Red Hat Enterprise Linux, perhaps? I wonder what brand of kit Dell would recommend I run RHEL on ...

I don't think this makes sense. Or does it? (1)

w4rl5ck (531459) | about 7 years ago | (#20496099)

From the perspective of an application developer, this is pure nonsense. I practically don't care wether my DB stores data in columns or rows or whatever.

What I need is a good, consistant layer that can handle object-based tree structures - nothing more, nothing less. I want to dump my Java/Objective-C/C#/C++/PHP/Python objects in some storage layer, and I want to be able to get it back, search for it, etc.pp.

Yes, relational databases are (or should be) dead for most modern application designs. But not because of RDBMs are going to be replaced by column-oriented DBMS (which is, from application perspective, no difference - IMHO), but because OODBs solves most application problems better (not that good solutions exists, yet... *sigh*)

On the other hand, I never got any master degree. Shame on me. Just won a best paper award for a paper I did not even wrote. Maybe I just don't have the wits for this stuff ;)

Object Databases (3, Interesting)

jjohnson (62583) | about 7 years ago | (#20496101)

Are they now officially an also-ran? Has the whole concept failed to be usefully implemented commercially, or will it be another Lisp--elegant, beautiful, and largely unused because it's kind of weird?

Re:Object Databases (2, Insightful)

SashaMan (263632) | about 7 years ago | (#20496325)

In a word, yes. I think there are a couple reasons for this:

1. OR mappers like Hibernate have gotten to the point that they are quite good, so they make the value add prop of object databases less compelling.
2. Object databases are never going to get the speed of relational databases. This is the real dealbreaker. Suppose an object database can handle 95% of my queries with adequate performance. All well and good, but I'm totally screwed on those other 5%. On the other hand, if I was using a relational database with hibernate, hibernate might handle 95% of the queries with adequate performance, but for those other 5% I can workaround by writing custom SQL. With that setup I get the best of both worlds.

I don't know of any attempts to use object databases on large enterprise projects that haven't been complete failures, with the failure always due to performance issues.

Elegant? (0)

Anonymous Coward | about 7 years ago | (#20496635)

I would hardly use the word elegant to describe object databases? They seem elegant, right up until you try to version a schema or point a reporting tool at them. Interesting, yes. Elegant, hell no.

He may have a point (2, Interesting)

duffbeer703 (177751) | about 7 years ago | (#20496143)

For data warehousing, a higher or different level of abstraction may be useful and make database design easier, particularly as paralellism becomes more and more common. Storing rich markup language or media in a database might be problematic as well.

But there's no way that RDBMS's are going away -- relational algebra simply solves too many data storage problems.

Are relations obsolete? (4, Informative)

roman_mir (125474) | about 7 years ago | (#20496199)

Once someone shows that there is no longer a use for any relationship between data entries, then we'll be able to say that RDBMSs are obsolete. Actually both headlines (/. and the linked article) are mistaken about what Michael Stonebraker is saying. He is talking about read intensive applications mostly and he is talking about optimization of data for reading purposes. This does not mean that RDBMSs are obsolete for all uses, just that he sees a faster way to retrieve data for certain uses.

SenSage is earlier example of column-oriented DB (1)

GringoGoiano (176551) | about 7 years ago | (#20496205)

SenSage built a column-oriented DB in 2001 and has had much success with the approach for their fast-input, fast-query, high-density, multi-TB databases. Stonebreaker was on their technical advisory board. Interesting that he now centers his own startup on the same principles. See http://en.wikipedia.org/wiki/Column-oriented_DBMS [wikipedia.org] .

'Tis True (-1, Troll)

MicrosoftIsGod (1152887) | about 7 years ago | (#20496239)

Tis true that Microsoft is greater than all of you! MS-SQL is better than anything. Windows is truly wonderful. As to the topic at hand, yes, I like green marshmallows, and so should you.

This will be the year (1)

Tribbin (565963) | about 7 years ago | (#20496331)

This will be the year of the column based database.

paradigm shift! (4, Funny)

sohp (22984) | about 7 years ago | (#20496421)

Along with Procedural Programming [slashdot.org] , this could REVOLUTIONIZE the software industry!!

You have to have a successor to be obsolete (0)

Anonymous Coward | about 7 years ago | (#20496435)

Yep, I'd say they're obsolete given the wild success of their replacement(s)....oh...wait....

Careful (1)

HangingChad (677530) | about 7 years ago | (#20496461)

The Dvorak keyboard is more efficient by a factor of 10 and you don't see it taking over the keyboard layout landscape.

Just because something is "better", even in technology, doesn't mean it's going to take over.

I've also lived through the decline of mainframes...still around. The internet was going to replace faxes...I still have a fax machine.

Linux is better than Windows, columns are better than rows but I wouldn't get all a-twitter over either of them just yet. Particularly from someone selling column based data stores.

rtfa before posting (3, Informative)

jilles (20976) | about 7 years ago | (#20496479)

I can understand people not reading every link on a slashdot article they comment on. But if you post the bloody link, is it too much asked to actually RTFA?! It's an article about a column. The actual column is quite interesting.

To add some content, this is about optimal storage for SQL databases in a data warehouse context where there are some interesting products that use something more optimal than the one size fits all solutions currently available from the big RDBMS vendors. The API on top is the same (i.e. SQL and other familiar dataware house APIs), which makes it quite easy to integrate.

Regarding the obsolescence question, one size fits all will be good enough for most for some time to come. Increasingly people are more than happy with lightweight options that are even less efficient on which they slap persistence layers that reduce performance even more just because it allows them to autogenerate all the code that deals with stuffing boring data in some storage. Not having to deal with that makes it irrelevant how the database works and allows you to focus on how you work with the data rather than worrying about tables, rows and ACID properties. Autogenerating code that interacts with the database allows you to do all sorts of interesting things in the generated code and the layers underneath. For example, the hibernate (a popular persistence layer for Java) people have been integrating Apache Lucene, a popular search index product, so that you can index and search your data objects using lucene search queries rather than sql. It's a quite neat solution that adds real value (e.g. fully text searchable product catalogs are dead easy with this).

Column based storage is just an optimization and not really that critical to the applications on top. If you need it, there are some specialized products currently. The author of the column is probably right about such solutions finding their way into mainstream products really soon. At the application level, you'll still be talking good old SQL to the damn thing though.

Wrong approach? (3, Interesting)

Aladrin (926209) | about 7 years ago | (#20496569)

Maybe his approach is all wrong. The database my company uses has MANY tables that are rarely written to, but a few that are written to ALL the time. Instead of trying to cram his 'one size fits all' database scheme down our throats and replace the current 'one size fits all' database scheme, maybe he should be trying to create a database engine that can do both.

I think you would have to determine the main use of the table beforehand (write-seldom or write-often), but the DB engine could use a different scheme for each table that way. I know some will claim that it can't be more efficient to split things this way, but remember that this guy is claiming 50x the speed for write-seldom operations.

As for Relational Databases... How is this exclusive to that? This is simply how the data is stored and accessed. If he is claiming 50x speed-up because he doesn't deal with the relational stuff, that's bunk. You could write a row-store database with much greater speed as well, given those parameters.

Specialized versus generalized? (3, Interesting)

dpbsmith (263124) | about 7 years ago | (#20496589)

'"In every major application area I can think of, it is possible to build a SQL DBMS engine with vertical market-specific internals that outperforms the 'one size fits all' engines by a factor of 50 or so," he wrote.'

I know very little about DBMS systems, but I thought it has always been true that you can achieve monumental performance increases by building somewhat specialized database systems in which the internals of the system make assumptions, and are tied to, the structure of the data being modelled. In fact, when RDBMS systems came in, one of the knocks on them was that they were far more resource-intensive than the hierarchical databases they displaced. However, the carved-in-stone assumptions of those models made them difficult and expensive change or repurposed.

I'm sure I remember innumerable articles insisting that "relational databases don't need to be really all that much terribly slower if you know how to optimize this that and the other thing..."

In other words, as an outsider viewing from a distance, I've assumed that the increasing use of RDBMS was an indication that in the real world it turned out that it was better to be slow, flexible, and general, than fast, rigid, and specialized.

So, what is a "column store?" It sounds like it is an agile, rapid development methodology for generating fast, rigid, specialized databases?

Mildly Confused (1)

Duffy13 (1135411) | about 7 years ago | (#20496593)

While I didn't particularly pay much attention in my database class, or go to it that often, from my current work with databases and a quick skim of some definitions for RDBMS, it strikes me that a good portion of people in this thread and the articles are using the term RDBMS incorrectly. (Though some of the posts appear to be in agreement with me) As far as I can determine RDBMS is solely (and simplified) the concept of relating data between different tables to decrease the repetition of said data. It's a method, a widely applied method, but just a method, not an actual type of database storage.

Sooo, wtf does RDBMS have to do with storing data with either columns or rows in a file?

Soon... (3, Funny)

lpangelrob (714473) | about 7 years ago | (#20496641)

The near future. Mr. Stonebraker walks into a store.

Mr. Stonebraker: How much are these plums?
Checkout girl: Plums? They're $0.99, $1.39, $12.49, $15.99, $26.38, $13.37...

Load More Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>