Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

With XML, is the Time Right for Hierarchical DBs?

Cliff posted more than 12 years ago | from the digital-evolution-of-data-storage dept.

News 276

DullTrev asks: "The hierarchical database model existed before the far more familiar relational model. Hierarchical databases were blown away by relational versions because it was difficult to model a many-to-many relationship - the very basis of the hierarchical model is that each child element has only one parent element. However, we now live in a web world that demands quick access to a variety of data on a variety of platforms. XML is being used to facilitate this, and XML has, of course, a hierarchical structure." Do you think a hierarchical database would really be a better answer for storing XML data over the existing relational counterparts?

"There have been some pushes to create pure XML databases (info on XML in connection to databases is here and info on XML database products is here) with claims that as they support XML natively, they can offer many advantages over relation databases.

Some of these claims include speed, better handling of audio, graphic and other digital files, easier administration, and handling of unexpected elements. Software AG, a German firm, produce and sell a suite of XML products, including Tamino, a native XML database. They have lots of information on why they think there database is great, not surprisingly, but no benchmarks. So, do the Slashdot community think that with XML the time has come for hierarchical databases? Or is it better simply to use a relational database that can output in XML, or script your way to achieve the same goal?"

Sorry! There are no comments related to the filter you selected.

Not-so-first post! (-1)

I.T.R.A.R.K. (533627) | more than 12 years ago | (#2581578)

Fucking christ. This pops up JUST as I finish making another post.
Soooo, no doubt this won't be a first post. But at least I'll be able to waste someone's bandwidth. You fuckers on dialup can suck my ass.!
And while you're at it, you can suck on this tasty morsel [rotten.com] !

Hey, Slashdot actually got a techie topic! (-1, Offtopic)

Anonymous Coward | more than 12 years ago | (#2581804)

What the hell's wrong with youse guys? Go back to giving us bullshit uninteresting topics to post anonymously on and do it NOW!

I DO IT WRONG (-1)

Big_Ass_Spork (446856) | more than 12 years ago | (#2581876)

I do it wrong

Laying here in the shadows of my room, I squint up at my love. My Ms. Portman. I am sore and tired after fucking her for eight solid hours. My chapped and aching dick is soaking in grits to relieve the pain. She gets on her knees and starts lapping the grits up out of the bowl. She places her beautiful hands on my penis and starts to lick the grits off my achy piece.

Massaging my nutsack she....

WAIT, I DO IT WRONG!!!!

Yanking my dick out of her mouth I throw her to the ground and shove it in to her gaping freshly fisted ass. [goatse.cx]

"OH BIG ASS SPORK!! Fuck my ass, fuck my ass good. DEEPER, my stallion, deeper!! Make a Beowulf cluster of sperm on my back!!"

"Imagine a Beowulf cluster of this baby!"

I DO IT WRONG!!!!

I continue to hump her alabaster form. Glistening with beads of sweat, she bites her lip in delight as I tear her ass open with my engorged dick.

"Queen Amidala!!" I shreik as I near climax.

She looks up at me and screams, "You are so alive in me, unlike *BSD or VA Software!!! Fill me with seed!! Yes, Yes, Yess!!!!"

"For me you are calling, hhhmmm?"

"YODA?!? What the fuck, can't you see I am using the force here?"

He savagely kicks my Natalie aside, he pulls out his large green penis and impales me...

I DO IT WRONG!!

All your sporkz are belong to the dead homiez!!

Frist P0s7! (-1, Troll)

Anonymous Coward | more than 12 years ago | (#2581579)

I rule. Finally! MWA HA HA! Props to the man.

Re:Frist P0s7! (-1)

I.T.R.A.R.K. (533627) | more than 12 years ago | (#2581583)

Oh yeah. Suck it!
C'moooon. Ungh! Ungh!
Don't act like it's your first time!
*slap*slap*
You got owned, bitch!

- I throw rocks at retarded kids

I don't think so. (2, Insightful)

webprogrammer (518832) | more than 12 years ago | (#2581584)

Hierarchical databases won't take over because they're relational counterparts are already so well developed. A relational database can do everything a hierarchical one can, with few exceptions. Even if there is a slight gain to using a hierarchical system, there are much fewer solutions, and consequently the one's that do exist aren't as well developed, so implenting one is more difficult.

Re:I don't think so. (4, Insightful)

Netmonger (3253) | more than 12 years ago | (#2581605)

I don't agree - look at LDAP. The benefits for LDAP'fying services is clear. With a hierarchial database, specific queries can target a subset of the entire database, without the over head of having seperate tables and/or database for varying information. For keeping track of 'real world' objects: People, Printers, IPS, etc.. the advantage is that the system used to organize them is similar to the actual grouping going on. Managers have employees 'underneath' them. Its basically taking the organizational concepts used for filesystems and applying them to database design. I havent done any performance testing LDAP vs. SQL for similar schema setup, but from what I understand one of the other benefits is fast lookups. Sounds like a good project! To implement databases in both LDAP and SQL and measure the performance of similar queries!! :)

Re:I don't think so. (1)

webprogrammer (518832) | more than 12 years ago | (#2581637)

I agree with what you say, but what is it that doesn't allow you to do that in a relational database? There may be instances where a hierarchical database fits (as in the example you give), but how is this a great advantage over a relational database? Yes it may be quicker in some instances, but a properly designed relational based system could force you to connect only to the database that contains the necessary data. With your method, you'd have to connect to the whole database first, then target a subset. Of course, under some circumstances hierarchical may have the advantage. Probably a project like you say could be done fairly easily with something like PHP. I'm not sure about this, but you might even be able to use pre-supplied wrappers so you don't have to bother making sure your LDAP and SQL query mean the same thing. You could just use the same query for both, with wrappers around each. Maybe I'll try that sometime.

Re:I don't think so. (2)

tzanger (1575) | more than 12 years ago | (#2581757)

I agree with what you say, but what is it that doesn't allow you to do that in a relational database?

Multiple values for attributes immediately come to mind.

For instance: Bob Smith has 3 phone numbers. With a hierarchial database such as LDAP, you simply list them as

  • telephoneNumber: (519) 555-1212

  • telephoneNumber: (604) 555-1212
    telephoneNumber: (905) 555-1212
In a relational database you must either leave room for the most you think you will run into, use a "joiner" table (the real term escapes me at this moment) or similarly kludge together a solution. Hierarchial databases are a pain in the ass for many things, but storing multivalued data is not one of an RDBMS' strong points.

Re:I don't think so. (1)

Craig Davison (37723) | more than 12 years ago | (#2581816)

The "joiner" table is not a kludge.

Besides, many LDAP implementations - ActiveDirectory for example - have a relational DB at their core.

Disclaimer: I always hated LDAP.

Re:I don't think so. (1)

kingmundi (54911) | more than 12 years ago | (#2581830)

Whats wrong with this?

|id | custnum | custname | telephone |
|1 | 45890 | Bob Smith | (519) 555-1212 |
|2 | 45890 | Bob Smith | (604) 555-1212 |
|3 | 45890 | Bob Smith | (905) 555-1212 |

Re: I don't think so. (1)

Inthewire (521207) | more than 12 years ago | (#2581851)

Whats wrong with this?

|id | custnum | custname | telephone |
|1 | 45890 | Bob Smith | (519) 555-1212 |
|2 | 45890 | Bob Smith | (604) 555-1212 |
|3 | 45890 | Bob Smith | (905) 555-1212 |

Well...no need to store the custname if you are already storing the custnum...duplicaton of data, and all that.
You'd just have another table, keyed by custname, that would hold 'Bob Smith'

Re:I don't think so. (1)

DavidJA (323792) | more than 12 years ago | (#2581861)

Because it breaks the rules of normalization(sp?). How much space is being wasted by repeating the CustName,CustNum & ID 3 times? And what if this customer has orders? Under what instance of Bob Smith does the order get associated with?

A better solution is to create two tables. One for customers (id/custnum/custname) and one for contact numbers (id/custid/phonenumber), with custnum being the same number which is stored in the first table.

For more info try here [wdvl.com]

Re:I don't think so. (2)

DJerman (12424) | more than 12 years ago | (#2582011)

Both are right and wrong. The denormalized tables can be made blindingly fast with appropriate index(es), but the joined tables are more space-efficient and flexible. You must know the "typical query" and know the number and kind of records and know the physical layer before you can do the math and settle the question.

Third normal form is wrong if you *always* join two tables in the same way. You may waste storage with the un-normalized table but if you always join the tables you're wasting time and swap space (temp segments, whatever) reconstructing the d**n thing over and over again. Build it once and be done. OTOH, if you usually just pick other information and get phone numbers once in a blue moon, normalize away. In oracle, choose a cluster and almost get both benefits (sacrificing both space and time, but less).

Re:I don't think so. (2, Interesting)

jd142 (129673) | more than 12 years ago | (#2581837)

Nah, this example is pretty simple. You don't even have to use a join or bridge table, which is what I've heard them called. Those are only needed when you have two objects that have a many to many relationship. For example, if you were doing a database of computer repairs, you might have a table of customers and a table of techs. Since there would be a many to many relationship here, you'd have a work order table or something, to show that tech1 worked with cust1, cust2, cust1, cust3, cust3, and that cust1 had service calls by tech1, tech2, tech2, tech1, etc.

In this case, unless you had a table of phone number data that contained information about the number (like who paid for it, the day it was installed, the type of service available, the type of line, etc) you could get by with just one employee/number table, like this:

bobid phone1
bobid phone2
bobid phone3

which is pretty simple, with a combinatio key of employid/phonenumb. You could still have a separate table with the phone number info, with the phone number as the primary key if you wanted to track the other data.

Most people overthink relational databases and don't really break things down like they should and make well formed tables. Of course, you can chang ethe table structure based on how the database is going to be used. Sometimes is is better to denormalize the table for search efficiency.

What I think is most interesting are the OODBMS, but it seems to me that they would have an increased overhead on their searches.

bob

Re:I don't think so. (0)

Anonymous Coward | more than 12 years ago | (#2581839)

Excuse me, but joining tables (which is what you would probably do in this example) isn't a kludge in relational databases, it's the bread and butter of what it is actually good for.

Re:I don't think so. (3, Informative)

brunson (91995) | more than 12 years ago | (#2581841)

This is a terrible example. You are trying to describe a scenario that requires a many to many relationship. The intermediary "joiner" or cross-reference table is only necessary if you have a need to keep both joined tables normalized, i.e. you want each distinct telephone number, as well as each person object, to be stored in the database only once.

You've already given up the possibility of normalizing your phone numbers in the heirarchical model (my roomates home phone is the same as mine and it shows up in LDAP twice, once for me and once for him), so a simple many to one join to the telephone number table will allow you to list a home phone twice, once for each of us.

Now, if the data you are modeling truely requires a many to many relationship (your model needs to handle the real world, you can't change the world to fit the limitations of your tools), you have no way of representing that information in a normalized fashion in a heirarchical model. The so called "kludge" of an x-ref table from the relational world is not even an option.

The heirarchical model is so limited and simplistic that it can be implemented in a single, self-referential table in a relational database, and can even be queried in a recursive manner (oracle has had 'connect by prior' for dealing with these models since I started with the product 10 years ago).

From my view as a mathematician, and not a computer programmer, the relational model is so much more robust and powerful than a heirarchical model it hardly warrants discussion.

'Joiner tables' are not Kludges (1)

The Raven (30575) | more than 12 years ago | (#2581973)

Taking out attributes with multiple values and putting them into a linked table is core to the functionality of relational databases.

Customer
--------
CustomerID
FirstName
LastName

PhoneNumber
-----------
PhoneNumberID
PhoneNumberTypeID
CustomerID
Text

This kind of relation is basic functionality in relational databases. This ain't no kludge.

Hierarchal databases have so many limitations. Even simple things, like employee lists, suffer under the restrictions of a hierarchal database. Employees that work in multiple departments, or have multiple supervisors. Employees with multiple spouses (think International). Projects with three leads. Employees working on multiple projects.

Relational databases were created for a reason. Abandoning all those improvements just to fit more cleanly into the XML hierarchal model is ludicrous to me.

Raven

Re:I don't think so. (1)

petis (139263) | more than 12 years ago | (#2581673)

I agree with you. For example oracle implements a hieararchical search with the start with .. connect by prior .. statement. It works like a charm. I don't think this is possible to do in mysql and friends, but it is probably just a matter of time.

Re:I don't think so. (3, Insightful)

drodver (410899) | more than 12 years ago | (#2581891)

Why do you assume relational databases are more developed than hierarchical?? The company I work for has been using our own hierarchical database for 25 years. They had the potential to become what Oracle is today but decided to stay focused on the medical industry. The serious problem with relational databases is they have traditionally not handled sparse data well at all. In the case of a patient every time they come for a visit there are tens of thousands of possible data points that can be entered, but most usually are empty. For tasks such as these relational databases have been completely impractical. With the use of indexing a heirarchical database can do everything a relational database can do.

Reversed Question (5, Insightful)

devnullkac (223246) | more than 12 years ago | (#2581585)

From a purist perspective, I suspect the question is actually reversed: we shouldn't be talking about "XML data" is if it was somehow the core representation. It's usual intent is as a transmission format and, as such, needn't correspond directly to the organization of the source data.

Rather than discard the advantages of relational and object databases, should we instead ask how XML can be used to represent those kinds of relationships?

Re:Reversed Question (2)

sporty (27564) | more than 12 years ago | (#2581720)

Having worked with representing just the table layouts in XML, its really not that hard to represent say, a NxM set of data. That's always been easy, its a 2d table. It winds up being a very shallow xml document, no more than say 3 levels: root node, node representing the data record, node representing an entry in the record.


You are 100% right, in that we should discard relational db's. Objects are a little more natural for a representation in XML. If an object contains objects, even if they are of the same type, ala trees, its a more natural representation than a 2d table.

Re:Reversed Question (1)

The Raven (30575) | more than 12 years ago | (#2581825)

devnullkac did NOT recommend that we discard relational databases! Exactly the opposite, he claimed that we should be looking for efficient ways to represent relational information in XML, not ways to abandon relational databases in favor of hierarchal ones.

Remember that XML is not the 'end use' for data. XML is a way to get data from place to place... from one database to another, one OS to another, one country to another. People don't 'use' XML any more than they 'use' TCP... it's a way to get data from one place to another.

So the question should not be 'How do we efficiently represent XML in our databases?' The question should be 'How do I efficiently represent my database in XML?'

Raven

Re:Reversed Question (2)

sporty (27564) | more than 12 years ago | (#2581845)

Its really not that hard, especially with 2d data, per my post. And even if you do table joins, you can represent the joined data with XML to represent which tables which data came from and which did not.

Re:Reversed Question (4, Insightful)

Florian Weimer (88405) | more than 12 years ago | (#2581723)

You have a point. In addition, we should ask ourselves: "Do we really need XML if it doesn't fit in our established technology framework?"

Often, the answer is a plain "No", from a technical standpoint. However, you have to market your product somehow, and this means that you need Java, Linux, LDAP, XML, and SOAP. (As time passes, some entries will drop off the beginning of this list, and others will show up at the end.)

Re:Reversed Question (0)

Anonymous Coward | more than 12 years ago | (#2581850)

Java, Linux, LDAP, XML, and SOAP. (As time passes, some entries will drop off the beginning of this list, and others will show up at the end.)

That's a bit presumptious to say the least. Leaving Linux aside for the moment, try to remember that Java, LDAP and XML were created to solve particular problems - at which they have succeeded quite well. SOAP and .NET were created purely to try and grab market share away from the previous technologies, without conferring any substantial benefits and relying entirely upon marketing muscle. It's not at all clear that a sufficient majority will be fooled.

Re:Reversed Question (2)

haruharaharu (443975) | more than 12 years ago | (#2581946)

Java, LDAP and XML were created to solve particular problems - at which they have succeeded quite well. SOAP and .NET were created purely to try and grab market share away from the previous technologies

And they are all being used in various places where they don't belong, just because they are the fads of the day. How long before 'Who moved my cheese?' finds its way in to this list?

You didn't mention the best native XML DB (1, Interesting)

Anonymous Coward | more than 12 years ago | (#2581589)

excelon has a very full featured XML database.
We use it exclusively and it kicks ass.

Well, the current version does. Pre 3.0 sucked ass.

http://www.exceloncorp.com

hi (-1)

FigBug (69370) | more than 12 years ago | (#2581597)

i like databases

Hierarchical == Object-Oriented Databases? (3, Insightful)

disarray (108) | more than 12 years ago | (#2581603)

Wouldn't object-oriented databases qualify as hierarchical (or some of them, at least)? A rather lengthy story [slashdot.org] ran a while back covering various reasons why object-oriented databases are useful, followed by various comments on cases where they aren't and why they aren't as common as relational ones today. The bottom line seems to be that they are in use today. One notable example comes to mind: LDAP. The aforementioned story has more. Despite the rather preachy tone, it's an interesting read.

1337ness for sale. [ebay.com]

Re:Hierarchical != Object-Oriented Databases (1)

nusuth (520833) | more than 12 years ago | (#2581663)

Organisation of containers are not what organisation of data is. Think of it this way, you can store all objects derived from FooObject if you have a node structure FooObject *Item, yet nodes themselves could be stored in a linked list (compare with a simple db table), or in a graph (cw a relational database with many to many relationships) or in a tree (cw a hierarchical database.) The difference between relational and object oriented databases are basicaly what type of things they can store. OO ones can store whole objects, while relational ones store fields in tables. Those fields are usually simple data types. I can't think of a reason that an OO database can not also a be hierarchical database, yet that does not have to be the case either.

Re:Hierarchical == Object-Oriented Databases? (1)

ez76 (322080) | more than 12 years ago | (#2581710)

Wouldn't object-oriented databases qualify as hierarchical (or some of them, at least)?

Be careful not to confuse class "hierarchies" with relationships among objects, which are generally graphs, not rooted trees.

Re:Hierarchical == Object-Oriented Databases? (1)

JordanH (75307) | more than 12 years ago | (#2581924)

Shhhh... You aren't supposed to notice that.

Why MUST we be forced into one-size-fits-all RDBMS solutions?

Someone had to come up with the buzzword-compliant "Object-Oriented" Database to break the hypnotism the Relational Database vendors and theorists have over the industry.

It seems to me that a lot of data is hierarchical in nature. It's represented that way in programs and sometimes, you just want it to be persistent. A hierarchical database is sometimes just the thing you need, but we're forced into taking our nice hierarchies and deconstructing it into tables to make it fit in the one tool for persistant data storage that's blessed.

Many to many is hard? FALSE! (3, Insightful)

hodeleri (89647) | more than 12 years ago | (#2581604)

XML has, of course, a hierarchical structure

Just because XML is a hierarchical markup language does not mean that it can only be used for hierarchical things. Perhaps you should look at RDF [w3.org] which can use many to many mappings through resources and groupings (sequences, bags, and alternates). (A resource in one grouping can refer to another grouping i.e. many to many.)

Re:Many to many is hard? FALSE! (1)

m00nshyn3 (314525) | more than 12 years ago | (#2581793)

Nobody ever said many to many relationships in XML were hard. The orginal comment said that mm relationships were hard in hierarchical databases, not hierarchical structures. a straightforward explanation of why many to many is hard in a hierarchical db can be found here [extropia.com] .

Discussions (3, Informative)

Lozzer (141543) | more than 12 years ago | (#2581608)

There is lots said on this over at Database Debunkings [firstsql.com]

XML vs. ERwin (3, Insightful)

imrdkl (302224) | more than 12 years ago | (#2581609)

IANADG, but the folks that do our models still use good, old, ERwin. Something about the relationship-specification capabilities, I guess. I was not aware that XML limited number of parents specifically. You sure that ain't just a limitation of your programming language? :)

An afterthought, databases are about storage and speed of insertion/extraction. I honestly don't believe that fitting the database to the data structure is worth the cost or the trouble, just yet.

No Chance... (3, Insightful)

augustz (18082) | more than 12 years ago | (#2581610)

I think these discussions come up part of the time because people want something new and sexy. In this case OO DB's, which 'XML DB's' are a variant of, may have benefits in specific and limited cases. But I have not been impressed.

Take your classic orders table. Part NO, Custoemr NO, etc. etc. The number of apps with only one parent is tiny, the flexibilty limited, and the whole metadata scanning business awkaward.

For anyone doing and serious larger scale database work some of this stuff is a joke. The idea these vendors have is that we'll be storing XML data in these DB's, ignoring that even for a simple phone directory, the XML data probably takes up a significantly greater amount of space than a simple relational DB would require

And this ignores the significant amount of time and energy invested in toolsets and models for the existing setup. Sure, someone might come out with a chip that runs 2x as fast as an intel at the same price, but unless it is intel compatible how many people would buy it or care?

Re:No Chance... (0)

Anonymous Coward | more than 12 years ago | (#2581707)

This is a very uninformed answer. XML DB's are *not* variants of Object DBs... Some, like Excelon and X-Hive, are built on top of object databases, while others like Tamino and dbXML have implemented native filing systems geared specifically toward storing and indexing XML document structures. Even those DBs that are build on top of object databases don't expose the actual object database layer.

Re:No Chance... (1)

Multispin (49784) | more than 12 years ago | (#2581724)

I wouldn't consider XML DBs a special case of OO DBs. OO DBs imply something more about object relationships. The often imply the notion of methods and operators on these objects.

Belive me, there are LOTS of very cool things that you can do when you break the relational model. Is the relational model going away? Hell no. Is XML somehow competing with the relational model. Nope.

I honestly don't see anything XML related being faster then relational for transactional data. However, for knowledge representation or data interchange, hierarchies rock!

Source and migration (a digression) (1, Offtopic)

Improv (2467) | more than 12 years ago | (#2581768)

I would suspect that companies/people who run
Unix would like that faster chip, as Unix is quite
portable. I have 2 Alphas, 3 PCs, a NeXT, and my
laptop is an iBook, all of them running Unix.
At work, I manage various flavors of Unix, many on
non-x86 hardware. But I digress..

Re:No Chance... (1)

Kingpin (40003) | more than 12 years ago | (#2581795)

This is not because people think "XML is sexy. Let's do everything with it!". It's because the DB research world recently has been approaching semistructured data that allows for queries not immediatly available in the relational model.

Furthermore the "document society" has been bringing consistency back into the web by various derivates of the "webservices" concept, ie. there's an XML representation of data - which is machine readable, rather than one of quirky man made HTML.

This convergence between DB research and data representation is what is interesting to lots of people in this area. Will it suddenly make sense to use the hierachical structure as logical view on the database? If so, can we make operations like JOIN and UNION (or other) on websites thus causing data enhancement or aggregation?

The ideas are really interesting, don't quite knock them yet. I can strongly recommend a google search on "semistructred data", further the book "Data on the Web" by Abiteboul et al. is extremely insightful on this topic.

Both Worlds (1)

Khazunga (176423) | more than 12 years ago | (#2581615)

What I really would like to see is the ability to have Relational Databases, with hierarchical types for fields. I would be able to query these fields much like I query/transform an XML document, possibly using the XML technologies (XPath, XSL, ...)

The relational model is very good for most situations, and has been very studied and optimized. Noone would transition back to pure hierarchical DBs.

Re:Both Worlds (2)

friedo (112163) | more than 12 years ago | (#2581877)

A hierarchical model can easily be done in a relational system by simply using a self-referrential 2d table. Each element has an unique ID and the ID of its parent.

Re:Both Worlds (2, Informative)

rp (29053) | more than 12 years ago | (#2582009)

You can represent the structure, but you can't manipulate it using standard relational logic.

For example, take a table representing a parent-child relationship. Now try to sort the persons in the table by their number of descendants. SQL has only recently been extended to allow this query to be posed. Perhaps your relational database can handle this kind of query, where you have arbitrary-depth path walking, ybut ou can't expect it to handle them efficiently.

XML is sometimes useful (1, Informative)

Anonymous Coward | more than 12 years ago | (#2581618)

Yes, I think XML databases can be useful sometimes, as even though relational is faster and better developed in some cases native XML products have the capability to store any data, without prior setup. I know I'm using dbXML (http://www.dbxml.org) in a product of mine which allows 3rd parties to store arbitrary data associated with a user.

Also, you get the full advantage of the XML technologies developed by the W3C and others - the ability to do a simple query, transform that data and then send to a web browser with very little coding involved is a great bonus.

(i've forgotten my login, time to go create a new one i think)

Indexing? (4, Insightful)

aralin (107264) | more than 12 years ago | (#2581625)

Anyone can explain to me what is suddenly so wrong about relational database with hierarchical indexing?

Maybe its just me, but the goal today is integration and having a special database for XML and special database for this and that just because its faster for this particular problem creates such a level of complexity, which prevents accomplishing even of the most trivial tasks.

Still, XML is only a way how to describe data, that might be often in their structure relational. Why do not store data in their native form and create XML documents out of database on fly by filters?

This question of hierarchical databases is just plain trolling in my eyes.

Re:Indexing? (2)

BenHmm (90784) | more than 12 years ago | (#2581654)

Still, XML is only a way how to describe data, that might be often in their structure relational. Why do not store data in their native form and create XML documents out of database on fly by filters?

Quite. Not only would the XML markup probably take more space than the data itself, but storing it as XML seems to be not only pointless, but also a little shortsighted. What if your XML spec changes? What if you want the data in another form?

Just storing the data and then dynamically creating the XML doc on the fly is sooo much easier.

Re:Indexing? (2, Insightful)

captredballs (71364) | more than 12 years ago | (#2581922)


The problems that you mention, both concerning storage space and flexibility of the data model are what XML databases are attempting to solve.

Listing the problems in opposition to the solutions does not make for a good arguement.

All about databases (1, Informative)

jayant_techguy (441933) | more than 12 years ago | (#2581630)

Extropia [extropia.com] has a detailed tutorial on databases of all types.XML:DB [xmldb.org] discusses the differences between object-oriented databases, hierarchical databases, and relational databases in detail. You may be interested in DBX [f2s.com] a DBMS that is written completely in PHP, and works using XML style text files as its native format.

XML and RDBMS inconsistencies (2, Interesting)

russcoon (34224) | more than 12 years ago | (#2581632)

In my experience with XML and RDBMS systems, mapping one onto another is always a dicey task. The primary reason (IMHO) is that XML's ability to represent order as well as structure as data doesn't fit into an RDBMS database without some work. I've seen people try to map both XML and regular DB's onto each other, and my opinion is that the results don't "feel right" on one side or the other unless great pains are made to preserve the structure of the XML doc in the DB schema.

That said, I'm not sure a hierarchial DB will necessarialy be any better than something like an OODBMS with well-modeled objects.

XML an alternative to db for me... (1)

kellyboy (536872) | more than 12 years ago | (#2581634)

It's my choice for DBs for my website since my website hosting co doesnt provide MySql or any DBs. It has all the XML modules you could use to use XML as database..... it's conveninece. It's very portable. And it's easy to read.

Heirarchical vs relational dbs (2, Insightful)

ShmakDown (536071) | more than 12 years ago | (#2581635)

I don't think that heirarchical db's have any real chance of taking over or replacing relational dbs in the future. There may start to be more of a place for them, but many application service providers that use XML still have a fair amount of relational data that needs to be maintained. XML is mainly being used for communication protocals and not so much for internal data structure storage. I think the more likely db trend in the future will be for many users to maintain both relational and heirarchical databases..

Re:Heirarchical vs relational dbs (1)

wadetemp (217315) | more than 12 years ago | (#2581644)

If you've every looked at any of the OLAP technologies, it's very simple to take data in an exsiting relational structure and map it to a hierarchical structure for easier user analysis. This isn't necessarily maintaining both types of databases, but rather building one type off another type for analysis purposes. *whisper (Microsoft has some great software to let you do this on SQL Server...) :)

Its time (-1, Troll)

Anonymous Coward | more than 12 years ago | (#2581638)


The universe belongs to us.

Northern Troll Alliance.

LDAP, the hierarchical database that works (1)

dmelomed (148666) | more than 12 years ago | (#2581643)

With all this hype about XML and ubiquiteness of SQL, LDAP directories do not get the attention they deserve. How many of you have installed SQL-based authentication at your site, just to find out how limited a solution it is (maintain more than one database for all kinds of authentications, do you?). Not only does LDAP allow for a flexible hierarchical directory, it's also a standardized Internet protocol whereas SQL isn't. With LDAP, many applications work out of the box because it's a standard. Oh yeah, there's also the OSS server available at openldap.org.

Re:LDAP, the hierarchical database that works (0)

Anonymous Coward | more than 12 years ago | (#2581676)

One place I worked was using SQL for such things - when I asked about LDAP, the answer was "but we have people who know SQL" - if you hit a square peg hard enough, it'll probably go through the round hole....sigh...

Take two (0)

Anonymous Coward | more than 12 years ago | (#2581647)

IMHO, XML documents should be handled in both hierarchical and relational ways: the first for an efficient long-term storage and the other for transactions.

Re:Take two (1)

ShmakDown (536071) | more than 12 years ago | (#2581661)

I agree, the problem there lies in finding the best way for conversion, or using a different approach like redundency. I like the idea of being able to store in both ways so that the lookup still happens quickly.

The priorities are wrong.... (2, Interesting)

el_mex (175423) | more than 12 years ago | (#2581657)

A data format will NEVER dictate a system's design. XML is nothing other than a data format.


The relational model has no major shortcomings. The only thing XML offers that is not already very well done is easier data interchange. As a database administrator, I can tell you there is NO chance XML will dictate a change of how we store data. There are much higher priorities in database management than easier data interchange.

Re:The priorities are wrong.... (1)

ShmakDown (536071) | more than 12 years ago | (#2581674)

Wake up. Data formats dictate system design all the time! Systems work better when they are fine tuned to work with their data sets well!

Re:The priorities are wrong.... (1)

el_mex (175423) | more than 12 years ago | (#2581688)

Data formats dictate system design all the time!


Really? So I guess if a system processes a data file with a flat structure it would make good design if the database is designed flat to fit the data format?


Systems work better when they are fine tuned to work with their data sets well!


You're blurring the line. I referred to data formats as they pertain to a data file, not as they do to a database design (If I had meant "database design" I would have said "database design").

Re:The priorities are wrong.... (1)

captredballs (71364) | more than 12 years ago | (#2581703)

"No" major shortcomings? Traversing complex data sets can often be incredibly difficult in relational databases. Additionally, relational data models are generally very difficult to modify.

In particular, I'm thinking about complex scientific data sets where you may wish to "select" based on criteria that may not be keyed.

Re:The priorities are wrong.... (1)

el_mex (175423) | more than 12 years ago | (#2581725)

I do not get your comment...

I'm thinking about complex scientific data sets where you may wish to "select" based on criteria that may not be keyed

By "keyed" do you mean indexed? Why is that hard? Time-consuming, yes (with large data sets), but "incredibly difficult"? Why is it incredibly difficult? Is it not involving just a single SQL statement, keyed or not?

Again, would XML address this?

Why relational databases dominate (5, Insightful)

coyote-san (38515) | more than 12 years ago | (#2581660)

Relational databases didn't come to dominate the database market because they pushed aside equally valid alternatives, they dominate the market because relational databases implement relational calculus. Indeed, that's the very touchstone that distinguishes relational databases from something like DBM and its many descendants.

And *that* is important because it assures the desiger and user that every possible operation is well-defined and (hopefully) correctly implemented. The exact syntax for a "join" may differ, and a specific implementation may be flawed, but everyone agrees to a common baseline.

For hierarchial databases to really take off, they need to have an equally strong mathematical underpinning. For now, AFAIK, there is none other than that you get when you map a hierarchial database into relational tables and use exactly those relational properties. That's a good start, but if you're only using the properties in relational databases, why not stick with them?

As for XML, that's completely irrelevant. It's a good format for transferring data, but that's about it. You can store hierarchial data in an XML file, but you can also use it to store purely relational data or completely unstructured data (in some CDATA block).

I'm currently working on a paper about this... (3, Interesting)

Carnage4Life (106069) | more than 12 years ago | (#2581671)

Hi,
I wrote a paper on native XML databases and SQL databases that support XML [25hoursaday.com] that appeared on Slashdot [slashdot.org] a little while ago. While doing research for that paper I asked myself the same question, whether instead of coming up with hybrid methods to store relational and hierarchical data we should store XML in already existing hierarchical databases. Unfortunately things are not so clear cut.

First of all, a lot of data out there is relational and people aren't ready or willing to transition all that data to XML based storage so mixing of relational and XML data will probably be with us for a while. The biggest problem with object oriented databases is that they didn't understand this fundamental issue but it seems that with XMKL databases the vendors understand that hybrid data will be with us for quite a while which is why Tamino supports importing data from relational sources and even ships with a SQL engine.

Secondly, XML documents have a lot of metadata beyond the hierarchical parent-child relationships such as processing instructions, comments and entities which are require more intelligence in the support from the database than just storing parent-child relationships.

Finally all the major [commercial] relational database vendors have included some sort of native suppport for XML including XML types and there is a an ANSI standard in the works [sqlx.org] for combining XML and SQL. From what I've seen, none of the hierarchical databases plan to support XML as much as the relational databases have or plan to.

Now if you were simply asking whether a native XML database can be built on top of a hierarchical database then I believe the answer is yes. Then again native XML databases can and have been built on object oriented databases and relational databses so it makes sense that they can be implemented in a database system that is more suited to handling hierarchical data.

XML Data Bloat (2, Insightful)

trp0 (155951) | more than 12 years ago | (#2581685)

It certainly seems like the same thing is happening with XML that happens with any new toy: "my friend told me XML was cool for stuff, so I'm going to convert everything to XML so I can be cool too."

I was pretty sure that XML was useful in that it was a human-readable data-encoding mechanism that "average" users could get a grip on and utilize in sharing information between heterogenous systems, but it seems like people are completely missing the point these days in how to use XML effectively.

A lot of the benefit of using XML is quickly becoming negated by everyone coming up with their own DTDs and the lack of standard formats for encoding data that is to be shared. As an example, here at the university I attend, there is a project for sharing information about biological species' population data amongst sister organizations. The goal is make the information possessed by all these organizations available to all the others. The trouble is that they have all come up with their own format for storing the data they collect and can not agree on what standard should be used, so each organization is encoding all their information with a different XML labeling scheme. My first questions was: "Why in the heck are you using XML to encode the data anyway?" Seems easier and saner to just store it in your relational database and make the database accessible to sister organization who can then encode the information however they want for their end-users through their client applications rather than the organization holding the information imposing order on people wanting access to the information.

To make a long story short, XML encoding doesn't help you store the information more efficiently at all and with the state of the "formatting standards" today doesn't even really provide an efficient way of sharing information between organization or an efficient way of encoding the information for transmittal to other organizations. It seems as if people are missing the forest for the trees in how XML can be useful in its relation to data encoding and we should stick with our trusty ole relational and object-oriented database models as they have shown their usefulness and efficiency.

Re:XML Data Bloat (1)

el_mex (175423) | more than 12 years ago | (#2581709)

Agree with you 100%. XML is not a database. XML is just a formatted, human-readable export file. If you run a database on top of the XML file, you will come up with anon-optimal system. The fact that it is human readable takes away from the computer's ability to read the data easily.

I dismissed XML altogether when people started to claim it was going to save the world. The situation is still ridiculous, some people just do not get it. In my database I want uptime, redundancy, speed, and recoverability. Does XML address any of these issues?

Re:XML Data Bloat (3, Insightful)

Skapare (16644) | more than 12 years ago | (#2581864)

XML is just a formatted, human-readable export file.

Human readable?

I suppose you don't mind it when someone send you mail, and you see a bunch of tags all over the place because it's in HTML. XML is just the same kind of thing ... all cluttered with tags. The computer can read XML easier and more quickly than humans. Sure it could read it even faster if it didn't have to parse all those tags. But I wouldn't call this a design intended for humans to read.

Re:XML Data Bloat (3, Informative)

JordanH (75307) | more than 12 years ago | (#2581977)

  • Human readable?

    I suppose you don't mind it when someone send you mail, and you see a bunch of tags all over the place because it's in HTML. XML is just the same kind of thing ... all cluttered with tags. The computer can read XML easier and more quickly than humans. Sure it could read it even faster if it didn't have to parse all those tags. But I wouldn't call this a design intended for humans to read.


The XML isn't human readable, but browsers and other applications can make pretty good guesses at a nice human readable representation.

Further, you can define style sheets to produce different views, with data that would be unimportant to a particular human (or application) elided.

It may be oversold, but the point is that the data definition is well defined such that writers and readers (often human readers, also applications) can interact more easily. It's about portability of data, which readability is a subset.

XML not meant as a replacement for RDBMSs (4, Interesting)

bwt (68845) | more than 12 years ago | (#2581693)

Or is it better simply to use a relational database that can output in XML, or script your way to achieve the same goal?"

I believe that RDBMS's should add functionality to read/write XML, especially as the XML Schema recommendations is basically done.

The idea that XML should be the permanent storage format is a bad one. There is a lot of power in a normalized data model -- it enforces data integrity , while eliminating data fragmentation automatically and it minimizes transaction resources.

Consider XML representations for different entities that all share some kind of child entity. For example: people, businesses, and schools all share addresses. In XML, you want the addresses to appear in the description of the individual object. Does that mean you want to store the addresses separately that way? Absolutely not, because then when you enforce constraints or ask questions about addresses, your data is fragmented in three places. For that matter, how do you know all the entities that might use addresses? In an RDBMS, you can inspect all the foreign keys to the address entitity. What's the XML analog?

Pros and Cons (2, Informative)

Multispin (49784) | more than 12 years ago | (#2581702)

I work for a company that has been doing hierarchical DBMSs for years. The company is Applied Technical Systems [apptechsys.com] . We make a database engine called CCM.

XML is a great way for exchanging data, but the term XML databases is very misleading. If the database engine actually stores data in native XML, it's going to be *very* slow. I think the point behind XML is that nobody should really have to care what your backend is as long as you can export reasonable XML. Note that I say reasonable XML. And XML export that simple encodes the rows and fields in a table to XML with <row> and <col> tags is NOT reasonable. It conveys no actual knowledge of the real structure of the data.

Storing XML data in a relation DB can either be a very hard problem or a very easy one. Let me explain.You could look at some XML and define a DB schema for it, not too hard to do. Problem? It's not generic; a human has to re do it each time the XML structure changes. The alternative is to store it all in one big table and index the hell out of it. Problem? It's slow. At that point you aren't using any structure of the XML or the power of relational DBs.

I'm a firm believer that efficient XML storage, querying and retrieval will require a hierarchical database. The problem is that there's several features (bugs IMHO) in XML (and XPath) that, in a way, are throwbacks to relational DBs. IDREFs and the notion of document order particularly bug me. I ran into these this summer when I was on a team trying to build a XPath and XQuery front end for CCM.

We're gradually seeing the XML world change. Early XML documents were similar to the type mentioned above. They were flat. When you start adding depth the information inherent in the structure of the data becomes apparent. Another thing I'm glad to see the industry move away from is the notion that XML resides in files. Many (if not all) of the early XML parsers made this assumption. It was a pain in the ass to parse from some other source, like a buffer in memory.

Strictly speaking (1)

bikiniAtoll (442292) | more than 12 years ago | (#2581714)

Strictly speaking the relational model doesn't specify how data is stored, only how it is retrieved. XML is a storage format; there is no reason why a relational database couldn't use XML for storage.

Repeat after me ... (5, Informative)

Serpent Mage (95312) | more than 12 years ago | (#2581717)

XML is not a magic bullet. Relational database won out over the Hierarchical model for a lot of reasons. For instance, there exists a number of integrity constraints with the Hierarchical model such as

1) No record occurrences except root records can exist without being related to a parent record occurrence. This means that
a) a child record cannot be inserted unless it is linked to a parent record.
b) a child record may be deleted independently of its parent however, deletion of the parent record automatically results in the deletion of all its child and descendent records.
c) the above rules do not apply to virtual child records and virtual parent records.

2) If a child record has 2 or more parent records from the SAME record type, the child record must be duplicated once under each parent record.

3) A child record having 2 or more parent records of DIFFERENT record types can do so only by having at most 1 real parent, with all the others represented as virtual parents. IMS limites the number of virtual parents to 1.

In addition to these flaws, relational databases have had over a decade to become mature, optimized, and enterprise scalable. Harddrive partitioning for such databases as oracle work out perfectly with the cylinder, sector, and tracks of a hard drive to allow for the fastest read/write times as can be possible.

Too often people see that XML "can" do so many things and decides that it should be the way things are done but XML is NOT a magic bullet and just because it has the potential to do something does not make it the best methodology for doing so.

A website worser than goatse.cx! (-1, Troll)

Troll XP (535651) | more than 12 years ago | (#2581737)

Here it is! [redherring.hm]

Re:A website worser than goatse.cx! (-1, Troll)

Troll XP (535651) | more than 12 years ago | (#2581748)

But only windoze users will get it!

With SDF, is the time right for relational dbs :) (1)

teambpsi (307527) | more than 12 years ago | (#2581761)

I really wish people would stop focusing on the INTERCHANGE format and focus on the abstract implementation details.

Just about any heirachial store CAN be implemented in a relational database -- they are called "intersection entities".

Trivial and fast (when indexed) to Manage one-to-many and many-to-many relationships.

Complete with constraint checks if you so desire.

The greatly exaggerated demise of ODBMS should point out the problem of adoption: What problem does this solve that I cannot solve using what I already know?

or to parody Dr. Ian Malcolm in Jurrasic Park

"you were so busy using BLOBS in relational databases, you didn't stop to consider whether you SHOULD" :P

XML is not only hierarchical (0)

Anonymous Coward | more than 12 years ago | (#2581776)


XML has several standard hyperlinking paradigms (ID/IDREF, XLink, HyTime, TEI) which allow for the creation of non-hierarchical relationships.

Also, I don't like hearing so many people talk about using DB technology to hold XML data. Unless you are talking about a document management system, you really ought to be thinking of XML as an interchange format only.

In terms of the ability to represent information, XML tries to solve much of the same problem as a database: providing a framework for arbitrary structure. A database does it in a way that is highly optimized for query and modification speed, XML tries to do it in a way that is optimized for interchange and platform-independent processing.

Storing XML fragments in database fields is an odd thing to do, but I see more and more people doing it. I guess in an ideal world, your database schema would go down to the exact level of granularity you might be using the XML to capture. It seems half-assed to me to use a DB for high-level structure, then inside your records you have some other completely different type of structure. I guess people like this, though, since the DB vendors have all added technology to enable this.

To me it just means that people spend less time thinking about and designing the information models, and it is yet another case of software features shaping requirements (when of course we all know it should be vice versa)

impedance mismatch (2)

tim_maroney (239442) | more than 12 years ago | (#2581778)

I was surprised to see so many questions of the form "what's wrong with relational databases"? Relational databases have a well-known problem called "impedance mismatch" when mapping multi-linked object structures. Many links on the impedance mismatch issue can be found at this Google search [google.com] .

Anyone who has tried to take a natural set of application-side objects and map them onto a relational database is already quite familiar with the problems created by the proliferation of tables needed to map simple application data structures, as well as the large amount of development effort needed to deal with simple relationships that would be trivial to specify in an object model such as Java's or XML's.

There is clearly a need to move on to object databases, but installed base and skill set inertia have blocked this transition, with the result that database-oriented applications have remained hamstrung in their friendliness and feature set.

Tim

OO? (1)

Nevrar (65761) | more than 12 years ago | (#2581784)

I'm rather ignorant on this subject, but surely XML data could be viewed as being object oriented? If this is the case, then surely an OODBMS or more practically an Object-Relational DBMS could be used.

In case you were wondering if there are any out there, check out PostgreSQL [postgresql.org] which is way cooler than MySQL (it's open-source for a start)

Having worked in the industry... (1)

SerpentMage (13390) | more than 12 years ago | (#2581786)

Having just quit a pure XML company (can't say the name of the company, but lets just say I can now spy on the company instead of working there) I have to say that pure XML databases will most likely not pick up.

The reason why they will not pick up is not because they are not good in their own domain, but simply because the legacy of SQL is simply too large. To make XML do what SQL does today is about eight years away.

In those eight years SQL data will become huge and the problem will be converting the data. For example if you have a multi-terrabyte database how can you ensure there are no errors in transferrring. Hard disks have an error rate that works one in a billion. Now put that on a multi-terrabyte database that means a megabyte of data may be faulty at the best. This means that somebody will have a screwed up account.

This means the best solution is status-quo since the status quo works and does the job correctly.

I even predict that in ten years the "programmer" will almost cease to exist. In ten years we will become data mining extractors. Sure there will still be a task of extracting data using programs, but the main concern will be managing the data and figuring out interesting things from it.

What all this means is that we will live for a long time to come with SQL. Ok there may be XML adaptors, but it will still be SQL...

Oh god, please no. (1)

rhinoX (7448) | more than 12 years ago | (#2581821)


Having just taken a database course a year ago in which we had to deal with a hierarchical database. It is absolutely awful.

Some thoughts... (5, Interesting)

Coventry (3779) | more than 12 years ago | (#2581822)

I have been struggling with these issues for awhile now, for various reasons. Why? Because I like Zope [zope.org] , but am, like most developers, more comfortable with relational data structures.

Zope uses an object database known as the ZODB. Some forms of many-to-many relationsships and such can be handled via the use of selection and multi-selection properties, which are designed to distinguish between a selected element and the list of available elements. The list of elements can be derived from a property on the current object, a property on a parent object, or be created via a method call - allowing for non-traditional (for OODBMS) cross-linking of objects. Of course, since this sort of thing is a workaround, no true relational links are created... 'Soft Relations' may be ok for MySQL [mysql.org] , but in big application development, relationships must be enforced! Thus, the big-boys in RDBMS all enforce foreign keys (mysql does not)...

Of course, I've found that by careful creation of object heirarcies, very complex applications can be created on top of a OODBMS that are in fact more robust, in some ways, then the relational couterparts. The Bigest hurdle (Short-term) I see to OODBMS (including ones based upon XML [the ZODB can export objects as XML but they are stored differently internally]) is the lack of a true query and data manipulation language - like SQL. Sure, OQL exists, and is even technically a standard, but it A) sucks and B) is geared towards large java applications with huge amounts of active objects, not general purpose OODB queries. Thus, without such language, OODBMS are all disimilar in how one queries and creates/updates data, and in many cases, the only interface is a truely procedural one! Thus OODBMS are forced to use proprietary tools, and are locked into one system - not to mention speed of development (something normally associated with OO development and OODBMS in general) is hindered by the excessive amount of procedural calls one needs to simply query thier data...

Recently, an add-on to Zope addressed some of these issues. Called 'ZOQL' - it uses a SQL like syntax and allows for very discrete querying of the ZODB (something one had to do programatically using the 'ZCatalog' before) with all of the familar aggregate and comparison operators SQL users love... Of course, this _still_ doesn't address the issue of soft-relationships:

I think the bigest hurdle to OODBMS in the long term (tools like ZOQL are interfaces to existing systems, thus can be mplemented easily) is the lack of handling relationships. It seems that most RDBMS force a developer to think in Relational terms about the data, and most OODBMS force you to think in terms of objects... Most problems can be mapped to either of these domains, but you are forcing the data-model-type onto the problem. What is needed is a hybrid system, an 'Object-Relational' DBMS. This is to say that OODBMS system makers desist with the traditional OO idea that relations are of the following types:
  • Object A is a Object B
  • Object A Has a/many Object B(s)
What RDBMS systems excelled in (and thus fell into pupular use for) was ease of management and allowing common data to be moved and grouped. A 'Look-up Table' - for instance, which simply holds a list of common data (an enumerated list) and can be centrally maintained is a Boon in the RDBMS world. For example, you have a lookup table of car manufactureres, and one of them changes its name... Instead of updating all N Cars that are made by the manufacturer, you simply update the single record in lookup table. Since each car would have somehting akin to a 'Manuafactuer_ID' column linking it to the lookup table, the Cars belonging to the manufacturer are all taken care of.

How does one do this in a hierarchal system? Well, the easy answer would be that each manufacturer object contains all the cars that manufacturer makes. Simple, right? WRONG. Why?

Because each car also has a body-type (compact, sedan, SUV, truck, van, etc...) - which in a relational database would simple by another lookup table, but in an OODBMS poses data management issues. Do we put body-type higher then manufacturer? If so, then we have to maintain the list of manufacturers for each body type, causing headaches. Or do we put body-type below manufacturer, causing us to need to maintain a seperate list of body types for each manufacturer - these lists of course need to match exactly if we ever plan on being able to search or do reports based upon all cars of a specific body type.
Sadly enough, this sort of seperate-enumeration-relationship isn't implemented (well) in any OODBMS I've found.
Take the ZOBD for example, its selection and multiselection lists Try to handle this situation, but fail because relational integrety is not maintained! That is to say, behind the scenes it's not a true reference to a value in the enumerated list, but just a text entry representing a value in the list. If the value in the list changes, the selection-property does not update, leaving you with the equivilent of MySQL's bastard-children, the orphaned records.
This sort of soft-relationship handling is Ugly and BAD for maintainaility, but OODBMS users are faced with two ugly choices each time they map such a relationship: Do I store this as a plain-text property and just update N records each time this changes, or do I map it into the hierarchy and deal with the headaches incurred by doing so...?

I don't think I've answered the question, but hopefully I've at least shed some light on the subject for members of both the OODBMS camps and RDBMS camps... Now if only a useful ORDBMS were to come along...

(Note that PostgreSQL and some other RDBMS actualy can be used in a semi-OO manner, but this is usually reserved for inheritable structures of data to be used for specific extensions to the data model - thus the SUV table inherits from the Cars table and adds some columns - but all other relationships SUV has will still be relational)

A Hierarchy of Myth (2, Interesting)

droleary (47999) | more than 12 years ago | (#2581831)

While a hierarchy is often used by humans to organize and structure things, that should in no way impact how the data/information/objects are treated as individuals. Look at the common file system hierarchy and it's easy to see that burying files under a hierarchy of directories actually makes access to that information harder. It wasn't so noticeable when we were all just managing a few MB of files, but now people are beginning to store large picture, movie, and sound libraries. File managers have mistakenly stuck with the hierarchy instead of using information associated with the file itself (ID3 tags, etc.) to organize it all. What is really needed is a better approach to representing metadata so that information can be accessed directly based on those metadata attributes and not have it hidden in the hierarchy. I have a short essay on this from the work I've been doing on a Meta Object Manager (MOM), but it needs to be cleaned up before it could be published.

The desire to impose a hierarchy on the data itself instead of considering a hierarchy as simply one view on the data is a step backwards. Nobody who manages large amounts of data is looking to jam it into a static hierarchy, and so XML is not an answer, nor is any hierarchical representation.

Best of Both Worlds? (1)

modulo (172960) | more than 12 years ago | (#2581832)

I use Intersystems' [intersystems.com] Cache at work - under the hood it's hierarchical, so it would seem to be a good fit for XML (I hear more XML stuff is coming for version 4.2), but it also projects everything as relational tables through ODBC, and simultaneously as objects, through ActiveX and Java. (They're dropping CORBA, not enough interest apparently.) I find it a little tough to program natively in, but it's gotten a lot better with version 4.x. And it runs quite well under Linux. That's the platform I use :-)

persistence layer (2, Informative)

budGibson (18631) | more than 12 years ago | (#2581855)

In design, the logical construction of the program and its data structures should be relatively independent of the physical implementation of said.

Basically, as I read your question, you are using a logical design that is hierarchical (an object structure experessed in XML) and wondering if it would not make more sense to store it in a hierarchical database. Maybe.

However, relational databases form the current state of the art and have been highly optimized such that any theoretical performance gains from better matching of logical structure to physical lay-out in the database are likely outweighed. More generally, by insisting on a match between logical and physical lay-out, you would potentially be limiting yourself to a specific physical implementation, one that may not provide good performance relative to others.

A better solution to your problem might be something referred to as a persistence layer. This adds another layer of abstraction to your application, in the form of a mapping, between your logical design and your actual physical mode of storage. There now exist publically available free (as in beer, and in some cases open-source) tools that will automate this mapping. Generally, any performance hit from the abstraction should be made up in the speed of the superior physical implementation, and the freedom to switch later is also important.

Two that exist for java are castor available from exolab [exolab.org] and a pilot implementation for Sun's emerging Java Data Objects standard (see http://java.sun.com [sun.com] for that tool).

Bad Question (2)

smack.addict (116174) | more than 12 years ago | (#2581869)

First, hierarchical databases weaknesses are not limited to many-to-many relationship modeling. The simple fact is that some data is better represented in a hierarchical fashion (a directory service IS a hierarchical database) and others in a relational fashion. XML is a tool for exposing that data to external sources regardless of its internal representation(s).

Mapping XML onto a relational model (2, Interesting)

ccf (116263) | more than 12 years ago | (#2581884)

The enXyme [sourceforge.net] project attempts to map XML data onto a relational database schema. The goal is to allow complex, specific queries of XML data. It's not easy to capture ALL of XML, with all its possibilities, but you can do parts of it. The project already has a basic XML schema parser, a script that takes an XML schema and generates a series of sql CREATE statements that reflect the hierarchy described by the schema.

I guess a pure XML database like the ones mentioned in the article would be better at this, but the advantage is that relational dbs are already in wide use.

Let's get this out of the way (1)

ppetru (24677) | more than 12 years ago | (#2581897)

XML is a format. Using it just because you can (which unfortunately represents most of its current usage) is stupid. Examples:

  • XML databases. Excuse me? What sane person would store a massive amount of data with the on-disk format being XML? What's the point?
  • Configuration files (especially the Java stuff is plagued by this). Why on earth would you want to replace "theparameter = thevalue" by "<theparameter value="thevalue">" ? I've seen software embedding an XML parser for the sole purpose of parsing the config file... I don't know about you, but that seems just plain wrong to me.
  • Messaging protocols designed to be lightweight and/or very low overhead. Now don't get me wrong, I think that XML-RPC is very nice and everything, but to use it (for example) for in-cluster message passing of a distributed database would add so much overhead (with exactly zero benefits) that only a marketing guy could take such a decision.

The list could go on and on and on. Just remember: use each tool when it's needed. Don't just put XML in there because it's a nice buzzword, mkay?

PS: on the subject of the article... Again, I fail to see where XML comes in. Hierarchical DBs have been around for a long while, and even if they've been shadowed by the RDBMS' it doesn't mean they're dead (I'm working on a project which uses a hierarchical database simulated on top of BerkeleyDB. Works wonderful, and there's no need for SQL. Or XML, for that matter).

Re:Let's get this out of the way (1)

tbien (28401) | more than 12 years ago | (#2582000)

I totally agree...

Once I talk to a guy in my current project and he said that every kind of data wants to be stored into its naturally fitting format - what he meant was XML... Sorry, but I don't think so, Sir!

XML is nice when I have to do inter-application communication with unknown communication partners - thats it!!!

The power of XML was its simpleness, but with all those senseless 300+ pages specifications around it its worth nothing!

Re:Let's get this out of the way (1)

shatteredpottery (320695) | more than 12 years ago | (#2582024)

"...would add so much overhead (with exactly zero benefits) that
only a marketing guy could take such a decision."


AHA! I think you've hit on the real problem here... God knows the marketroids at my company are constantly nagging everyone to use XML for everything, and often come back from a call saying something like "I told Client Z that we use XML for this, this, and that. We do, right?"

Got it backwards (1)

lesinator (459276) | more than 12 years ago | (#2581909)

I think what this really points to is that XML is not the end-all, be-all of formats for exchanging data. All that XML does is bring structured data formats to where databases were in the late 70s (when the Relational model began to take off). Just as databases took the path from flat files to indexed files, to heirarchical databses, to relational database; XML is just the next step in the path from fixed with columns, to delimited values, to key/value pairs, to XML. We need at least one more revolutionary change before the transmission structures (XML) catch up to the live datastores (Relational Databases).

Nested relational databases area better fit (1)

lawley (31214) | more than 12 years ago | (#2581935)

If you have a look at the literature, for example the proceedings of VLDB, you'll see that nested relational databases are, in general, the preferred theoretical model. There are a few quirks to iron out where the semantics don't quite align, but on the whole they are the better approach.


Why? Because the relational component still allows set-at-a-time interaction for efficient querying, but path expressions can also be used to navigate through the nested structure.

Hierarchical DBs run the world right now (1)

philgross (23409) | more than 12 years ago | (#2581937)

IMS, IBM's hierarchical database system, was originally developed for the Apollo space program. It is far from dead as you can read here [ibm.com] . IMS systems are currently storing around 15000 petabytes of data, and executing 50 billion transactions/day. All stored in a hierarchical database system.

The client benifits from XML, not the server. (1)

Twillerror (536681) | more than 12 years ago | (#2581938)

The client is the one that needs the information in easier to read form, not the server.

Relational databases are fast, dependable, and are well proven. The data storage that they provide is very adequate for most situations, if not all.

What really needs to happen is SQL needs to evolve to include XML, and better support for some other situations that constantly arise when writing complex queries.

I wish that the top clause could be used on a join stament.

left join top 1 some_table st ( ot.pk = st.fk ) with order by st.some_column.

Or that there was some syntax for querying out trees of data. Or there was a way to embed code in the joins that could analyze what had been joined at each loop. The data is stored fine in a relational database, sometimes SQL doesn't have the power to extract in a simple manner. I'm sure you can think of many other things that SQL could do a lot better.

Looking at one to many or many to many relationships in the standard two dimensional data set is a pain. Especially when you have several joins that can duplicate data. Cold fusion ( I know it sucks ) has a feature that allows you to query a result set, which makes extracting the data you need a little eaiser, but XML would make this a mute point.

Microsoft is developing an XML extension that will return the data in XML for SQL server. The bigger issue is that ODBC needs to do it before they (MS) do so that we don't get stuck using Microsoft XML over OLE DB, which means no Linux support. You can read about MS solution at http://www.microsfot.com/sql.

Looking at MS solution, it is still too complicated, because the SQL language does not allow you to direct the XML output. As well if foreign keys are setup it shouldn't need them, as well the join statements should be more then adequate to describe how the XML should be formated.

Another aspect is update/insert/bulk inserts. This could make doing updates to linked tables easier. As well as inserting complex structures of data. Also, it could be used to check data before it got inserted into the db and failed.

Data transfers would be much nicer as well. Taking what is one to many relationships in you database, and trying to make them horizontal instead of vertical is a real pain in the ass. It is also much slower usually because of the extra joining needed.

OLAP applications could stand to benifits as well. OLAP produces data in with multiple dimensions. Usually this is accomplished by
having binary data formats that are specific to vendors, now multi-dimensional data can be treated the same as any other result set.

I think we are a good year or two away from there being good XML support for databases, but it will increase application performance as front end guys and gals can issue less queries to the database to get the data they need, a in more logical fashion as well.

Experience with XML over ER engines (3, Informative)

nsample (261457) | more than 12 years ago | (#2581965)

Anyone can explain to me what is suddenly so wrong about relational database with hierarchical indexing?

Maybe its just me, but the goal today is integration and having a special database for XML and special database for this and that just because its faster for this particular problem creates such a level of complexity, which prevents accomplishing even of the most trivial tasks.


Forgive me for tooting my own horn on this one, but I believe that (for once on /.) there is a correct answer.

I summarize the answer in a paper written for VLDB 2001 (www.vldb.org [vldb.org] ). The paper presents joint work between Stanford, Berkeley, and RightOrder, Inc. It can be found online here [vldb.org] (in PDF).

What we found is that relational systems, with appropriate indexes for XML data, give the advantages of both worlds. XML is a hierarchical representation in only the loosest sense. It's written linearly in a flat text document, just as a child learns to write things down on a piece of paper. However, you wouldn't convince anyone but that same child that something written on paper can only represent two-dimensional objects just because the paper itself is flat. XML in many variants is plainly richer in concept than its simple hierarchical representation and thus quite suited to ER. I believe a previous poster mention RDF... a perfect example.

Punchline: XML is neat, XML is tasty, but XML is not inherently more or less expressive than ER; it just requires a little critical thinking (and index tweaking) to tune ER engines to deal with it. (Once tuned, the ER engines dominate all others in performance.)

Henry Baker's opinion of relational databases... (1)

Barry Wilkes (234818) | more than 12 years ago | (#2581996)

Some time ago I came across the following letter [rice.edu] by Henry baker regarding relational databases. It made interesting reading.

Perhaps there is an element of 'when your only tool is a hammer, everything is a nail.' to relational databases. They are certainly so pervasive now that any idea of using something different would be seen as taking a HUGE risk.

Maybe in the future.. but not just yet (0)

Anonymous Coward | more than 12 years ago | (#2582007)

I have pulled a great deal of hair out at work because a client insisted that we used an XML based object database (eXcelon).

While the concept is incredibly powerful, the technology is still to young. The moment you try and do anything important, like an aggregation for example, life becomes very difficult.

I've noticed that there's a lot of cool new stuff being proposed for XSLT2. Maybe that'll help the market along a bit. As far as modeling the data is concerned, it really does kick butt over relational for many (but not all) applications - but it's simply not ready to hit the masses yet.

Whats wrong with MS Access? (-1, Troll)

wrt (530301) | more than 12 years ago | (#2582028)

Flat file rocks my crotch!
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?