Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

The Future of XML

Zonk posted more than 6 years ago | from the xml-wearing-little-space-suits dept.

The Internet 273

An anonymous reader writes "How will you use XML in years to come? The wheels of progress turn slowly, but turn they do. The outline of XML's future is becoming clear. The exact timeline is a tad uncertain, but where XML is going isn't. XML's future lies with the Web, and more specifically with Web publishing. 'Word processors, spreadsheets, games, diagramming tools, and more are all migrating into the browser. This trend will only accelerate in the coming year as local storage in Web browsers makes it increasingly possible to work offline. But XML is still firmly grounded in Web 1.0 publishing, and that's still very important.'"

cancel ×

273 comments

Sorry! There are no comments related to the filter you selected.

"How will you use XML in years to come?" (5, Insightful)

Ant P. (974313) | more than 6 years ago | (#22342468)

Sparingly. JSON is just plain better, and doesn't inflict an enterprisey mindset on anyone that tries to use it.

Yes, but.... (-1, Troll)

Anonymous Coward | more than 6 years ago | (#22342550)

That's a suprisingly small mindset you have there. There are WAY better alternatives to either XML or JSON, such as the NIMP project [nimp.org] . It's like not choosing a more gas friendly car, just because you are used to your old SUV....

Re:Yes, but.... (-1, Troll)

Anonymous Coward | more than 6 years ago | (#22342672)

A few months back, our shop went NIMP in the middle of a large project. We started out using JSON, but the specs changed and trying to use JSON for data exchange wasn't possible. NIMP delivered. Once you try it, I'm sure you'll love it.

WARNING: GNAA (3, Informative)

Anonymous Coward | more than 6 years ago | (#22342676)

Don't click the "NIMP project" link. GNAA. Bad news. You know the rest.

Parent is a liar. There is no GNAA in URL (-1, Offtopic)

Anonymous Coward | more than 6 years ago | (#22342814)

Therefore, he has been sacked by his mom. [rude.com]

don't CLICK the LINK (-1, Offtopic)

Anonymous Coward | more than 6 years ago | (#22342828)

Very amusing but potentially embarassing! That parrot jerking the dude off is pretty funny :D

Re:"How will you use XML in years to come?" (4, Interesting)

DragonWriter (970822) | more than 6 years ago | (#22342650)

Sparingly. JSON is just plain better, and doesn't inflict an enterprisey mindset on anyone that tries to use it.


JSON/YAML is/are better (not considering, of course, the variety and maturity of available tools; but then, perhaps, you don't always need most of what is out there in XML tools, either) for lots of things (mostly, the kinds of things TFA notes XML wasn't designed for and often isn't the best choice for),things that aren't marked-up text. Where you actually want an extensible language for text-centric markup, rather than a structured format for interchange of something that isn't marked-up text, XML seems to be a pretty good choice. Of course, for some reason, that seems to be a minority of the uses of XML.

Re:"How will you use XML in years to come?" (2, Interesting)

larry bagina (561269) | more than 6 years ago | (#22343286)

OpenStep property lists kick json's ass 7 ways to sunday.

Re:"How will you use XML in years to come?" (3, Insightful)

MagikSlinger (259969) | more than 6 years ago | (#22343000)

Sparingly. JSON is just plain better, and doesn't inflict an enterprisey mindset on anyone that tries to use it.

JSON is inflicting Javascript on everyone. There are other programming languages out there. Also, XML can painlessly create meta-documents made up of other people's XML documents.

Re:"How will you use XML in years to come?" (4, Informative)

DragonWriter (970822) | more than 6 years ago | (#22343292)

JSON is inflicting Javascript on everyone.


No, it really doesn't, but if "JavaScript" in the name bothers you, you might feel better with YAML.

There are other programming languages out there.


And there are JSON and/or YAML libraries for quite a lot of them. So what?

Re:"How will you use XML in years to come?" (5, Insightful)

MagikSlinger (259969) | more than 6 years ago | (#22343584)

JSON is inflicting Javascript on everyone.

No, it really doesn't, but if "JavaScript" in the name bothers you, you might feel better with YAML.

No, it wouldn't because JSON is bare bones data. It's simply nested hash tables, arrays and strings. XML does much more than that. XML can represent a lot of information in a simple, easy-to-understand format. JSON strips it out for speed & efficiency. Which sort of gets into the point I did want to make but was too impatient to explain: JSON is good where JSON is best, and XML is good where XML is best. I dislike the one-uber-alles arguments because it's ignoring other situations and their needs.

There are other programming languages out there.

And there are JSON and/or YAML libraries for quite a lot of them. So what?

Would you like to live in a world of S-expressions [wikipedia.org] ? The LISP people would point out there are libraries to read/write S-expressions, so why use JSON? The answer of course is that we want more than simply nesting lists of strings. We want our markup languages to fit our requirements, not the other way around. And saying "JSON for Everything", which the original poster did was... silly.

My problems with JSON are:

  • No schema: XML Schema not only makes it easier to unit test, but it can be fed into tools that can do useful things like automatic creation [sun.com] of Java classes and code to read/write. Does JSON have anything like that? Of course not, because it would defeat JSON's purpose: easy Javascript data transmission.
  • Expressability: With XML, I can create a model that fits my logical model of the data where I use attributes to augment the data in the child elements. Doing that in JSON is a kludge with a hash-table to represent an element which can't be easily converted into a graph for easy understanding.
  • Diversity: I use GML [wikipedia.org] in my day job. A lot. I can easily set up an object conversion rule with Jakarta Digester [apache.org] that I can painlessly drop into future projects without modification. That's the power of namespaces. I can build an XML document using tags from a dozen different schema, and then feed it to another application that only looks for the tags it cares about.
  • XPath [slashdot.org] . 'Nuff said. Ok, one thing: this should have replaced SAX/DOM years ago.

JSON is great for AJAX where XML is clunky and a little bit slower (my own speed tests hasn't shown there's a huge hit, but it is significant). XML is great for document-type data like formatted documents or electronic data interchange between heavy-weight processes. My point was that the original poster's JSON is everything was narrow-minded, and that XML answers a very specific need. There are tonnes of mark-up languages out there, and I think XML is a great machine-based language. I hate it when humans have to write XML to configure something though. That really ticks me off. But that's the point: there should not be one mark-up language to rule them all. A mark-up language for every purpose.

Re:"How will you use XML in years to come?" (4, Insightful)

aoteoroa (596031) | more than 6 years ago | (#22343928)

Hear! Hear!

One file (format) will not rule them all.

XML is good if you want to design a communication protocol between your software, and some other unknown program.

JSON is much lighter. Far less kilobits needed to transfer the same information so when performance is important and you control everything then use JSON.

When it comes to humans editing config files I find traditional ini files, or .properties easier to read and perfectly suitable in most cases.

Writing more complex, relational data to disk? Sqlite often solves the problem quickly.

Re:"How will you use XML in years to come?" (4, Funny)

frank_adrian314159 (469671) | more than 6 years ago | (#22344162)

Would you like to live in a world of S-expressions?

If you're giving me a choice... why yes, please! Where can I get one of these worlds you're talking about?

Re:"How will you use XML in years to come?" (2, Insightful)

filbranden (1168407) | more than 6 years ago | (#22343900)

JSON is inflicting Javascript on everyone. There are other programming languages out there.

On the browser? If you want to use AJAX-like technology, JavaScript is still the only viable and portable option as the programming language for the client side.

Re:"How will you use XML in years to come?" (0)

Anonymous Coward | more than 6 years ago | (#22343664)

Why not just use s-expressions?

Re:"How will you use XML in years to come?" (3, Insightful)

slawo (1210850) | more than 6 years ago | (#22344006)

JSON is fit mostly for communication and transfer of simple data between JS and server side scripts through object serialization. But it remains limited. You can compare JSON to XML only if your knowledge of XML stops at the X in AJAX
Beyond that scope comparing these two unrelated "things" is irrelevant.

The tools and libraries available for XML go well beyond JSON's scope. DOM [w3.org] , RSS & ATOM [intertwingly.net] , OASIS, Xpath, XSLT, eXist DB [sourceforge.net] are just few examples of tools and libraries surrounding XML.
XML is designed to let you create your own protocols and formats while using only one simple base format (XML), one simple descriptor language (Schema or DTS) and you can convert your formats and protocols using one transformation technology (XSLT).
There are languages to query XML files and collections, XLM databases and many more examples including scientific, vectorial and 3d imaging formats.
I believe the word Interoperability was created with XML in mind... Or maybe the other way round... whatever.

You know the saying (5, Funny)

Anonymous Coward | more than 6 years ago | (#22342474)

XML is like violence. If it doesn't solve your problem, you're not using enough of it.

I've never heard that before... (0)

Junta (36770) | more than 6 years ago | (#22342526)

Nope... never ever have heard that one before....

Re:I've never heard that before... (0)

Anonymous Coward | more than 6 years ago | (#22343270)

It needs to be repeated, you creationist Java-programming terror-apologist Nazi-sympathizer.

I believe that this was what you were looking for. (1)

khasim (1285) | more than 6 years ago | (#22342562)

Rule #6 - If violence wasn't your last resort, you failed to resort to enough of it.

http://www.schlockmercenary.com/d/20050313.html [schlockmercenary.com]

Re:You know the saying (1, Funny)

palegray.net (1195047) | more than 6 years ago | (#22344114)

Taking your advice, I set up my server to output 10 MB of randomized XML for every pageview on all my websites, but all I got was this lousy bandwidth bill... thanks a lot, buddy.

I don't understand... (5, Insightful)

ilovegeorgebush (923173) | more than 6 years ago | (#22342476)

I don't get it. We can argue the merits of data exchange formats 'till we're blue in the face; yet I cannot see why XML is so popular. For the majority of applications that use it, it's overboard. Yes, it's easier on the eye, but ultimately how often do you have to play with the XML your CAD software uses?

I'm a programmer, just like the rest of you here, so I'm quite used to having to write a parser here or there, or fixing an issue or two in an ant script. The thing that puzzles me, is why it's used so much on the web. XML is bulky, and when designed badly it can be far too complex; this all adds to bandwidth and processing on the client (think AJAX), so I'm not seeing why anyone would want to use it. Formats like JSON are just as usable, and not to mention more lightweight. Where's the gain?

Re:I don't understand... (1)

daeg (828071) | more than 6 years ago | (#22342622)

I can see using it for some program data formats, but for one reason only: upgrading old file formats to a new format via XSL. In practice, I'm unaware of many software packages that do this, though.

Re:I don't understand... (0)

Anonymous Coward | more than 6 years ago | (#22342798)

> how often do you have to play with the XML your CAD software uses?

Quite frequently I'll dump XML from applications and manipulate it using a script. Inkscape for example - I'm more skilled at programming than using the app. Having the option to dump to XML is always welcome -- it's not difficult to generate.

> I'm quite used to having to write a parser here or there

So a parser for an undocumented binary format takes you how long on average to write? It takes me 30 seconds to write a script that loads an XML document (and hopefully the format isn't OOXML style blob in XML).

I'm no fan of verbosity or bloat but XML is fine for webpages (XHTML/deflate), business documents and config files that require nested data structures. It's not always the answer though and is surely misused for things like SVG. I'd like to see a standardized JSON serialization for SVG, not one that requires script to parse -- just a more compact format for stuff that nobody in their right mind will ever edit by hand.

Re:I don't understand... (5, Insightful)

SpaceHamster (253491) | more than 6 years ago | (#22342970)

My best stab at the popularity:

1. Looks a lot like HTML. "Oh, it has angle brackets, I know this!"
2. Inertia.
3. Has features that make it a good choice for business: schemas and validation, transforms, namespaces, a type system.
4. Inertia.

There just isn't that much need to switch. Modern parsers/hardware make the slowness argument moot, and everyone knows how to work with it.

As an interchange format with javascript (and other dynamically typed languages) it is sub-optimal for a number of reasons, and so an alternative, JSON has developed which fills that particular niche. But when I sit down to right yet another line of business app, my default format is going to be XML, and will be for the foreseeable future.

Re:I don't understand... (1)

Sta7ic (819090) | more than 6 years ago | (#22343450)

Much agreed with 2, 3, and 4. For better or worse, one of the project managers is big on XML and web services, to the point that developing web services is the first thing that I think of with their name.

You know what they say about round holes and square pegs? I've come to learn that XML is like play-doh. It's not round and it's not square, but it doesn't feel strong enough to last at the end of the day.

Re:I don't understand... (3, Funny)

El Cubano (631386) | more than 6 years ago | (#22343010)

For the majority of applications that use it, it's overboard.

You mean like this? [thedailywtf.com]

Re:I don't understand... (1)

maxwell demon (590494) | more than 6 years ago | (#22343200)

Of course the right level of bloat would be like this:

<node type="tagged">
  <tag>
    rootNode
  </tag>
  <properties />
  <content>
    <node type="tagged">
      <tag>
        numberOfAddresses
      </tag>
      <properties />
      <content>
        <node type="text">
          <text>
            110
          </text>
        </node>
      </content>
    </node>
    <node type="tagged">
      <tag>
        address_1
      </tag>
      <properties />
      <content>
        <node type="text">
          <text>
            442 Fake St.
          </text>
        </node>
      </content>
    </node>
...
  </content>
</node>

Re:I don't understand... (5, Insightful)

GodfatherofSoul (174979) | more than 6 years ago | (#22343020)

XML gives you a parsable standard on two levels; generic XML syntax and specific to your protocol via schemas. It's verbose enough to allow by-hand manual editing while the syntax will catch any errors save semantic errors you'll likely have. It's also a little more versatile as far as the syntax goes. Yes, there are less verbose parsing syntaxes out there, but you always seem to lose something when it comes to manual viewing or editing.

Plus, as far as writing parsers, why burn the time when there are so many tools for XML out there? It's a design choice I suppose like every other one; i.e. what are you losing/gaining by DIYing? Personally, I love XML and regret that it hasn't taken off more. Especially in the area of network protocols. People have been trying to shove everything into an HTML pipe, when XML over the much underrated BEEP is a far more versatile. There are costs, though as you've already mentioned.

Re:I don't understand... (5, Interesting)

machineghost (622031) | more than 6 years ago | (#22343046)

The "bulkiness" of XML is also it's strength: XML can be used to markup almost any data imaginable. Now it's true that for most simple two-party exchanges, a simpler format (like comma separated values or YAML or something) would require less characters, and would thus save disk space, transmit faster, etc.

However, the modern programming age is all about sacrificing performance for convenience (this is why virtually no one is using C or C++ to make web apps, and almost everyone is using a significantly poorer performing language like Python or Ruby). We've got powerful computers with tons of RAM and hard drive space, and high-speed internet connections that can transmit vast amounts of data in mere seconds; why waste (valuable programmer) time and energy over-optimizing everything?

Instead, developers choose the option that will make their lives easier. XML is widely known, easily understood, and is human readable. I can send an XML document, without any schema or documentation, to another developer and they'll be able to "grok it". There's also a ton of tools out there for working with XML; if someone sends me a random XML document, I can see it syntax colored in Eclipse or my browser. If someone sends me an XML schema, I can use JAXB to generate Java classes to interact with it. If I need to reformat/convert ANY XML document, I can just whip up an XSLT for it and I'm done.

So yes, other formats offer some benefits. But XML's universality (which does require a bit of bulkiness) makes it a great choice for most types of data one would like to markup and/or transmit.

P.S. JSON is just as usable? Try writing a schema to validate it ... ok I admit, that wasn't so hard, just some Javascript right? But now you have to write a new batch of code to validate the next type of JSON you use. And another for the next, and so on. With XML, you have a choice of not one but four different schema formats; once you learn to use one of them, you can describe a validation schema far more quickly than you ever could in Javascript.

Same deal with transformations: if you want to alter your JSON data in a consistent way, you have to again write custom code every time. Sure XSLT has a learning curve, but once you master it you can accomplish in a few lines of code what any other language would need tens or even hundreds of lines to do.

Re:I don't understand... (5, Insightful)

Anonymous Coward | more than 6 years ago | (#22343130)

I don't get it. We can argue the merits of data exchange formats 'till we're blue in the face; yet I cannot see why XML is so popular.

Because it's a standard that everyone (even reluctantly) can agree on.

Because there are well-debugged libraries for reading, writing and manipulating it.

Because (as a last resort) text is easy to manipulate with scripting languages like perl and python.

Because if verbosity is a problem, text compresses very well.

Re:I don't understand... (0)

Anonymous Coward | more than 6 years ago | (#22343588)

Because if verbosity is a problem, text compresses very well.
This gave me an idea for a new image format:

<image>
    <row>
        <pixel color="#FFFFFF"/>
        <pixel color="#000000"/>
        <pixel color="#FFFFFF"/>
    </row>
    <row>
        <pixel color="#000000"/>
        <pixel color="#FFFFFF"/>
        <pixel color="#000000"/>
    </row>
    <row>
        <pixel color="#FFFFFF"/>
        <pixel color="#000000"/>
        <pixel color="#FFFFFF"/>
    </row>
</image>

It may look a little too verbose for a 3x3 image, but it's text, so it compresses very well!

Patent pending

Re:I don't understand... (0)

Anonymous Coward | more than 6 years ago | (#22343888)

It may look a little too verbose for a 3x3 image, but it's text, so it compresses very well!

You're trying to be sarcastic but base32 [wikipedia.org] and base64 [wikipedia.org] are fairly efficient methods for encoding binary data as text and therefore as valid XML data. And it still compresses very well.

And if all you're sending are images, you're not sending them as XML to begin with are you.

Re:I don't understand... (2, Informative)

bluFox (612877) | more than 6 years ago | (#22344262)

| And if all you're sending are images, you're not sending them as XML to begin with are you. hmmm maybe [w3.org]

Re:I don't understand... (4, Interesting)

Xtifr (1323) | more than 6 years ago | (#22343254)

Like a lot of things, XML is popular because it's popular. Parsing is done with libraries, so programmers don't have to see or care how much overhead is involved, and it's well-known and well-understood, so it's easy to find people who are familiar with it. Every programmer and his dog knows the basics. It's easy to cobble up some in a text editor for testing purposes. You can hand it off to some guy in a completely separate division without worrying that he's going to find it particularly confusing. And you can work with it in pretty much any modern programming language without having to worry about the messy details. It's the path of least resistance. It may not be good, but it's frequently good enough, and that's usually the bottom line.

I mean, yeah, when I was a kid, we all worked in hand-optimized C and assembler, and tried to pack useful information into each bit of storage, but systems were a lot smaller and a lot more expensive back then. These days, I write perl or python scripts that spit out forty bytes of XML to encode a single boolean flag, and it doesn't even faze me. Welcome to the 21st century. :)

Re:I don't understand... (2, Insightful)

solafide (845228) | more than 6 years ago | (#22343472)

so programmers don't have to see or care how much overhead is involved

Which is how we got to the point where, Dr. Dewar and Dr. Schonberg [af.mil] :

...students who know how to put a simple program together, but do not know how to program. A further pitfall of the early use of libraries and frameworks is that it is impossible for the student to develop a sense of the run-time cost of what is written because it is extremely hard to know what any method call will eventually execute.
And you're saying overhead doesn't matter?

Re:I don't understand... (3, Insightful)

Xtifr (1323) | more than 6 years ago | (#22343852)

From an academic viewpoint, it probably matters. From a point of view of trying to get the job done...not so much. I studied the performance and efficiency of a wide variety of sort algorithms when I was in school, but nowadays, I generally just call some library to do my sorting for me. It may not be quite as efficient for the machine to use some random, generic sort, but for me, it's the difference between a few seconds to type "sort" vs. a few hours to code and debug a sort routine that is probably, at best, only a few percent faster.

XML is, in many cases (including mine), the path of least resistance. It's not particularly fast or efficient, but it's simple and quick and I don't have to spend hours documenting my formats for the dozens of other people in the company who have to use my data. Many of whom are probably not programmers by Dewar and Schonberg's definition, but who still do valuable work for the company.

Re:I don't understand... (1)

martin-boundary (547041) | more than 6 years ago | (#22344094)

Two things:

How do you know if what you've done actually gets the job done? Any monkey can type away randomly and get something done, but it's usually not the job that actually needs doing. For that, you need the skills academic work teaches.

You missed the point of studying sorting algorithms. They are taught not so that you can reimplement a quicksort later in life, they are taught because they are a great no-frills case study of the basic concepts you need to get a job done while knowing that you got the job done. Sorting is how you were taught to think logically and methodically, which are the skills you use every day in your programming career.

Re:I don't understand... (5, Insightful)

batkiwi (137781) | more than 6 years ago | (#22343272)

XML IS:
-Easily validated
-Easily parsed
-Easily compressed (in transit or stored)
-Human readable in case of emergency
-Easily extendable

Re:I don't understand... (2, Insightful)

maxwell demon (590494) | more than 6 years ago | (#22343328)

-Easily compressed (in transit or stored)

Which just means that it has lots of redundancy. Or, as one might call it, bloat.

Re:I don't understand... (4, Insightful)

Otto (17870) | more than 6 years ago | (#22344142)

-Easily compressed (in transit or stored)

Which just means that it has lots of redundancy. Or, as one might call it, bloat.
Test question: Which is quicker?
1. Spending a few hours coding your formats in some binary format making maximum use of all the bits.
2. Spending a few minutes writing code to send your internal data structure to a library that will serialize it into XML and then running the XML through a generic compression routine (if space/speed actually makes any difference to your particular application).

Consider the question in both the short and the long term. Also consider that you're paying that programmer a few hundred an hour.

Discuss.

Re:I don't understand... (2, Insightful)

batkiwi (137781) | more than 6 years ago | (#22344238)

Why is it bloat? How does it affect anything?

-it doesn't affect transit time when compressed
-it minimally takes more cpu to gunzip a stream, but the same could be said of translating ANY binary format (unless you're sending direct memory dumps, which is dangerous)
-it's never really in memory as the entire point is to serialize/deserialize

Re:I don't understand... (3, Insightful)

Anonymous Coward | more than 6 years ago | (#22343290)

For one, it has off-the-shelf parsers and validation tools. Parsing XML is, at it's hardest, a matter of writing a SAX parser. XML binding APIs make things even easier. The standardized validation tools also make it great for ensuring that people in charge of generating the data are using the form expected by those receiving the data.

Our biggest usage is in our customer data feeds. These feeds are often 1GB+ when compressed. Since switching to an XML format from a tab-delimited format, we've been able to give our customers an XSD to validate the file before sending it to us. The result has been far fewer round trips between us and our customers before receiving an acceptable feed. As you can probably imagine, fewer round trips is a very good thing when each round trip involves sending 1GB+ of data over the internet.

Also, gzip and other compression methods do very well with XML files...to the point where the difference between a compressed XML file and a compressed JSON or other formatting of the same data is pretty minimal. So your point about AJAX applications isn't particularly relevant once you've enabled gzip content coding.

The human-readable benefit is more of a nice side effect rather than a must have, though it does come in quite handy when debugging.

So you might want to ask yourself why not use it? What are the drawbacks of your CAD program using XML to store its files, especially if it does something like OpenOffice does by compressing them. Why not take advantage of tools that make it simple to parse information? If it's just the "XML is bulky" complaint, that's basically been addressed.

Re:I don't understand... (2, Insightful)

kwerle (39371) | more than 6 years ago | (#22343298)

I don't get it. We can argue the merits of data exchange formats 'till we're blue in the face; yet I cannot see why XML is so popular. For the majority of applications that use it, it's overboard. Yes, it's easier on the eye, but ultimately how often do you have to play with the XML your CAD software uses?

Let's say you need to store data, and a database is not an option. What format shall you store it in?
  1. Proprietary binary
  2. Proprietary text
  3. JSON
  4. XML


1 & 2 are untried, untested, and it is not possible to find 3rd party tools for working with/validating/generating/etc.
3 is just insane.
With XML you get a very well tested format. You get a million libraries and tools (a few of which don't suck) in any language you care to mention. Your users get all the benefits of an open spec and all the same tools. The question becomes: why would you not use XML?

Re:I don't understand... (0)

Anonymous Coward | more than 6 years ago | (#22343862)

Pure memory dump. Fast, painless, just works. Those other options are only worth bothering with if you have to interoperate with some other programs. That's usually handled better with a database.

Re:I don't understand... (0, Flamebait)

mi (197448) | more than 6 years ago | (#22343428)

Yes, it's easier on the eye

Except it is not even that! Even when nicely formatted, it is not... It sucks. It can't be processed with the likes of awk and sed. You can't print it. It takes A LOT more space (try compressing it) and it is hard to author. In addition it takes noticeable CPU-power to parse and application memory to store.

I'm forced to use it by the software vendor (who are stupid enough to use it even between their own components), and am fed up to the gills... These idiots use it for log-files, which they consequently don't write out, until a task is done, defeating most of the purpose of a log file.

I wish, it was never invented...

Re:I don't understand... (1)

filbranden (1168407) | more than 6 years ago | (#22344214)

It can't be processed with the likes of awk and sed.

Just because you can't use tools made for processing text in Unix line-based format, doesn't mean there aren't tools for this purpose. You can even find tools inspired on awk for XML processing, like xmlgawk [sourceforge.net] (also here [vrweb.de] ).

However...

I agree with you that XML is not the answer for everything. For instance, I just hate XML configuration files, exactly because you can't reliably grep, sed, awk, ex, them. Editing XML with vi is not the nicest task either. For config files I usually like INI-style files, for which there are modules in Python [python.org] and Perl [cpan.org] , and you can easily get around with simple shell tools when you just need a dirty hack. A config file usually has a simple enough structure to allow you to specify anything you need in the constraints of INI files (if it doesn't, you should probably rethink your config, it's probably bloated!).

For other tasks such as tabular data, CSV or just plain text delimited by tabs or ":" (think /etc/passwd) are more suited than XML, exactly for the ability to use simple universal tools (grep, sed, awk) on the data. It's even easier to visually inspect a table if it's in CSV or tab-delimited than it is to inspect an XML file and try to see through the tag soup.

Your comment about logfiles is right on the spot! Log files are just made to be grepped. Anything that doesn't write all relevant information of an action on one line, at the time the action happened, really defeats the purpose of logging.

It's all about the DOM (1, Interesting)

Anonymous Coward | more than 6 years ago | (#22343610)

The thing that puzzles me, is why it's used so much on the web. XML is bulky, and when designed badly it can be far too complex; this all adds to bandwidth and processing on the client (think AJAX), so I'm not seeing why anyone would want to use it. Formats like JSON are just as usable, and not to mention more lightweight. Where's the gain?

If I use XML, I can embed documents in other documents of different types, and they share a DOM. I can serve an XHTML document with MathML and SVG inside it, and use one CSS file to style everything, and my Javascript file can play with all of the above.

JSON is neat, and it's great for some things, but I haven't seen anything from the JSON people that even approaches what I can do with XML in a browser.

Why is XML so popular (2, Insightful)

Hal_Porter (817932) | more than 6 years ago | (#22342500)

It seems to me to be a slight improvement on ini files, csv and the like. But parsing it is hideously inefficient compared to a binary format. It's bloated too, so it takes more time to send it over the net or save it to disk. I've seen some XML schema that are aggressively hard to read too. And yet it's become something that every new technology, protocol or applications needs to namecheck.

Re:Why is XML so popular (2, Informative)

Alaria Phrozen (975601) | more than 6 years ago | (#22342638)

XML is not necessarily for human eyes. With the strict rules on non-overlapping closing of tags, its parsing and expansion is very easily stored and visualized as a tree. So parsing in general is actually quite easy. Also when you consider people like this http://www.acmqueue.org/modules.php?name=Content&pa=showpage&pid=247&page=4 [acmqueue.org] (ACM! So take it seriously!) who want to convert all Turing complete programming languages into XML abstractions, and call it the future, well... I'm honestly not sure why as you're right, we could have certainly done all this before. It's just sort of a generalization that everybody can agree on. Then again, maybe it's a response to, "Hey! _Anything_ is better than LISP!"

Re:Why is XML so popular (0)

Anonymous Coward | more than 6 years ago | (#22342866)

I refuse to take the linked article seriously. I doubt it is peer-reviewed for a start. Secondly, the article doesn't make sense (maybe it's written in "Inglish" XML).

Re:Why is XML so popular (3, Funny)

cp.tar (871488) | more than 6 years ago | (#22342954)

Then again, maybe it's a response to, "Hey! _Anything_ is better than LISP!"

Funny, that. I've heard LISPers say "XML looks quite like LISP, only uglier."

Re:Why is XML so popular (3, Informative)

Hal_Porter (817932) | more than 6 years ago | (#22343320)

its parsing and expansion is very easily stored and visualized as a tree

Why not store it as a tree in a format computers can parse efficiently? Invent binary format with parent and child offsets and binary tags for the names and values. It's smaller in memory and faster. Better basically. You don't need to parse them if machines are going to read them. And decent human programmers can read them with a debugger or from a hexdump in a file, or write a tool to dump them as a human friendly ASCII during development.

So parsing in general is actually quite easy.

You end up doing a bunch of string operations. Those aren't quick. Most likely you drag in some library written by a Computer Science damaged 'engineer' who doesn't understand assembler or how to read a hexdump and so it will be a lot less efficient than that.

Re:Why is XML so popular (1)

DaleGlass (1068434) | more than 6 years ago | (#22344246)

Here you go: http://en.wikipedia.org/wiki/Efficient_XML_Interchange [wikipedia.org]

I like this way much more than coming up with something new because it means I'd be able to keep my XML generating shell scripts, and just filter the output through a text to binary converter.

Re:Why is XML so popular (2, Insightful)

InlawBiker (1124825) | more than 6 years ago | (#22342662)

Because everyone said "XML is the future." And because it has an "X" it was perceived as shiny and cool. So therefore all managers and inexperienced developers jumped all over it. Now I have to encapsulate a few bytes into a huge XML message and reverse it on incoming messages, when I could have just said "name=value" and been done with it. I can see a use for XML in some applications, but it's been dreadfully overused.

Much Ado About Nothing... (3, Interesting)

milsoRgen (1016505) | more than 6 years ago | (#22342572)

FTA:

Netscape's dream of replacing the operating system with a browser is also coming true this year.

They've been saying that for years, and frankly it won't happen. A vast amount of users relish the control that having software stored and run locally provides. Of course there will always be exceptions as web based e-mail has shown us.

As far as the future of XML... I can't seem to find anything in this article that states anything more than the obvious, it's on the same path it's been on for quite some time.

FTA:

Success or failure, XML was intended for publishing: books, manuals, and--most important--Web pages.

Is that news to anyone? My understanding of XML is that it's intended use is to provide information, about the information.

Re:Much Ado About Nothing... (0)

Anonymous Coward | more than 6 years ago | (#22343078)

A vast amount of users relish the control that having software stored and run locally provides.
Web App's can be installed and ran locally. Google also have some deals where you can get a box that runs their software all ready to go.

Neither are ideal solutions, however they address your single argument against all app's moving to browser delivery

Re:Much Ado About Nothing... (0)

Anonymous Coward | more than 6 years ago | (#22343902)

Keep the Web in the browser, please. [blogsavy.com]
http://pinderkent.blogsavy.com/archives/81 [blogsavy.com]

I was reading today about Pyro Desktop. As the Pyro homepage states: Pyro is a new kind of desktop environment for Linux built on Mozilla Firefox. Its goal is to enable true integration between the Web and modern desktop computing. By merging the Web with the desktop, Pyro offers the first big step toward a new future for the Web and the applications built for it.

This sort of desktop integration makes me feel uneasy. The first problem I see with it is that it's unnecessary. Current web browsers work just fine as they are, for the most part. Some of them could be slimmed down somewhat, but ones like Opera and Konqueror function quite well. Konqueror, for instance, integrates well into the entire KDE desktop environment without being obtrusive.

The second problem I see is that it promotes bad habits. In fact, this second problem may be the most significant problems. Directly under the "Imagine..." line of the page, we see the following: Single programming environment for the whole desktop. Now that makes me feel very uneasy.

Time and time again the browser has been shown to be an inadequate application development platform. That hasn't stopped various people, groups and companies from putting together rather complex Web-based software products. However, one common trend we find with such applications is that they pale in comparison to native desktop applications written in a language such as C++. Developing a reliable, quality Web-based application is often more time consuming than developing a similar application using C++, Visual Basic, or Java.

I haven't been very impressed with most of the Web-based applications I've used so far. The native equivalents have essentially always been far more reliable, performant, and enjoyable to use. So the last thing I'd like to see is more Web-based apps, and fewer native apps. I'd much rather see it go the other way, with more apps written using languages like Python and Ruby, and making use of native GUI toolkits like GTK+ and Qt.

I don't see myself using this sort of software. It seems more like a step backwards than a step forwards. The Web is best suited to a browser. The desktop should remain a place for native applications.

The thing with XML (2, Interesting)

KevMar (471257) | more than 6 years ago | (#22342606)

The think with XML is that it so easily supports whatever design the developer can think of. Even the realy bad ones. Now that it is such a buzz word, the problem gets worse.

I had someone call me up to design them a simple web app. But he wanted it coded in XML because he thought that was the technology he wanted. His Access database was not web frendly enough.

I did correct him a little to put him in check and atleast gave him the right buzz words to use to the next guy.

I think XML is dead simple to use if used correctly. I do like it much better that ini files. That is about all I use it for now. Easy to use config files that others have to use.

Re:The thing with XML (0)

Anonymous Coward | more than 6 years ago | (#22343334)

I'm freelance software engineer and have been making money from xml, xml schema, xpath, xsl, xwhatever, wsdl, soap for years. But for some reason, I feel cynical, almost a swindler sometimes (but if this is what they want...)
Is it just me ? Are there xml engineers that actually believes in those technologies with regard to their costs. Are there xml fans ?

Too many 'this stuff sucks' moments (5, Interesting)

Bryan Ischo (893) | more than 6 years ago | (#22342850)

I have had far too many 'this stuff sucks' moments with XML to ever consider using it in any capacity where it is not forced upon me (which unfortunately, it is, with great frequency).

I first heard about XML years ago when it was new, and just the concept sucked to me. A markup language based on the ugly and unwieldy syntax of SGML (from which HTML derives)? We don't need more SGML-alikes, we need fewer, was my thought. This stuff sucks.

Then a while later I actually had to use XML. I read up on its design and features and thought, OK well at least the cool bit is that it has DTDs to describe the specifics of a domain of XML. But then I found out that DTDs are incomplete to the extreme, unable to properly specify large parts of what one should be able to specify with it. And on top of that, DTDs don't even use XML syntax - what the hell? This stuff sucks.

I then found that there were several competing specifications for XML-based replacements for the DTD syntax, and none were well-accepted enough to be considered the standard. So I realized that there was going to be no way to avoid fragmentation and incompatibility in XML schemas. This stuff sucks.

I spent some time reading through supposedly 'human readable' XML documents, and writing some. Both reading and writing XML is incredibly nonsuccinct, error-prone, and time consuming. This stuff sucks.

Finally I had to write some code to read in XML documents and operate on them. I searched around for freely available software libraries that would take care of parsing the XML documents for me. I had to read up on the 'SAX' and 'DOM' models of XML parsing. Both are ridiculously primitive and difficult to work with. This stuff sucks.

Of course I found the most widely distributed, and probably widely used, free XML parser (using the SAX style), expat. It is not re-entrant, because XML syntax is so ridiculously and overly complex that people don't even bother to write re-entrant parsers for it. So you have to dedicate a thread to keeping the stack state for the parser, or read the whole document in one big buffer and pass it to the parser. XML is so unwieldy and stupid that even the best freely available implementations of parsers are lame. This stuff sucks.

Then I got bitten by numerous bugs that occurred because XML has such weak syntax; you can't easily limit the size of elements in a document, for example, either in the DTD (or XML schema replacement) or expat. You just gotta accept that the parser could blow up your program if someone feeds it bad data, because the parser writers couldn't be bothered to put any kind of controls in on this, probably because they were 'thinking XML style', which basically means, not thinking much at all. This stuff sucks.

Finally, my application had poor performance because XML is so slow and bloated to read in as a wire protocol. This stuff sucks.

XML sucks in so many different ways, it's amazing. In fact I cannot think of a single thing that XML does well, or a single aspect of it that couldn't have been better planned from the beginning. I blame the creators of XML, who obviously didn't really have much of a clue.

In summary - XML sucks, and I refuse to use it, and actively fight against it every opportunity I get.

Re:Too many 'this stuff sucks' moments (0)

Anonymous Coward | more than 6 years ago | (#22342994)

You, sir, are now my hero.

Re:Too many 'this stuff sucks' moments (2, Insightful)

Belial6 (794905) | more than 6 years ago | (#22343004)

I'm not saying that XML is the end all be all, but if your application blows up because someone fed it bad data in XML, your program is broken, and no data format is going to fix it. As the developer, it is your responsibility to vet the data before trying to use it.

Re:Too many 'this stuff sucks' moments (1)

tjstork (137384) | more than 6 years ago | (#22343460)

I'm not saying that XML is the end all be all, but if your application blows up because someone fed it bad data in XML, your program is broken, and no data format is going to fix it. As the developer, it is your responsibility to vet the data before trying to use it.

As long as you guys want to fit the bill for supporting that shoddy format, go right ahead!

interoperability is overrated.

Re:Too many 'this stuff sucks' moments (1)

Bryan Ischo (893) | more than 6 years ago | (#22343732)

I agree with you 100%. And so the only way to parse XML in a nonbroken way is to write your own XML parser, adding "nonstandard" constraints that prevent your program from blowing up. So you don't get to re-use existing parsing technologies like expat, you have to write everything yourself. This is a direct consequence of the suckfulness of XML, which is so lame that nobody even bothers to write good free parsers for it.

An example of nonstandard constraints you have to put on your parser - DTD doesn't allow you to specify the maximum length that the text of an element can have. So someone could pass you an XML document with a fragment like this:

{100 megabytes of the letter 'a'}

expat does have facilities that let you reject this document - you can put your own checks in place to say 'if I have read more than 100K bytes of data in the address field, stop parsing and issue an error'. But you have to wait for expat to give you 100K of data before you can finally detect that the field is too long and you have to reject the document. This is wasteful. It would be much better if you could tell expat ahead of time, 'the address field has a limit of 100K' and it would do this for you. It would better still if XML format allowed expat to detect that the field was too long before even reading it. But XML doesn't support this.

What about the maximum length of element names? I think expat may have an inbuilt limit; if not, it will choke and kill your program when it encounters something like this:

hello

It will try to read 100 MB to get the element name. Bogus. Like I said, expat may have some inbuilt protection from this, but it's technically correct XML since XML doesn't limit element names at all, and expat is technically broken because it rejects valid XML. It's only because XML is so lame does a reasonable limitation to avoid your program blowing up result in a nonconforming parser.

Re:Too many 'this stuff sucks' moments (1)

Bryan Ischo (893) | more than 6 years ago | (#22344180)

Whoops, I forgot to properly escape my example text.  The first example was supposed to be:

<foo>{100 megabytes of the letter 'a'}</foo>

And the second was supposed to be:

<{100 MB of 'a'}>hello</foo>

Re:Too many 'this stuff sucks' moments (3, Insightful)

QRDeNameland (873957) | more than 6 years ago | (#22343016)

Too bad I used up all my mod points earlier...this post deserves a +1 Insightful.

I was just a neophyte developer when XML first surfaced in buzzword bingo, but it was the beginning of my realization of how to recognize a "Kool-aid" technology: if the people who espouse a technology can not give you a simple explanation of what it is and why it's good, they are probably "drinking the "Kool-aid".

Unfortunately, I also have since discovered the unsettling corollary: you will have it forced down your throat anyway.

Re:Too many 'this stuff sucks' moments (1, Funny)

Anonymous Coward | more than 6 years ago | (#22343018)

So, tell us, what do you really think about XML? Does it suck? :p

XML doesn't suck! (0)

Anonymous Coward | more than 6 years ago | (#22343178)

The problem with XML is it was promoted as the answer to everything when the reality is that it fills a niche for marking up textual documents and data.

Management hear this web2.0 thing is really cool and want to convert our database to one big nested JSON structure. The SQL dump is 30GB, can you have the new system in production by next week?


Would you blame JSON for that or is it simply a case of selecting the right format for the job?

Re:XML doesn't suck! (1)

Bryan Ischo (893) | more than 6 years ago | (#22343782)

I agree with part of what you are saying; definitely there is no single technology that is applicable in all situations, as your JSON example shows.

However, I believe that XML isn't even good for marking up textual documents and data. It would be faster, smaller, and less error prone for computers if it were an intelligently defined binary format. It would be easier for humans to read and write as a non-SGML-based format. I think the correct thing that XML should have been is a format which has both a binary and human-readable form, where each form is tailored specifically to the domain in which it is defined (binary form being fast, easy, and safe for computers to parse and emit, and human-readable form being easy for humans to read and write), and each form can be converted to the other simply and easily.

Re:Too many 'this stuff sucks' moments (1)

cpeterso (19082) | more than 6 years ago | (#22343190)

But what is the alternative?

Re:Too many 'this stuff sucks' moments (1)

frank_adrian314159 (469671) | more than 6 years ago | (#22344188)

But what is the alternative?

Use Lisp and s-expressions.

Re:Too many 'this stuff sucks' moments (1)

pla (258480) | more than 6 years ago | (#22343240)

I searched around for freely available software libraries that would take care of parsing the XML documents for me.

Not "free", but believe it or not, .NET actually has pretty decent XML support... Except as you point out:



Then I got bitten by numerous bugs that occurred because XML has such weak syntax

Based on the exhibited behavior, I suspect virtually all programs that parse XML use SelectSingleNode() (or comparable). And there we have a problem, in that XML itself doesn't require node uniqueness, but most programs will break otherwise.

Re:Too many 'this stuff sucks' moments (2, Interesting)

jma05 (897351) | more than 6 years ago | (#22343530)

I just am adding finishing touches for a several year long project where I was bitten by XML (My problems were with schema namespace support in libraries at the time). I had to resort to non-standard hacks.

While I share your disdain (and I agree with everyone of your points), the question is this - What other *standard* way do we have to describe a format that has *moderate to high* level of complexity. JSON is great when I don't need to apply any constraints on the data. I would gladly choose it (along with YAML and other choices) for all my simpler needs. But we need some format that is more general purpose.

Binary formats are efficient on the wire. I will gladly take CORBA over SOAP. But if I was to define a binary protocol for my own purposes, do you think I can? The effort required to model, inspect, migrate, provide language support and describe for others to understand is huge without the kind of tools that we have with XML today. We need some/any standard means to do this and whether you and I like it are not, XML is the only choice now. Hopefully, we will see better ways of representing information. But till then, XML, warts and all, is here to stay.

Re:Too many 'this stuff sucks' moments (1)

Bryan Ischo (893) | more than 6 years ago | (#22343830)

Actually I have been working on this problem off and on for a couple of years. I wrote a description of a binary format which could encode any hierarchical data structure, and had all of the features of XML (that I know of) while being fast and safe for computers to parse and emit. I also had a re-entrant parser that could read a document byte by byte if necessary and properly managed its state (allowing the programmer to drive the parser using for example a select loop with sockets). It worked really well.

But then I decided that I could break the problem up into a couple of subproblems that were more elegant to solve individually than together as part of a larger system. I'm still working on that, but maybe I'll have it done soon.

I framed my initial implementation as a way to convert C++ objects to binary form and serialize/deserialize them, without any programmer intervention other than running a build-time tool to generate some extra C++ syntax to be compiled with one's program and thus enable the program to serialize and deserialize any C++ object that it defined, fast, safe, and easy. I have split that up now into two parts: one to generate full-featured reflection information for C++ data types, and the second to take that information and drive a serialization/deserialization engine using it. I completed the first part, but ran out of steam on the second part. I think I'll be getting back to that soon and maybe in a couple of months I'll have it done.

The first part, if you are curious, is at:

http://www.ischo.com/xrtti [ischo.com]

Re:Too many 'this stuff sucks' moments (1)

cjonslashdot (904508) | more than 6 years ago | (#22343650)

How refreshing to see so many people arguing intelligently that "the Emperor has no clothes". Indeed, XML is a terrible solution to almost all of the uses it is put to. It does solve the parsing problem, in that the syntax is pre-defined, but it is too bad that we don't have a better "standard syntax" or easier to use parser toolkit standard. I used to write compilers many, many years ago and the tools were designed for gurus. XML came along because it made the job of parsing easy, and XML messaging caught on because it was ascii and could be use to trick firewalls by pushing messages through HTTP which was intended for browsers. But I agree: XML actually sucks. And it is a very, very poor choice for a messaging specification syntax, for the many reasons stated by others. Incredibly, Web services and SOA are built around XML: so I conclude that their days are numbered, since the very underpinning is so defective by design. But then again, people said that about Windows.... Someone here asked what the alternative is. There are many levels to the answer to this: at one level, someone should create something better, something that makes a clean break from XML. But for those who simply want an existing tool to use for an application, I would suggest that if you are creating a persistent file, use a persistence API - don't write XML; and if you are doing messaging or remote procedure calls, use a native protocol - don't define your message interface in WSDL (because it is the WORST interface approach there is); but if you must define a document syntax of some kind, consider XML, because it is actually effective (although cumbersome) for that, and there is nothing better currently. And above all, be willing to say that the Emperor has no clothes: expose the many fragmented and overlapping XML standards for what they are: a disorganized jumbled junk heap of overly, overly, overly complex make-work. What we really need is a small number of specs designed by a small number of smart individuals, and all designed to work together with the minimum of complexity instead of the maximum. And I for one am tired of the ever increasing complexity of HTML and consequent browsers that must have a 100Mb footprint. Indeed, this stuff sucks.

Re:Too many 'this stuff sucks' moments (2, Insightful)

Anonymous Coward | more than 6 years ago | (#22343692)

Wow...that's a lot of FUD to fit into one single post.

To pick just a few of your actual points...

So you have to dedicate a thread to keeping the stack state for the parser, or read the whole document in one big buffer and pass it to the parser.
Why on earth would you use a separate thread. SAX callbacks allow you ample opportunity to maintain whatever state you need and DOM parsers cache the entire thing into a hierarchy that you can navigate to avoid having to maintain any state of your own. Granted, the uses for DOM parsers are few and far between, but SAX parsers are quite simple to write, don't require an extra thread and can be done on-the-fly to avoid having to load the entire document at any time. I'm not sure how you get "primitive and difficult to work with" since it rarely takes me more than a few minutes to write a SAX parser, and XML binding technologies (e.g. JAXB, Castor, etc) make it even easier.

...you can't easily limit the size of elements in a document, for example...
That's entirely incorrect. See the following snippet:

    <element name="myelement">
        <simpleType>
            <restriction base="token">
                <maxLength value="20"/>
                <minLength value="1"/>
            </restriction>
        </simpleType>
    </element>


Somewhat verbose, yes. But it's not particularly difficult to learn the syntax. And then a value that's not within the expected bounds will now result in a validation error.

You just gotta accept that the parser could blow up your program if someone feeds it bad data.
If you apply an XSD validation, and your XSD is somewhat complete, you can be pretty sure that anything that makes it past the parser will be of acceptable length or even has an acceptable value (in the case of enumerations and unique constraints). This is all pretty simple stuff and it's not difficult to do. Compare this with writing your own parser where you have to do all validation of the format yourself. How is that easier again?

Finally, my application had poor performance because XML is so slow and bloated to read in as a wire protocol.
When was the last time you tried it, 1995? Nowadays, compression algorithms require so little processing power that XML adds only a minimal amount of overhead when transfered over the wire.

In summary - XML sucks, and I refuse to use it, and actively fight against it every opportunity I get.
Quite frankly, your position makes about as much sense to me as the people who advocate using XML for everything (XSL-T creators, I'm looking in your direction). XML is a tool, and a useful one at that. In some situations it's very helpful and in others it's not. It's not hard to tell the difference. If you need hierarchically-structured data that's easy to parse, XML is great. It's simple to write parsers and the available toolset handles a lot of what you'd otherwise have to do manually.

I'm actually curious what an XML-luddite like yourself would advocate instead. Give me an example of another technology that allows me to represent hierarchical data, have a standardized validation format so I can allow others to create data in a format I specify and uses an existing parser that takes me almost no time to integrate with. What would you recommend I use when presented with this all-to-common scenario?

Re:Too many 'this stuff sucks' moments (1)

Bryan Ischo (893) | more than 6 years ago | (#22344152)

It's been a couple of years (like, 2 or 3) since I wrote the code to which I am referring, or had the experiences to which I am alluding. So my memory of details is fuzzy, and I may have missed the mark on some of it because I may be misremembering. However, I do very clearly recall that, when I was in the thick of my XML efforts, and had a clear idea of what the problems were, that I had many 'this stuff sucks' moments like the ones I described. Maybe the details are a little off, but the point remains, that I ran into problem after problem that seemed to me to be inherent in XML, or to be so cumbersome as to cause me no surprise that there was no really good freely available parser for XML at that time (and maybe still isn't, I don't know).

As far as the separate thread thing, I guess I am definitely misremembering. You are right, you just 'feed' data into the SAX parser (in my case, expat) and it makes some number of callbacks to you as it parses that bit. I just checked a little bit of code I wrote a while ago and this is just what it does when it uses the expat parser, reads chunks out of a file and passes it to expat, and handles the callbacks as expat makes them. So I am wrong in my previous assertion, for sure. I think I might be confusing this with some DOM implementation I had seen where you *did* have to feed the document all at once.

Your example for how to limit the size of elements in a document is not using DTD, which at the time that I was working with XML, was the standard way to describe a specific XML syntax. I guess you're using XSD since you mention that later on? Is XSD a standard now, superceding DSD? If not, then what you are really saying is that *XSD-aware-parsers* let you limit the number of bytes in an element, not that the XML format itself does. Which may seem like a minor difference, but it is a difference that makes my point: XML is so lame that basic functionality like limiting field sizes had to be done in an update to the spec (XSD) or, if XSD isn't really part of the XML standard, in a non-standardized extension.

Also, how does the XSD-compatible parser know that a field it is parsing exceeds the length restriction you defined? If the limit is N, doesn't it have to read in N bytes (and parse it), and detect that there are still more bytes coming, before it can reject that document? It's a minor point, but it is inefficient.

Also your "somewhat verbose, yes" comment is just too funny. I don't think it's possible to write any XML at all without having that be true ...

You asked when the last time I tried XML was, and saw poor performance? Like I said, a couple of years ago. It was not unacceptably slow performance, but it was clearly worse than necessary and also used more memory than necessary (this part being significantly more important on the memory limited platform I was working). Also, I have used the yum tool on Linux and as I understand it, part of the reason that it's insanely slow is that it stores too much in XML and the parsing thereof takes a long time when it starts up. That's just anecdotal evidence though and probably simplifies the reason for yum's slowness.

In terms of what I would recommend ... I would recommend re-evaluating whether or not XML really gives you any of the things that you say that you want. I can tell you with fair certainty that it does not give you the 'takes almost no time to integrate with' part. Technologies that allow representation of hierarchical data are a dime a dozen. And standardized validation formats are only really necessary with overly complex beasts like XML.

Re:Too many 'this stuff sucks' moments (1)

ad0gg (594412) | more than 6 years ago | (#22343800)

I spent some time reading through supposedly 'human readable' XML documents, and writing some. Both reading and writing XML is incredibly nonsuccinct, error-prone, and time consuming. This stuff sucks.

There's so many more readable formats like json. Or just using byte offsets. Hell we could being using pipe delimited data.

A Buzzword's Life (4, Funny)

kbob88 (951258) | more than 6 years ago | (#22342962)

The future of XML?

Probably a long, healthy life in a big house on the top of buzzword hill, funded by many glowing articles in magazines like InformationWeek and CIO, and 'research papers' by Gartner. Sitting on the porch yelling, "Get off my lawn!" to upstarts like SOA, AJAX, and VOIP. Hanging out watching tube with cousin HTML and poor cousin SGML. Trying to keep JSON and YAML from breaking in and swiping his stuff. Then fading into that same retirement community that housed such oldsters as EDI, VMS, SNA, CICS, RISC, etc.

We're stuck with XML for a loooong time (2, Interesting)

MasterC (70492) | more than 6 years ago | (#22342980)

XML is easy to understand because of the prevalence of HTML knowledge. XML is easy because it's text. XML is easy because, like perl, you can store the same thing in 15 ways. XML is easy because there is only one data type: text. XML is flexible because you can nest to your heart's content.

All these things are why people use it.

All these things are why people abuse it.

All these things are why we won't be able to get rid of it soon.

TFA has nothing to say about the future of XML but the tools to use XML. XQuery and XML databases. Whoopity do. The threshold for getting posted on /. steps down yet another notch. IMHO: if you loathe/hate XML then you should think about a change in career because it's not going away any time soon...

YAML (2, Informative)

CopaceticOpus (965603) | more than 6 years ago | (#22343028)

JSON is lightweight, and yet it remains human readable and editable. XML lets you forget some of the security concerns of JSON, and has the advantage of not being tied to a specific programming language.

If only there was a standardized format that combined these advantages, without all that XML bloat. There is! Try YAML [yaml.org] .

XML's big win is supposed to be its semantics: it tells you not only what data you have, but what sort of data it is. This allows you to create all sorts of dreamy scenarios about computers being able to understand each other and act super intelligently. In reality, it leads to massively bloated XML specifications and protracted fights over what's the best way to describe one's data, but not to any of the magic.

As my all time favorite Slashdot sig said: "XML is like violence: if it doesn't solve your problem, you aren't using enough of it."

Re:YAML (0)

Anonymous Coward | more than 6 years ago | (#22344280)

YAML is just XML without the closing tags. Seriously, it sucks just as bad as XML. Reading the YAML specification sucks donkey balls just like the XML specification (maybe worse).

XML is what people use, it's here to stay. All the other choices suck too, pick the one most people use.

Emporor's New Clothes (0)

Anonymous Coward | more than 6 years ago | (#22343096)

XML is something that programs can use to do stuff for themselves, like writing and reading their config files, or a game or an MP3 player or, god forbid, a spreadsheet encoding its UI layout in XML so that any end user can come along and play with it.

What XML is not, and never will be, is a standard way for different programs written by different people to pass data around for an extended amount of time. It's a nice theory, but the key ingredient that would make it work in the long run will always be missing: People would have to settle on something and then stop screwing with it. That's never going to happen. Programmers and companies are constantly reinventing data structures and programs because they either a) get bored, or b) want more money. When either a) or b) happen, you get a new programming language or a new version of an operating system or a new whatever, and it's not compatible with the stuff that came before, now matter how much imcompatibility tolerance you attempt to design into something.

Before XML fanboys start squeaking that XML will solve all of that, I'll just say no, XML won't solve any of that, no more than Object Oriented Programming guaranteed that we'd never have to write another string class again. I've heard it all before. What each new generation will keep doing is throwing away investments and frittering away their lives relearning and reinventing ways to execute instructions on processors.

Based on the fact that it's text... (1)

Aphrika (756248) | more than 6 years ago | (#22343110)

Doesn't that mean I can use it until um... er... text runs out?

It's not rocket science - MS were using it in MediaPlayer long before EkksEmmEll came along... it was called "sticking your crap in angle brackets and parsing it" - HTML is a subset of SGML and I'm pretty sure that it (in its XHTML form) will be around for a while yet.

How does that die out? Just because you give it a name and rant about standards in some poxy white paper/media blag doesn't mean it's going to die and go away...

XML tables (1)

mugnyte (203225) | more than 6 years ago | (#22343192)

ok true story.

  We once had to port live data from Texas to Oregon from giant tables repeatedly, not too well built. So we looked to send XML, enforcing a DTD/schema on the sender teams. We ended up writing the encoders because we used an early and crude compression scheme:

  We took the source table and counted the number of duplicate sets per column, then returned sorted data in order of highest duplicates to lowest.
  Then, we encoded in XML using a column, then row order. Scanning downward, if rows duplicated > 2x, we merely attributed the R tag with repeats="xx"
  Additionally, if the sender detected non-duplicate columnar data, but could find repeating sets, there were more attributes for that.
  This could be expanded into sequences, dictionary schemes, etc. This was 1998 and all the world was XML. I'm glad this concept died with web services, bandwidth, etc.

  Essentially, the transfer of all this data as XML was interesting and fast(er) but didn't sacrifice the universality of being able to generate ascii-based text files from multiple (and divergent) sources. I came to see how XML could in fact be an OK choice for some communication formats.

  Overall, it was a pleasant experience - but I've had too many others where folks wanted to shove XML into serialized objects for some strange reason (with no need for communication or human-readability).

XML needs to be easier to read (1)

revealingheart (1213834) | more than 6 years ago | (#22343326)

While helping maintain work on an old game's source code, the thorny problem of which data storage format to use propped up, to replace the inflexible one in use with fixed references. There's two main developers, one who wants XML for flexibility, the other a binary format for speed and size.

XML was the main choice, but maintaining the files is trickier than it appears. It's one of the least "human readable" formats I've seen, much more so than HTML, where you know what each tag intrinsically means. This irks the second developer, who would otherwise have accepted a compromise, and there's a stalemate on the issue.

We have skipped around the issue and instead started work on adding other features and bugfixes. The only implementation which I did notice could work was SweetXML [innig.net] , though the time required to set this up wouldn't be as quick.

XML, and it's derivatives such as XAML, need to be easier to read and edit before they can become fully fledged on the web.

Graveyard (1)

CaroKann (795685) | more than 6 years ago | (#22343362)

In the long run, we are all dead. XML's future is in the graveyard. Alas, that is probably too much to ask. :(

The only reason why XML caught on is because it... (1)

v(*_*)vvvv (233078) | more than 6 years ago | (#22343432)

The only reason why XML caught on is because it looks like HTML. And since all the phony IT execs who have "HTML" on their resume were able to say they understood it.

Why not S-expressions? (3, Interesting)

corsec67 (627446) | more than 6 years ago | (#22343434)

S-expressions [wikipedia.org] (think the lisp format) are much nicer, more compact, and easier to use than XML, while sharing almost all of the same properties otherwise.

For example:
<tag1>
    <tag2>
      <tag3/>
    </tag2>
<tag1>

becomes:
(tag1
    (tag2
        (tag3)
    )
)

Re:Why not S-expressions? (1)

deadzaphod (699097) | more than 6 years ago | (#22343762)

Good question. I would love to see everyone standardize on S-expressions instead. The only real advantage that I know of that XML has is namespaces, and that could be fixed with a single short RFC describing a standard structure on top of S-expressions instead.

Now that I think about it, I think maybe I'll set up a website after work tonight to promote this idea...

Don't get blindsided by big stuff you can't see (3, Informative)

leighklotz (192300) | more than 6 years ago | (#22343442)

XML has tremendous, huge, giant levels of adoption that dwarf its use as XHTML and in XMLHTTPRequest (AJAX) stuff.
WHATWG's HTML 5 and JSON will have no effect on these other uses. It's just that nobody in hangouts like this sees it.

For example, the entire international banking industry runs on XML Schemas. Here's one such standard: IFX. Look at a few links: http://www.csc.com/industries/banking/news/11490.shtml [csc.com] , http://www.ifxforum.org/home [ifxforum.org] , http://www.ifxforum.org/home [ifxforum.org]
But there are other XML standards in use in banking.

The petroleum industry is a heavy user of XML. Example: Well Information Transfer Standard Markup Language WITSML (http://www.knowsys.com/ and others).

The list goes on and on, literally, in major, world-wide industry after industry. XML has become like SQL -- it was new, it still has plenty of stuff going on and smart people are working on it, but a new generation of programmers has graduated from high school, and reacts against it. But it's pure folly to think it's going to go away in favor of JSON or tag soup markup.

So yes, suceess in Facebook applications can make a few grad students drop out of school to market their "stuff," and Google can throw spitballs at Microsoft with a free spreadsheet written in Javascript, but when you right down to it, do you really think the banking industry, the petroleum industry, and countless others are going to roll over tomorrow and start hacking JSON?

Errrm, folks, what's the big fat hairy deal? (5, Informative)

Qbertino (265505) | more than 6 years ago | (#22343452)

Ok. I've once again seen the full range of XML comments here. From 'cool super technology modern java' to 'OMFG it sucks' right over to 'XML has bad security' - I mean ... WTF? XML is a Data Format Standard. It has about as much to do with IT security as the color of your keyboard.

And for those of you out there who haven't yet noticed: XML sucks because data structure serialisation sucks. It allways will. You can't cut open, unravel and string out an n-dimensional net of relations into a 1-dimensional string of bits and bytes without it sucking in one way or the other. It's a, if not THE classic hard problem in IT. Get over it. It's with XML that we've finally agreed upon in which way it's supposed to suck. Halle-flippin'-luja! XML is the unified successor to the late sixties way of badly delimited literals, indifference between variables and values and flatfile constructs of obscure standards nobody wants. And which are so arcane by todays standards that they are beyond useless (Check out AICC if you don't know what I mean). Crappy PLs and config schemas from the dawn of computing.

That's all there is to XML: a universal n-to-1 serialisation standard. Nothing more and nothing less. Calm down.

And as for the headline: Of-f*cking-course it's here to stay. What do you want to change about it (much less 'enhance'). Do you want to start color-coding your data? Talking about the future of XML is allmost like talking about the future of the wheel ("Scientist ask: Will it ever get any rounder?"). Give me a break. I'm glad we got it and I'm actually - for once - gratefull to the academic IT community doing something usefull and pushing it. It's universal, can be handled by any class and style of data processing and when things get rough it's even human readable. What more do you want?

Now if only someone could come up with a replacement for SQL and enforce universal utf-8 everywhere we could finally leave the 1960s behind us and shed the last pieces of vintage computing we have to deal with on a daily basis. Thats what discussions like these should actually be about.

Re:Errrm, folks, what's the big fat hairy deal? (0)

Anonymous Coward | more than 6 years ago | (#22343512)

Sweet jesus, this comment nails this subject so completely. If anyone has mod points, here would be a good place to use them.

Re:Errrm, folks, what's the big fat hairy deal? (1)

Shados (741919) | more than 6 years ago | (#22343562)

Completly agree with you, to the security comment, all the way to the "UTF-8 everywhere and be gone with SQL" thing.

Just out of curiosity, have you ever had to work with EDI? Because you sound like someone who probably got burnt by something like that in the past :)

MOD PARENT UP! Re:Errrm, folks, what's the big (2, Insightful)

DIGITAiLor (622965) | more than 6 years ago | (#22344086)


Cheers, Qbertino. This is the best explanation of XML's raison d'etre I have ever heard.

I think what people might hate most is DTDs. That makes sense. Even their creator says they suck. There are many ways around them... Lisp can be one big full-service XML processor. Easily. With happy ending and no need for the DOM or SAX.

The bottom line is, XML is nothing (literally) until you spec YourML. And most people don't have a need for that! So it seems useless to them. If you are writing markup languages for application spaces that don't have them it's a godsend. And it is leading to improvements and much-needed standardization.

I've never understood why seemingly rational people whine about XML. It's like whining about mathematics. They're like that for a reason; their intrinsic structure provides their utility. It's not some arbitrary syntax decision that you can whine about. Don't like how modulo works or the concept of recursion? It's AXIOMATIC, baby. Don't like close tags? They're there for a reason.

As for a SQL replacement... have you checked out Berkeley DB XML? Have you found anything promising that you like?

Near-death experience with Vista -- ZDNet (0, Offtopic)

Daengbo (523424) | more than 6 years ago | (#22343648)

http://talkback.zdnet.com/5206-10533-0.html?forumID=1&threadID=44087 [zdnet.com]
He killed his laptop and couldn't use his desktop for a couple of hours.

Re:Near-death experience with Vista -- ZDNet (0, Offtopic)

Daengbo (523424) | more than 6 years ago | (#22343678)

Oops! Wrong story. [slashdot.org]

As always (1)

roman_mir (125474) | more than 6 years ago | (#22344104)

Use it if mandated, try to avoid using it for application configuration if possible, try to avoid transformations as much as possible, accept that web services / ajax do make sense in certain situations.

Basically like any tool use where it makes most sense, avoid using it in other cases.

XML is a fad, STEP is the future (5, Interesting)

chip2004 (913568) | more than 6 years ago | (#22344160)

XML tries to make everything fit into a single hierarchy. Most real-world information is comprised of graphs of data. ISO STEP provides better readability compared to XML, a more strongly typed schema mechanism, and a more compact size. Best of all, programs can process and present results of STEP incrementally instead of requiring closing tags so you can hold gigabytes of information in the same file and seek randomly.

Example:
#10=ORGANIZATION('O0001','LKSoft','company');
#11=PRODUCT_DEFINITION_CONTEXT('part definition',#12,'manufacturing');
#12=APPLICATION_CONTEXT('mechanical design');
#13=APPLICATION_PROTOCOL_DEFINITION('','automotive_design',2003,#12);
#14=PRODUCT_DEFINITION('0',$,#15,#11);
#15=PRODUCT_DEFINITION_FORMATION('1',$,#16);
#16=PRODUCT('A0001','Test Part 1','',(#18));
#17=PRODUCT_RELATED_PRODUCT_CATEGORY('part',$,(#16));
#18=PRODUCT_CONTEXT('',#12,'');
#19=APPLIED_ORGANIZATION_ASSIGNMENT(#10,#20,(#16));
#20=ORGANIZATION_ROLE('id owner');
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>