Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Vendor Neutral File Formats?

Cliff posted more than 9 years ago | from the no-more-lock-in dept.

Data Storage 83

timmyv asks: "I have recently been tasked with developing a corporate wide policy that will standardize all employee created documents on vendor neutral file formats. OASIS is good in theory, but I haven't been able to locate enough concrete examples of policies or implementation schemes that work at a corporate level. Does anyone work at a company where documents can only be saved as RTF, HTML, etc. or have any experience with this type of problem?"

cancel ×

83 comments

Sorry! There are no comments related to the filter you selected.

RTF (2, Informative)

Uber Banker (655221) | more than 9 years ago | (#11227983)

Isn't vendor neutral.

Re:RTF (0)

Anonymous Coward | more than 9 years ago | (#11229235)

NeXT used it as their standard file format -- it's why TextEdit.app on MacOS X can do rtf. And NeXT wasn't exactly buddy-buddy with Microsoft. Do you know what the hell you're talking about?

Re:RTF (0)

Anonymous Coward | more than 9 years ago | (#11230323)

The fact that a file format is used by a few vendors doesn't make it vendor-neutral.

I work in a nice company (3, Insightful)

ilithiiri (836229) | more than 9 years ago | (#11227989)

and we, unfortunately, use _all_ the formats known to the world.

I've already tried to encourage the adoption of hassle-free formats (rtf, html, TXT, whatever).. they don't pass.

It seems that people simply can't get it.
Unfortunately.

OpenOffice (2, Informative)

saden1 (581102) | more than 9 years ago | (#11227995)

OpenOffice file format is a good start. The format is open standard. As governments around the world embrace it companies will ultimate flock to the format.

Re:OpenOffice (2, Interesting)

spud603 (832173) | more than 9 years ago | (#11228037)

Although Microsoft may have successfully killed OOo's format-acceptance in the US by "opening" their office file formats. With the new xml-based word doc's, microsoft may have defined the new standard for text formats in the US. At least it's better than that gobbledy-binary mess they had before..

what are you talking about? (1, Informative)

Anonymous Coward | more than 9 years ago | (#11231287)

MS-office2003 is XML format but that does not mean it is open.

It is restricted by patents, see..
http://news.com.com/Microsoft+seeks+XML-rel ated+pa tents/2100-1013_3-5146581.html

Re:OpenOffice (3, Insightful)

fm6 (162816) | more than 9 years ago | (#11228380)

Well, that's not exactly "vendor neutral", since only one vendor supports it. Of course, that one vendor is an open-source project, and the format is well-documented XML. So if you want to break out of the Microsoft orbit, it's the obvious first choice.

Re:OpenOffice (1, Informative)

Anonymous Coward | more than 9 years ago | (#11232935)

Well, actually. The OASIS OpenDocument format will be supported by OpenOffice.org, KOffice, and apparently IBM. It is an OASIS standard and by next year it will be an ISO standard. And the EU is thinking of making it the standard format for pan-European government data. Finally, the specification is not controlled by OOo, but the OASIS, an non-profit standards group, and under the blessing of ISO.

So... the format is pretty much vendor neutral.

Cheers,
Daniel Carrera.
OpenOffice.org volunteer.

OASIS Open Document vendor independent (2, Interesting)

SgtChaireBourne (457691) | more than 9 years ago | (#11232904)

Open Document will be interesting to follow.

Like HTML, which surprised people in the 1990's, the OASIS OpenOffice.org file format is indeed vendor independent, though, it is now called Open Document [oasis-open.org] . Anyone can use it or develop tools for it without restriction. Even Microsoft is part of the team at OASIS, at least on paper [zdnet.com.au] . And, even if MS doesn't get out of the way, interesting things will happen with Open Doument.

So far OASIS Open Document being used by at least the following:

  • StarOffice
  • OpenOffice.org
  • AbiWord
  • kWord
Unlike MS-WordML, which is encumbered by patents, trade secrets, and difficult licensing issues, OpenDocument is free to use. It also meets the requirements specified in European Interoperability Framework for Pan-European eGovernment Services [eu.int] . It's getting increasing attention:
... the adoption of an OASIS Open Office Standard should be welcomed, and industry actors not currently involved with the OASIS Open Document Format should consider participating in the standardisation process in order to encourage a wider consensus around the format.

--EU Telematics between Administrations Committee, May 24, 2004

Note that the only industry actor not currently involved in the OASIS Open Document Format has been and still is MS. MS is still trying to shoehorn old MS-Office 97 customers into DRM'd MS-Office 2003, which functions in effect like a roach motel for your data. So far the worst insult that Balmer and Gates can cough up is that OpenOffice.org (OOo) is like MS-Office 97. However, I think even those two can see that OOo meets this groups functional requirements quite well, and is free and multiplatform. OOo is also available in more languages than MS-Office, handles long documents better, and does better with styles and stylesheets.

Currently, there are many governments moving up to StarOffice or OpenOffice.org for the sake of these formats. Singapore comes to mind first, but there are many, many others that don't necessarily make the mainstream press like Sarpsborg. Likewise, there are many small, medium and large businesses moving along. Some with an axe to grind [com.com] (with good reason ) speak up. However, most are silent until the move is being implemented to keep the goon squad from Redmond from getting in the way.

The current choice:

  • OASIS Open Document --
    1. be able to access your own data indefinitely as XML
    2. and change productivity tools, operating systems and hardware only if and when it suites you
  • MS-WordML --
    1. pay that Redmond tithe indefinitely
    2. and buy new productivity tools, operating systems and hardware when Chairman Bill tells you to
Easy choice. You don't need to be a wizard to see which direction things are going to head.

Derrida’s Ghost (0, Funny)

Anonymous Coward | more than 9 years ago | (#11227999)

Any postmodernist worth his or her salt would tell you there's no such thing as a vendor-neutral file format.

Re:Derrida’s Ghost (-1, Offtopic)

Anonymous Coward | more than 9 years ago | (#11228040)

Derrida is very much alive.

There is no ghost [nothingness.rog] .

Wrong question for the task. (2, Interesting)

Rahga (13479) | more than 9 years ago | (#11228049)

"I have recently been tasked with developing a corporate wide policy that will standardize all employee created documents on vendor neutral file formats."

Sorry, but looking at that statement, it seems to me that you are asking the wrong questions. Rather than getting concerned about formats and standards organizations, you should realize that to replace certain formats you will need to improve on open source projects without funding for the development of them. If they say "no" to this, then congratulations, you don't actually have to do this research. Nothing's quite as useless as an unfunded mandate.

Sadly, I'm not sure if this post is meant to be funny.

Nonsense. (1)

jotaeleemeese (303437) | more than 9 years ago | (#11234860)

Where is he mentioning that the applications have to be Open Source ones?

For all applications there are formats that are industry standards and unencumbered by patents (as far as it is possible to ensure this in certain litigious countries).

The knee jerk reaction "boooh! Open Source software is not ready" should be only used when actually Open Source is a necessary part of a requested solution.

Not sure what the question is limited to (4, Insightful)

GoofyBoy (44399) | more than 9 years ago | (#11228079)


There could be a huge number of different files you need. CAD files, images, Powerpoint presentations, complex spreadsheets will all mess up any format you can come up with (eg HTML). How would you even edit some of these things?

Even OpenOffice formats are not vendor neutral, you have only one product out there that really uses it.

Re:Not sure what the question is limited to (2, Informative)

TheRealJFM (671978) | more than 9 years ago | (#11228151)

well KOffice may be adopting this format (if it hasn't already), and StarOffice also uses it (I would consider SO a seperate project now, especially at version 2 of OO.o).

also don't forget that it may be made an ISO standard [slashdot.org] .

Re:Not sure what the question is limited to (1)

Mr.Ned (79679) | more than 9 years ago | (#11230260)

Abiword does as well.

Re:OOo file format is open though (1)

pbhj (607776) | more than 9 years ago | (#11228932)

OpenOffice.org format may not be vendor neutral particularly (though like others said, KOffice at least uses it) but it is an open and prevalent format. MS .doc is prevalent but as it's not open then it's not necessarily going to have filters available for it in the future. I think OOo is safer in this respect. Also OOo format is (compressed) xml so can probably be parsed by xml readers (? - I haven't got a clue, really!!).

Re:OOo file format is open though (2, Insightful)

michaela (31955) | more than 9 years ago | (#11231834)

Yep. Just use unzip and you'll get several XML files, among them: content.xml is the document itself, meta.xml is the property sheet info, styles.xml is the stylesheet(s) in use when the document was saved.

After that, you can your favorite XML widget, such as the XML::Parser [cpan.org] Perl module, to turn it into HTML or other things of your choosing.

Or create an XSLT file and use something like Xalan [apache.org] to
format it on the fly.

Gotta love OOo and those open formats!

What's the true question? (1)

crath (80215) | more than 9 years ago | (#11230631)

There could be a huge number of different files you need. CAD files, images, ...

Before starting, try to determine what the true question is. Were you asked to choose something that is truly vendor neutral, or were you asked to choose corporate standards that will interoperate with your customers and suppliers? The first question is *very* difficult to answer; the second one is easily solved (albeit in a non-Slashdot friendly manner).

I will assume the latter question is the true question, and continue my posting based upon that assumption.

For each major document type, determine who needs to be able to read and edit those documents. This question must be answered for your employees as well as your customers and suppliers. Then, choose a file type that is widely used in that community; which may mean standardising on an older version of a particular application.

For example, in the case of word processed documents, MS Word 97 is a very safe, very widely readable (by other applications) format. Newer versions of MS Word can be configured to only create Word97 files, and many other non-MS applications are able to open and edit Word97 files. So, although Word97 format isn't vendor neutral, it is widely interoperable and makes a good corporate standard.

PDF (3, Interesting)

AkaXakA (695610) | more than 9 years ago | (#11228107)

It might sound like Adobe lock-in,
but with PDF Printers (files are printed to pdf's) for Linux [sourceforge.net] and Windows [sourceforge.net] (I asume Mac has it built in), it's a good option for creating documents that'll be displayed everywhere in the same manner.

Re:PDF (2, Informative)

topham (32406) | more than 9 years ago | (#11228605)

Any standard application can print to PDF on a Mac. (running OS X). PDF is inherent to printing. (Very cool, means every program can use the built in viewer for print-preview and what the print-preview shows is what actually prints... unlike certain Microsoft applications under windows)

The only issue with PDF is the tendancy to be one-way. But there are programs out there designed to convert PDF documents to other formats.

Re:PDF (2, Informative)

PhlegmMaster (596165) | more than 9 years ago | (#11231371)

The only issue with PDF is the tendancy to be one-way. But there are programs out there designed to convert PDF documents to other formats.

There's also -
pdf2txt@adobe.com
pdf2html@adobe.com

Re:PDF (2, Informative)

Zzootnik (179922) | more than 9 years ago | (#11228659)

Yep-For a large part, it Is a lock-in.
My company is standardized (at least for production work) on PDF format, which everything can make. The problem is getting things back out or editing such documents...
It seems that the only truly accurate interpreter is Adobe's Acrobat Software, but it 'just works' for the final output. Converting it to anything else useable doesn't seem to work vey well or reliable.
Editing these things is a bit of a pain, but it can be done, and we do for a chunk of the production----> but this is definitely beyond the capabilities of any of the PHB's or the multitudes of Customer reps/etc, so the 90% paperwork and other miscellaneous office/corporate documentation never sees PDF format. A lot of that gets done in MS Word/Excel. Oh well... I find myself idly wondering/dreading what the next version of Acrobat is introducing...

Re:PDF (1)

samael (12612) | more than 9 years ago | (#11228970)

And how do you edit them? PDF editing is a complete nightmare...

Hmmm. (1)

Pig Hogger (10379) | more than 9 years ago | (#11228112)

XML maybe????

Re:Hmmm. (2, Insightful)

fm6 (162816) | more than 9 years ago | (#11228354)

XML isn't a format. It's a language for creating formats. Saying "we'll use XML" is like saying "we'll use an SQL database". It's a step, but only a small one. The big decisions remain.

Re:Hmmm. (1)

TykeClone (668449) | more than 9 years ago | (#11229119)

"we'll use an SQL database"

Make it muave - I hear it's faster.

Re:Hmmm. (1)

fm6 (162816) | more than 9 years ago | (#11230417)

Mauve is so 90s!

Re:Hmmm. (1)

A.Chwunbee (838021) | more than 8 years ago | (#11237916)

No, if I am rememebering well the jolly fine Dilbert, it is having the most ram.

Re:Hmmm. (3, Insightful)

pauljlucas (529435) | more than 9 years ago | (#11228396)

XML maybe?
XML without a schema (and applications that can understand it) is useless. One needs something like DocBook [oasis-open.org] .

Re:Hmmm. (2)

Phixxation (845373) | more than 9 years ago | (#11245846)

XML without a schema (and applications that can understand it) is useless. One needs something like DocBook

I work at a company that regulary consumes vendor data - We're plagued by a certain unnamed corporate enties lack of technical knowledge and insistance upon using XML. I don't understand what it is about that format that draws additional users, but it drives me fsckin nuts.

Re:Hmmm. (1)

pauljlucas (529435) | more than 9 years ago | (#11245923)

I don't understand what it is about that format that draws additional users, but it drives me fsckin nuts.
The advantages of XML are that (1) it's plain text and therefore easy to read, and (2) it's easy to parse because all XML is the same, hence you need only one parser. Even if you don't have a schema, you can simply look at the data and pretty much figure it out. Try that with some weird binary format.

But I do agree that there's too much hype around XML.

postscript/PDF and XML? (3, Interesting)

SHEENmaster (581283) | more than 9 years ago | (#11228434)

XCircuit, a circuit layout app for X, uses postscript as its default format. If you have XCircuit, you can load the postscript file into it and edit it like any other circuit. If not, you can still print it or view it as you would any other postscript file.

XML is a good start, because it's easy for a new app (the fictional YCircuit) to add support for the format, but you are still stuck unable to print it if you don't have the skills to write a conversion script and no one else has written it for you.

Why not combine the two? XML embedded in a standard PDF file would allow any application with support for the creator's XML tagset to import the file, and at the very least those without any similar application could view and print the file.

SVG instead (1)

tepples (727027) | more than 9 years ago | (#11232160)

XML embedded in a standard PDF file would allow any application with support for the creator's XML tagset to import the file, and at the very least those without any similar application could view and print the file.

For a more pure XML solution, it'd be better to embed domain-specific XML data in an SVG document, which Adobe's SVG viewer [adobe.com] can display and print. In fact, it might even be possible to XSLT the XML into SVG.

PDF and the Things That Turn Into It (3, Interesting)

Anonymous Coward | more than 9 years ago | (#11228128)

What you need is a toolchain that allows conversion back and forth between several different types. For example, I could write a short paper in XML, SGML, or LaTeX, and convert any of the three to PDF. I could convert the XML or SGML versions to LaTeX, then use latex2html to turn it into an HTML document. I don't know of converters that turn XML,SGML->HTML, but they probably exist.

The point is that it doesn't matter which method I used to create the document; I can convert any of them into either of the other formats without losing information, and any of the three can be turned into HTML or PDF for display purposes.

You've probably got several different types of documents to mess with. Technical papers with plots, accounting spreadsheets, secretary generated memos, and presentations with pretty pictures so that management can understand what's going on. LaTeX alone could handle all of these situations. Create document types and environments to match the needs of each type of document. XML, being completely generic, could also handle any of the situations, but it's easier to type LaTeX markup than it is XML. There is at least one caveat: you have to be careful what type of images you feed TeX.

Heck, you could use Perl bindings to MS-Excel to snag data out of spreadsheets and export it into a format that some other chart making tool uses. You could use Excel itself to export as CSV files, which you could then use awk to convert into some other format.

Basically, it doesn't matter what tool each person uses, as long as what they export off their own workstation is in a standard format.

Re:PDF and the Things That Turn Into It (2, Insightful)

GoofyBoy (44399) | more than 9 years ago | (#11228480)

Umm... you a moving from a vendor-specific system to in-house expertise-specific system.

Re:PDF and the Things That Turn Into It (0)

Anonymous Coward | more than 9 years ago | (#11229480)

from a vendor-specific system to in-house expertise-specific system.

Sounds good to me, but then again, I'm an engineer. A manager would probably rather be able to blame someone outside of the company.

From a technology standpoint, there aren't any mysteries involved in this. Someone who understands exactly what was needed out of each department in their company, and moderately understood the technologies involved could probably draw up rudimentary LaTeX environments in the same amount of time this fellow has been given to investigate possible policies.

Also, once initial scripts/templates have been drawn up, maintenance is an absolute breeze. Everything has been stored in a format where presentation is separate from content. Altering the look of documents later on is nothing.

Re:PDF and the Things That Turn Into It (1)

GoofyBoy (44399) | more than 9 years ago | (#11230926)

>From a technology standpoint, there aren't any mysteries involved in this. Someone who understands exactly what was needed out of each department in their company, and moderately understood the technologies involved could probably draw up rudimentary LaTeX environments ...

And this isn't a mystery?

Re:PDF and the Things That Turn Into It (1, Insightful)

Anonymous Coward | more than 9 years ago | (#11231198)

And this isn't a mystery?

No. It's a matter of researching documentation.

Re:PDF and the Things That Turn Into It (2, Interesting)

zedkineece (845146) | more than 9 years ago | (#11229184)

I agree with Anonymous Coward. Why not use XML as your standard format? You could use Word 2003 (or even the entire Office 2003 suite) or XMLSpy to author your documents, but store everything in XML. You could then write (or obtain consulting like we did) to develop XSLTs to convert the XML to whatever format you or your vendors require. One source format to virtually any format you need. It is also somewhat painless to have another XSLT developed when a future format is required, which eliminates the need to do wholesale changes in the future. I highly recommend the company we used, "Docsoft" (http://www.docsoft.com [docsoft.com] I think). They have some smart guys with probably the best support of an vendor we have dealt with. Since implementation, we have discovered a lot of additional "pluses" that we didn't consider, such as using the XML as a DMS (Docsoft has a search tool that indexes XML in relation to how data is tagged, which has turned out to be invaluable to us). We can even store images with XML meta data to find out what the subject and author is. We sometimes spent 40 hours trying to find extremely specific data, now only takes us 15 minutes or less. All because of XML. Just my 2 cents worth.

Re:PDF and the Things That Turn Into It (2, Informative)

tepples (727027) | more than 9 years ago | (#11232172)

I don't know of converters that turn XML,SGML->HTML, but they probably exist.

The tool to convert from domain-specific XML to XHTML is called XSLT. For more info, Ask Google [google.com] .

Re:PDF and the Things That Turn Into It (1)

iammaxus (683241) | more than 9 years ago | (#11232229)

mod parent down. "XML" and "SGML" are not file formats. They are formats for formats. "I don't know of converters that turn XML,SGML->HTML, but they probably exist." Is horribly ridiculous because HTML is an SGML file format. XML based formats are useful because of XSLT as other posters mentioned. You can create and XML format and then automatically convert it into any other XML format (XHTML for one) or even to non XML formats.

Vendor neutral is not always the answer.... (3, Insightful)

Alpha27 (211269) | more than 9 years ago | (#11228134)

The idea of switching applications for people can be a task no one wants to undertake for many two reasons.

Comfort level:
It's like having designers switch from Photoshop to The GIMP, or MS Word to OO Writer. Granted, the apps accomplish the same thing, but it's not the *same* program. People will resist the change because they know how to use the first program, and the reason for the change isn't a concern for them.

Dominance:
Going vendor neutral when the major still use vendor specific requires you to see if your users use vendor specific features that are not available in the neutral. If those features aren't there, then what do you do? Write code to compenstate for the feature, or get plugins, or do nothing if there's nothing you can do. Are there tools that can do as good a job as the old tools, to work in this neutral envirnoment?

It would help more if you stated your case in more detail.

Re:Vendor neutral is not always the answer.... (0)

Anonymous Coward | more than 9 years ago | (#11229176)

the gimp isn't even close to paint shop pro's level yet, comparing it to photoshop makes artists/designers/whatever weep.

Re:Vendor neutral is not always the answer.... (1)

tepples (727027) | more than 9 years ago | (#11232184)

the gimp isn't even close to paint shop pro's level yet

In what way, specifically?

Re:Vendor neutral is not always the answer.... (1)

Gi77 B4t35 (808520) | more than 8 years ago | (#11237927)

comparing it to photoshop makes artists/designers/whatever weep
So does having a little too much chocolate on their iced mochalattefrapinos. Bunch of one-button mouse using ponces.

Re:Vendor neutral is not always the answer.... (1)

amokk (465630) | more than 9 years ago | (#11234737)

I don't really like to bring it down to this level, but your assertion that Photoshop and Gimp accomplish the same thing is entirely wrong.

It is not possible to move from Photoshop to Gimp in many, incredibly common, situations. Assuming one would even want to.

Vendor Neutral? (1)

rueger (210566) | more than 9 years ago | (#11228355)

That seems like kind of an unclear idea. How many vendors do you have, and do they all use the same software in the same fashion?

Unless you have pretty carefully surveyed all of those people you really can't choose one file format over another.

In other words, you're asking the wrong question. Instead of trying to figure out what your employees can standardize on, you will first need to find out what what the majority of your vendors have standardized on.

Of course you'll have problems. HTML or PDF are horrible if you're circulating documents that need to be edited or excerpted. And vendors and suppliers will still send you documents in whatever their house file format is.

Really, for this to be effective you need to involve your employees, management, vendors, and probably suppliers in order to get everyone working within the same set of file formats.

Right motivation, wrong question... (2, Insightful)

moreati (119629) | more than 9 years ago | (#11228417)

Avoiding vendor lockin is of course A Good Thing. However, as others have said, there is no format completely vendor neutral - each platform has it's own set of unique features that don't translate directly and must be stored somewhere in an extension or custom tag. I'm certain the OASIS/OOo format has a few StarOfficeisms in it.

What matters is that the data you own is readly transformable into a Fully Open and documented format independant of your chosen platform, normally (but not necessarily) this will mean your native format is Fully Open and documented. This includes all data, styling, formatting, metadata and interrelationships. Bascially you should be able to quickly jump ship, even if your vendor has been wiped of the earth or there are legal/technical issues preventing you from running the original platform, without loss or 'damage' of any information. There must be at least one other clear route to all your information, completely bypassing the original platform.

As an example .doc would be unsuitable since the format is undocumented and you would be reliant on the correct version of office to correctly and completely read/export it, hence you would depend on Microsoft.

Similarly prior to it's released as open source software and even immediately after .sxw would have been unsuitable (even though it was 'just zipped xml'), since OOo/StarOffice were the only way of performing any completely trustworthy export. Now the format is formally documented and independant tools exist it is suitable.

There are grey areas such as databases, which have no common datafile format but do expose Fully Open interfaces such as ODBC or JDBC.

With this in mind I would argue that forcing everyone to save documents in 'basic' formats such as HTML and RTF is counterproductive, they lack wide support for features such styling and precise page layout. Any format will do as long as you can readily, fully & demonstratably extract all your information, independantly of the platform that created it.

Alex

"Vendor Neutral"???!!! (4, Insightful)

fm6 (162816) | more than 9 years ago | (#11228499)

You think RTF is "vendor neutral"? It's simply a 7-bit-safe version of Word's native format. There are lots of third-party tools that read and write RTF, but the same is true of Word native. Either way, you run up against all the formatting issues you always get when you're importing and exporting unstructured formats.

HTML is only vendor neutral if you don't use any vendor-specific extensions. So you can't just say, "Everybody save your files as HTML". You also have to forbid anybody using apps (such as Word) that save to a non-standard HTML.

In theory, you can create an XML-based format that looks the same in Word, OpenOffice, FrameMaker, and any other XML-aware app. But doing so means designing a schema in extreme nit-picking detail, and writing a lot of transformations to get that XML in and out of all the apps that need to read or write it. It's a lot of work, and nobody does it unless they have a specific application that requires highly-structured information. Like if you have a huge set of technical documentation that you need to update a lot. (I was involved in just such a project -- and the politics of converting all those documents to XML cost me my job.) Or if you have invoices or similar business documents that need to go into or out of a web services app.

But for the big mass of unstructured documents, there just isn't a vendor-neutral solution, and nobody has any real incentive to create one. The solution remains the same: standardize on certain specific applications. Which boils down to using OpenOffice if you hate giving money to Bill and/or want a platform-neutral solution. Otherwise you standardize on Microsoft Office, because it's what everybody knows how to use.

Re:"Vendor Neutral"???!!! (2, Informative)

Anonymous Coward | more than 9 years ago | (#11229323)

You think RTF is "vendor neutral"? It's simply a 7-bit-safe version of Word's native format.
That it is not.

RTF does contain, in theory, sufficient control words to describe everything that Word 2000 can do, but it's hardly a direct translation and things get lost a lot. Furthermore, RTF contains a few control words that Microsoft didn't put there: such as \collapsed (added by NeXT to describe paragraphs that had been hidden by the user).

There are lots of third-party tools that read and write RTF, but the same is true of Word native. Either way, you run up against all the formatting issues you always get when you're importing and exporting unstructured formats.
There is a huge difference. RTF is a formally published, open specification [microsoft.com] and Microsoft openly encourages [microsoft.com] third-party implementations. It's been stable for 5 years now. Word .doc files are a closed spec that Microsoft jealously guards and changes often.

Re:"Vendor Neutral"???!!! (1)

fm6 (162816) | more than 9 years ago | (#11230711)

Technically, I suppose you're right. But Microsoft's past attempts to promote RTF as an open format have little practical meaning nowadays. I mean, if an unsuccessful platform [fortunecity.com] is your best example of non-Microsoft development of RTF-based software, it doesn't say much for as an industry standard. A "standard" technology that only one company fully implements is, for all practical purposes, proprietary.

And although it's easier to find documentation for RTF than for Word native, the latter does exists. You just have to have the right developer's license to see it. I don't know whether products like OpenOffice, AbiWord, and WordPerfect, use that documentation, or whether they just reverse-engineer the files. But however they go about it, they don't do any worse a job reading Word native format than reading RTF.

So, yeah, characterizing RTF as "7-bit native" is a slight oversimplification. But not one that really matters to anybody trying to find a neutral format.

Re:"Vendor Neutral"???!!! (0)

Anonymous Coward | more than 9 years ago | (#11231643)

I mean, if an unsuccessful platform is your best example of non-Microsoft development of RTF-based software, it doesn't say much for as an industry standard.
You do realize who bought NeXT, right? RTF is the primary format for Apple's Cocoa text objects. TextEdit.app is all RTF. Apple's code documentation is all in RTF. RTF is the chief text transfer format for Apple's cut/copy/paste facility.

NEXTSTEP is now Mac OS X (1)

tepples (727027) | more than 9 years ago | (#11232195)

I mean, if an unsuccessful platform is your best example of non-Microsoft development of RTF-based software

Unsuccessful my ass [apple.com] ; learn why [slashdot.org] .

Re:NEXTSTEP is now Mac OS X (1)

fm6 (162816) | more than 8 years ago | (#11239145)

NextStep may be the platform on which OS X was built. (Just as NextStep itself was built on Project Mach.) But OS X is hardly a continuation of NextStep. How many NextStep applications have migrated to OS X?

Because OpenStep is now Cocoa (1)

tepples (727027) | more than 9 years ago | (#11241370)

How many NextStep applications have migrated to OS X?

Depends on whether the developer is still around. Mac OS X implements the Mac OS Toolbox API as "Carbon" and the OpenStep API as "Cocoa". If the developer still has the source code and wants to reach thousands of Mac users, porting starts with a recompile. But if your developer has gone out of business, on the other hand...

Re:NEXTSTEP is now Mac OS X (1)

jbolden (176878) | more than 9 years ago | (#11242745)

Its been 10 years, and NexTStep was primarily a development platform when Apple got it. But if you count apps in OSX like the Dock, Preview, NetInfo,... you get lots. If you count ideas from Next that moved to the whole of computing like WYSIWYG fonts then even more. The big one which is not OSy and moved directly is Interface builder.

Re:NEXTSTEP is now Mac OS X (0)

Anonymous Coward | more than 9 years ago | (#11290377)

Just off the top of my head: OmniWeb, Create! ChartSMITH, Mathematica, PStill. That was with about 15 seconds of thought.

RTF (1)

alexo (9335) | more than 9 years ago | (#11236538)


> RTF does contain, in theory, sufficient control words to describe
> everything that Word 2000 can do, but it's hardly a direct translation and
> things get lost a lot.


What gets lost?
Examples please.

Easy (1, Funny)

Anonymous Coward | more than 9 years ago | (#11228915)

Store everything in giant PNGs.

That is what i do... (1)

Uber Banker (655221) | more than 9 years ago | (#11229174)

...when i get locked PDFs. Just take a screenshot of the document. Easy.

You're asking the wrong questions (4, Insightful)

abb3w (696381) | more than 9 years ago | (#11229026)

The first question is not what, or how; the first question is WHY. As in, why do this? And therefore, is there a better way to achieve this goal?

Are they doing this to save money? to clamp down on the uppity workers? because the CEO got emailed an AppleWorks attachment with no file extension from some Mac user? to avoid the risks of single vendor lock-in?

Many documents formats can be converted back-and-forth with some degree of effectiveness. Yes, if you open a document from WordPerfect in Microsoft Office, the word spacing may change a little. However, this happens if you move from a machine connected with a HP4000 printer to a HP2100 printer as well. However, some formats give different feature capabilities; saving from DOC to RTF will cause (as an example) tables to shift about a bit. TXT format is readable by most anything, but the formatting capabilites are nigh nonexistant. (Ooh! Tabs!) While WordPerfect and Word will each open the others documents, they aren't so good for saving in open formats

What formats are currently used? Why are they needed? Will everyone need to be able to write to them, or are pay-writer/free-reader combos acceptable? And, *ARE* there any "vendor neutral" formats out there? (For desktop publishing, the real answer is "no". Publisher is a joke, and while Adobe and Quark maintain some import compatibilties, the formats AREN'T neutral.)

For myself, working in a small department, "Let a thousand flowers bloom" is just fine. I accept that I will occaisionally get forwarded an e-mail with an attachement that the user can't figure out how to open-- usually Mac/PC file extension name issues solved easily by renaming. Once in a blue moon I have to explain to someone that no, not everyone has FooBarBaz market research organizer, since for most the $800 license cost for it would be more beneficially used for other things, and they will probably need to examine such data files once in their career, if that.

Perhaps a list of universally accepted formats-- that is, formats that must be used for wide distribution-- would be more appropriate, after considering what features are needed in said formats. After all, Photoshop .PSD documents are harder to view outside Photoshop, but far more useful for subtle graphics work than JPEGs.

I suspect you are being sent out on a project inadequately considered. Depending on the pointy-hairyness of the person who assigned it to you, you may find some substantial benefit to reconsidering the ground assumptions.

Re:You're asking the wrong questions (1)

RevDobbs (313888) | more than 9 years ago | (#11229348)

Once in a blue moon I have to explain to someone that no, not everyone has FooBarBaz market research organizer, since for most the $800 license cost for it would be more beneficially used for other things, and they will probably need to examine such data files once in their career, if that.

I know it's illegal, but there was a torrent for the latest FooBarBaz on SuprNova just before it got shot down... you may be able to still find it out there.

Re:You're asking the wrong questions (0)

Anonymous Coward | more than 9 years ago | (#11230149)

I know it's illegal, but there was a torrent for the latest FooBarBaz on SuprNova just before it got shot down... you may be able to still find it out there.

Not worth the bother, it's a bad crack with an invalid key, and installs 180Solutions, Bullseye, and a rootkit to boot. :P

Re:You're asking the wrong questions (2, Informative)

timmyv (840681) | more than 9 years ago | (#11229406)

I guess I neglected to mention that the "corporation" I work for is a state government. Therefore Open Standards are essential to allow for:
The types of files we are talking about are essentially textual documents, spreadsheets, databases, etc. 2 of the 3 OOo provides, but I have a pretty good idea of how our user base would respond if we upped and replaced all their MS Office installations with Open Office, or for that matter how our DBAs would respond if we moved entirely to MySQL or MaxDB without a strong policy or incentives.

Re:You're asking the wrong questions (2, Informative)

abb3w (696381) | more than 9 years ago | (#11230248)

Ah, that's a somewhat more clear problem.

For free access to documents by citizens, PDF is pretty good. There are viewers for most platforms (I don't know about BSD or Solaris, but Mac/PC/Linux all are OK); and there are non-Acrobat print-to-PDF knockoffs at economical prices. Requiring PDF publication of all publicly available printed documents in, say, PDFv1.2, PDFv1.3 or PDFv1.4 would be a useful and not overly onerous step. (Adding forms-completion ability to the PDF requirement might well be too much.) The PDF standards are public, although copyrighted.
M$ Office has free viewers for older versions on Windows, but the Mac version isn't native on the current Apple OS, and OpenOffice is the only viewer I know for .DOC under Linux. =)

As far as permanence of data, nothing beats the long term unkillability of a bare TXT file; it also allows improved handicapped accessibility to the data in the process. For databases (w/o queries) and single-page spreadsheets, CSV comma-separated text format is similarly hard to destroy. Most Office Suites will read in such applications. For charts and other pictures, JPG may eventually be replaced, but will probably be readable for a long time. Of course, data corruption is always a risk (especially for JPG), so backups should be made redundantly, and be prepared for at least one major media format migration (EG: CD to DVD-Blue, or whatever). Requiring that any software be able to import from and export to these as relevant would be a reasonable and not overly onerous step.

Security is a more problematic issue. Some documents are meant to be kept non-public, barring (or even given) FOIA requests. Were it in my desmene to do so, I would still require the creation of the files for archive purposes, but storage off-site at a secure abandoned-salt-mine-type facility. Given that Security is oft diametrically opposed to Accesibility and/or Permanence, this may be a problem.

Oh... and PDF has some built-in security features. Requiring them to be used only when such security is mandatory might be worth thinking about.

Re:You're asking the wrong questions (0)

Anonymous Coward | more than 9 years ago | (#11230375)

I think paper documents are the only form that can fully meet these requirements. Even if you use only .TXT files the media those files are written on may be ureadable in a decade or two.

Re:You're asking the wrong questions (0)

Anonymous Coward | more than 9 years ago | (#11235406)

But you are failed to realise the "access" part of the question. Paper implies in physical access. It is very, very, useful for long term archivement but not for access.

Re:You're asking the wrong questions (1)

topham (32406) | more than 9 years ago | (#11234802)


I realize a lot of people do not like PDF; but any other format is asking for grief from end-users.

A company I currently do a lot of work for is slowly migrating towards PDF, each step a long the way has been pretty smooth. It's easy enough for the users to understand they 'print to PDF' to make a presentation version of a document.

I don't believe intermediate documents (works in process) should be stored in open formats. Not enough open formats support enough features, you would simply end up with a half dozen, or more 'open formats' and have more difficulties than necessary for everyone involved.

As described it is important you have the finished documents in a format that can be read without difficulty. PDF meets that need, as well as allowing re-printing of an archived document long after you've replaced the original program that created the document with something newer and different.

My employer has provided solutions for Intranet websites to upload the documents (as PDF) and allow the users to view them on kiosks. Having a universal viewer like PDF is much better than using multiple add-on viewers for different document types. (Excel/Word, etc).

If I were presenting drawings, as opposed to Documents I'd probably add SVG to the mix. But for presentation of a document as was originally intended PDF can't be beat.

Permanence (1)

jbolden (176878) | more than 9 years ago | (#11242871)

Permanence of public data.

I guess how permanent is permanent? Its very hard to store data electronically long term and have it be accessible years later. How many computer techs today could even deal with a 9 track data tape (a state of the art archival format 20 years ago)? While PCs can handle Bus and Tag data streams the adapter card is $3k per. No one 30 years ago would have conceived of having individual users not connected in any meaningful way to operations center.

I've done a lot of work taking data in "will be good forever" formats like code 1 and moving them to formats that are actually usable by non mainframes. I see no reason to believe that .pdf deployed on modern tapes archives will be meaningfully usable in 30 years. If by "permanent" you mean 10 years or less than no problem. If you mean 100 then in addition to all the other suggestions below, I'm going to say a Microfiche printer should be part of the solution. 100 years from now people may not have a clue what Microsoft Word was and thus no idea what to do with a ".doc file" on a DVD or whatever but they will know how to use a magnifying glass and a light source just fine.

With one of these printers your users either export, .jpg, .pdf, .doc... or they "print" to this printer which captures 400 ppm very cheaply (server + printer + setup for a little over $10k). It may sound really really old fashioned but I think it is worth considering. Think about how you would get digital data from the systems you were using in 1975....

Re:You're asking the wrong questions (1)

fm6 (162816) | more than 9 years ago | (#11230756)

When you say "the word spacing may change a little", you're really underestimating the problem. If you ever do anything more than really simple memos with no nested lists, no complex tables, and no charts, you find yourself in a real mess trying to import documents from another vendor. It's something you can deal with if you just want to read other people's documents -- but normal business workflow requires that people pass documents back and forth, making changes and annotations. You simply can't do that without standardizing on a format. And I don't mean RTF, which is effectively a Microsoft proprietary format, despite Redmond's past attempts to get it adopted as "neutral".

LaTeX (2, Insightful)

KivlE (547859) | more than 9 years ago | (#11229580)

Hmm, I'd say LaTeX would be a good alternative? There are interpreters for most platforms, the source files are plain text, and it can output a variety of readable formats (pdf,ps,html etc).

Re:LaTeX (1)

Planesdragon (210349) | more than 9 years ago | (#11230960)

Show us how to move a MS Word file to LaTeX with no loss of information (yes, formatting counts as "information") or human editing.

if you can't do that, it's not worth his time.

Re:LaTeX (1)

homeobocks (744469) | more than 9 years ago | (#11231430)

It's possible, but most people who use MS Word don't form real headers/sections/numbering, they just increase the font size and centre things and do things manually. Because of that, it would be hard to turn style information into logical information.

Re:LaTeX (1)

Planesdragon (210349) | more than 9 years ago | (#11232148)

You can turn a set font size to a header easily enough. Heck, you can do it with VB script.

Got a link for "possible"?

Infer what? (1)

tepples (727027) | more than 9 years ago | (#11232228)

True, but given an RTF using visual formatting, how can a program know in advance which font size was meant to be "heading level 1", which was meant to be "heading level 2", whether italics represent emphasis or the title of a work, etc?

Re:Infer what? (1)

Planesdragon (210349) | more than 9 years ago | (#11234320)

Two ways.

Number one: the office tells them. I.e., "use everything that's size 14 as Heading 1, use italics as italics, etc."

Number two: write a program to figure it out. This could be done in Office VB to apply and redefine headings for any given document.

What is your point? (1)

jotaeleemeese (303437) | more than 9 years ago | (#11234882)

The article poster is explicitly stating they want to move to vendor neutral applications.

In such a situation why would they need to do such conversions?

Re:What is your point? (2, Interesting)

Planesdragon (210349) | more than 9 years ago | (#11234951)

That's not what he said. He said vendor neutral file formats.

This may result in dropping MS Office entirely -- or it may just result in changing the default "save as" settings for every install of Word, or the creation of an "archive and share" custom function that takes DOCs or WPSs or whatever and turns them into the new neutral format.

Bad Assignment (2, Insightful)

salesgeek (263995) | more than 9 years ago | (#11229604)

I'd recommend you find a way to get out of the assignment. You will not find what you seek as it is one of the holy grails of computing that should exist but does not and does not for good reason (money).

I always try and use portable files (3, Informative)

The_Dougster (308194) | more than 9 years ago | (#11232778)

Well, for CAD, its a screwed up world. The best/most portable format is probably IGES, except its such a huge specification that nobody's IGES file is compatible with anybody else's. I'm an engineer and for myself I use Turbocad 10 professional at home. It reads/writes AutoCAD files and numerous other formats, and is somewhere in between AutoCAD and Pro/Engineer in terms of its capabilities. You'll have a tough time convincing any corporation to use TurboCAD though.

For text documents, HTML would be good, except MS products tend to produce the most screwed up HTML files I've ever seen. All I can recommend is to use PDF files for important and official documents because they are essentially immutable and tend to produce consistent hardcopies from any computer.

OpenOffice formats are nice, and if I were starting up a new business I would of course set up Linux workstations to use OO exclusively, and put a Windows machine down in the IT room so the IT staff could convert any troublesome documents that come through the email.

For Visio, there is no equivalent, other than exporting the visio file as a DXF or maybe a WMF. Windows MetaFiles never seem to load right in other apps though so thats something to think about. SVG files will probably be the future here if Dia starts using them.

The Shot Heard Round the World (2, Informative)

garyedwards (317746) | more than 8 years ago | (#11240538)


There are no "StarOfficeisms" in the OASIS XML Open Document file format specification. Least ways not any we know of. By December of 2004, when the OASIS TC submitted the XML file format specification to ISO, all known references and anachronisms that might be called starisms were changed. Neutralizing changes were even made to such things as the file format extensions and mime type registrations. We even changed the name from OASIS Open Office to OASIS/ISO Open Document.

Separating the file format from any particular application or applications suite is a big deal. Especially if there is a rising demand from enterprise level end users for an applications independent universal structured file format solution. tty. Separating the file format from any particular application or applications suite is a big deal. Especially if there is a rising demand from enterprise level end users for an applications independent universal structured file format solution.

So the OASIS/ISO TC chose to keep that most powerful of technology terms, the word "Open", but lose the direct reference and/or suggestion to OpenOffice.org.

The second reason for changing the name to "OASIS Open Document" is far more interesting, and directly relates to the European Union "TAC/IDA" task force recommendations based on the infamous Valoris Report. You will recall that by September of 2004, the EU had evaluated responses from both Sun and Microsoft regarding the Valoris recommendation that all EU information system purchases be required to support an open standards based XML file format specification.

Microsoft's open XML proposal was determined by the EU to be "not open enough". This criticism was in the original Valoris Report, and not altered by subsequent Microsoft arguments. After much squealing, squawking, finger pointing, complaining and outrageous misrepresentation, in mid November of 2004 Microsoft finally conceded and agreed to meet EU requirements. More about this in a moment, but for now the important thing to note is that the EU held firm. A remarkable feat even though there is currently a range of cross platform alternative solutions that meet EU requirements, including the open and free OpenOffice.org, Sun's StarOffice, IBM's WorkPlace, and Novell's Open Office. And if Microsoft had not sold their share in Corel to a vulture investor outfit for pennies on the dollar, an investor who then proceeded to cut XML out of Corel, WordPerfect Office would also be OASIS/ISO XML compliant.

Meanwhile, the EU was also not entirely satisfied with the OASIS XML specification as explained in Sun's response to the EU requirements recommendation. Three things in particular concerned the EU.

First, that OASIS submit the file format specification to ISO. In September of 2004, OASIS management and the OASIS TC came to agreement with ISO that the file format specification would be submitted to ISO before years end, but maintenance and improvement would remain with the current OASIS TC. Hence the combo moniker "OASIS/ISO".

Second, there was a great deal of concern about "custom-defined schemas". Sometimes this issue is also referred to as "user-defined schemas". Others just call it a "forms" or "template" issue. Basically it refers to an applications ability to load (or consume) an externally defined schema template that might include specific user interfaces (forms), business - workgroup logic (routing), meta data interfaces, and other things related to the emerging world of collaborative computing.

Microsoft of course champions the auxiliary Office productivity application, InfoPath. However, in September of 2004, the OASIS TC finished work on extending the specification to include XForms, SVG, and SMiL. Current OOo -v.2 builds fully demonstrate the powerful capabilities of these extensions, including the binding of web services and data to graphical objects and forms/template widgets. Move over InfoPath. Hello OASIS UBL!

The third issue involves EU concerns for a more "universal" file format specification. Meaning, an XML file format specification that could be used in circumstances far beyond that of a typical Office Productivity Suite. Some of the Valoris people have referred to this as a "productivity environment" issue. One that would truly redefine the meaning of a "compound document". Imagine many applications, working with a variety of file formats, contributing services to a single document or collaborative workflow of related documents. To achieve this, the productivity environment really needs a universal XML file format that can scale from the output of single purpose applications like AbiWord, to the complexities of integrated workflows that could include project management, eMail, voice, contact management, database bindings, web services, presentations, intelligent forms and template logic, and of course all the customary aspects of an Office Productivity Suite.

So while there are many things the OASIS TC has planned to meet the emerging needs of a collaborative computing ecosystem, one of the recent considerations was to ditch the limiting term "Office", and run with the EU's concept of an XML-Open Standards based universal Open Document model. Hence the name change to OASIS/ISO Open Document.

Let's get back to the events of mid November 2004. Claiming victory, Microsoft finally concedes, agreeing to meet the EU requirements. Cleverly, the EU has not endorsed the OASIS/ISO Open Document specifically. They only stated the information systems requirement to support Open Standards based open XML file formats. The OASIS/ISO effort just happens to be the only specification that qualifies.

This is only speculation on my part, but i think there is clear evidence that the September 2004 requirements decision immediately impacted the EU purchase cycle. Right down to the departmental level.

Microsoft's not the only one to make concessions. In mid November, IBM loudly announced their support for the OASIS/ISO Open Document. Not hard to do since IBM's highly portable WorkPlace desktop environment is based on components from OpenOffice.org and Mozilla. The thing is though that IBM's WorkPlace is pulling an enormous chain of over 150 companies who now also can claim to support the OASIS/ISO Open Document. Most prominent, IMHO, is the significant presence in that chain of Adobe!

One of the things the OASIS/ISO TC is hopeful about is that in the wake of the November announcements, Adobe might join in the effort to extend the current metadata model along the lines of their XMP metadata templates.

Other interesting extensions to the OASIS/ISO specification work that were completed in September of 2004 include bibliography, DocBook, and accommodations for the OASIS UBL forms and templates. Although few people are aware of this, the bibliography and DocBook projects early on set their sights on the Library of Congress MODS XML Schema. A prominent "custom-defined schema" if ever there was one. So where the EU chose to focus on current and next generation collaborative computing systems, other aspects of the OASIS/ISO work targets the legacy stores of mankind's knowledge.

It's been said that over 90% of mankind's knowledge is in an unstructured format. Over 85% of mankind's information growth is in the unstructured, application bound file formats of platform specific office productivity suites. Some people look at these figures and say that's interesting, but so what? What's the big deal about structured formats anyway?

Well, structured formats are machine ready. Our computers are the only machines ever invented that can interact and extend mankind's informational powers. These machines can converse and work volumes of structured digital information with only minimal human input and guidance. Provide them the logic, and our machines of the mind can do wonders with structured content, data, and streaming media. Feed them unstructured information though and they can barely crawl without persistent and intense human intervention.

So the question becomes, do we truly make use of our machines or not? Do we turn them loose and let them traverse the global info grid to master galactic flows of information? Or do we continue to carry them around on our backs?

That is the structured vs. unstructured question. And guess what? There is only one open standards based, open XML technologies based, structured file format that can even begin to handle the needs of inter application, integrated productivity environments.

By separating the file format from the application, the OASIS /ISO Open Document returns to the end user ownership of both their information, and their information processes. Two hundred years from now, when undreamed of computational power enables applications and devices beyond our imagination, our information, our knowledge will still be accessible. No application or platform vendor permission required.

And that's good thing.

~ge~

Gary Edwards
OpenOffice.org volunteer serving on the OASIS/ISO Open Document Technical Committee
Check for New Comments
Slashdot Login

Need an Account?

Forgot your password?