Big Blue on OOXML

Some will probably say “it’s about time too”…

IBM has made public an article written by Peter Seebach called “OOXML: What’s the big deal?”.

In it Peter explains in clear and unambiguous language why Microsoft’s OOXML document format (also know as DIS29500 or ECMA-376) is not fit to be an international standard.

Stating what has already been said many times before might be construed as boring or repetitive, but in this case Peter gives a refreshingly concise review and summary of the main issues. Many of which have been lost in the verbosity and plethora of opinion and conjecture that abounds on the web regarding OOXML.

Here are couple of salient comments from the piece:

There have been a number of technical complaints made about OOXML. Every one of them comes down to the same base complaint: Rather than specifying a reasonable common interchange format, OOXML specifies the whole feature set of Microsoft Office, down to bug compatibility. This creates a burden on other implementers which is simply unreasonable (and in fact impossible) to meet, while conveniently being precisely what Microsoft is already shipping. That raises a lot of concerns.

He goes on to examine three categories of “showstopper problems” and gives examples in each. The final category, “Unique Features”, is quite damming in it’s final analysis…

Probably the most famous example is one of the optional settings provided in OOXML. The setting is called “useWord97LineBreakRules”, and it specifies to use the line-break rules that were used in Word ’97 for East Asian documents. Much like the previous examples, this is of course impossible for anyone else to do, as no specification of these rules is provided. In fact, the OOXML standard even warns implementers not to implement this:

The OOXML standard’s guidance for useWord97LineBreakRules

[Guidance: To faithfully replicate this behavior, applications must imitate the behavior of that application, which involves many possible behaviors and cannot be faithfully placed into narrative for this Office Open XML Standard. If applications wish to match this behavior, they must utilize and duplicate the output of those applications. It is recommended that applications not intentionally replicate this behavior as it was deprecated due to issues with its output, and is maintained only for compatibility with existing documents from that application. end guidance]

This guidance is excellent. Given that there is no specification available of this feature, and it is deprecated, it makes all kinds of sense for people not to implement it. But wait; if it shouldn’t be implemented, why is it in the spec? Compatibility with existing documents is not a reason to add a feature to a standard aimed at interchanging data; users are worried about whether their text can be opened at all in another program, not whether every line break is in the exact same location!

This feature is in the spec because OOXML is not a document interchange format; it’s a careful, bit-for-bit, replication of Microsoft’s historical binary formats, wrapped up in angle brackets.

That’s a cracking analysis. OOXML is NOT a document interchange format. It’s MS Office binary wrapped in XML

Peter’s conclusion says it all.

OOXML is a credible effort to solve a real problem: The problem of how to replace completely opaque binary files encoding ten years of accreted behaviour with partially-legible XML files encoding the same behaviour, down to the last bit. That problem, unfortunately, is not the problem of providing a usable, implementable, exchange format for office documents.

OOXML should not, and must not become an ISO standard. It is, as we have been saying all along, a proprietary vendor’s implementation of their proprietary document format. There will be only one beneficiary if this becomes a global standard, and it isn’t you or me…

Tags: , ,

3 Comments

  • James says:

    A couple of points.

    First, the OOXML standard embodies the functionality of Office. Since it is huge, it should be accompanied by examples. There is nothing in this article that says that to interchange with Office using this spec requires full support of all office features. For an Office to some other document processor, there will always be function and feature differences, and so a full featured Office document will lose “fidelity” in an interchange of this sort.

    The theme of protest that is out there has a behind the scenes agenda: mainly to continue to heckle MS over interoperability in order to keep it on the EU Anti-Trust front burner. This is ever increasing sucking economists, politicians and attorneys into the software innovation process and the next guy up will be Google (just a matter of time). Do we really want a regulatory body overseeing software standards and designs? I say, let Microsoft have its standard (at least there will be a documented approach to interoperability that will be sustained), and then leave the rest of the world to how much and how well they handle the interoperability for their specific products.

    As most of us know, the biggest issues standing in the way of moving to another personal productivity product is the human need for comfort in the tools they use to do their job, and to move them to something else could be done with the right support, but nobody (other than a deep believer in Open Source) would spend the time or resources on it, given all the higher priorities that need a technology solution (i.e. the money is better spent on more pressing needs, and Office does just fine for now). But this horse has been out of the barn for years, and personally, there would have to be a MAJOR uptick in functionality that I would personally be interested in to make me consider anything other than Office.

    So, while everyone dreams about document processing on “the cloud”, or some of the free office suites out there, these are just misguided efforts in the wrong place, and what is left to innovate in this area? So the quest continues to try to convince the world that a dumbed down office suite of tools is what everyone wants. This is much the same as standing outside an auto dealer trying to convince new car owners that they really want a car without the radio or power windows that come with it.

    So, if the Web 2.0 stuff is for real (am I am VERY Skeptical it is), focus on solutions in that market that people may want or could use to their benefit.

    Then, someday, if you are as lucky as Microsoft was, the EU will be calling you wondering if your mash up needs an interoperability “Standard”. And then imagine what kind of spec you would produce in that situation to cover all the bases.

  • Alan Lord says:

    Thanks for commenting James.

    I appreciate you taking the time to comment, although I feel that your points are somewhat misguided.

    Responses to individual points in-line below.

    “First, the OOXML standard embodies the functionality of Office. Since it is huge, it should be accompanied by examples. There is nothing in this article that says that to interchange with Office using this spec requires full support of all office features. For an Office to some other document processor, there will always be function and feature differences, and so a full featured Office document will lose “fidelity” in an interchange of this sort.”

    OOXML is the embodiment of MS Office. And that in itself is no bad thing. However, it is absolutely NOT what an ISO standard is for.

    Taken directly from ISO’s website (http://www.iso.org/iso/about/discover-iso_what-standards-do.htm):

    What [ISO] standards do

    • make the development, manufacturing and supply of products and services more efficient, safer and cleaner.

    This is not the case with OOXML. We already have a document interchange format. OOXML attempts to replicate bugs and foibles of applications which go back decades. The purpose of a standard (for documentation formats)has nothing to do with the applications that read/write or interpret the schema.

    • facilitate trade between countries and make it fairer

    OOXML will certainly NOT make anything fairer. It is MS Office. It is basically impossible to fully replicate by anyone other than Microsoft.

    * share technological advances and good management practice

    OOXML does not advance anything. It endeavours to restrict developers to one application base for document interchange. The many thousands of comments and errors clearly do not bode well in terms of good management practice.

    * disseminate innovation

    There is no innovation in OOXML. It is MS Office regurgitated and wrapped in angle brackets.

    * safeguard consumers, and users in general, of products and services

    In no way will OOXML safeguard anyone from prolonging the world’s largest monopoly.

    * make life simpler by providing solutions to common problems

    OOXML does not solve any common problem. The only problem it solves is Microsoft’s lack of support for an approved ISO standard document format.

    “The theme of protest that is out there has a behind the scenes agenda: mainly to continue to heckle MS over interoperability in order to keep it on the EU Anti-Trust front burner. This is ever increasing sucking economists, politicians and attorneys into the software innovation process and the next guy up will be Google (just a matter of time). Do we really want a regulatory body overseeing software standards and designs? I say, let Microsoft have its standard (at least there will be a documented approach to interoperability that will be sustained), and then leave the rest of the world to how much and how well they handle the interoperability for their specific products.”

    I for one most certainly do want an independent body overseeing what are global standards, yes. Microsoft run the world’s biggest monopoly, are convicted many times over and have further cases pending. Their OOXML specification does not document interoperability, it documents MS Office’s binary formats to a certain extent, but misses important bits out and relies on application specific processes to be fully implemented. It is also unreasonably hindered by patents for anyone else who develops applications which use it.

    “As most of us know, the biggest issues standing in the way of moving to another personal productivity product is the human need for comfort in the tools they use to do their job, and to move them to something else could be done with the right support, but nobody (other than a deep believer in Open Source) would spend the time or resources on it, given all the higher priorities that need a technology solution (i.e. the money is better spent on more pressing needs, and Office does just fine for now). But this horse has been out of the barn for years, and personally, there would have to be a MAJOR uptick in functionality that I would personally be interested in to make me consider anything other than Office.”

    This is so much hogwash it barely deserves a response. Millions of people are moving to OpenOffice.org. And hundreds of millions have probably done so already. Maybe you are wealthy or you get Microsoft’s software for free, but for most of us, spending £400 or so for some applications that enable you to write letters or create spreadsheets is unaffordable and just plain stupid to be honest. OpenOffice.org is free and does pretty much everything MS Office does.

    “So, while everyone dreams about document processing on “the cloud”, or some of the free office suites out there, these are just misguided efforts in the wrong place, and what is left to innovate in this area? So the quest continues to try to convince the world that a dumbed down office suite of tools is what everyone wants. This is much the same as standing outside an auto dealer trying to convince new car owners that they really want a car without the radio or power windows that come with it.”

    I’m not quite sure I get you here, perhaps you have been smoking something? Innovation happens constantly and you or I have absolutely NO IDEA what will happen in the next 5 years. OOXML is an attempt to perpetuate a monopoly, and to stifle, not stimulate, innovation. ODF (the existing ISO standard for document interchange) provides a platform for innovation by being freely available, application agnostic and unencumbered by patents or other proprietary restrictions. No one here is trying to convince the world of about an office suite (That’s what Microsoft are trying to do). ISO and ODF are about file formats; not the applications that use the data therein.

    “Then, someday, if you are as lucky as Microsoft was, the EU will be calling you wondering if your mash up needs an interoperability “Standard”. And then imagine what kind of spec you would produce in that situation to cover all the bases.”

    You have lost the plot there really… There are many standards for “mashups”. That’s why they are so prevalent and are developing at a pace that leaves Microsoft in their wake. They have been developed in the public domain and are unencumbered. Just how TCP/IP was created. Just how XML was created. Just how… Oh never mind. Go and waste your money on some more Microsoft Software. I’m sure they will appreciate your contribution.

  • Louis Steinberg says:

    James:

    “As most of us know, the biggest issues standing in the way of moving to another personal productivity product is the human need for comfort in the tools they use to do their job”

    I take it, then, that your Vista machines run OpenOffice.org, not MS Office 2007? After all, the UI in the older MS Office versions i much more like the UI in OO.o than like the UI in Office 2007.

    In fact, the file format *is* a major issue in vendor lock-in, and Microsoft makes intentional use of file format lock-in. See http://www.groklaw.net/article.php?story=20071023002351958.

    To use your analogy, the anti-MSOOXML forces are trying to convince car buyers they don’t want a car that can only park in, say, Ford-manufactured driveways and can only run on Ford-brand gasoline.

Leave a Reply to Louis Steinberg

XHTML: You can use these tags:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>