Corporate
Alzheimer's: Coping With Forgotten File Formats
Debate over open file formats heats up as threat of lost data looms
By
John K. Waters
Special to Law.com
April 4, 2006
What
if the file formats in which we save text documents, spreadsheets, charts and
presentations -- all that stuff generated by so-called productivity software --
were not supported by future versions of the programs used to create them today,
or by some as-yet-unimagined successor products? Could drifting file formats
cause a kind of corporate Alzheimer's that threatens our ability to recall
contracts, insurance policies, financial records, payroll data and other
critical documents?
"This is already happening today," says Simon Phipps, chief technology evangelist at Sun Microsystems. "People are finding that documents which are as little as 10 years old are inaccessible to them now. As long as the baseline file format continues to evolve at the rate of a new format every 18 months -- which it has for the past 20 years -- you can guarantee that the time to sunset of a particular format is going to be something like 10 or 11 years."
Does anyone remember WordStar?
"Every day there are fewer and fewer paper records," says attorney Andrew Updegrove. "Without a standard, how can we use an electronic filing system for documents that have to be kept indefinitely? What would happen if you had to produce a lot of documents that don't exist on paper, say, to comply with a discovery request? What would you do if you'd saved a deposition on an eight-track tape? It would be very difficult even to find a machine to play it back on today."
The question of how best to guarantee long-term access to corporate data and documents, free of the restrictions of proprietary software or the limitations of outdated technologies, lies at the heart of a debate currently heating up over two so-called open file formats, both of which are being offered as the cure for this potentially debilitating information-age malady.
One is the OpenDocument Format for Office Applications (ODF), which was developed by the OASIS standards consortium. The ODF was derived from a file format native to the OpenOffice suite, a free and open-source productivity bundle developed from Sun Microsystems' StarOffice suite. The ODF was designed to work with any office application, including Microsoft Office.
The other is Microsoft's Office Open XML, which the software maker has submitted for standardization to Ecma International, a group based in Europe. Microsoft has said that Open XML will be the default file format for the next version of the Office suite (Office 2007).
Both file formats are based on the Extensible Markup Language (XML). Designers use it to add customized tags, enabling the definition, transmission, validation and interpretation of data among applications and organizations. An XML-based file format is, therefore, also easier to share across apps and platforms than traditional binary formats.
Plenty of organizations have jumped on the ODF bandwagon since the OASIS technical committee was formed in 2002. Sun got onboard at that time; so did Boeing, Sony, the National Archive of Australia, the New York State Office of the Attorney General and the Society of Biblical Literature, among others. And yet, given the overwhelming market dominance of Microsoft Office, it's fair to wonder how relevant the ODF really is. According to analysts at Gartner, Microsoft owns more than 95 percent of the office applications space. Its closest competitor in market share, Corel's WordPerfect Office, currently has no plans to support the ODF.
Supporters argue that, because it was developed collaboratively by multiple vendors and other groups, the ODF is the true open standard. Open XML, they say, was developed by a single, powerful vendor, and the Ecma "standard" will reflect a bias toward the preferences of the format's creator.
But that's just as it should be, says Microsoft's Brian Jones, lead program manager for the Office suite. "It's true that the feature set Open XML enables is a reflection of the Office feature set," he says. "We had to go with a format that was, from the ground up, designed to support all of our legacy systems."
Microsoft has promised that the next version of Office will be fully compatible with older document formats, and it plans to release updates for Office 2000, Office XP, and Office 2003 that will let them read and edit Open XML formats, Jones says.
The company's decision to promote what amounts to its own standard makes sense to attorney Andrew Updegrove. One of the most outspoken and oft-quoted supporters of the ODF, Updegrove is a partner at the Boston law firm of Gesmer Updegrove, and the creator of Consortiuminfo.org, an information site covering standards, standard setting and open-source software.
"Microsoft has acted exactly the way you would expect anyone with such a valuable monopoly to act," Updegrove says. "You don't work that hard to get there and then just give it away. But don't confuse what they're doing with true openness. I can't see how getting more granular opens more doors. It just locks you in."
Jones points out that his company has, in fact, given Open XML away. "That's the revolutionary thing," he says. "The formats are no longer Microsoft formats; they are controlled by an international standards body. That they happen to be based on, and fully compatible with, our legacy binary format helps to get rid of any existing user pain. And going forward, anybody can build solutions on top of them."
Microsoft's announcement that it would be submitting Open XML to Ecma came hot on the heels of a much-publicized decision by the Commonwealth of Massachusetts to phase out use of proprietary formats for storing government documents and switch to the ODF. The Commonwealth made its announcement in September; Microsoft disclosed its Ecma plan in November.
The Massachusetts announcement threw a spotlight on the ODF, and supporters greeted the decision with an almost gleeful optimism. "We have broken [Microsoft's] control point at the document level," IBM's VP of standards Bob Sutor said at the time.
"The big vendors have been waiting for 20 years to find a wedge that they could drive into the Microsoft monopoly," Updegrove explains. "The Massachusetts decision shot this alternative to the top."
But Microsoft's Ecma strategy was very well received by the Commonwealth -- so well, in fact that its administration finance secretary declared that " ... we are optimistic that Office Open XML will meet our new standards for acceptable open formats ... " Adding drama to the confusion, Peter Quinn, the state CIO who championed the ODF, resigned, citing pressure from pro-Microsoft state politicians.
Whatever happens in New England, the ODF now has some serious momentum. In March, 35 associations, academic institutions, and industry groups from around the world formed The ODF Alliance to promote the standard to local and national government organizations. And OASIS itself has formed The ODF Adoption Committee to do the same thing for business users and software vendors.
The value of an open standard file format for office applications is obvious; where this format fracas is heading remains to be seen. Let's keep a lookout for further symptoms of corporate Alzheimer's.
PS: On the day I filed this column, Microsoft disclosed that it had just joined a group closely involved in approving OASIS candidates submitted to the ultimate standards arbiter, ISO (the International Organization for Standardization). It's a move the company says was intended to facilitate ratification of its own Open XML doc format, but which ODF proponents fear is designed to delay approval of the rival format. The group Microsoft has joined, the V1 Text Processing: Office and Publishing Systems Interface group, is a very small subcommittee made up of six member companies within the International Committee for Information Technology Standards (INCITS). V1 is the subcommittee charged with reconciling the votes cast for ISO adoption of the ODF. Microsoft has been a member of OASIS for years, and yet elected not to take part in the ODF Technical Committee.
Here are some links for those interested in following this unfolding drama:
Andrew Updegrove's blog
Pamela Jones's blog at Groklaw
Microsoft's OpenXML FAQ page
John K. Waters is a freelance journalist and author based in Silicon Valley. He serves as senior correspondent for Application Development Trends magazine. His books include "The Everything Computer Book," "John Chambers and the Cisco Way" and "Blobitecture: Waveform Architecture and Digital Design."
Law.com's ongoing IN FOCUS article series highlights opinion and analysis from our site's contributors and writers across the ALM network of publications.