Stories
Slash Boxes
Comments

SoylentNews is people

posted by mrbluze on Thursday April 03 2014, @07:00AM   Printer-friendly
from the hieroglyphics-was-too-easy dept.

A well-known problem in computing is the existence of data in outdated or inaccessible formats. A common reason for this inability to access data is the use of proprietary file-formats that result in vendor lock-in. At the Libregraphics conference in Germany, project leader Fridrich Strba announced the Document Liberation Project sponsored by The Document Foundation, which aims to attract open source developers to help provide tools for the conversion of files to the ODF ISO standard document format.

The project goals are:

  • to try to understand the structure and details of proprietary, undocumented file-formats
  • to use the understanding of the file-formats to implement libraries that are able to parse such documents and extract as much information as possible from them;
  • to use our existing framework to encode this data in a truly free and open standard file-format: the Open Document Format.

The project is associated with LibreOffice and is already helping compatibility with old formats in a number of FOSS projects.

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2, Interesting) by FatPhil on Thursday April 03 2014, @07:03AM

    by FatPhil (863) <reversethis-{if.fdsa} {ta} {tnelyos-cp}> on Thursday April 03 2014, @07:03AM (#25366) Homepage
    ...we point our fingers at them and say "see! that's what happens if you used closed prorietory formats/software".

    The worse we make them look, the more likely people are to realise how bad they are.

    People will stop using closed priorietory formats when the US stops using Fahrenheit, and for the same reasons.
    --
    Great minds discuss ideas; average minds discuss events; small minds discuss people; the smallest discuss themselves
    • (Score: 3, Insightful) by gishzida on Thursday April 03 2014, @07:39AM

      by gishzida (2870) on Thursday April 03 2014, @07:39AM (#25379) Journal

      But like many bits of finger pointing it does not solve the problem of having documents you can no longer open. And that does pose a challenge for free and open source citizens such as yourself... I use open source where and when I can... but there was a time when there were no such applications available... These old formats were long dead before open document format was a gleam in its designer's eyes.

      I have a few of those and have inherited a few [from my late father who was an avid user of WordStar for DOS]. I on the other hand have some documents in Q&A Write and JustWrite format... and old Visio docs that the current versions no longer supports... and all of them created before "free and open source" applications were available.

      So am I to be blamed for not using what was not available???

      • (Score: 2) by aristarchus on Thursday April 03 2014, @08:20AM

        by aristarchus (2645) on Thursday April 03 2014, @08:20AM (#25402) Journal

        So am I to be blamed for not using what was not available???

        To put it simply, which evidently is required: yes. I was just a good German and knew nothing about the standardization of wordprocessing file formats! My family never owned Microsoft stock! Those who fail to resist monopoly by the enforcement of public standards are complicit in the domination of the monopolies. And college drop-out Bill Gates sticking his nose into educattional reform. Do you think that none of this is your fault? We told you! Repeatedly! (I lose hope, . . .)

        • (Score: 1) by gishzida on Thursday April 03 2014, @09:06AM

          by gishzida (2870) on Thursday April 03 2014, @09:06AM (#25415) Journal

          We're talking passed one another here... I don't own any Microsoft stock either... and apparently you are unaware of PC history... You couldn't tell me crap about "open source" when I was using VolksWriter for DOS on an TrueBlue IBM PC in 1983. The FSF did not come into being until 1985... and had nothing to show for itself on PCs class machines really until Linux 1 came along in 1991 [Stallman worked on UNIX workstations and got pissed off when Symbolics would not share their code any more]. I did not use a MS office product until 1996... and then only because MS had strangled the market...there was no open source word processor for IBM machines running DOS... In fact there were no open source word processors until much later.

          What I do have is files that were created before any open source apps were available [most of them created prior to 1995]... So tell me once again how I was supposed to use open source apps which did not exist THEN? I avoidded MS products... they were over priced. So the files I have are NOT even Microsoft products! JustWrite and Q&A were by Symantec. As for Visio... MS bought Visio just after rev 2 was released then soon scuttled the earlier format.... again without any open source equivalent being available...

          So what you are saying that rather than open the doors of open source to help people with these old formats you'd rather wave your finger and say shame on you for not having open source before the FSF was even created???? Relatively speaking open source word processing is very new to the market place... and would not even be there if someone at the late great Sun Microsystems had not bought a German proprietary source product [Star Office] and then released it as open source...
           

          • (Score: 2) by aristarchus on Thursday April 03 2014, @09:08AM

            by aristarchus (2645) on Thursday April 03 2014, @09:08AM (#25416) Journal

            One word: ASCII. (wait, that is not a word! But it is a STANDARD!)

            • (Score: 1) by gishzida on Thursday April 03 2014, @09:43AM

              by gishzida (2870) on Thursday April 03 2014, @09:43AM (#25427) Journal

              In 1982 Wang Word processing terminals was a standard too... until it was shown a $3000 PC could replace a $40,000 specialized word processing system.

              IBM included in DOS 1.1 a text editor called e.... but e was a text editor and not a word processor.

              The early "king of the hill" for DOS was WordStar. Later WordPerfect took the crown... there were a lot of other smaller competitors like JustWrite, Q&A, ClarisWorks, BrownBag, and VolksWriter. Then MS started playing dirty games of FUD... which no one caught wind of until the Anti-trust case... Frankly Microsoft should have been broken up then... too bad that the Bush Administration did not agree and let them off the hook.

              Alas... what might have been!

              • (Score: 2) by Hairyfeet on Thursday April 03 2014, @01:55PM

                by Hairyfeet (75) <bassbeast1968NO@SPAMgmail.com> on Thursday April 03 2014, @01:55PM (#25543) Journal

                Sorry, gotta throw a flag, bullshit on the field. Everybody blames "teh evil M$!" when in reality the ENTIRE HISTORY of MSFT can be encapsulated in the sentence "and then the other guy did something REALLY dumb". All that "EEE" memo bullshit? Was frankly more about a company trying to cover up the fact their entire history was being blessed with pants on head retarded competitors than it was any real brilliant plan, in fact the closest you can get to saying MSFT was brilliant was Bill gates managing to bullshit the press for nearly half a decade with the vaporware that was Cairo, but again I would argue that even being able to pull that off came down to the zeitgeist of the early 90s being all things sci-fi and VR and Gates just said "ummm, yeah we got that!" and the press bought it.

                Now how does this apply to Wordperfect and the rise of Word? Simple WP had had several megahits in the DOS era, had the market all but locked up (kinda like MSFT in the early 00s with PCs) and were riding high with rising stocks and glowing press, so what happened? Say it with me boys and girls....."and then they did something REALLY dumb"...the something really dumb was saying "Bah our customers are big business and they are on DOS and WFW 3.11, this Win95 thing is a buggy playtoy for the home users, no need to put any real effort behind it" so they put out a truly shittastic version of WP that was just the WFW 3.11 version with a DOS4GW wrapper...now anybody who has actually used a DOS4GW knows they are anything but stable, but to use the WFW 3.11 version instead of the pure DOS version? they couldn't have made it a worse product if they had someone on the line taking a dump into each and every box. Imagine the "fun" of being halfway through writing a very important business letter and go to make a line indent and have the program just crash and take your letter with it, fun huh?

                So like Netscape (putting out the buggy as fuck NS4), IBM with OS/2 (gave the finger to the OEMs with MCA bus and tried to charge $200 a copy to OEMs) BeOS (tied first to failed AT&T hobbit CPU, then tried to tie to PPC when Apple had dibs on over 80% of the chips in production) and Apple of the late 80s-mid 90s (fired CEO with taste for clueless corporate suits who filled the channel with crap while letting the OS fall waaay behind) Wordperfect GAVE the word processing business to MSFT by doing something forehead slappingly DUMB. It doesn't take an "evil genius" to win the battle when your competitor meets you on the battlefield, smiles, then promptly blows their own brains out, it only takes enough common sense to say "wow that was dumb, I shouldn't do that" and walking across the finish line without shooting yourself in the foot.

                I would argue this is why MSFT has had exactly zero luck making inroads into mobile computing, because Google and modern Apple show no signs of being willing to kill themselves with boneheaded ideas. The only real "skill" MSFT has ever had is seeing when a competitor really boned it and capitalizing on it, without that? they end up chasing last year's trends. But you can't blame WP on anybody but WP, they were the ones that put out a half baked alpha quality build as a finished product and you do NOT do that kind of shit when you are talking about software critical to day to day business ops, that's REALLY dumb.

                --
                ACs are never seen so don't bother. Always ready to show SJWs for the racists they are.
              • (Score: 2) by aristarchus on Thursday April 03 2014, @06:52PM

                by aristarchus (2645) on Thursday April 03 2014, @06:52PM (#25755) Journal

                Still talking past each other. Market dominance or widespread adoption does not a standard make. The point of standards is that they are owned by no one, and established by allegedly neutral professionals, like the SAE. Almost all "word-processors" from the beginning have attempted to lock in users with proprietary (read: non-standard in the sense above) file formats. And now the chickens are coming home to not be able to be read.

            • (Score: 2) by randmcnatt on Thursday April 03 2014, @10:05AM

              by randmcnatt (671) on Thursday April 03 2014, @10:05AM (#25432)

              I spent six months puzzling over an undocumented proprietary database. Turned out that, in some cases, the original programmers had used 6-bit ASCII to encode data, and stuffed unrelated info in the upper bits. They also used the same trick with 7-bit ASCII fields. So we had to deal with a proven standard used in a completely nonstandard way.

              --
              The Wright brothers were not the first to fly: they were the first to land.
          • (Score: 3, Interesting) by zafiro17 on Thursday April 03 2014, @11:13AM

            by zafiro17 (234) on Thursday April 03 2014, @11:13AM (#25450) Homepage

            You don't need ancient, pre-FOSS software to have a problem either. The new version of Mac Keynote doesn't have backward file compatibility.

            https://discussions.apple.com/thread/5525581 [apple.com]

            http://presentationmagic.com/2013/10/26/its-not-al l-bad/ [presentationmagic.com]

            These two URLs should give you the gist of the problem, but in sum, people are having problems with files created just a year or two ago, which is frankly unacceptable. I know it's turned me off a product I liked until not long ago.

            --
            Dad always thought laughter was the best medicine, which I guess is why several of us died of tuberculosis - Jack Handey
            • (Score: 2) by elf on Thursday April 03 2014, @12:27PM

              by elf (64) on Thursday April 03 2014, @12:27PM (#25483)

              They have already looked at that it seems

              http://www.freedesktop.org/wiki/Software/libetonye k/ [freedesktop.org]

              (This was a link on the project website)

              I like the idea of this project and think it should be applied to lots of different types of files

      • (Score: 2) by Hairyfeet on Thursday April 03 2014, @01:19PM

        by Hairyfeet (75) <bassbeast1968NO@SPAMgmail.com> on Thursday April 03 2014, @01:19PM (#25507) Journal

        But how often does that actually happen IRL anymore? I mean we have virtual machines now folks, no need to keep ancient hardware around anymore to run some dinosaur OS. Hell just for shits and giggles a few years back I searched across the net for installers or .ISOs of all the old OSes I ever used to see if i could get them running in VMs and...it really wasn't difficult at all.

        There were emulators that let me run the old Commodore BASIC, all the old DOS and Windows versions, OS/2, even the old Motorola Apple had an emulator that could run it and again REALLY straight forward. Oh and many of the emulators support CD and floppies so all it takes is a USB floppy and a CD/DVD drive in the host and you can run just about anything...hell I really wouldn't be surprised if somebody has a USB to cassette drive adapter for using the old Commodore and TRS80 cassettes.

        So I really don't understand the fuss, if you saved in some funky ancient format you can either use something that still supports said old funky format (which in the case of old MS Office file formats is beyond easy as MS Office 2K runs on Win 7 great and IIRC goes back to Word 3 as far as format support) or fire up a VM and run old funky software to convert it into something like RTF that everything supports.

        --
        ACs are never seen so don't bother. Always ready to show SJWs for the racists they are.
  • (Score: 1) by KritonK on Thursday April 03 2014, @11:20AM

    by KritonK (465) on Thursday April 03 2014, @11:20AM (#25454)

    The project is associated with LibreOffice and is already helping compatibility with old formats in a number of FOSS projects.

    That's interesting, considering that LibreOffice 4.0 dropped [documentfoundation.org] support for its own legacy formats!

    • (Score: 3, Insightful) by sigterm on Thursday April 03 2014, @12:45PM

      by sigterm (849) on Thursday April 03 2014, @12:45PM (#25492)

      That's interesting, considering that LibreOffice 4.0 dropped support for its own legacy formats!

      Newer versions of LibreOffice dropping support for older, LibreOffice-specific formats isn't a problem, because:

      - the old formats were (also) open, so we have the specs
      - older versions of LibreOffice will remain freely available, including source code

      Neither is true for proprietary formats and the software needed to parse them.

      • (Score: 2) by Hairyfeet on Thursday April 03 2014, @02:13PM

        by Hairyfeet (75) <bassbeast1968NO@SPAMgmail.com> on Thursday April 03 2014, @02:13PM (#25561) Journal

        OMG did you just honestly try to sell their dropping support as a fricking FEATURE, really? And you obviously DID NOT READ what he linked to as it was NOT just LO formats, but also StarOffice binary which AFAIK have NEVER been open source!

        To me this is just a perfect example of why FOSS only works in a few cases, because most of the time its like herding cats and not only does the left hand not know what the right is doing it also frankly don't give a shit. Here we have one side saying "we need to support turning more formats into FOSS formats" while at the very same moment you have the actual devs going "old code is old and we didn't write it so lets just toss it" LOL. And BTW what fucking good is the source code when its no longer in the program? Are you HONESTLY saying i should spend tens of thousands of dollars to hire a developer team to write my own version because some yahoos in LO decided they no longer want to support their OWN LEGACY FORMATS? Say what you will about MSFT but I can take a file created in MS Office 2K and have no problem opening in the latest and greatest, that is over 13 years of support. Staroffice binary was the default for OO.o as late as 06/07 IIRC so you can't even open documents half as old in LO...classic!

        If the LO foundation wanted to look disorganized and mickey mouse amateur hour by asking for more converters at the same time they are actually dropping some of the very same converters they are asking for? Congrats, mission accomplished.

        --
        ACs are never seen so don't bother. Always ready to show SJWs for the racists they are.
        • (Score: 2) by moondrake on Thursday April 03 2014, @03:21PM

          by moondrake (2658) on Thursday April 03 2014, @03:21PM (#25627)

          Two points (I considered posted this as AC :P):

          1) the code for reading/writing these files was free software. So it is irrelevant that the StarOffice binary was not (and actually, a certain cleaned-up source of StarOffice was made open and formed the basis of OpenOffice.org)
          2) Perhaps you did not read the link particularly well since your claim " Staroffice binary was the default for OO.o as late as 06/07 IIRC so you can't even open documents half as old in LO...classic!" is BS. From the linked page:
          "Note: the old OpenOffice.org XML file format (.sxw, .sxi etc.) which was used as the default format by StarOffice versions 6 and 7 is still supported."

          SO 5.2 was released in 2000. Every fileformat after that (from SO, OO.org and LO) is still supported.

          The older SO formats are no longer supported because it was a huge pile of (partially broken) code that few people could make sense of. Actually I am sure support would be accepted back into LO if someone rewrites its as a new import filter in maintainable code. This is exactly what this project hopes to address. Though I think there are more important things to address first.

          As for the tone of your post, particularly your last sentence, I can only hope that someone mods you down.

        • (Score: 1) by urza9814 on Thursday April 03 2014, @05:43PM

          by urza9814 (3954) on Thursday April 03 2014, @05:43PM (#25702) Journal

          Right, and I suppose you'd say the same thing about the various Winodows 'compatibility mode' options -- why are some MS developers removing these features only to have other developers add them back in!

          Why should LibreOffice keep stagnant, legacy code for legacy formats that few people use which could create instability in the program or security vulnerabilities, as well as simply adding more garbage to the code base? The few people who need to open these file formats should be able to download an external module or even a complete external software package to do that conversion. There's no reason the rest of us need to be put at risk for their convenience.

    • (Score: 1) by KritonK on Thursday April 03 2014, @01:47PM

      by KritonK (465) on Thursday April 03 2014, @01:47PM (#25534)

      My point was that, on one hand, the LibreOffice folks remove support for reading their own legacy formats and on the other hand they participate in a project that tries to read as many legacy formats as possible. It sounds a bit schizophrenic!

      I guess it isn't, though; if there is a bunch of format conversion programs available, that can convert from arbitrary formats to ODF, then LibreOffice does not need to support a multitude of formats. It can run a format converter on the input file and only have to deal with reading the resulting ODF.

    • (Score: 0) by Anonymous Coward on Friday April 04 2014, @07:22AM

      by Anonymous Coward on Friday April 04 2014, @07:22AM (#26078)

      They dropped support for those formats because they required the:
      "notorious binfilter, which was removed in LO 4.0 since it was unmaintainable."

      In other words that support was removed since it was a complete mess (and got in the way of the rest
      of LO development), and writing a new filter properly would make more sense -- the devs are still very
      open to having support for these (rare) legacy formats:
      https://wiki.documentfoundation.org/Development/GS oC/Ideas#Implement_legacy_StarOffice_binary_format s_import_filter [documentfoundation.org]

    • (Score: 0) by Anonymous Coward on Saturday April 05 2014, @12:49PM

      by Anonymous Coward on Saturday April 05 2014, @12:49PM (#26646)

      The StarOffice binary filter has been dropped because it was unmaintainable. It was essentially a copy of a big part of the StarOffice codebase as it existed 15 (or more; I think StarOffice 5 already had XML format) years ago, about half a million lines of code in total (for comparison, libcdr, which reads all versions of CorelDraw, is about 30000 lines of code). It got in the way of cleanups, like removing the obsolete string and container types, it would have been a lot of work to convert it to the new build system, etc. Since nobody had shown any interest in doing that, we (the LibreOffice Engineering Steering Committee) decided to drop it (btw, we were talking about dropping it since the very beginning of LibreOffice, so this could not have come as a surprise. There was plenty of time for people to volunteer and do the work.)

      -- David Tardon (I am posting as anonymous, because I am not going to register just for that one post. If you have further comments, feel free to post to discuss AT documentliberation DOT org.)