Stories
Slash Boxes
Comments

SoylentNews

SoylentNews is people

Sections

SoylentNews

Locking USPTO Patent Filings into Microsoft with DOCX

posted by Fnord666 on Wednesday May 26 2021, @01:13PM

from the vendor-capture dept.

canopic jug writes:

There are still a few months to fix this, but for now the US Patent and Trademark Office's (USPTO) Acting Commissioner for Patents, Andrew Faile, and Chief Information Officer, Jamie Holcombe, have announced that starting January 1^st, 2022, the USPTO will institute a surcharge for applicants that are not locked into Microsoft products via the proprietary DOCX format. From that date onwards, the USPTO will move away from PDF and require all filers to use that proprietary format or face an arbitrary surcharge when filing.

First, we delayed the effective date for the non-DOCX surcharge fee to January 1, 2022, to provide more time for applicants to transition to this new process, and for the USPTO to continue our outreach efforts and address customer concerns. We've also made office actions available in DOCX and XML formats and further enhanced DOCX features, including accepting DOCX for drawings in addition to the specification, claims, and abstract for certain applications.

One out of several major problems with the plans is that DOCX is a proprietary format. There are several variants of DOCX and each of them are really only supported by a single company's products. Some other products have had progress in beginning to reverse engineering it, but are hindered by the lack of documentation. DOCX is a competitor to the fully-documented, open standard OpenDocument Format, also known as ISO/IEC 26300.

DOCX is not to be confused with OOXML, though it often is. While OOXML, also known as ISO/IEC 29500, is technically standardized, it is incompletely documented and only vaguely related to DOCX. The DOCX format itself is neither fully documented nor standard. So the USPTO is also engaged in spreading disinformation by asserting that it is.

Previously:
(2015) Microsoft Threatened the UK Over Open Standards

Original Submission

This discussion has been archived. No new comments can be posted.

Locking USPTO Patent Filings into Microsoft with DOCX | Log In/Create an Account | Top | 70 comments | Search Discussion

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.

Re:Format automation (Score: 0) by Anonymous Coward on Friday May 28 2021, @03:06PM

by Anonymous Coward on Friday May 28 2021, @03:06PM (#1139653)

Thanks, but I know all that. I was there.
A fact that is impossible to tell from just a username.
Soon it became more common to display a postscript file to the screen than print it to paper.
Which I suspect had a large impact in Adobe's invention of PDF. PDF is the Postscript font and rendering engine hooked up to a different set of formatting instructions. Since Postscript was already calculating exact pixel positioning for every item drawn on a page, having PDF simply be a format that archived that positioning info meant that PDF was not much of a change from Postscript (the single biggest difference is dropping the general purpose programming language part of Postscript). PDF is largely what you get if you start with Postscript, remove all the general purpose programming language commands, and rename the "drawing commands" into different names.
Once you get away from paper, and consider how best to store knowledge digitally, it's rather obvious that PDF is terrible.
100% agreement. PDF is not at all a good format into which to store data of any form. The only and only thing PDF does well is preserve the physical page layout of the printed document, hense my referring to PDF as electronic paper. It really is little more than electronic paper
The interesting part of any document is the contents, not typesetting data. The two should not be jumbled all together. HTML has a lot of shortcomings, but it is just a plain better approach to the problem. The world isn't standardized on 8.5x11 inch paper.
Also full agreement. PDF is rigid, just like a physical sheet of paper can't change size to accommodate some difference in viewing, neither can PDFs. PDF's simply preserve the exact pixel positioning of everything on the page.
PDF lets the writer dictate that and other such details to the reader. HTML gives the reader much more control.
Yup, and there is probably an underlying reason (beyond that Adobe simply distilled Postscript down to just the "drawing commands") for why PDF is so rigid. Have you ever had the miss-fortune to work with any of the "page designer" or "page layout" crowd? I.e., the folks one hires to do the magazine layout and decide how things should look? These folks, almost 100%, all consider the "design" (the layout, where things are positioned, how much space is here, how big this font is set) over the actual "content" of anything the produce. A huge part of this is because for them, often, when they are producing a layout design, the content is something like lorem ipsum [wikipedia.org] text (i.e., meaningless filler) and so the only thing they deal with, and the only thing they can use to pat each other on the back for "job well done" is the layout (i.e., the physical arrangement of stuff on the page).
These same folks are also almost rabid in their belief that an end recipient of their wondrous "design" should only ever be able to see their wondrous "design" in its exact, pixel perfect, positional glory. This comes in large part from the "design" being all they have to congratulate themselves about, since the content, for them, when they did the job was just lorem ipsum. And back in the late 80's an early 90's at Adobe, this was the world into which Adobe was pandering their software offerings. The page layout designer who was rabid in his/her belief that their design should never be modified from the beautiful work of art they created by anyone viewing it later on any medium. With this being their world, it is no wonder that the folks at Adobe who dreamed up converting Postscript into what became PDF saw no problems what-so-ever with PDF's rigidity. The expectation in their world was that the document storage format should rigidly preserve their wondrous design for everyone to marvel at who later viewed it.
These same folks are also why HTML has been soiled by CSS that provides the ability to do pixel exact, unchanging, positioning and sizing. They simply could not handle the concept that something they "designed" might be modified by an end users browser such that things were no longer exactly positioned where they, the designer, decided they should be positioned. Every single CSS declaration where there is the ability to exactly position some HTML element is there as pandering to this world view on the part of the layout artists.
You mention that all the word processors saved documents by basically dumping their working memories to files, and this resulted in nothing being compatible with anything else.
Nope, I said nothing of the sort. Someone else has mentioned that MSWord's old DOC format was basically a memory dump from word's heap, and that fact has been known for some time. But whomever mentioned that wasn't me. What I said was there were something like 12 different word-processors, each reading/writing 12 different file formats (each format specific to the WP that wrote it), and with none of the 12 providing much of any ability to interoperate with the others (i.e., read/write the other 11 formats that were not their own). But I did not say that all 12 were memory dumps. They might have all been memory dumps, or maybe only one of them was a memory dump (msword's doc format). But I never said they were all memory dumps, just that they were all incompatible with sharing with each other.
but there was a standard then, and it was even free and open: LaTeX, and before that, TeX.
Indeed, yes, there was. And unless one was an academic going for their doctorate in one of the sciences that published via TeX/LaTeX one generally knew nothing of the existence of those tools. A format based on Tex/LaTex source, plus enough extra baggage to carry any custom fonts used by the Tex/LaTex source, would have been a far superior way to exchange documents that were also useful as data sources than PDF will ever be.

Parent

Moderator Help

Nadia Comaneci, simple perfection. -- '76 Olympics