Office Open XML Parser
This parser can be used to compare Office Open XML documents, usually named Microsoft Word docx
. The parser output is similar to an PDF conversion of the document, but extended for the semantic data of the original document.
The Office Open XML Feature is partially based on open source components and will be continuously improved.
Supported Features
This parser component is developed according to the Office Open XML specification 'ECMA-376-1'. In cases where the specification is ambiguous, Microsoft Word 2019 is used as a reference. In case a document was created by another application the rendering result may be slightly different compared to the source application.
The current version not yet supports all features of the Office Open XML specification. Currently supported are:
Paper format | |
---|---|
Margins | Orientation |
Format | Columns |
Breaks | Indentation |
Spacing | |
Text formatting | |
Size | Typeface (installed only) |
Styles | Color |
Lists | Justification |
Enumerations(Standard) | Ligatures (partially) |
Spacing | Widow/Orphan control |
Pagination | Text wrapping |
Tables | |
Colors | Borders |
Merged cells | embedded Excel tables |
Multi page tables | Repeating headers |
Rotation/Scaling/Mirroring | |
Shapes | |
Basic forms | Free forms |
Line width | Connections |
Caps | Line styles |
Color | Rotation/Scaling/Mirroring |
Images | |
Positioning | Rotation |
Scaling | Mirroring |
Text fields | |
Positioning | Rotation |
Header and Footer | |
Multi line | Section header/footer |
Even/Odd pages | Page numbers |
Draft | |
Water marks | Page color |
References | |
Table of contents | Page numbers |
Not yet supported features
The following features are not yet support. They will be ignored if used in a document.
Text | |
---|---|
Contour | Shadows |
Mirroring | Glow |
Comments | Formulas and Symbols |
Automatic hyphenation | |
Shapes | |
Shadow | Glow |
Soft Edges | 3-D Rotation |
Arrows | |
Images | |
Shadows | Mirroring |
Glow | Soft Edges |
3-D Format | 3-D Rotation |
Art effect | Image corrections |
Charts | |
Font-Embedding |