Using PHPWord to generate text documents in PHP

I started working on a project where I need to generate reports as text documents that can then be edited with a word processor. The reports will be a combination of rich text blocks from HTML and data sets in tables. Open Document (odt) or Microsoft’s Office Open XML (docx) are both suitable choices for the file format since they are both open and supported by most word processors.

It’s relatively easy to generate Open Document spreadsheets, but converting HTML text into a formatted document is much more difficult. Luckily there are a handful PHP libraries out there for working with Open Document or Office Open XML files that could make this project a lot easier. To meet my needs the library must be able to create a document, import HTML blocks and create tables. I’m going to test the most promising looking libraries, compare the results and post my results here. I’ve created a live demo for testing each library which can be found here http://sporkcode.com/sandbox/document.

PHPWord

https://github.com/PHPOffice/PHPWord

Demo http://sporkcode.com/sandbox/document/phpWord

PHPWord looks very promising. It’s open source, actively maintained, has documentation and examples, supports many file formats and has an extensive API for working with documents. It’s also also easy to install using composer.

Unfortunately the library failed to format the document correctly in the demo. The first problem is that it doesn’t recognize <b> or <i> tags in the HTML. This seems like a pretty glaring omission, but it was easy to get around by using HTML Purifier to change the tags into <strong> and <em>. The library also has problems formatting text that has inline HTML tags, but is not wrapped in a block level tag such as a <p>. This especially a problem since text generated by HTML editor in the demo is not always wrapped in block level tags.

Open Document Text

The Open Document text file had many rendering problems.

PHPWord Open Document Text screen capture

Screen capture of Open Document Text file created by PHPWord

As the image above shows the heading tag did not render any formatting, inline tags did not render correctly when not inside another tag (as mentioned above), the unordered and ordered list did not render at all and the table cells did not render any formatting.

Office Open XML

The Office Open XML document rendered better, but still had problems.

PHPWord Office Open XML screen capture

Screen capture of Office Open XML generated by PHPWord

There are still problems with inline tags not wrapped in block level tag. The table styles rendered correctly, but the first columns took up the entire width of the document until I set a static column width.

Portable Document Format (pdf)

The library has the option to render to a PDF, however it requires an additional library which I didn’t test because of the problems with Open Document and Office Open XML formats.

Other Issues

I took a quick look at the source code to see how difficult it would be to fix some of the problems and was not very impressed with what I found. The code seems rather thrown together. There are unnecessary static functions, arguments passed by reference and use of reflection classes. Not that there was anything wrong with the code, but it doesn’t utilize what I would consider best practices. It appears that the project is a branch of a project of the same name created by CodePlex, which could explain the patchwork nature of the code.

Another thing that annoyed me is that all of the syntax for document styling uses older HTML attribute names (valign, bgcolor) instead of using CSS conventions.

Conclusion

Overall the library has impressive features and appears well supported, but sill has plenty of issues. Even though the project is supported I have doubts that it will become stable. I was very disappointed by the results of the Open Document support and if I use this library I will definitely stick with Office Open XML.

Next: Using HTML52PDF to create text documents in PHP

Leave a Reply

Your email address will not be published. Required fields are marked *