
A while ago, I explained the way I was going to create PDF files on Google AppEngine. Turns out that, once I started to really test this solution, there were a few holes in my strategy. Well actually, there was only one, but it was a major one: importing HTML files into Google Docs is a mess. Hardly any of the HTML formatting is preserved by Google Docs. You might get away with a little right align and bold here and there, but any table is completely ruined, as are some more advanced formatting options.
So I set out to find another solution.
Turns out there are quite a few Java libraries that support PDF. Although none of them seems to be in widespread use or have any sizable community. Except for one: iText and the reporting engine build on top, JasperReports.
I would have loved to use JasperReports as this would suite my need perfectly, but from my searches it appears there’s no way to get it to run on Google App Engine and there’s also little interest by the JasperReports developers to do so (at least, that was what I understood from this message)
So my only option was to use the basics. iText has a number of issues on Google AppEngine, but those are mainly with inserting images into PDFs, something which I don’t yet have the need for. I ran a few tests and, if you don’t use the “offending” code, there’s no need to make custom builds of iText. You can just grab the one from the iText jar from downloads page.
Be warned, if you are going for the latest iText. There have been a few refactorings (recently?) that make many of the existing tutorials outdated. Most of them will still work upto a certain level. For instance, I found these iText tutorials very helpful to get me started. However, if you really want to get into the latest version of iText, it seems you’ll have to buy the early access version of the book.
BTW, if I should ever return to converting HTML to PDF, I think “The Flying Saucer Project” is a great start. Although it only supports xHTML and doesn’t seem to be very actively developed, all users agree that it’s a pretty great library.
5 Comments
There is a reporting tool called NextReports (www.next-reports.com) which is not pixel-perfect. Designer has a grid layout like excel has. NextReports engine is free to use in your applications. For pdf generation, NextReports uses iText.
Check out http://pdfcrowd.com. They have an online html to pdf API that can be called from GAE.
PDFCrowd looks like a very capable convertor. One thing isn’t very clear: will I have to pay to remove the PDFCrowd footer in the future. Right now you can sign up for an account and remove it. I’m not sure if this will stay free in the future?
NextReport looks like a very capable product. I will have to try it out. The site wasn’t very clear on the licensing, specifically that of the engine. Do you happen to know if I’m allowed to use the engine freely in commercial products?
Yes you are allowed to use the engine freely in commercial products. And as starting from today, when a new version was released , the designer is also a free to use product. Only NextReports Server is a licensed product , but if you use it for other customers you may obtain an OEM license at half-price.