PDF Templates via Rails
March 5, 2007
Recently, for a project, I needed a PDF generator which took existing PDF as templates and filled in data. My simple requirement was PDF look and feel wouldn’t require any coding. How hard could that be? Famous last words.
The Problem
The majority of PDF generators out there are exactly that generators. They simply create new PDF’s based on programming code. They don’t edit existing PDF files, let alone have the ability to fill in data. So I looked into a couple other solutions. First I checked out image based solutions like using ImageMagick. I could of course edit existing image files and add data. But, the problem is the added image data would have to be positioned. If the template image changed, so would my positions in code. So that was out.
I tried using PostScript files saved from Adobe Illustrator. I created a template and placed text like <% replace_me %>. That should work right? Wrong. For whatever reason Adobe Illustrator likes to create HUGE files. Yes 100,000 lines long in some cases. It also splits up text arbitrarily. So “<% replace_me %>” became “<% r” 20 lines of positioning and font information, “eplac” 20 lines of positioning and font information, etc. Making a search and replace impossible.
iText to the Rescue
iText is an excellent Java PDF library. In fact, I believe other then it’s C# counterpart. Is the only PDF library which could do what I wanted. iText can take existing PDF’s and manipulate them. It also can take blank PDF forms and fill out the data, similar to taking HTML form and setting each form field tags value attribute. This is exactly what I wanted. Although it is an ad-hoc round about solution it does work.
Creating PDF forms is pretty easy also. Downside is only one product can create them Adobe LiveCycle Designer, which comes with Adobe Acrobat Professional.
Solution
So first I needed to create a wrapper around iText using the excellent Ruby Java Bridge(Rjb).
require 'rjb'
Rjb::load('lib/itext-1.4.8.jar')
class PDFStamper
attr_accessor :writer
def initialize( template = "proposal_template.pdf" )
filestream = Rjb::import('java.io.FileOutputStream')
acrofields = Rjb::import('com.lowagie.text.pdf.AcroFields')
pdfreader = Rjb::import('com.lowagie.text.pdf.PdfReader')
pdfstamper = Rjb::import('com.lowagie.text.pdf.PdfStamper')
reader = pdfreader.new( template )
@stamp = pdfstamper.new( reader, filestream.new( tmpfile() ) )
@form = @stamp.getAcroFields()
end
def set( key, value )
@form.setField( key, value.to_s )
end
def fill
@stamp.setFormFlattening(true)
@stamp.close
end
def tmpfile
return @tmpfile unless @tmpfile.nil?
@tmpfile = File.join( Dir::tmpdir, make_tmpname )
end
private
def make_tmpname
return 'proposal-' + rand(10000).to_s + '.pdf'
end
end
Then using Adobe LiveCycle Designer, I simply created a PDF form and added textfield’s which would be filled out by iText. The textfield’s can be styled, so don’t think you have to keep that normal “textfield” look. Make sure you give each textfield a name that you will use “set( textfield_name, value)” to set the value . In my PDF I simply named each textfield after the database fields. Then in my controller code, I had the following.
def output
order = Order.find(params[:id])
pdf = PDFStamper.new
for column in Order.content_columns
pdf.set( column.name, order.send(column.name) )
end
pdf.fill
send_data( File.open( pdf.tmpfile ).read,
:filename => "order.pdf",
:type => "application/pdf",
:disposition => "inline"
)
end
Caveats
First of, you need Java environment variables set correctly before this will work.
export LD_LIBRARY_PATH=/usr/java/jdk1.6.0/jre/lib/i386/:/usr/java/jdk1.6.0/jre/lib/i386/client/:./ export JAVA_HOME=/usr/java/jdk1.6.0/
You can set these variables in the command line and start mongrel manually “mongrel_rails start”. Which will work fine. Except in production this isn’t really a good solution.
I ended up using the mongrel_cluster init.d script that comes with mongrel. Documentation is available here. I simply placed the export commands on the top of the script.
Another issue I hit was when Java starts. Java will check for total available system memory and then precedes to steal a good portion of it. Now this isn’t a problem with a dedicated server. A virtual server, on the other hand, is allocated a portion of the available system memory. So if the server you are on has 4gbs of memory. Java thinks it has 4gbs to play with, not the 256mb allocated to your virtual server.
This caused this weird issue where one mongrel process in my cluster would work and one wouldn’t. Because each mongrel instance starts its own Java process. The first one would steal all the available memory. Then the second couldn’t even start because no memory was available.
Making matters worse Rails or Mongrel, not sure which, would hide this memory error. I didn’t figure it out until I created a test script that forked, each fork loading the iText jar. The test showed the error coming from Java.
To fix this, I set the _JAVA_OPTIONS environment variable. The options get sent to Java as it loads, it limits the amount of ram each Java instance can eat up. Just place this next to your other Java environment variables inside your init.d mongrel script.
export _JAVA_OPTIONS='-Xms16m -Xmx32m'
You may have to fudge these numbers a little for your particular environment. Or, if you are using a dedicated server don’t worry about it.
Limitations
Now for my particular needs, I only needed text placeholders for the template. However, I believe using LiveCycle designer you can place image placeholders and table based data. Then use iText to fill them in. Don’t take my word for it though.
March 6, 2007 at 8:06 am
Thank you for the excellent blogpost. I have added a link to it in my FAQ: http://itext.ugent.be/library/question.php?id=35
Some extra pointers:
You are right when you say there is only one product that can create XFA forms: Adobe LiveCycle Designer. But there are plenty of other products that can create AcroForms.
See http://itext.ugent.be/library/question.php?id=31 to read more about the difference between the two flavors of PDF forms.
As for using Image placeholders: there’s a possible solution in the third comment of the FAQ entry on forms, but in a more recent iText version a new method was added to iText that makes it easier to replace a button (buttons are the best placeholders for images): search for AcroFields.replacePushbuttonField()
June 7, 2007 at 9:57 am
This is was I was looking for !
I have this solved in a java application using a servlet; now I am trying to rewrite my application in ruby/rails.
Into which file does the pdfstamper code go ? Would that be a model file ?
Does it matter where you place the itext.jar and if so , where do you have to place it ?
Is there any rule regarding the path to the template ?
Thank you
December 4, 2007 at 10:46 pm
i get this error, can you help me ?
Can’t start the AWT because Java was started on the first thread. Make sure StartOnFirstThread is not specified in your application’s Info.plist or on the command line
?
December 4, 2007 at 10:52 pm
Funny. I was getting this error a couple days ago. I believe it is a issue on Leopard/OS X. I moved the code over to my Ubuntu box and it worked correctly.
I also created a RubyForge project for this code. It has been updated. You might want to check that out also.
http://rubyforge.org/projects/pdf-stamper/
The code on SVN is the lastest and greatest and should work. It also works on JRuby.
E-mail me if you find a workaround.
December 4, 2007 at 11:49 pm
i think too, but now i’ve got another error, it return
/Users/magnum/Documents/webapps/gest/lib/pdfstamper.rb:29:in `method_missing’
always when i call
def set( key, value )
puts(key+”=”+value.to_s)
begin
@form.setField( key, value.to_s )
rescue
end
end
please, have you a skype account or something similar
? my skype account skype: reactiva , gtalk: reactiva@msn.com
thanks a lot !
Bye Antonio (magnum)
December 5, 2007 at 1:01 am
the funny thing is that i just installed the one from sn on rubyforge
i’m using this code:
def print_pdf
@pdf = PDF::Stamper.new(”/Users/magnum/webapps/gest/lib/test_template.pdf”)
@pdf.text :text_field01, ‘test’
@pdf.image :button_field01, “logo.gif”
@pdf.save_as “test_output.pdf”
end
@pdf.image works perfectly
BUT @pdf.text always throw me an exceptio, always:
NoClassDefFoundError in InvoiceController#print_pdf
/Users/magnum/Documents/webapps/gest/lib/pdf/stamper/rjb.rb:66:in `method_missing’
/Users/magnum/Documents/webapps/gest/lib/pdf/stamper/rjb.rb:66:in `text’
/Users/magnum/Documents/webapps/gest/app/controllers/invoice_controller.rb:55:in `print_pdf’
/Library/Ruby/Gems/1.8/gems/actionpack-1.13.6/lib/action_controller/base.rb:1101:in `send’
…
…
…
December 5, 2007 at 6:26 pm
It would probably be best if you post this on the PDF::Stamper mailing list.
http://rubyforge.org/mailman/listinfo/pdf-stamper-devel
December 27, 2007 at 3:47 am
Hi, After much banging-of-the-head-against-the-wall I figured out how to circumvent the “Can’t start the AWT” error on Leopard. Rjb::load has an optional second argument that’s an array of command-line args for the java vm. Adding -Djava.awt.headless=true seems to do the trick. Ultimately, line 12 in the current version should reads as follows:
Rjb::load(File.join(File.dirname(__FILE__), ‘..’, ‘..’, ‘ext’, ‘itext-2.0.4.jar’), ['-Djava.awt.headless=true'])
Dunno how it affects non-Mac environments but my guess is that it won’t hurt. Anyway, thanks for this great code, and please continue to work on it- it’s quite useful.
April 22, 2008 at 6:37 pm
[...] PDF Templates via Rails « Just a Techie Blog (tags: generator create howto pdf rails reports ruby rubyonrails template tutorial) [...]
December 7, 2008 at 12:35 am
All, I had this solution working until I had to move it to Backgrondrb for automation purposes. Now every time It tries to load the PdfWriter it seg faults. Anyone have any thoughts? Any help appreciated.