PDF Templates via Rails

March 5, 2007

Recently, for a project, I needed a PDF generator which took existing PDF as templates and filled in data. My simple requirement was PDF look and feel wouldn’t require any coding. How hard could that be? Famous last words.

The Problem

The majority of PDF generators out there are exactly that generators. They simply create new PDF’s based on programming code. They don’t edit existing PDF files, let alone have the ability to fill in data. So I looked into a couple other solutions. First I checked out image based solutions like using ImageMagick. I could of course edit existing image files and add data. But, the problem is the added image data would have to be positioned. If the template image changed, so would my positions in code. So that was out.

I tried using PostScript files saved from Adobe Illustrator. I created a template and placed text like <% replace_me %>. That should work right? Wrong. For whatever reason Adobe Illustrator likes to create HUGE files. Yes 100,000 lines long in some cases. It also splits up text arbitrarily. So “<% replace_me %>” became “<% r” 20 lines of positioning and font information, “eplac” 20 lines of positioning and font information, etc. Making a search and replace impossible.

iText to the Rescue

iText is an excellent Java PDF library. In fact, I believe other then it’s C# counterpart. Is the only PDF library which could do what I wanted. iText can take existing PDF’s and manipulate them. It also can take blank PDF forms and fill out the data, similar to taking HTML form and setting each form field tags value attribute. This is exactly what I wanted. Although it is an ad-hoc round about solution it does work.

Creating PDF forms is pretty easy also. Downside is only one product can create them Adobe LiveCycle Designer, which comes with Adobe Acrobat Professional.

Solution

So first I needed to create a wrapper around iText using the excellent Ruby Java Bridge(Rjb).

require 'rjb'
Rjb::load('lib/itext-1.4.8.jar')

class PDFStamper
  attr_accessor :writer

  def initialize( template = "proposal_template.pdf" )
    filestream   = Rjb::import('java.io.FileOutputStream')
    acrofields   = Rjb::import('com.lowagie.text.pdf.AcroFields')
    pdfreader    = Rjb::import('com.lowagie.text.pdf.PdfReader')
    pdfstamper   = Rjb::import('com.lowagie.text.pdf.PdfStamper')

    reader = pdfreader.new( template )
    @stamp = pdfstamper.new( reader, filestream.new( tmpfile() ) )
    @form = @stamp.getAcroFields()
  end

  def set( key, value )
    @form.setField( key, value.to_s )
  end

  def fill
    @stamp.setFormFlattening(true)
    @stamp.close
  end

  def tmpfile
    return @tmpfile unless @tmpfile.nil?
    @tmpfile = File.join( Dir::tmpdir, make_tmpname )
  end

  private

  def make_tmpname
    return 'proposal-' + rand(10000).to_s + '.pdf'
  end
end

Then using Adobe LiveCycle Designer, I simply created a PDF form and added textfield’s which would be filled out by iText. The textfield’s can be styled, so don’t think you have to keep that normal “textfield” look. Make sure you give each textfield a name that you will use “set( textfield_name, value)” to set the value . In my PDF I simply named each textfield after the database fields. Then in my controller code, I had the following.

def output
  order = Order.find(params[:id])
  pdf = PDFStamper.new

  for column in Order.content_columns
    pdf.set( column.name, order.send(column.name) )
  end

  pdf.fill

  send_data( File.open( pdf.tmpfile ).read,
    :filename => "order.pdf",
    :type => "application/pdf",
    :disposition => "inline"
  )
end

Caveats

First of, you need Java environment variables set correctly before this will work.

export LD_LIBRARY_PATH=/usr/java/jdk1.6.0/jre/lib/i386/:/usr/java/jdk1.6.0/jre/lib/i386/client/:./
export JAVA_HOME=/usr/java/jdk1.6.0/

You can set these variables in the command line and start mongrel manually “mongrel_rails start”. Which will work fine. Except in production this isn’t really a good solution.

I ended up using the mongrel_cluster init.d script that comes with mongrel. Documentation is available here. I simply placed the export commands on the top of the script.

Another issue I hit was when Java starts. Java will check for total available system memory and then precedes to steal a good portion of it. Now this isn’t a problem with a dedicated server. A virtual server, on the other hand, is allocated a portion of the available system memory. So if the server you are on has 4gbs of memory. Java thinks it has 4gbs to play with, not the 256mb allocated to your virtual server.

This caused this weird issue where one mongrel process in my cluster would work and one wouldn’t. Because each mongrel instance starts its own Java process. The first one would steal all the available memory. Then the second couldn’t even start because no memory was available.

Making matters worse Rails or Mongrel, not sure which, would hide this memory error. I didn’t figure it out until I created a test script that forked, each fork loading the iText jar. The test showed the error coming from Java.

To fix this, I set the _JAVA_OPTIONS environment variable. The options get sent to Java as it loads, it limits the amount of ram each Java instance can eat up. Just place this next to your other Java environment variables inside your init.d mongrel script.

export _JAVA_OPTIONS='-Xms16m -Xmx32m'

You may have to fudge these numbers a little for your particular environment. Or, if you are using a dedicated server don’t worry about it.

Limitations

Now for my particular needs, I only needed text placeholders for the template. However, I believe using LiveCycle designer you can place image placeholders and table based data. Then use iText to fill them in. Don’t take my word for it though.

10 Responses to “PDF Templates via Rails”


  1. Thank you for the excellent blogpost. I have added a link to it in my FAQ: http://itext.ugent.be/library/question.php?id=35

    Some extra pointers:

    You are right when you say there is only one product that can create XFA forms: Adobe LiveCycle Designer. But there are plenty of other products that can create AcroForms.
    See http://itext.ugent.be/library/question.php?id=31 to read more about the difference between the two flavors of PDF forms.

    As for using Image placeholders: there’s a possible solution in the third comment of the FAQ entry on forms, but in a more recent iText version a new method was added to iText that makes it easier to replace a button (buttons are the best placeholders for images): search for AcroFields.replacePushbuttonField()

  2. ed Says:

    This is was I was looking for !

    I have this solved in a java application using a servlet; now I am trying to rewrite my application in ruby/rails.

    Into which file does the pdfstamper code go ? Would that be a model file ?

    Does it matter where you place the itext.jar and if so , where do you have to place it ?

    Is there any rule regarding the path to the template ?

    Thank you

  3. magnum Says:

    i get this error, can you help me ?
    Can’t start the AWT because Java was started on the first thread. Make sure StartOnFirstThread is not specified in your application’s Info.plist or on the command line

    ?

  4. Jason Yates Says:

    Funny. I was getting this error a couple days ago. I believe it is a issue on Leopard/OS X. I moved the code over to my Ubuntu box and it worked correctly.

    I also created a RubyForge project for this code. It has been updated. You might want to check that out also.

    http://rubyforge.org/projects/pdf-stamper/

    The code on SVN is the lastest and greatest and should work. It also works on JRuby.

    E-mail me if you find a workaround.

  5. magnum Says:

    i think too, but now i’ve got another error, it return
    /Users/magnum/Documents/webapps/gest/lib/pdfstamper.rb:29:in `method_missing’

    always when i call

    def set( key, value )
    puts(key+”=”+value.to_s)
    begin
    @form.setField( key, value.to_s )

    rescue

    end

    end

    please, have you a skype account or something similar :-) ? my skype account skype: reactiva , gtalk: reactiva@msn.com

    thanks a lot !
    Bye Antonio (magnum)

  6. magnum Says:

    the funny thing is that i just installed the one from sn on rubyforge

    i’m using this code:

    def print_pdf
    @pdf = PDF::Stamper.new(”/Users/magnum/webapps/gest/lib/test_template.pdf”)
    @pdf.text :text_field01, ‘test’
    @pdf.image :button_field01, “logo.gif”
    @pdf.save_as “test_output.pdf”

    end

    @pdf.image works perfectly
    BUT @pdf.text always throw me an exceptio, always:
    NoClassDefFoundError in InvoiceController#print_pdf
    /Users/magnum/Documents/webapps/gest/lib/pdf/stamper/rjb.rb:66:in `method_missing’
    /Users/magnum/Documents/webapps/gest/lib/pdf/stamper/rjb.rb:66:in `text’
    /Users/magnum/Documents/webapps/gest/app/controllers/invoice_controller.rb:55:in `print_pdf’
    /Library/Ruby/Gems/1.8/gems/actionpack-1.13.6/lib/action_controller/base.rb:1101:in `send’


  7. Jason Yates Says:

    It would probably be best if you post this on the PDF::Stamper mailing list.

    http://rubyforge.org/mailman/listinfo/pdf-stamper-devel

  8. Daniel Says:

    Hi, After much banging-of-the-head-against-the-wall I figured out how to circumvent the “Can’t start the AWT” error on Leopard. Rjb::load has an optional second argument that’s an array of command-line args for the java vm. Adding -Djava.awt.headless=true seems to do the trick. Ultimately, line 12 in the current version should reads as follows:

    Rjb::load(File.join(File.dirname(__FILE__), ‘..’, ‘..’, ‘ext’, ‘itext-2.0.4.jar’), ['-Djava.awt.headless=true'])

    Dunno how it affects non-Mac environments but my guess is that it won’t hurt. Anyway, thanks for this great code, and please continue to work on it- it’s quite useful.


  9. [...] PDF Templates via Rails « Just a Techie Blog (tags: generator create howto pdf rails reports ruby rubyonrails template tutorial) [...]

  10. Constantine Mavros Says:

    All, I had this solution working until I had to move it to Backgrondrb for automation purposes. Now every time It tries to load the PdfWriter it seg faults. Anyone have any thoughts? Any help appreciated.


Leave a Reply