1 may 2017

Fast PDF scaling with page numbering under Ubuntu

The problem

We want a backend process to scale PDF files and number pages. Currently, wer'e using some Java code bases on the last LGPL iText version (2.1.7) which does PDF scaling and stamping. But the code includes some features for custom output formatting (text tables, barcodes) for footers and margins written in Java, so that only software developers have the knowledge to customize and recompile the code. Wouldn't it be nicer if the customer could customize these output formats directly?

What we need:
  • PDF stamping feature
  • Page numbering
  • Page scaling
We've used PyPDF2 and xhtml2pdf in the past, but it may be too slow for big documents.

The proposed solution

pdfjam is a package with a bunch of scripts for pdf manipulation based on pdflatex / pdftext command line included in Tex Live Binaries packge. On Ubuntu, you can get it from the standard repositories.

Scaling


The following command line scales a PDF input file:

pdfjam --scale 0.9 --outfile output.pdf input.pdf

It's very quick. On my machine it takes less than 1s for a 120 page 2.1MB PDF file.

Page Numbering

With some additions we can generate page numbers. Note: the following command should be a one-liner:

pdfjam  --preamble '\usepackage{fancyhdr} \topmargin 85pt \oddsidemargin 140pt \pagestyle{fancy} \rfoot{\Large\thepage} \cfoot{} \renewcommand {\headrulewidth}{0pt} \renewcommand {\footrulewidth}{0pt} '  --pagecommand '\thispagestyle{fancy}' --scale 0.9 --outfile output.pdf input.pdf

This is still very quick. Some explanations go here:
  • The preamble argument is just the text which goes into the .tex command file fed to pdlatex just before the "\begin{document}" part.
  • The --pagecommand is an additional argument which goes into the "\includepdfmerge" command
  • If you want to have a look into the generated .tex command file, add --no-tidy to the command line.
  • The "topmargin" and "oddsidemargin" are set for A4 page size. You may experiment with your own preferences.

Page Numbering with the "{page} of {pages}" format

If we would like to write out page numbers like this, we need the lastpage Tex package. Now the pdflatex command (called from pdfjam) must be invoked twice. This requires changing the pdfjam shell script. Just replace the line:

$pdflatex $texFile > $msgFile || {

with something like this:

$pdflatex $texFile > $msgFile && if grep 'xdef' $auxFile > /dev/null ; then $pdflatex $texFile >> $msgFile ; fi || {

i.e.: If the aux file contains any xdef definition, we'll do second pass.

For Ubuntu, the lastpage Tex Live package is included in the texlive-latex-extras package. If you don't want to install the recommended documentation, you could run the following command:

sudo apt-get install --no-install-recommends texlive-latex-extra

Now, let's change the page numbering format:

pdfjam --preamble '\usepackage{fancyhdr} \usepackage{lastpage} \topmargin 85pt \oddsidemargin 140pt \pagestyle{fancy} \rfoot{\Large\thepage\ of \pageref{LastPage}} \cfoot{} '  --pagecommand '\thispagestyle{fancy}' --scale 0.9 --outfile output.pdf input.pdf

This doubles the time required to generate the document, but still 1.8s for my 120 pages document.


1 comentario:

  1. Thanks for this post. I am no latex expert and this post got me started in the right direction to solve my problem.

    ResponderEliminar