Troubleshooters.Com and T.C Linux Library Present

Scanning to PNM, then Concatenating to PDF

Copyright (C) 2010 by Steve Litt, All rights reserved. Material provided as-is, use at your own risk. 



Contents     
Other PDF Related Pages

Disclaimer

DO NOT use the techniques on this page without backing up your original .pnm files and resulting .pdf file. Do not use these techniques without fully verifying that the resulting PDF is what the recipient needs. I am not responsible for any damage or injury caused by your use of this page, or caused by errors and/or omissions in this page. If that's not acceptable to you, you may not use this web page. By using this web page, you are accepting this disclaimer.

Introduction

XSane's Multipage facility scans multiple pieces of paper to a sequence of .pnm files, and then when you click "Save Project" it concatenates those .pnm files into a single .pdf file. So far so good. But what happens if you need to make a change to one of those pages? You could use pdftk to split the PDF into individual pages, then change one with Gimp, then use pdftk to change them back. But Gimping PDF files pixellates them badly. It would be much better if you could use Gimp to directly edit the .pnm files, and then re-concatenate. Unfortunately, my finding is that there's no easy way to do that within Xsane. Hence this web page.

Multipage Scanning With Xsane

Xsane window    
To the left is the basic Xsane window. Notice in the upper right, right below the menubar, is a dropdown control set to "Multipage". The dropdown has the following choices:
  • Viewer
  • Save
  • Copy
  • Multipage
  • Fax
  • E-mail
When you select "Multipage", Xsane pops up the "Xsane Multipage Project" screen, as follows...

Multipage project screen as it first pops up
  
The multipage project screen, as shown to the left, starts with the last project name in the field. You can see I'm calling mine "myscan" and putting it in my home directory. Then press the Create project button, which creates directory /home/slitt/myscan. Once you click that button, a directory structure is created and the screen changes.

Here is the directory structure:
slitt@mydesk:~$ find myscan
myscan
myscan/xsane-multipage-list
slitt@mydesk:~$


And here are the contents of myscan/xsane-multipage-list:
Project created@@@@@@@@@@@@@@@@
image-0001.pnm


Notice there's no image-0001.pnm. This is a dummy -- it's the next image to be created.

Multipage project screen after project creation
  
As far as the screen changes, look to the left.

Now the Multipage document filetype dropdown is enabled. Make sure it's "PDF". The "Pages" window is enabled but empty. This window will later be filled with individual  scans, each of which is contained on disk as a .pnm file. Each scan, which will eventually be concatenated into a .pdf, can be shown, edited, deleted, moved up in the list or moved down in the list (in other words, the pages can be rearranged).

Multipage project screen, page 1
  
Now scan page 1. You see image-0001 in the scan window, and the directory structure has been changed as follows:
slitt@mydesk:~$ find myscan
myscan
myscan/xsane-multipage-list
myscan/image-0001.pnm
slitt@mydesk:~$ cat myscan/xsane-multipage-list
Project changed@@@@@@@@@@@@@@@@
image-0002.pnm
image-0001.pnm
slitt@mydesk:~$


Note once again that the image-0002.pnm mentioned in myscan/xsane-multipage-list doesn't yet exist, but instead is the name of the file corresponding to the next scan.

Multipage project screen, page 2
  
Scan the next page. Now image-0002 shows in the window. Here's the directory and file structure after the two scans:
slitt@mydesk:~$ find myscan
myscan
myscan/xsane-multipage-list
myscan/image-0001.pnm
myscan/image-0002.pnm
slitt@mydesk:~$ cat myscan/xsane-multipage-list
Project changed@@@@@@@@@@@@@@@@
image-0003.pnm
image-0001.pnm
image-0002.pnm
slitt@mydesk:~$


Notice the status in xsane-multipage-list is still "Project changed". This will change.

Click the "Save multipage file" button and watch what happens. Nothing changes, except where it used to say "Project changed", it now says "File has been saved".

Now look at your disk:
slitt@mydesk:~$ ls -ltr | tail -n4
drwxr-x---  2 slitt slitt    4096 2010-06-25 14:39 junkjunk
-rwx------  1 slitt slitt      76 2010-06-25 14:44 tempsegf1277491494.bat
-rw-r-----  1 slitt slitt  119476 2010-06-25 19:10 myscan.pdf
drwxr-x---  2 slitt slitt    4096 2010-06-25 19:10 myscan
slitt@mydesk:~$ find myscan
myscan
myscan/xsane-multipage-list
myscan/image-0001.pnm
myscan/image-0002.pnm
slitt@mydesk:~$ cat myscan/xsane-multipage-list
File has been saved@@@@@@@@@@@@
image-0003.pnm
image-0001.pnm
image-0002.pnm
slitt@mydesk:~$


In the preceding, you see the last two files changed were the myscan.pdf file, which is the multipage PDF, and the myscan directory, which contains the .pnm files and the list of .pnm files, in the proper order, in the form of multipage_project_page_one.png.

The following is a 2-up screenshot of the scanned document:
The scanned myscan.pdf

If you were to look at the two .pnm files within the myscan directory, they'd be identical to these two pages.

Modifying the Pages in .pnm Form

So what if you want to put Arabic numerals on each page, corresponding to the page number. Theoretically this could be done with the Edit buttons in Xsane, but that can present some problems. Instead, we'll do it manually.

First, install the graphicsmagick package. According to documents I've read, the graphicsmagick version of the convert program works better than the imagemagick version of that same program.

Next, you don't want to work directly on the myscan directory. This is for two reasons:
  1. You don't want to mess it up
  2. You might want the original plus the modified version
So do this:
cp -rp myscan myscan_modified

Now go into the myscan_modified directory, modify both .pnm files by putting a page number, and save each. DO NOT change the size of either.

Now, from within the myscan_modified directory, copy xsane-multipage-list to danger.sh. Edit danger.sh , deleting the status line (the line with the @ symbols), and the first .pnm, which is really the next one that would have been scanned. Now put all the filenames on one line, separated by spaces. At the beginning of the line insert the command convert followed by a space, and at the end of the line put a space followed by the PDF filename, ../myscan_modified.pdf. So danger.sh should be a one line file that looks like this:

convert image-0001.pnm image-0002.pnm ../myscan_modified.pdf

Now run that file from within the myscan_modifed directory like this:

. ./danger.sh

It will create a myscan_modified.pdf file in the parent directory. Use acroread to look at the result, and here's what you see:

Modified PDF

Notice to the left that you see the Arabic page numbers you inserted in Gimp.

The reason you modified xsane-multipage-lists to create danger.sh rather than just making a script that grabs (globs) all the .pnm files is that it's possible that in Xsane you rearranged the order, and that new order is available only within xsane-multipage-list.

Summary

Here's the process:
  1. Recursively copy the directory structure made in Xsane.
  2. cd to the new directory
  3. Use xsane-multipage-list to create a danger.sh to convert the list of .pnm files, in their order in xsane-multipage-list,  to a .pdf. Remember not to include the highest number .pnm, which represents the scan that WOULD HAVE been done if you hadn't finish. Danger.sh would look something like this:
  4. . ./danger.sh
  5. Check the resulting .pdf in the parent directory.



Back to Troubleshooters.Com * Back to Linux Library