PDA

View Full Version : Scanning and Re-assembling a Booklet



rumeyj
28.08.2014, 03:15 AM
The following outlines what I did to scan a booklet into a pdf document. It serves as a tutorial. The only question posed is whether you can see any improvements in my roundabout strategy. All feedback is welcome. No product is endorsed, and they are mentioned as I used them.

I recently thought it might be a good idea to scan my daughter's workbooks and set about it without thinking it through. I post this as it may be helpful to others.

I had access to a page scanner with a feeder that could continuously scan loose pages. The books had been stapled at the centre, so removing the staples and unfolding I had a lot of A3 size papers. The cover, I scanned separately.

The first book took me a couple of days, and I nearly gave up until I thought hard for improvements to the process. Here is the process so far. Any improvements and suggestions are welcome.

Scan the pages in order. Taking care to scan the pages in order saves heaps of time later. For this discussion, I will take a workbook that has 80 numbered pages on it. There are FOUR pages to a sheet. So there are 20 sheets. The numbering (on the book after scanning) will not be continuous as, for example, page 1 and page 80 will be on the same scan, page 1 on t the right and page 80 (the last) on the left.

Split the pages into half. This is where Irfanview comes to the rescue. I'd always had friends speak highly of Irfanview, but found the Batch feature only when I need to cut so many pages in half.

The scanner labelled each of the sides of sheets it scanned sequentially (it scanned both sides).. I had to cut them in half.

I ran two batches on the same files, once cropping the right (or Odd) side, and once the left (or Even) sides.

These were the settings I used:

The image size (can get by pressing "I" in Irfanview when viewing the image)

3850

was 3307 wide by 2338 tall.

To crop the right side I used (in File --> Batch --> Advanced

3851

And rename options
3852

(Remember at this stage I was doing the cover seperately. Later I found a neat trick to do it together and save another few minutes)

I had copied all the scanned files to a directory, and Then, in B(atch) mode, set that as the output directory and added all the files and ran the batch (first with above settings to crop right hand odd pages) then with the settings below for even pages.

3853

3854

If you are confident enough, you can ask Irfanview to delete the original scanned files after cropping at this stage by turning on (in B  advanced)
3855

before running the last batch command.
(don't forget to turn it off for the first part of the next book)

At the end of this, I had a series of Odd pages number oNN and even pages numbered eNN which seemed hopeless out of order. However, if you scanned them in order, you'd notice that there is pattern in the madness.

For example, in a book having only 20 pages,
3856


You'll notice that e01 is the last page, o01 the first, e02 the second, o02 one but last etc.

So it's just a matter of renaming them to their page numbers. I do this in the dos prompt (command line) and generated the commands using an excel macro that just takes in how many pages there are and generates the rename commands.

3857


The macro (that the command button runs) looks like this, or you can download the excel file (not yet figured out how to upload it :( code below)




Sub Macro1()
' by rumeyj

Dim i As Integer, lastPg As Integer
Dim extN As String, othN As String

lastPg = Cells(2, 2).Value 'this cell B2 is where the total number of pages is input

If lastPg Mod 4 > 0 Then lastPg = lastPg + 4 - (lastPg Mod 4)

Range("a5:a100").Clear
Range("A4").Select

For i = 1 To lastPg / 2
extN = Format(i, "00")
othN = Format(lastPg - i + 1, "00")

If i Mod 2 Then 'i is odd
'Debug.Print i; "i is odd"
PonNextLine ("rename e" & extN & ".jpg " & othN & ".jpg")
PonNextLine ("rename o" & extN & ".jpg " & extN & ".jpg")
Else
PonNextLine ("rename e" & extN & ".jpg " & extN & ".jpg")
PonNextLine ("rename o" & extN & ".jpg " & othN & ".jpg")
End If
Next i
End Sub

Function PonNextLine(ln As String)
'print on next line
ActiveCell.Offset(1, 0).Select
ActiveCell.Value = ln
End Function

Running this macro with, say 20 in cell B2 (that is to say 20 pages) gives,

rename e01.jpg 20.jpg
rename o01.jpg 01.jpg
rename e02.jpg 02.jpg
rename o02.jpg 19.jpg
rename e03.jpg 18.jpg
rename o03.jpg 03.jpg
etc.

Now I go to a command window (type Win-R and 'cmd' in that and enter)

Go into the directory where the files are saved. E.g. if they are in d:\user\home\book1\
Type

d:
cd d:\user\home\book1\

copy the generated commands from excel (highlight and Ctrl+C) and paste (right-click and paste) into the command window

this should rename all the files with their correct page numbers.

After the pages are renamed, it's time to do minor editing on each page, if necessary. You can use the Irfanview paint toolbar (F12 while viewing image) and eraser tool to remove minor blemishes. I used Gimp, which is pretty much an overkill for what I was doing.

Once the images are spic and span, you can use Irfanview to combine them into a single PDF file with Options  Multipage Images  Create Multipage PDF (you need the plugin). I found the generated pdf rather large, and just dumping them into MS-Word and saving as a pdf produced significantly smaller files. I am sure there is some dialogin Irfanview where I can set the output quality but haven't stumbled upon it yet.

Hope this helps… all comments and feedback welcome.

Bhikkhu Pesala
28.08.2014, 04:35 AM
It's something I have done a few times with poor quality books printed in Burma that I needed to edit. I didn't bother with batch processing as each page needed rotating, and perhaps some editing with IrfanPaint too.

1. I broke the book binding and scanned facing pages in order using batch scanning
2. After each scan
3. I applied a custom selection set up for the left page and cropped it after deskewing the scan with the IrfanPaint plugin
4. I saved the left page with the correct page number, page 002 etc.
5. I clicked undo, deskewed to suit the right page
6. I applied the custom selection and moved it to the right
7. I cropped the right page and saved it as 003, etc.
8. I scanned the next pair of pages.
9. I combined all of the pages in a multipage TIF for easy viewing in IrfanView.

If you want to use batch processes, I suggest running two process on your saved scans, and crop the left and right pages separately, then incrementing the page numbers by 2.

rumeyj
28.08.2014, 04:53 AM
If you want to use batch processes, I suggest running two process on your saved scans, and crop the left and right pages separately, then incrementing the page numbers by 2.

Thanks for your prompt feed back. I did do that. However, because, there are 4 pages on a sheet and they are not consecutive, I was into the whole renaming macro thing. But I see where you are going. can rename the odd pages, then the even ones separately by reversing the sort order and incrementing by two...this works if you originally scanned sheets having 2 pages, not four.

Thanks again.

Mij
01.09.2014, 08:46 PM
In this case you would need to do the renaming in 4 passes of the batch process.
One trick I have used in the past for selecting sets of images such as every 4th one, is to use Irfanview thumbnails. By adjusting the width of the Window you can make every 4th image line up in columns as in the image below. It is then fairly quick to hold down the Ctrl key while you click on each of the thumbnails in the appropriate columns.
Then press B and the Batch dialog opens with the selected set ready for your first pass.

3869