Announcement

Collapse
No announcement yet.

Pages extracted from Multipage PDF are larger than original file

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Pages extracted from Multipage PDF are larger than original file

    I am extracting pages from a multipage PDF.

    The original PDF is 974kb with 27 pages

    When I use the command line command:
    c:\progra~2\irfanv~1\i_view32.exe /filelist=c:\temp\filelist.txt /extract=(c:\temp,pdf) /killmesoftly

    ... then... each extracted page is somewhere between 2,526kb and 2,590kb.

    Which, after extracting all pages, gives me 67.2 MB.

    That is a massive increase in file size.

    Any ideas on how to manage the size of the extracted pages?

    #2
    Originally posted by ccbamatx View Post
    ... then... each extracted page is somewhere between 2,526kb and 2,590kb.
    IV converts text to image.

    Any ideas on how to manage the size of the extracted pages?
    Don't touch that PDF :-D

    Convert to png and optimize pics (pngout, optipng).

    Or search other tool. When I wanted to extract some pages from a large PDF, I am used PDFtk:
    Code:
    pdftk.exe big.pdf cat 4-12 output small.pdf
    Split into pages, text stays as text:
    Code:
    pdftk in.pdf burst
    Last edited by RottenImp; 08.05.2016, 09:49 PM.
    IV 4.56 32-bit

    Comment


      #3
      One other thing to add to the above post is that IV converts your PDF to a full color (24 bits per pixel) image even if the original was grayscale or just black and white.
      If you save that as a PDF it will indeed be a very big file.
      If it was grayscale or B&W you should convert it back to that before you save as PDF. That gives a big size reduction although it still may not be as small as your original PDF. It is still an image not text.
      I find that conversion to B&W is usually better if you do it in 2 stages - to grayscale, then to B&W. In your example ccbamatx that means inserting /gray /bpp=1 before the /extract command.

      One little wrinkle. The new PDF.DLL plugin does not yet handle saving of 1 bpp images properly. You need to open the PDF options in the Plugins tab of Properties/Settings and untick the box to "Save to PDF using PDF.DLL". Make sure that you still have the old ImPDF.DLL plugin in your plugins folder so that IV can use that instead. Leave the "Use PDF.DLL to Open PDF files" checked. That works fine.

      ImPDF has a lot of Save options that you may want to change before you let it run. Go there from Batch process dialog by setting "Output format to PDF" and clicking on "Options" alongside. One you will definitely want to have ticked is "not needed" under "Preview of PDF during Save operation" in the "General" tab. Otherwise the operation will keep stopping for every file to show you what it is about to Save.

      Comment


        #4
        I would recomend using PDFsam instead, that program is made for the purpose of merging/extraction of PDF files.
        If it hurts not to drint, don't waste the bottle then.

        Comment

        Working...
        X