This is an attempt at getting old scanned journal papers to be
viewable on the e-readers, particularly my Sony PRS300.  The devices
themselves tend not to deal well with image PDFs, and calibre doesn't
do any better it seems.  This program splits each page into 4 images
which tends to be enough to make them quite legible on a 5" screen.
As you may imagine, this works well for two-column journal style split
exactly down the middle, and not-at-all for most other layouts.

It uses the Calibre-generated output as templates, and pdftohtml (part
of evince/poppler) to do the actual conversion of pdf to images.  It
uses ImageMagick's convert to resize the images and also uses PIL to
read the image sizes.  These last two tasks could be done by either
tool alone but I just picked what was easiest to code.  All of these
packages should come builtin with any standard linux distribution (KDE
installations might need to install evince).

Usage:
./converttoimageepub.py --help
Usage: converttoimageepub.py [-v] [-h] [options] <pdf file>

Options:
  -h, --help            show this help message and exit
  -v, --verbose         Verbosity.  Invoke many times for higher verbosity
  -o OUTPUTFILENAME, --output=OUTPUTFILENAME
                        Output filename.  If not specified, it is like the
                        input with an epub extension
  --keep-work           Leave temporary directory behind
  --width=WIDTH         Width of resultant images (default: 550)
  --height=HEIGHT       Height of resultant images (default: 747)


Mostly just calling it with a PDF as an argument will generate a file
with the same name but an .epub extension in your current directory.
The default width and height match PRS300 output but you'll have to
find what works for your device (hint: get height right, make width
something wider than you think is right)

pdftohtml takes longer than you think to finish, so run the program
with -vvv to watch it go.


License is GPLv3 since it uses Calibre output and Calibre is thusly
licensed.  The code in the script itself isGPLv3/BSD/WTFPL, take your pick.


