Hello all,
This article lets you get introduced with a python library that converts pdf to images.
Install packages
pdf2image
library converts the PDF file to a PIL image.
pip install pdf2image
Please follow this link to install poppler based on your OS. Poppler is a library to render PDF files.
Get a pdf file to test
I have a taken a free sample pdf available to use from online.
Let's see the steps now
You will be using convert_from_path()
from pdf2image library.
from pdf2image import convert_from_path()
Now just pass the url of your pdf file into convert_from_path() and store the images into a list.
pdf_images = convert_from_path('/Users/..../sample.pdf')
You can save the images with filenames that you want using save()
.
for i in range(len(pdf_images)):
image_name = str(i+1) + '.jpg'
image_name[i].save(image_name, 'JPEG')
JPEG
is the format in which you want the file to be saved. Below is a sample output where you can see different image files saved from the pdf file I used.
You can explore other functionalities like changing the dpi and size of images and many more.
Try this out guys. It's going to be fun. I would appreciate your feedbacks. See you all in the next article.