This allows the developer to do some pretty complex merging operations. So if you have created a merging object with 3 pages in it, you can tell the merging object to merge the next document in at a specific position. :param bool import_bookmarks: You may prevent the source document's bookmarksįrom being imported by specifying this as ``False``.īasically the merge method allows you to tell PyPDF where to merge a page by page number. To merge only the specified range of pages from the source :param pages: can be a :ref:`Page Range ` or a ``(start, stop)`` tuple The beginning of the included file by supplying the text of the bookmark. :param str bookmark: Optionally, you may specify a bookmark to be applied at String representing a path to a PDF file. :param fileobj: A File Object or an object that supports the standard readĪnd seek methods similar to a File Object. :param int position: The *page number* to insert this file. Merges the pages from the given file into the output file at the Its code definition looks like this:ĭef merge(self, position, fileobj, bookmark=None, pages=None, import_bookmarks=True): The PdfFileMerger class also has a merge method that you can use. PyPDF2 will automatically append the entire document so you don’t need to loop through all the pages of each document yourself. Here we just need to create the PdfFileMerger object and then loop through the PDF paths, appending them to our merging object. PyPDF2 made this a bit simpler by creating a PdfFileMerger object: Then we write out the writer object’s contents to disk. For each PDF path, we create a PdfFileReader object and then loop over its pages, adding each and every page to our writer object. Here we create a PdfFileWriter object and several PdfFileReader objects. Pdf_writer.addPage(pdf_reader.getPage(page)) When the original PyPdf came out, the only way to get it to merge multiple PDFs together was like this:įrom PyPDF2 import PdfFileWriter, PdfFileReaderįor page in range(pdf_reader.getNumPages()): Depending on the scanner you have, you might end up scanning a document into multiple PDFs, so being able to join them together again can be wonderful. One project that sticks out in my mind is scanning documents in. I have needed to merge PDFs for work and for fun. One useful use case for doing this is for businesses to merge their dailies into a single PDF. Now that we have a bunch of PDFs, let’s learn how we might take them and merge them back together. We add the one because PyPDF2’s page numbers are zero-based, so page 0 is actually page 1.įinally we open the new file name in write-binary mode and use the PDF writer object’s write method to write the object’s contents to disk. The next step is to create a unique file name which we do by using the original file name plus the word “page” plus the page number + 1. Now we had added one page to our writer object. This method accepts a page object, so to get the page object, we call the reader object’s getPage method. We then add a page to our writer object using its addPage method. Inside of the for loop, we create an instance of PdfFileWriter. Then we loop over all the pages using the reader object’s getNumPages method. Next we open the PDF up and create a reader object. The first line of this function will grab the name of the input file, minus the extension. Then we create a fun little function called pdf_splitter. Output_filename = ''.format(output_filename))įor this example, we need to import both the PdfFileReader and the PdfFileWriter. We will split off each page and turn it into its own standalone PDF.įrom PyPDF2 import PdfFileReader, PdfFileWriterįname = os.path.splitext(os.path.basename(path)) For this example, we will download a W9 form from the IRS and loop over all six of its pages. You just need to tell it how many pages you want. The PyPDF2 package gives you the ability to split up a single PDF into multiple ones. Now that we have PyPDF2 installed, let’s learn how to split and merge PDFs! The preferred way to do so is to use pip. PyPDF2 doesn’t come as a part of the Python Standard Library, so you will need to install it yourself. We will also learn how to take a series of PDFs and join them back together into a single PDF. In this article, we will learn how to split a single PDF into multiple smaller ones. The PyPDF2 package allows you to do a lot of useful operations on existing PDFs.
0 Comments
Leave a Reply. |