How to write PDF split tool by Python 07/06 Update SLTechnology News&Howtos

How to write PDF split tool by Python

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces "Python how to write PDF split tool". In daily operation, I believe many people have doubts about how to write PDF split tool in Python. The editor consulted all kinds of materials and sorted out simple and easy-to-use methods of operation. I hope it will be helpful to answer the doubts about "how to write PDF split tool in Python". Next, please follow the editor to study!

Demand

You need to take a few pages out of PDF and save them as a new PDF. For later use, this tool needs to be made into a silly form with GUI pages.

Select the source pdf file, and then specify the name and location of the new pdf file generated, and the page information that needs to be split, and you can get the new pdf file.

Demand analysis

We have too many options for Python GUI, so let's start with a simple horizontal comparison.

At a high level, the big GUI tools are:

WxWindows

Tkinter

Customer libraries (Kivy,Toga, etc.)

Web related (HTML,Flask, etc.)

But today, the tool we choose is appJar, which was invented by an educational god, so it provides a simpler GUI creation process and is entirely based on Tkinter, which is supported by Python by default

Code implementation

First of all, in order to implement the PDF operation, I chose the pypdf2 library here.

Let's first hard-code an example of input and output

From PyPDF2 import PdfFileWriter, PdfFileReaderinfile = "Input.pdf" outfile = "Output.pdf" page_range = "1-2pr 6"

Next we instantiate the PdfFileWriter and PdfFIleReader objects and create the actual Output.pdf file

Output = PdfFileWriter () input_pdf = PdfFileReader (open (infile, "rb")) output_file = open (outfile, "wb")

One of the more complex points below is the need to split the pdf, extract the page and save it in the list

Page_ranges = (x.split ("-") for x in page_range.split (",") range_list = [i for r in page_ranges for i in range (int (r [0]), int (r [- 1]) + 1)]

Finally, copy the contents from the original file to the new file.

For p in range_list: output.addPage (input_pdf.getPage (p-1)) output.write (output_file)

Let's build the GUI interface

For this gadget to split PDF, you need to have the following features:

You can select a pdf file through a standard file browser

You can choose the location and file name of the output file

You can customize which pages to extract

There are some error checks.

After installing appJar through PIP, we can code

From appJar import guifrom PyPDF2 import PdfFileWriter, PdfFileReaderfrom pathlib import Path

Create a GUI window

App = gui ("PDF Splitter", useTtk=True) app.setTtkTheme ("default") app.setSize (500,200)

I use the default theme here, and of course I can switch between various theme modes.

Here are the tagging and data entry components

App.addLabel ("Choose Source PDF File") app.addFileEntry ("Input_File") app.addLabel ("Select Output Directory") app.addDirectoryEntry ("Output_Directory") app.addLabel ("Output file name") app.addEntry ("Output_name") app.addLabel ("Page Ranges: 1meme 3meme 4-10") app.addEntry ("Page_Ranges")

Next, add buttons, "handle" and "exit", press the button, and call the following function

App.addButtons (["Process", "Quit"], press)

Finally, run this app.

# start the GUIapp.go ()

In this way, we have completed the construction of the GUI, and let's write the internal processing logic. The program reads any input, determines whether it is PDF, and splits it

Def press (button): if button = = "Process": src_file = app.getEntry ("Input_File") dest_dir = app.getEntry ("Output_Directory") page_range = app.getEntry ("Page_Ranges") out_file = app.getEntry ("Output_name") errors, error_msg = validate_inputs (src_file, dest_dir, page_range) Out_file) if errors: app.errorBox ("Error", "\ n" .join (error_msg), parent=None) else: split_pages (src_file, page_range, Path (dest_dir, out_file)) else: app.stop ()

If you click the "Process" button, call app.getEntry () to retrieve the input values, each of which will be stored and validated by calling validate_inputs ()

Let's look at the validate_inputs function.

Def validate_inputs (input_file, output_dir, range File_name): errors = False error_msgs = [] # Make sure a PDF is selected if Path (input_file). Suffix.upper ()! = ".PDF": errors = True error_msgs.append ("Please select a PDF input file") # Make sure a range is selected if len (range) < 1: errors = True error_msgs.append ("Please enter a valid page range") # Check For a valid directory if not (Path (output_dir)) .exists (): errors = True error_msgs.append ("Please Select a valid output directory") # Check for a file name if len (file_name) < 1: errors = True error_msgs.append ("Please enter a file name") return (errors Error_msgs)

This function is to perform some checks to ensure that the input has data and is valid

After all the data has been collected and verified, the split function can be called to process the file.

Def split_pages (input_file, page_range, out_file): output = PdfFileWriter () input_pdf = PdfFileReader (open (input_file, "rb")) output_file = open (out_file, "wb") page_ranges = (x.split ("-") for x in page_range.split (",") range_list = [i for r in page_ranges for i in range (int (r [0])) Int (r [- 1]) + 1)] for p in range_list: # Need to subtract 1 because pages are 0 indexed try: output.addPage (input_pdf.getPage (p-1)) except IndexError: # Alert the user and stop adding pages app.infoBox ("Info", "Range exceeded number of pages in input.\ nFile will still be saved.") Break output.write (output_file) if (app.questionBox ("File Save", "Output PDF saved. Do you want to quit?"): app.stop () this ends the study of "how Python writes the PDF split tool", hoping to solve everyone's doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.