Unlock the secrets of your code with ourAI-powered Code Explainer. Take a look!
PDF is an abbreviation that stands for portable document format, it was created by adobe and in the world of documents it has proved to be an easy and reliable way of exchanging documents. In this tutorial we will be building a PDF viewer with a graphical user interface (GUI), we will cover topics like designing the GUI, mining the PDF data, and displaying the PDF in the app.
Building your own PDF viewer with a GUI is a great way to master some of the cool concepts of Python, this tutorial is about that and if you are excited let us get right into it. We will use theTkinter module for the GUI and for doing operations on PDF files like getting metadata, page, and text we will use thefitz module that is part of thePyMuPDF module.
Here is what we are going to build at the end of this tutorial:

This application will be built from the ground up and there will be an in-depth coverage of the concepts so that you understand everything.
If you're curious what's that PDF document, it's the free chapter of ourEthical Hacking with Python EBook!
Here is the table of contents:
miner.py FileLet us begin by installing the required modules, fortunately, we will need to install one module for this project, the other modules are pre-installed. So in your terminal enter this command:
$ pip install pymupdfRelated:How to Extract Text from PDF in Python.
Now let us create a Python file and name itpdfviewer.py, here you can call the file any name you prefer but the name should be meaningful. Open the file and add this code:
# importing everything from tkinterfrom tkinter import *# importing ttk for styling widgets from tkinterfrom tkinter import ttk# importing filedialog from tkinterfrom tkinter import filedialog as fd# importing os moduleimport osWe are importing all the functions and built-in modules from the tkinter module, and the second line importsttk fromtkinter, this is for styling the widgets (Buttons, Labels, Entries, etc). The third line of code importsfiledialog fromtkinter again asfd and in the last line, we import theos module, this will help us get the current working directory and retrieving the PDF file from the path.
We will use object-oriented programming approach to build this application. Now that the imports have been taken care of, let us create a class for the application, below the imports paste this code:
# creating a class called PDFViewerclass PDFViewer: # initializing the __init__ / special method def __init__(self, master): passThe code snippet creates aPDFViewer class, and inside it, we have aconstructor or an__init__() function whose arguments areself andmaster. Something worth mentioning here, every function inside a class takesself as the first argument. Inside theconstructor, we are doing nothing via thepass statement as of now. To test the program, paste this code below thePDFViewer class:
# creating the root window using Tk() classroot = Tk()# instantiating/creating object app for class PDFViewerapp = PDFViewer(root)# calling the mainloop to run the app infinitely until user closes itroot.mainloop()Here we are creating theroot window using the tkinterTk() built-in class, then we are creating an object whose argument is the main windowroot. Note, theroot being passed will replace the master inside theconstructor and finally, we are running the main window infinitely until it is closed by the user.
To run the program, enter this command in your terminal:
$ python pfdviewer.pyIf you are using an advanced editor like vs code or pycharm, they come with a feature for running Python scripts.
The output will be as follows:

The output is just a basic window do not worry about it we will work on it just in a moment.
Download: Practical Python PDF Processing EBook.
Now let us declare the variables that we will use in the application, inside the constructor paste these lines of code:
# path for the pdf doc self.path = None # state of the pdf doc, open or closed self.fileisopen = None # author of the pdf doc self.author = None # name for the pdf doc self.name = None # the current page for the pdf self.current_page = 0 # total number of pages for the pdf doc self.numPages = NoneIn the code snippet we have declared these variables:
self.path – this is for the path of the PDF document, currently set toNone.self.fileisopen – the state of the document, whether opened or closed, currently set toNone.self.author – the author of the document, currently set toNone.self.name – the name of the document, currently set toNone.self.current_page – the current page, currently set to 0.self.numPages – the total number of pages, currently set toNone.The above variables will start to make sense the moment we start using them.
In this section, we will design the GUI for the application, we will focus on improving the look of the basic window we just saw.
Now inside the constructor, replace thepass statement with this code:
# creating the window self.master = master # gives title to the main window self.master.title('PDF Viewer') # gives dimensions to main window self.master.geometry('580x520+440+180') # this disables the minimize/maximize button on the main window self.master.resizable(width = 0, height = 0) # loads the icon and adds it to the main window self.master.iconbitmap(self.master, 'pdf_file_icon.ico')We are creating the main windowmaster, we are then giving it a title via thetitle() function.
To control the window’s dimensions, we are using thegeometry() function which takes580 as width and520 as height, for positioning the window vertically we have440 and horizontally we have180.
With the dimensions set for the window, we are making it non-resizable using theresizable() function, both thewidth and theheight are set to0. Finally, we have loaded and added the icon to the main window using theiconbitmap() function.
With the above code, we will get this output:

On the top left corner of the main window, we have an icon and a title:

Here just make sure the icon is in the same folder as the program file.
Download: Practical Python PDF Processing EBook.
Now let us add a menu bar to the main window, this menu will have two buttons, theOpen File button, and theExit button. Just below the main window’s code, paste this code:
# creating the menu self.menu = Menu(self.master) # adding it to the main window self.master.config(menu=self.menu) # creating a sub menu self.filemenu = Menu(self.menu) # giving the sub menu a label self.menu.add_cascade(label="File", menu=self.filemenu) # adding a two buttons to the sub menus self.filemenu.add_command(label="Open File") self.filemenu.add_command(label="Exit")We are creating a menu bar usingMenu() function whose argument is the main window and to add the menu bar to the main window we are using theconfig() function.
To create a file menu we are using the sameMenu() function and to add it to the menu bar we are using theadd_cascade() function, which takes thelabel andmenu as arguments. Now to add a sub-menus to the main menu, we use theadd_command() function, which as well takes thelabel as an argument, in our case, we have two sub-menus,Open File andExit.
Running the program, we will get this output:

If you click theFile button, the two sub-menus will collapse so that you can click them. Note, the sub-menus are buttons and they can take thecommand argument as well.
Let’s move on to create the two frames, the top and bottom frames. Inside the top frame we will have a Canvas for displaying the PDF pages and inside the bottom frame we will have theup anddown button and the label for displaying the page number. So below the menu, add this code:
# creating the top frame self.top_frame = ttk.Frame(self.master, width=580, height=460) # placing the frame using inside main window using grid() self.top_frame.grid(row=0, column=0) # the frame will not propagate self.top_frame.grid_propagate(False) # creating the bottom frame self.bottom_frame = ttk.Frame(self.master, width=580, height=50) # placing the frame using inside main window using grid() self.bottom_frame.grid(row=1, column=0) # the frame will not propagate self.bottom_frame.grid_propagate(False)In the code snippet, we are creating two frames, top and bottom frames using thettk.Frame() function. We are adding the top frame to the main window, giving itheight andwidth of580 and460 and we place it in the 0th row and column.
The same with the bottom frame, we add it to the main window, give it aheight of580 and a width of50, and place it in the 1st row and 0th column. If you notice both these frames have agrid_progagate() function whose input isFalse, this will help the frames to take the defined size regardless of the contents.
In this section, we will create the vertical and horizontal scrollbars, this will help us view the PDF page in case it is bigger than the display canvas. So below the frames code paste this code:
# creating a vertical scrollbar self.scrolly = Scrollbar(self.top_frame, orient=VERTICAL) # adding the scrollbar self.scrolly.grid(row=0, column=1, sticky=(N,S)) # creating a horizontal scrollbar self.scrollx = Scrollbar(self.top_frame, orient=HORIZONTAL) # adding the scrollbar self.scrollx.grid(row=1, column=0, sticky=(W, E))We are creating two scrollbars inside the top frame, the first scrollbar has verticalorient and the second has horizontalorient. And usinggrid() we are placing the vertical scrollbar in the 0th row and 1st column and we are finally sticking it inN andS directions. The horizontal scrollbar is placed in the 1st row and 0th column andW andE directions.
Master PDF Manipulation with Python by building PDF tools from scratch. Get your copy now!
Download EBookNow let us create the Canvas for displaying the pages and add it to the top frame, we will then configure the scrollbars to the Canvas. Below the scrollbars let us paste these lines of code:
# creating the canvas for display the PDF pages self.output = Canvas(self.top_frame, bg='#ECE8F3', width=560, height=435) # inserting both vertical and horizontal scrollbars to the canvas self.output.configure(yscrollcommand=self.scrolly.set, xscrollcommand=self.scrollx.set) # adding the canvas self.output.grid(row=0, column=0) # configuring the horizontal scrollbar to the canvas self.scrolly.configure(command=self.output.yview) # configuring the vertical scrollbar to the canvas self.scrollx.configure(command=self.output.xview)Here via theCanvas() function, we are creating a Canvas inside the top frame, we are giving it some background color, a width of560, and a height of435. To insert the scrollbars into the Canvas, we are using theconfigure() function which takes the vertical and horizontal scrollbars as arguments. Then we are placing the Canvas in the 0th row and column using thegrid() function.
Now to display these scrollbars on the Canvas, we are using theconfigure() function again, it takescommand whose value is the canvas direction view, for the vertical scrollbar we haveself.output.yview, and for the horizontal scrollbar we haveself.output.xview
If we run the program, this is the output that we will get:

The scrollbars have been added to the Canvas, for now, they are disabled since we have no content inside the Canvas.
Now that the widgets in the top frame are taken care of, let us add widgets in the bottom frame, below this line of code:
# configuring the vertical scrollbar to the canvas self.scrollx.configure(command=self.output.xview)Add this code:
# loading the button icons self.uparrow_icon = PhotoImage(file='uparrow.png') self.downarrow_icon = PhotoImage(file='downarrow.png') # resizing the icons to fit on buttons self.uparrow = self.uparrow_icon.subsample(3, 3) self.downarrow = self.downarrow_icon.subsample(3, 3) # creating an up button with an icon self.upbutton = ttk.Button(self.bottom_frame, image=self.uparrow) # adding the button self.upbutton.grid(row=0, column=1, padx=(270, 5), pady=8) # creating a down button with an icon self.downbutton = ttk.Button(self.bottom_frame, image=self.downarrow) # adding the button self.downbutton.grid(row=0, column=3, pady=8) # label for displaying page numbers self.page_label = ttk.Label(self.bottom_frame, text='page') # adding the label self.page_label.grid(row=0, column=4, padx=5)With this code snippet, we are loading two icons for the buttons using thePhotoImage() function, we are then resizing the icons to fit inside the buttons properly. To create the buttons we usettk.Button() function, which takes the bottomframe and theimage as arguments.
For the first button via thegrid() we are placing it in the 0th row and 1st column, to kind of push it to the center we are using thepadx=(270, 5) and to push it 8 pixels horizontally we usepady=8.
For the second button, usinggrid() again we place it in the 0th row and 3rd column and it also occupies 8 pixels horizontally.
Finally, we are creating a label to display the number of pages, we place it in the same row as the buttons.
Let us see how the application is looking:

Two buttons and a label have been added, as mentioned earlier all the icons for your application must be in the same folder as your program file.
Read also:How to Convert HTML to PDF in Python.
Let us wrap up the GUI design by making theExit button of the menu close the application. This is simple, edit this line of code:
self.filemenu.add_command(label="Exit")And make it look like this:
self.filemenu.add_command(label="Exit", command=self.master.destroy)With this simple line of code, we are able to close the application, thedestroy is atkinter built-in function that simply closes the main window.
Congratulations on successfully the GUI! Now let's get into the PDF stuff.
Master PDF Manipulation with Python by building PDF tools from scratch. Get your copy now!
Download EBookminer.py FileIn this section, we will create aminer.py file, this file is for doing PDF operations like opening the document, zooming the document,getting PDF metadata, getting the page, andgetting the text. Now create the file and make sure it is in the same folder as thepdfviewer.py file:

Open it and do the following imports:
# this is for doing some math operationsimport math# this is for handling the PDF operationsimport fitz# importing PhotoImage from tkinterfrom tkinter import PhotoImageWe are importing themath module, which will help us do some math conversions and we are also importing thefitz module which is for doing operations on PDF documents. Finally, we are importingPhotoImage fromtkinter for loading image data.
Just below the imports let us create thePDFMiner class and inside it we will have theconstructor. So add this code:
class PDFMiner: def __init__(self, filepath): # creating the file path self.filepath = filepath # opening the pdf document self.pdf = fitz.open(self.filepath) # loading the first page of the pdf document self.first_page = self.pdf.load_page(0) # getting the height and width of the first page self.width, self.height = self.first_page.rect.width, self.first_page.rect.height # initializing the zoom values of the page zoomdict = {800:0.8, 700:0.6, 600:1.0, 500:1.0} # getting the width value width = int(math.floor(self.width / 100.0) * 100) # zooming the page self.zoom = zoomdict[width]We are creating aPDFMiner class. Inside the constructor,we are creating thefilepath, then we are opening it using thefitz.open() function and assign it theself.pdf variable.
To get the first page of the PDF file, we use theload_page() function whose input is0 and to get the width and height of the page we use therect.width andrect.height functions respectively. We also have a dictionary of zoom values, if the page has a width of800 it will be zoomed by0.8, if it has700 as its width it will be zoomed by0.6, and the width of600 and500 will be zoomed by1.0.
To get the width value we are dividing theself.width by100 and the result is multiplied by100 as well, then the width is converted to an integer. Finally, we are zooming the page using the calculated width.
Let us create another function for getting the PDF document metadata, below theconstructor paste this code:
# this will get the metadata from the document like # author, name of document, number of pages def get_metadata(self): # getting metadata from the open PDF document metadata = self.pdf.metadata # getting number of pages from the open PDF document numPages = self.pdf.page_count # returning the metadata and the numPages return metadata, numPagesSo in the code snippet, we are creating aget_metadata() function, inside it we are retrieving the metadata from the opened PDF file via theself.pdf.metadata. To get the number of pages we are using theself.pdf.page_count then we are returning themetadata and thenumPages.
For more details, I invite you to checkthis tutorial that is for the sole purpose of extracting metadata from PDF documents.
We will create another function for getting the page, below theget_metadata() function paste this code:
# the function for getting the page def get_page(self, page_num): # loading the page page = self.pdf.load_page(page_num) # checking if zoom is True if self.zoom: # creating a Matrix whose zoom factor is self.zoom mat = fitz.Matrix(self.zoom, self.zoom) # gets the image of the page pix = page.get_pixmap(matrix=mat) # returns the image of the page else: pix = page.get_pixmap() # a variable that holds a transparent image px1 = fitz.Pixmap(pix, 0) if pix.alpha else pix # converting the image to bytes imgdata = px1.tobytes("ppm") # returning the image data return PhotoImage(data=imgdata)Here we are creating theget_page() function which takesself andpage_num as arguments then we are loading the pdf page via theload_page() function. We have anif/else block, inside theif statement we are creating a matrix using theMatrix() whose zoom factor isself.zoom. To get the image of the current page we useget_pixmap() and inside theelse statement we are just returning the image.
And outside theif/else block, we are creating a variable for holding the image data, then this image data is converted to bytes by thetobytes() function and finally, we are returning the image withPhotoImage() function.
The last function in theminer.py file is theget_text(), this will extract text from the current page, so below theget_page() function paste this code:
# function to get text from the current page def get_text(self, page_num): # loading the page page = self.pdf.load_page(page_num) # getting text from the loaded page text = page.getText('text') # returning text return textTheget_text() function takesself andpage_num as arguments, inside it we are loading the PDF document page, after loading the page we get its text using thegetText() function then finally we return the text.
Read also: How to Convert HTML to PDF in Python.
Now it is time we start implementing the application’s functionalities since we have taken care of most parts of the application, so let’s dive in!
The first functionality to implement is that of selecting the PDF file to view, so open thepdfviewer.py file and below the imports paste this code:
# importing the PDFMiner class from the miner filefrom miner import PDFMinerHere we are just importing thePDFMiner class, this means that we will be able to access all its functions.
So below thePDFViewer’s class constructor, add the following code:
# function for opening pdf files def open_file(self): # open the file dialog filepath = fd.askopenfilename(title='Select a PDF file', initialdir=os.getcwd(), filetypes=(('PDF', '*.pdf'), )) # checking if the file exists if filepath: # declaring the path self.path = filepath # extracting the pdf file from the path filename = os.path.basename(self.path) # passing the path to PDFMiner self.miner = PDFMiner(self.path) # getting data and numPages data, numPages = self.miner.get_metadata() # setting the current page to 0 self.current_page = 0 # checking if numPages exists if numPages: # getting the title self.name = data.get('title', filename[:-4]) # getting the author self.author = data.get('author', None) self.numPages = numPages # setting fileopen to True self.fileisopen = True # calling the display_page() function self.display_page() # replacing the window title with the PDF document name self.master.title(self.name)First of all, we are creatingopen_file() function, in which we declare a variablefilepath that is assigned to the actual file path. To get the path we use theaskopenfilename() function, which takestitle,initialdir, andfiletypes as arguments. The initial directory will be the current working directory because ofos.getcwd() and the files to select from will only be PDFs.
If thefilepath exists, we are settingpath tofilepath then we are retrieving the PDF file from thepath usingos.path.basename(), then this PDF file is assigned tofilename.
We then create theself.miner object for thePDFMiner class, we pass theself.path to the class as an input, so using the object we are accessingdata andnumPages from theget_metadata() function. Then we are settingcurrent_page to 0.
The lastif statement checks ifnumPages exists, then we are doing the following:
data.get() function.self.numPages tonumPages.self.fileisopen toTrue.self.display_page() function, we will create this in a moment.Let us now bind theopen_file() function to theOpen File button in the menu, edit this line:
self.filemenu.add_command(label="Open File")So that it looks like this:
self.filemenu.add_command(label="Open File", command=self.open_file)Run the program and click theOpen File button, this is the output you will get:

The functionality is working just fine, in the current directory the only files we have are PDFs.
Master PDF Manipulation with Python by building PDF tools from scratch. Get your copy now!
Download EBookNow let's work on the display page functionality, just below theopen_file() function, paste the following code:
# the function to display the page def display_page(self): # checking if numPages is less than current_page and if current_page is less than # or equal to 0 if 0 <= self.current_page < self.numPages: # getting the page using get_page() function from miner self.img_file = self.miner.get_page(self.current_page) # inserting the page image inside the Canvas self.output.create_image(0, 0, anchor='nw', image=self.img_file) # the variable to be stringified self.stringified_current_page = self.current_page + 1 # updating the page label with number of pages self.page_label['text'] = str(self.stringified_current_page) + ' of ' + str(self.numPages) # creating a region for inserting the page inside the Canvas region = self.output.bbox(ALL) # making the region to be scrollable self.output.configure(scrollregion=region)We have anif statement that checks if0 is less than or equal to thecurrent_page which is less thannumPages. Then we are getting the image of the current page viaget_page() and this image is inserted inside the Canvas using thecreate_image() function.
We are then updating thepage_label with the current page and the total number of pages.
Finally, we are creating a region inside the Canvas for inserting the images then to make this region scrollable we useconfigure() function whosescrollregion argument points to theregion
Note, you do not have to bind this function to any button since it has been called inside theopen_file() function.
To test this functionality, run the program and select any PDF document, make sure you get this output:

So many things have changed, now the window title is replaced with the document name, both scrollbars have been enabled, and the page label has been updated as well.
Let us make the application more interactive where the user clicks thedown button and the next page displays on Canvas. We will create a function for that, below thedisplay_page() function, paste this code:
# function for displaying next page def next_page(self): # checking if file is open if self.fileisopen: # checking if current_page is less than or equal to numPages-1 if self.current_page <= self.numPages - 1: # updating the page with value 1 self.current_page += 1 # displaying the new page self.display_page()Here we are just creating anext_page() function, the first if statement checksif thefileisopen, and the secondif statement checks ifcurrent_page is less than or equal tonumPages-1. Then we are updating thecurrent_page by 1 and we are calling thedisplay_page() function.
Let us bind this function to thedown button so that the code looks like this:
self.downbutton = ttk.Button(self.bottom_frame, image=self.downarrow, command=self.next_page)Just running the program, the current page is 1:

If you click thedown button, you will go to the next page and the page label will also update:

Great! The application is working as expected.
Now let us make it possible to go back to the previous page after clicking theup button, below or above thenext_page() function paste this code:
# function for displaying the previous page def previous_page(self): # checking if fileisopen if self.fileisopen: # checking if current_page is greater than 0 if self.current_page > 0: # decrementing the current_page by 1 self.current_page -= 1 # displaying the previous page self.display_page()For theprevious_page() function, we are checking iffileisopen isTrue, and ifcurrent_page is greater than0. Then we update thecurrent_page by decrementing it by1 and we called thedisplay_page() function.
Run the program and click thedown button to go to any page:

And now click theup button to go back to the previous page:

That’s it from this tutorial! This article has walked you through building a GUI PDF viewer using Tkinter and PyMuPDF in Python. We hope you have learned a lot and that the knowledge you have acquired will be useful in future projects.
Learn also:How to Sign PDF Files in Python
Get the complete codehere.
Finally, for more PDF handling guides on Python, you can check our Practical Python PDF Processing EBook, where we dive deeper into PDF document manipulation with Python, make sure to check it out here if you're interested!
Happy coding ♥
Take the stress out of learning Python. Meet ourPython Code Assistant – your new coding buddy. Give it a whirl!
View Full Code Fix My CodeGot a coding query or need some guidance before you comment? Check out thisPython Code Assistant for expert advice and handy tips. It's like having a coding tutor right in your fingertips!

