Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitbb35419

Browse files
committed
add converting pdf to image tutorial
1 parent7d418f5 commitbb35419

File tree

5 files changed

+57
-0
lines changed

5 files changed

+57
-0
lines changed

‎README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,7 @@ This is a repository of all the tutorials of [The Python Code](https://www.thepy
9696
-[Highlighting Text in PDF with Python](https://www.thepythoncode.com/article/redact-and-highlight-text-in-pdf-with-python). ([code](handling-pdf-files/highlight-redact-text))
9797
-[How to Extract Text from Images in PDF Files with Python](https://www.thepythoncode.com/article/extract-text-from-images-or-scanned-pdf-python). ([code](handling-pdf-files/pdf-ocr))
9898
-[How to Convert PDF to Docx in Python](https://www.thepythoncode.com/article/convert-pdf-files-to-docx-in-python). ([code](handling-pdf-files/convert-pdf-to-docx))
99+
-[How to Convert PDF to Images in Python](https://www.thepythoncode.com/article/convert-pdf-files-to-images-in-python). ([code](handling-pdf-files/convert-pdf-to-image))
99100

100101

101102
-###[Web Scraping](https://www.thepythoncode.com/topic/web-scraping)
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
#[How to Convert PDF to Images in Python](https://www.thepythoncode.com/article/convert-pdf-files-to-images-in-python)
2+
To run this:
3+
-`pip3 install -r requirements.txt`
4+
- To convert the PDF file`bert-paper.pdf` into several images (image per page):
5+
```
6+
$ python convert_pdf2image.py bert-paper.pdf
7+
```
Binary file not shown.
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
importfitz
2+
3+
fromtypingimportTuple
4+
importos
5+
6+
7+
defconvert_pdf2img(input_file:str,pages:Tuple=None):
8+
"""Converts pdf to image and generates a file by page"""
9+
# Open the document
10+
pdfIn=fitz.open(input_file)
11+
output_files= []
12+
# Iterate throughout the pages
13+
forpginrange(pdfIn.pageCount):
14+
ifstr(pages)!=str(None):
15+
ifstr(pg)notinstr(pages):
16+
continue
17+
# Select a page
18+
page=pdfIn[pg]
19+
rotate=int(0)
20+
# PDF Page is converted into a whole picture 1056*816 and then for each picture a screenshot is taken.
21+
# zoom = 1.33333333 -----> Image size = 1056*816
22+
# zoom = 2 ---> 2 * Default Resolution (text is clear, image text is hard to read) = filesize small / Image size = 1584*1224
23+
# zoom = 4 ---> 4 * Default Resolution (text is clear, image text is barely readable) = filesize large
24+
# zoom = 8 ---> 8 * Default Resolution (text is clear, image text is readable) = filesize large
25+
zoom_x=2
26+
zoom_y=2
27+
# The zoom factor is equal to 2 in order to make text clear
28+
# Pre-rotate is to rotate if needed.
29+
mat=fitz.Matrix(zoom_x,zoom_y).preRotate(rotate)
30+
pix=page.getPixmap(matrix=mat,alpha=False)
31+
output_file=f"{os.path.splitext(os.path.basename(input_file))[0]}_page{pg+1}.png"
32+
pix.writePNG(output_file)
33+
output_files.append(output_file)
34+
pdfIn.close()
35+
summary= {
36+
"File":input_file,"Pages":str(pages),"Output File(s)":str(output_files)
37+
}
38+
# Printing Summary
39+
print("## Summary ########################################################")
40+
print("\n".join("{}:{}".format(i,j)fori,jinsummary.items()))
41+
print("###################################################################")
42+
returnoutput_files
43+
44+
45+
if__name__=="__main__":
46+
importsys
47+
input_file=sys.argv[1]
48+
convert_pdf2img(input_file)
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
PyMuPDF==1.18.9

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp