- Notifications
You must be signed in to change notification settings - Fork5
Automatically extract documents from images and perspectively correct them with classic computer-vision algorithms. Check out Perspec for a GUI alternative.
ad-si/Perspectra
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Software and corresponding workflow to scan documents and bookswith as little hardware as possible.
Check outgithub:adius/awesome-scanningfor an extensive list of alternative solutions.
Command | Input | Result |
---|---|---|
perspectra correct --binary=gauss-diff 01.jpeg | ![]() | ![]() |
perspectra correct --binary=gauss-diff 02.jpeg | ![]() | ![]() |
perspectra correct --gray 03.jpeg | ![]() | ![]() |
We recommend to useuv
instead ofpip
to install the package.
uv tool install perspectra
To install from source:
git clone https://github.com/ad-si/Perspectracd Perspectramake install
usage: perspectra [-h] [--debug] {binarize,correct,corners,renumber-pages} ...options: -h, --help show this help message and exit --debug Render debugging viewsubcommands: subcommands to handle files and correct photos {binarize,correct,corners,renumber-pages} additional help binarize Binarize image correct Pespectively correct and crop photos of documents. corners Returns the corners of the document in the image as [top-left, top-right, bottom-right, bottom-left] renumber-pages Renames the images in a directory according to their page numbers. The assumed layout is `cover -> odd pages -> even pages reversed`
Your photos should ideally have following properties:
- Photos with 10 - 20 Mpx
- Contain 1 document
- Rectangular
- Pronounced corners
- Only black content on white or light-colored paper
- On dark background
- Maximum of 30° rotation
# Rule of thumb is the inverse of your focal length,# but motion blur is pretty much the worst for readable documents,# therefore use at least half of it and never less than 1/50.shutter:1/50 - 1/200 s# The whole document must be sharp even if you photograph it from an angle.# Therefore at least 8 f.aperture:8-12 f# Noise is less bad than motion blur => relative high ISO# Should be the last thing you set:# As high as necessary as low as possibleiso:800-6400
When usingTv
(Time Value) orAv
(Aperture Value) modeuse exposure compensation to set lightness value below 0.You really don't want to overexpose your photos as the bright pagesare the first thing that clips.
On the other hand,it doesn't matter if you loose background parts because they are to dark.
A good tool for this purpose isPySceneDetect.It's a Python/OpenCV-based scene detection program,using threshold/content analysis on a given video.
For easy installation you can use thedocker image
Find good values for threshold:
docker run\--rm\--volume (pwd):/video\ handflucht/pyscenedetect--input /video/page-turning.mp4\--downscale-factor 2\--detector content\--statsfile page-turning-stats.csv
To launch the image run:
docker run\--interactive\--tty\--volume=(pwd):/video\--entrypoint=bash\ handflucht/pyscenedetect
Then run in the shell:
cd /videoscenedetect \ --input page-turning.mp4 \ --downscale-factor 2 \ --detector content \ --threshold 3 \ --min-scene-length 80 \ --save-images
TODO: The correct way to do this:(afterBreakthrough/PySceneDetect#45 is implemented)
docker run\--rm\--volume (pwd):/video\ handflucht/pyscenedetect\--input /video/page-turning.mp4\--downscale-factor 2\--detector content\--threshold 3\--min-scene-length 80\--save-images<TODO: path>
Aim for a low threshold and a long minimum scene length.I.e. turn the page really fast and show it for a long time.
About
Automatically extract documents from images and perspectively correct them with classic computer-vision algorithms. Check out Perspec for a GUI alternative.