Last updated on July 4, 2021.

4:18am. Alarm blaring. Still dark outside. The bed is warm. And the floor will feel so cold on my bare feet.
But I got out of bed. I braved the morning, and I took the ice cold floor on my feet like a champ.
Why?
Because I’m excited.
Excited to share something very special with you today…
You see, over the past few weeks I’ve gotten some really great emails from fellow PyImageSearch readers. These emails were short, sweet, and to the point. They were simple “thank you’s” for posting actual, honest-to-goodness Python and OpenCV code that you could take and use to solve your own computer vision and image processing problems.
And upon reflection last night, I realized that I’m not doing a good enough job sharing the libraries, packages, and code that I have developed for myself for everyday use — so that’s exactly what I’m going to do today.
In this blog post I’m going to show you the functions in mytransform.py module. I use these functions whenever I need to do a 4 pointcv2.getPerspectiveTransform using OpenCV.
And I think you’ll find the code in here quite interesting… and you’ll even be able to utilize it in your own projects.
So read on. And checkout my 4 point OpenCVcv2.getPerspectiveTransform example.
- Update July 2021:Added two new sections. The first covers how to automatically find the top-left, top-right, bottom-right, and bottom-left coordinates for a perspective transform. The second section discusses how to improve perspective transform results by taking into account the aspect ratio of the input ROI.
OpenCV and Python versions:
This example will run on Python 2.7/Python 3.4+ andOpenCV 2.4.X/OpenCV 3.0+.
4 Point OpenCV getPerspectiveTransform Example
You may remember back to my posts on building a real-life Pokedex, specifically,my post on OpenCV and Perspective Warping.
In that post I mentioned how you could use a perspective transform to obtain a top-down, “birds eye view” of an image — provided that you could find reference points, of course.
This post will continue the discussion on the top-down, “birds eye view” of an image. But this time I’m going to share with youpersonal code that I useevery single time I need to do a 4 point perspective transform.
So let’s not waste any more time. Open up a new file, name ittransform.py, and let’s get started.
# import the necessary packagesimport numpy as npimport cv2def order_points(pts):# initialzie a list of coordinates that will be ordered# such that the first entry in the list is the top-left,# the second entry is the top-right, the third is the# bottom-right, and the fourth is the bottom-leftrect = np.zeros((4, 2), dtype = "float32")# the top-left point will have the smallest sum, whereas# the bottom-right point will have the largest sums = pts.sum(axis = 1)rect[0] = pts[np.argmin(s)]rect[2] = pts[np.argmax(s)]# now, compute the difference between the points, the# top-right point will have the smallest difference,# whereas the bottom-left will have the largest differencediff = np.diff(pts, axis = 1)rect[1] = pts[np.argmin(diff)]rect[3] = pts[np.argmax(diff)]# return the ordered coordinatesreturn rect
We’ll start off by importing the packages we’ll need: NumPy for numerical processing andcv2 for our OpenCV bindings.
Next up, let’s define theorder_points function onLine 5. This function takes a single argument,pts , which is a list of four points specifying the(x, y) coordinates of each point of the rectangle.
It is absolutely crucial that we have aconsistent ordering of the points in the rectangle. The actual ordering itself can be arbitrary,as long as it is consistent throughout the implementation.
Personally, I like to specify my points in top-left, top-right, bottom-right, and bottom-left order.
We’ll start by allocating memory for the four ordered points onLine 10.
Then, we’ll find the top-left point, which will have the smallestx + y sum, and the bottom-right point, which will have the largestx + y sum. This is handled onLines 14-16.
Of course, now we’ll have to find the top-right and bottom-left points. Here we’ll take the difference (i.e.x – y) between the points using thenp.diff function onLine 21.
The coordinates associated with the smallest difference will be the top-right points, whereas the coordinates with the largest difference will be the bottom-left points (Lines 22 and 23).
Finally, we return our ordered functions to the calling function onLine 26.
Again, I can’t stress again how important it is to maintain a consistent ordering of points.
And you’ll see exactly why in this next function:
def four_point_transform(image, pts):# obtain a consistent order of the points and unpack them# individuallyrect = order_points(pts)(tl, tr, br, bl) = rect# compute the width of the new image, which will be the# maximum distance between bottom-right and bottom-left# x-coordiates or the top-right and top-left x-coordinateswidthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2))widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2))maxWidth = max(int(widthA), int(widthB))# compute the height of the new image, which will be the# maximum distance between the top-right and bottom-right# y-coordinates or the top-left and bottom-left y-coordinatesheightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))maxHeight = max(int(heightA), int(heightB))# now that we have the dimensions of the new image, construct# the set of destination points to obtain a "birds eye view",# (i.e. top-down view) of the image, again specifying points# in the top-left, top-right, bottom-right, and bottom-left# orderdst = np.array([[0, 0],[maxWidth - 1, 0],[maxWidth - 1, maxHeight - 1],[0, maxHeight - 1]], dtype = "float32")# compute the perspective transform matrix and then apply itM = cv2.getPerspectiveTransform(rect, dst)warped = cv2.warpPerspective(image, M, (maxWidth, maxHeight))# return the warped imagereturn warped
We start off by defining thefour_point_transform function onLine 28, which requires two arguments:image andpts .
Theimage variable is the image we want to apply the perspective transform to. And thepts list is the list of four points that contain the ROI of the image we want to transform.
We make a call to ourorder_points function onLine 31, which places ourpts variable in a consistent order. We then unpack these coordinates onLine 32 for convenience.
Now we need to determine the dimensions of our new warped image.
We determine the width of the new image onLines 37-39, where the width is the largest distance between the bottom-right and bottom-left x-coordinates or the top-right and top-left x-coordinates.
In a similar fashion, we determine the height of the new image onLines 44-46, where the height is the maximum distance between the top-right and bottom-right y-coordinates or the top-left and bottom-left y-coordinates.
Note: Big thanks to Tom Lowell who emailed in and made sure I fixed the width and height calculation!
So here’s the part where you really need to pay attention.
Remember how I said that we are trying to obtain a top-down, “birds eye view” of the ROI in the original image? And remember how I said that a consistent ordering of the four points representing the ROI is crucial?
OnLines 53-57 you can see why. Here, we define 4 points representing our “top-down” view of the image. The first entry in the list is(0, 0) indicating the top-left corner. The second entry is(maxWidth - 1, 0) which corresponds to the top-right corner. Then we have(maxWidth - 1, maxHeight - 1) which is the bottom-right corner. Finally, we have(0, maxHeight - 1) which is the bottom-left corner.
The takeaway here is that these points are defined in a consistent ordering representation — and will allow us to obtain the top-down view of the image.
To actually obtain the top-down, “birds eye view” of the image we’ll utilize thecv2.getPerspectiveTransform function onLine 60. This function requires two arguments,rect , which is the list of 4 ROI points in the original image, anddst , which is our list of transformed points. Thecv2.getPerspectiveTransform function returnsM , which is the actual transformation matrix.
We apply the transformation matrix onLine 61 using thecv2.warpPerspective function. We pass in theimage , our transform matrixM , along with the width and height of our output image.
The output ofcv2.warpPerspective is ourwarped image, which is our top-down view.
We return this top-down view onLine 64 to the calling function.
Now that we have code to perform the transformation, we need some code to drive it and actually apply it to images.
Open up a new file, calltransform_example.py , and let’s finish this up:
# import the necessary packagesfrom pyimagesearch.transform import four_point_transformimport numpy as npimport argparseimport cv2# construct the argument parse and parse the argumentsap = argparse.ArgumentParser()ap.add_argument("-i", "--image", help = "path to the image file")ap.add_argument("-c", "--coords",help = "comma seperated list of source points")args = vars(ap.parse_args())# load the image and grab the source coordinates (i.e. the list of# of (x, y) points)# NOTE: using the 'eval' function is bad form, but for this example# let's just roll with it -- in future posts I'll show you how to# automatically determine the coordinates without pre-supplying themimage = cv2.imread(args["image"])pts = np.array(eval(args["coords"]), dtype = "float32")# apply the four point tranform to obtain a "birds eye view" of# the imagewarped = four_point_transform(image, pts)# show the original and warped imagescv2.imshow("Original", image)cv2.imshow("Warped", warped)cv2.waitKey(0)The first thing we’ll do is import ourfour_point_transform function onLine 2. I decided put it in thepyimagesearch sub-module for organizational purposes.
We’ll then use NumPy for the array functionality,argparse for parsing command line arguments, andcv2 for OpenCV bindings.
We parse our command line arguments onLines 8-12. We’ll use two switches,--image , which is the image that we want to apply the transform to, and--coords , which is the list of 4 points representing the region of the image we want to obtain a top-down, “birds eye view” of.
We then load the image onLine 19 and convert the points to a NumPy array onLine 20.
Now before you get all upset at me for using theeval function, please remember, this is just an example. I don’t condone performing a perspective transform this way.
And, as you’ll see in next week’s post,I’ll show you how toautomatically determine the four points needed for the perspective transform — no manual work on your part!
Next, we can apply our perspective transform onLine 24.
Finally, let’s display the original image and the warped, top-down view of the image onLines 27-29.
Obtaining a Top-Down View of the Image
Alright, let’s see this code in action.
Open up a shell and execute the following command:
$ python transform_example.py --image images/example_01.png --coords "[(73, 239), (356, 117), (475, 265), (187, 443)]"
You should see a top-down view of the notecard, similar to below:

Let’s try another image:
$ python transform_example.py --image images/example_02.png --coords "[(101, 185), (393, 151), (479, 323), (187, 441)]"

And a third for good measure:
$ python transform_example.py --image images/example_03.png --coords "[(63, 242), (291, 110), (361, 252), (78, 386)]"

As you can see, we have successfully obtained a top-down, “birds eye view” of the notecard!
In some cases the notecard looks a little warped — this is because the angle the photo was taken at is quite severe. The closer we come to the 90-degree angle of “looking down” on the notecard, the better the results will be.
Automatically finding the corners for the transform
In order to obtain our top-down transform of our input image we had tomanually supply/hardcode the input top-left, top-right, bottom-right, and bottom-left coordinates.
That raises the question:
Is there a way toautomatically obtain these coordinates?
You bet there is. The following three tutorials show you how to do exactly that:
- Building a document scanner with OpenCV
- Bubble sheet multiple choice scanner and test grader using OMR, Python, and OpenCV
- OpenCV Sudoku Solver and OCR
Improving your top-down transform results by computing the aspect ratio
The aspect ratio of an image is defined as the ratio of the width to the height. When resizing an image or performing a perspective transform, it’s important to consider the aspect ratio of the image.
For example, if you’ve ever seen image that looks “squished” or “crunched” it’s because the aspect ratio is off:

On theleft, we have our original image. And on theright, we have two images that have been distorted by not preserving the aspect ratio. They have been resized by ignoring the ratio of the width to the height of the image.
To obtain better, more aesthetically pleasing perspective transforms, you should consider taking into account the aspect ratio of the input image/ROI.This thread on StackOverflow will show you how to do that.
What's next? We recommendPyImageSearch University.
86+ total classes • 115+ hours hours of on-demand code walkthrough videos • Last updated: December 2025
★★★★★ 4.84 (128 Ratings) • 16,000+ Students Enrolled
I strongly believe that if you had the right teacher you couldmaster computer vision and deep learning.
Do you think learning computer vision and deep learning has to be time-consuming, overwhelming, and complicated? Or has to involve complex mathematics and equations? Or requires a degree in computer science?
That’snot the case.
All you need to master computer vision and deep learning is for someone to explain things to you insimple, intuitive terms.And that’s exactly what I do. My mission is to change education and how complex Artificial Intelligence topics are taught.
If you're serious about learning computer vision, your next stop should be PyImageSearch University, the most comprehensive computer vision, deep learning, and OpenCV course online today. Here you’ll learn how tosuccessfully andconfidently apply computer vision to your work, research, and projects. Join me in computer vision mastery.
Inside PyImageSearch University you'll find:
- ✓86+ courses on essential computer vision, deep learning, and OpenCV topics
- ✓86 Certificates of Completion
- ✓115+ hours hours of on-demand video
- ✓Brand new courses releasedregularly, ensuring you can keep up with state-of-the-art techniques
- ✓Pre-configured Jupyter Notebooks in Google Colab
- ✓ Run all code examples in your web browser — works on Windows, macOS, and Linux (no dev environment configuration required!)
- ✓ Access tocentralized code repos forall 540+ tutorials on PyImageSearch
- ✓ Easy one-click downloads for code, datasets, pre-trained models, etc.
- ✓Access on mobile, laptop, desktop, etc.
Summary
In this blog post I provided an OpenCVcv2.getPerspectiveTransform example using Python.
I even shared code from mypersonal library on how to do it!
But the fun doesn’t stop here.
You know those iPhone and Android “scanner” apps that let you snap a photo of a document and then have it “scanned” into your phone?
That’s right — I’ll show you how to use the 4 point OpenCV getPerspectiveTransform example code to build one of those document scanner apps!
I’m definitely excited about it, I hope you are too.
Anyway, be sure to signup for the PyImageSearch Newsletter to hear when the post goes live!

Download the Source Code and FREE 17-page Resource Guide
Enter your email address below to get a .zip of the code and aFREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL!
About the Author
Hi there, I’m Adrian Rosebrock, PhD. All too often I see developers, students, and researchers wasting their time, studying the wrong things, and generally struggling to get started with Computer Vision, Deep Learning, and OpenCV. I created this website to show you what I believe is the best possible way to get your start.








