Table of Contents
Watch Now This tutorial has a related video course created by the Real Python team. Watch it together with the written tutorial to deepen your understanding:Pointers and Objects in Python
If you’ve ever worked with lower level languages likeC orC++, then you’ve probably heard of pointers. Pointers allow you to create great efficiency in parts of your code. They also cause confusion for beginners and can lead to various memory management bugs, even for experts. So where are they in Python, and how can you simulate pointers in Python?
Pointers are widely used in C and C++. Essentially, they arevariables that hold the memory address of another variable. For a refresher on pointers, you might consider checking out thisoverview on C Pointers.
In this article, you’ll gain a better understanding of Python’s object model and learn why pointers in Python don’t really exist. For the cases where you need to mimic pointer behavior, you’ll learn ways to simulate pointers in Python without the memory-management nightmare.
In this article, you’ll:
ctypes
Note: In this article, “Python” will refer to the reference implementation of Python in C, otherwise known asCPython. As the article discusses some internals of the language, these notes are true for CPython 3.7 but may not be true in future or past iterations of the language.
Free Download:Get a sample chapter from Python Tricks: The Book that shows you Python’s best practices with simple examples you can apply instantly to write more beautiful + Pythonic code.
The truth is that I don’t know. Could pointers in Python exist natively? Probably, but pointers seem to go against theZen of Python. Pointers encourage implicit changes rather than explicit. Often, they are complex instead of simple, especially for beginners. Even worse, they beg for ways to shoot yourself in the foot, or do something really dangerous like read from a section of memory you were not supposed to.
Python tends to try to abstract away implementation details like memory addresses from its users. Python often focuses on usability instead of speed. As a result, pointers in Python don’t really make sense. Not to fear though, Python does, by default, give you some of the benefits of using pointers.
Understanding pointers in Python requires a short detour into Python’s implementation details. Specifically, you’ll need to understand:
Hold onto your memory addresses, and let’s get started.
In Python, everything is an object. For proof, you can open up a REPL and explore usingisinstance()
:
>>>isinstance(1,object)True>>>isinstance(list(),object)True>>>isinstance(True,object)True>>>deffoo():...pass...>>>isinstance(foo,object)True
This code shows you that everything in Python is indeed an object. Each object contains at least three pieces of data:
Thereference count is for memory management. For an in-depth look at the internals of memory management in Python, you can readMemory Management in Python.
The type is used at theCPython layer to ensure type safety during runtime. Finally, there’s the value, which is the actual value associated with the object.
Not all objects are the same though. There is one other important distinction you’ll need to understand: immutable vs mutable objects. Understanding the difference between the types of objects really helps clarify the first layer of the onion that is pointers in Python.
In Python, there are two types of objects:
Understanding this difference is the first key to navigating the landscape of pointers in Python. Here’s a breakdown of common types and whether or not they aremutable or immutable:
Type | Immutable? |
---|---|
int | Yes |
float | Yes |
bool | Yes |
complex | Yes |
tuple | Yes |
frozenset | Yes |
str | Yes |
list | No |
set | No |
dict | No |
As you can see, lots of commonly used primitive types areimmutable. You can prove this yourself by writing some Python. You’ll need a couple of tools from the Python standard library:
id()
returns the object’s memory address.is
returnsTrue
if and only if two objects have the same memory address.Once again, you can use these in a REPL environment:
>>>x=5>>>id(x)94529957049376
In the above code, you have assigned the value5
tox
. If you tried to modify this value with addition, then you’d get a new object:
>>>x+=1>>>x6>>>id(x)94529957049408
Even though the above code appears to modify the value ofx
, you’re getting anew object as a response.
Thestr
type is also immutable:
>>>s="real_python">>>id(s)140637819584048>>>s+="_rocks">>>s'real_python_rocks'>>>id(s)140637819609424
Again,s
ends up with adifferent memory addresses after the+=
operation.
Bonus: The+=
operator translates to various method calls.
For some objects likelist
,+=
will translate into__iadd__()
(in-place add). This will modifyself
and return the same ID. However,str
andint
don’t have these methods and result in__add__()
calls instead of__iadd__()
.
For more detailed information, check out the Pythondata model docs.
Trying to directly mutate thestrings
results in an error:
>>>s[0]="R"Traceback (most recent call last): File"<stdin>", line1, in<module>TypeError:'str' object does not support item assignment
The above code fails, and Python indicates thatstr
doesn’t support this mutation, which is in line with the definition that thestr
type is immutable.
Contrast that with a mutable object, likelist
:
>>>my_list=[1,2,3]>>>id(my_list)140637819575368>>>my_list.append(4)>>>my_list[1, 2, 3, 4]>>>id(my_list)140637819575368
This code shows a major difference in the two types of objects.my_list
has an id originally. Even after4
is appended to the list,my_list
has thesame id. This is because thelist
type is mutable.
Another way to demonstrate that the list is mutable is with assignment:
>>>my_list[0]=0>>>my_list[0, 2, 3, 4]>>>id(my_list)140637819575368
In this code, you mutatemy_list
and set its first element to0
. However, it maintains the same id even after this assignment. With mutable and immutable objects out of the way, the next step on your journey toPython enlightenment is understanding Python’s variable ecosystem.
Python variables are fundamentally different than variables in C or C++. In fact, Python doesn’t even have variables.Python has names, not variables.
This might seem pedantic, and for the most part, it is. Most of the time, it’s perfectly acceptable to think about Python names as variables, but understanding the difference is important. This is especially true when you’re navigating the tricky subject of pointers in Python.
To help drive home the difference, you can take a look at how variables work in C, what they represent, and then contrast that with how names work in Python.
Let’s say you had the following code that defines the variablex
:
intx=2337;
This one line of code has several, distinct steps when executed:
2337
to that memory locationx
points to that valueShown in a simplified view of memory, it might look like this:
Here, you can see that the variablex
has a fake memory location of0x7f1
and the value2337
. If, later in the program, you want to change the value ofx
, you can do the following:
x=2338;
The above code assigns a new value (2338
) to the variablex
, therebyoverwriting the previous value. This means that the variablex
ismutable. The updated memory layout shows the new value:
Notice that the location ofx
didn’t change, just the value itself. This is a significant point. It means thatx
is the memory location, not just a name for it.
Another way to think of this concept is in terms of ownership. In one sense,x
owns the memory location.x
is, at first, an empty box that can fit exactly one integer in which integer values can be stored.
When you assign a value tox
, you’re placing a value in the box thatx
owns. If you wanted to introduce a new variable (y
), you could add this line of code:
inty=x;
This code creates anew box calledy
and copies the value fromx
into the box. Now the memory layout will look like this:
Notice the new location0x7f5
ofy
. Even though the value ofx
was copied toy
, the variabley
owns some new address in memory. Therefore, you could overwrite the value ofy
without affectingx
:
y=2339;
Now the memory layout will look like this:
Again, you have modified the value aty
, butnot its location. In addition, you have not affected the originalx
variable at all. This is in stark contrast with how Python names work.
Python does not have variables. It has names. Yes, this is a pedantic point, and you can certainly use the term variables as much as you like. It is important to know that there is a difference between variables and names.
Let’s take the equivalent code from the above C example and write it in Python:
>>>x=2337
Much like in C, the above code is broken down into several distinct steps during execution:
PyObject
PyObject
2337
for thePyObject
x
x
to the newPyObject
PyObject
by 1Note: ThePyObject
is not the same as Python’sobject
. It’s specific to CPython and represents the base structure for all Python objects.
PyObject
is defined as a C struct, so if you’re wondering why you can’t calltypecode
orrefcount
directly, its because you don’t have access to the structures directly. Method calls likesys.getrefcount()
can help get some internals.
In memory, it might looks something like this:
You can see that the memory layout is vastly different than the C layout from before. Instead ofx
owning the block of memory where the value2337
resides, the newly created Python object owns the memory where2337
lives. The Python namex
doesn’t directly ownany memory address in the way the C variablex
owned a static slot in memory.
If you were to try to assign a new value tox
, you could try the following:
>>>x=2338
What’s happening here is different than the C equivalent, but not too different from the original bind in Python.
This code:
PyObject
PyObject
2338
for thePyObject
x
to the newPyObject
PyObject
by 1PyObject
by 1Now in memory, it would look something like this:
This diagram helps illustrate thatx
points to a reference to an object and doesn’t own the memory space as before. It also shows that thex = 2338
command is not an assignment, but rather binding the namex
to a reference.
In addition, the previous object (which held the2337
value) is now sitting in memory with a ref count of 0 and will get cleaned up by thegarbage collector.
You could introduce a new name,y
, to the mix as in the C example:
>>>y=x
In memory, you would have a new name, but not necessarily a new object:
Now you can see that a new Python object hasnot been created, just a new name that points to the same object. Also, the object’s refcount has increased by one. You could check for object identity equality to confirm that they are the same:
>>>yisxTrue
The above code indicates thatx
andy
are the same object. Make no mistake though:y
is still immutable.
For example, you could perform addition ony
:
>>>y+=1>>>yisxFalse
After the addition call, you are returned with a new Python object. Now, the memory looks like this:
A new object has been created, andy
now points to the new object. Interestingly, this is the same end-state if you had boundy
to2339
directly:
>>>y=2339
The above statement results in the same end-memory state as the addition. To recap, in Python, you don’t assign variables. Instead, you bind names to references.
Now that you understand how Python objects get created and names get bound to those objects, its time to throw a wrench in the machinery. That wrench goes by the name of interned objects.
Suppose you have the following Python code:
>>>x=1000>>>y=x>>>xisyTrue
As above,x
andy
are both names that point to the same Python object. But the Python object that holds the value1000
is not always guaranteed to have the same memory address. For example, if you were to assign a literal1000
toy
as well, you would end up with a different memory address:
>>>x=1000>>>y=1000>>>xisyFalse
This time, the linex is y
returnsFalse
. If this is confusing, then don’t worry. Here are the steps that occur when this code is executed:
1000
)x
to that object1000
)y
to that objectNote: The above steps occur only when this code is executed inside a REPL. If you were to take the example above, paste it into a file, and run the file, then you would find that thex is y
line would returnTrue
.
This occurs because compilers are smart. The CPython compiler attempts to make optimizations calledpeephole optimizations, which help save execution steps whenever possible.
Isn’t this wasteful? Well, yes it is, but that’s the price you pay for all of the great benefits of Python. You never have to worry about cleaning up these intermediate objects or even need to know that they exist! The joy is that these operations are relatively fast, and you never had to know any of those details until now.
The core Python developers, in their wisdom, also noticed this waste and decided to make a few optimizations. These optimizations result in behavior that can be surprising to newcomers:
>>>x=20>>>y=20>>>xisyTrue
In this example, you see nearly the same code as before, except this time the result isTrue
. This is the result of interned objects. Python pre-creates a certain subset of objects in memory and keeps them in the globalnamespace for everyday use.
Which objects depend on the implementation of Python. CPython 3.7 interns the following:
-5
and256
A
toZ
anda
toz
), digits, and underscores onlyThe reasoning behind this is that these objects are likely to be used in many programs. For example, most variable names in Python are covered by the second bullet point. By interning these objects, Python prevents memory allocation calls for consistently used objects.
Strings that are less than 20 characters and contain ASCII letters, digits, or underscores will be interned. This can make these strings more effective to use in comparisons:
>>>s1="realpython">>>id(s1)140696485006960>>>s2="realpython">>>id(s2)140696485006960>>>s1iss2True
Here you can see thats1
ands2
both point to the same address in memory. If you were to introduce a character that isn’t an ASCII letter, digit, or underscore, then you would get a different result:
>>>s1="Real Python!">>>s2="Real Python!">>>s1iss2False
Because this example has an exclamation mark (!
) in it, these strings are not interned and are different objects in memory.
Bonus: If you really want these objects to reference the same internal object, then you may want to check outsys.intern()
. One of the use cases for this function is outlined in the documentation:
Interning strings is useful to gain a little performance on dictionary lookup—if the keys in a dictionary are interned, and the lookup key is interned, the key comparisons (after hashing) can be done by a pointer compare instead of a string compare. (Source)
Interned objects are often a source of confusion. Just remember, if you’re ever in doubt, that you can always useid()
andis
to determine object equality.
Just because pointers in Python don’t exist natively doesn’t mean you can’t get the benefits of using pointers. In fact, there are multiple ways to simulate pointers in Python. You’ll learn two in this section:
Okay, let’s get to the point.
You’ve already learned about mutable types. Because these objects are mutable, you can treat them as if they were pointers to simulate pointer behavior. Suppose you wanted to replicate the following c code:
voidadd_one(int*x){*x+=1;}
This code takes a pointer to an integer (*x
) and then increments the value by one. Here is a main function to exercise the code:
#include<stdio.h>intmain(void){inty=2337;printf("y = %d\n",y);add_one(&y);printf("y = %d\n",y);return0;}
In the above code, you assign2337
toy
,print out the current value, increment the value by one, and then print out the modified value. The output of executing this code would be the following:
y = 2337y = 2338
One way to replicate this type of behavior in Python is by using a mutable type. Consider using a list and modifying the first element:
>>>defadd_one(x):...x[0]+=1...>>>y=[2337]>>>add_one(y)>>>y[0]2338
Here,add_one(x)
accesses the first element and increments its value by one. Using alist
means that the end result appears to have modified the value. So pointers in Python do exist? Well, no. This is only possible becauselist
is a mutable type. If you tried to use atuple
, you would get an error:
>>>z=(2337,)>>>add_one(z)Traceback (most recent call last): File"<stdin>", line1, in<module> File"<stdin>", line2, inadd_oneTypeError:'tuple' object does not support item assignment
The above code demonstrates thattuple
is immutable. Therefore, it does not support item assignment.list
is not the only mutable type. Another common approach to mimicking pointers in Python is to use adict
.
Let’s say you had an application where you wanted to keep track of every time an interesting event happened. One way to achieve this would be to create adict
and use one of the items as a counter:
>>>counters={"func_calls":0}>>>defbar():...counters["func_calls"]+=1...>>>deffoo():...counters["func_calls"]+=1...bar()...>>>foo()>>>counters["func_calls"]2
In this example, thecounters
dictionary is used to keep track of the number of function calls. After you callfoo()
, the counter has increased to2
as expected. All becausedict
is mutable.
Keep in mind, this is onlysimulates pointer behavior and does not directly map to true pointers in C or C++. That is to say, these operations are more expensive than they would be in C or C++.
Thedict
option is a great way to emulate pointers in Python, but sometimes it gets tedious to remember the key name you used. This is especially true if you’re using the dictionary in various parts of your application. This is where a customPython class can really help.
To build on the last example, assume that you want to track metrics in your application. Creating a class is a great way to abstract the pesky details:
classMetrics(object):def__init__(self):self._metrics={"func_calls":0,"cat_pictures_served":0,}
This code defines aMetrics
class. This class still uses adict
for holding the actual data, which is in the_metrics
member variable. This will give you the mutability you need. Now you just need to be able to access these values. One nice way to do this is with properties:
classMetrics(object):# ...@propertydeffunc_calls(self):returnself._metrics["func_calls"]@propertydefcat_pictures_served(self):returnself._metrics["cat_pictures_served"]
This code makes use of@property
. If you’re not familiar with decorators, you can check out thisPrimer on Python Decorators. The@property
decorator here allows you to accessfunc_calls
andcat_pictures_served
as if they were attributes:
>>>metrics=Metrics()>>>metrics.func_calls0>>>metrics.cat_pictures_served0
The fact that you can access these names as attributes means that you abstracted the fact that these values are in adict
. You also make it more explicit what the names of the attributes are. Of course, you need to be able to increment these values:
classMetrics(object):# ...definc_func_calls(self):self._metrics["func_calls"]+=1definc_cat_pics(self):self._metrics["cat_pictures_served"]+=1
You have introduced two new methods:
inc_func_calls()
inc_cat_pics()
These methods modify the values in the metricsdict
. You now have a class that you modify as if you’re modifying a pointer:
>>>metrics=Metrics()>>>metrics.inc_func_calls()>>>metrics.inc_func_calls()>>>metrics.func_calls2
Here, you can accessfunc_calls
and callinc_func_calls()
in various places in your applications and simulate pointers in Python. This is useful when you have something like metrics that need to be used and updated frequently in various parts of your applications.
Note: In this class in particular, makinginc_func_calls()
andinc_cat_pics()
explicit instead of using@property.setter
prevents users from setting these values to an arbitraryint
or an invalid value like adict
.
Here’s the full source for theMetrics
class:
classMetrics(object):def__init__(self):self._metrics={"func_calls":0,"cat_pictures_served":0,}@propertydeffunc_calls(self):returnself._metrics["func_calls"]@propertydefcat_pictures_served(self):returnself._metrics["cat_pictures_served"]definc_func_calls(self):self._metrics["func_calls"]+=1definc_cat_pics(self):self._metrics["cat_pictures_served"]+=1
ctypes
Okay, so maybe there are pointers in Python, specifically CPython. Using the builtinctypes
module, you can create real C-style pointers in Python. If you are unfamiliar withctypes
, then you can take a look atExtending Python With C Libraries and the “ctypes” Module.
The real reason you would use this is if you needed to make a function call to a C library that requires a pointer. Let’s go back to theadd_one()
C-function from before:
voidadd_one(int*x){*x+=1;}
Here again, this code is incrementing the value ofx
by one. To use this, first compile it into a shared object. Assuming the above file is stored inadd.c
, you could accomplish this withgcc
:
$gcc-c-Wall-Werror-fpicadd.c$gcc-shared-olibadd1.soadd.o
The first command compiles the C source file into an object calledadd.o
. The second command takes that unlinked object file and produces a shared object calledlibadd1.so
.
libadd1.so
should be in your current directory. You can load it into Python usingctypes
:
>>>importctypes>>>add_lib=ctypes.CDLL("./libadd1.so")>>>add_lib.add_one<_FuncPtr object at 0x7f9f3b8852a0>
Thectypes.CDLL
code returns an object that represents thelibadd1
shared object. Because you definedadd_one()
in this shared object, you can access it as if it were any other Python object. Before you call the function though, you should specify the function signature. This helps Python ensure that you pass the right type to the function.
In this case, the function signature is a pointer to an integer.ctypes
will allow you to specify this using the following code:
>>>add_one=add_lib.add_one>>>add_one.argtypes=[ctypes.POINTER(ctypes.c_int)]
In this code, you’re setting the function signature to match what C is expecting. Now, if you were to try to call this code with the wrong type, then you would get a nice warning instead of undefined behavior:
>>>add_one(1)Traceback (most recent call last): File"<stdin>", line1, in<module>ctypes.ArgumentError:argument 1: <class 'TypeError'>: \expected LP_c_int instance instead of int
Python throws an error, explaining thatadd_one()
wants a pointer instead of just an integer. Luckily,ctypes
has a way to pass pointers to these functions. First, declare a C-style integer:
>>>x=ctypes.c_int()>>>xc_int(0)
The above code creates a C-style integerx
with a value of0
.ctypes
provides the handybyref()
to allow passing a variable by reference.
Note: The termby reference is opposed to passing a variableby value.
When passing by reference, you’re passing the reference to the original variable, and thus modifications will be reflected in the original variable. Passing by value results in a copy of the original variable, and modifications are not reflected in the original.
For more information on passing by reference in Python, check outPass by Reference in Python: Background and Best Practices.
You can use this to calladd_one()
:
>>>add_one(ctypes.byref(x))998793640>>>xc_int(1)
Nice! Your integer was incremented by one. Congratulations, you have successfully used real pointers in Python.
You now have a better understanding of the intersection between Python objects and pointers. Even though some of the distinctions between names and variables seem pedantic, fundamentally understanding these key terms expands your understanding of how Python handles variables.
You’ve also learned some excellent ways to simulate pointers in Python:
ctypes
moduleThese methods allow you to simulate pointers in Python without sacrificing the memory safety that Python provides.
Thanks for reading. If you still have questions, feel free to reach out either in the comments section or on Twitter.
Watch Now This tutorial has a related video course created by the Real Python team. Watch it together with the written tutorial to deepen your understanding:Pointers and Objects in Python
🐍 Python Tricks 💌
Get a short & sweetPython Trick delivered to your inbox every couple of days. No spam ever. Unsubscribe any time. Curated by the Real Python team.
AboutLogan Jones
Hi, I'm Logan, an open source contributor, writer for Real Python, software developer, and always trying to get better. Feel free to reach out and let's get better together!
» More about LoganMasterReal-World Python Skills With Unlimited Access to Real Python
Join us and get access to thousands of tutorials, hands-on video courses, and a community of expert Pythonistas:
MasterReal-World Python Skills
With Unlimited Access to Real Python
Join us and get access to thousands of tutorials, hands-on video courses, and a community of expert Pythonistas:
What Do You Think?
What’s your #1 takeaway or favorite thing you learned? How are you going to put your newfound skills to use? Leave a comment below and let us know.
Commenting Tips: The most useful comments are those written with the goal of learning from or helping out other students.Get tips for asking good questions andget answers to common questions in our support portal.
Keep Learning
Related Topics:intermediatepython
Recommended Video Course:Pointers and Objects in Python
Related Tutorials:
Already have an account?Sign-In
Almost there! Complete this form and click the button below to gain instant access:
"Python Tricks: The Book" – Free Sample Chapter (PDF)