Primer on Python for R Users

Source:vignettes/python_primer.Rmd

python_primer.Rmd

library(reticulate)

Primer on Python for R users

You may find yourself wanting to read and understand some Python, oreven port some Python to R. This guide is designed to enable you to dothese tasks as quickly as possible. As you’ll see, R and Python aresimilar enough that this is possible without necessarily learning all ofPython. We start with the basics of container types and work up to themechanics of classes, dunders, the iterator protocol, the contextprotocol, and more!

Whitespace

Whitespace matters in Python. In R, expressions are grouped into acode block with{}. In Python, that is done by making theexpressions share an indentation level. For example, an expression withan R code block might be:

if(TRUE){cat("This is one expression. \n")cat("This is another expression. \n")}#> This is one expression.#> This is another expression.

The equivalent in Python:

ifTrue:print("This is one expression.")print("This is another expression.")#> This is one expression.#> This is another expression.

Python accepts tabs or spaces as the indentation spacer, but therules get tricky when they’re mixed. Most style guides suggest (andIDE’s default to) using spaces only.

Container Types

In R, thelist() is a container you can use to organizeR objects. R’slist() is feature packed, and there is nosingle direct equivalent in Python that supports all the same features.Instead there are (at least) 4 different Python container types you needto be aware of: lists, dictionaries, tuples, and sets.

Lists

Python lists are typically created using bare brackets[]. The Python built-inlist() function ismore of a coercion function, closer in spirit to R’sas.list(). The most important thing to know about Pythonlists is that they are modified in place. Note in the example below thaty reflects the changes made tox, because theunderlying list object which both symbols point to is modified inplace.

x= [1,2,3]y= x# `y` and `x` now refer to the same list!x.append(4)print("x is", x)#> x is [1, 2, 3, 4]print("y is", y)#> y is [1, 2, 3, 4]

One Python idiom that might be concerning to R users is that ofgrowing lists through theappend() method. Growing lists inR is typically slow and best avoided. But because Python’s list aremodified in place (and a full copy of the list is avoided when appendingitems), it is efficient to grow Python lists in place.

Some syntactic sugar around Python lists you might encounter is theusage of+ and* with lists. These areconcatenation and replication operators, akin to R’sc()andrep().

x= [1]x#> [1]x+ x#> [1, 1]x*3#> [1, 1, 1]

You can index into lists with integers using trailing[], but note that indexing is 0-based.

x= [1,2,3]x[0]#> 1x[1]#> 2x[2]#> 3try:  x[3]exceptExceptionas e:print(e)#> list index out of range

When indexing, negative numbers count from the end of thecontainer.

x= [1,2,3]x[-1]#> 3x[-2]#> 2x[-3]#> 1

You can slice ranges of lists using the: insidebrackets. Note that the slice syntax isnotinclusive of the end of the slice range. You can optionally also specifya stride.

x= [1,2,3,4,5,6]x[0:2]# get items at index positions 0, 1#> [1, 2]x[1:]# get items from index position 1 to the end#> [2, 3, 4, 5, 6]x[:-2]# get items from beginning up to the 2nd to last.#> [1, 2, 3, 4]x[:]# get all the items (idiom used to copy the list so as not to modify in place)#> [1, 2, 3, 4, 5, 6]x[::2]# get all the items, with a stride of 2#> [1, 3, 5]x[1::2]# get all the items from index 1 to the end, with a stride of 2#> [2, 4, 6]

Tuples

Tuples behave like lists, except they are not mutable, and they don’thave the same modify-in-place methods likeappend(). Theyare typically constructed using bare(), but parenthesesare not strictly required, and you may see an implicit tuple beingdefined just from a comma separated series of expressions. Becauseparentheses can also be used to specify order of operations inexpressions like(x + 3) * 4, a special syntax is requiredto define tuples of length 1: a trailing comma. Tuples are most commonlyencountered in functions that take a variable number of arguments.

x= (1,2)# tuple of length 2type(x)#> <class 'tuple'>len(x)#> 2x#> (1, 2)x= (1,)# tuple of length 1type(x)#> <class 'tuple'>len(x)#> 1x#> (1,)x= ()# tuple of length 0print(f"{type(x)=};{len(x)=};{x=}")#> type(x) = <class 'tuple'>; len(x) = 0; x = ()# example of an interpolated string literalsx=1,2# also a tupletype(x)#> <class 'tuple'>len(x)#> 2x=1,# beware a single trailing comma! This is a tuple!type(x)#> <class 'tuple'>len(x)#> 1

Packing and Unpacking

Tuples are the container that powers thepacking andunpacking semantics in Python. Python provides the convenienceof allowing you to assign multiple symbols in one expression. This iscalledunpacking.

For example:

x= (1,2,3)a, b, c= xa#> 1b#> 2c#> 3

(You can access similar unpacking behavior from R usingzeallot::`%<-%`).

Tuple unpacking can occur in a variety of contexts, such asiteration:

xx= (("a",1),      ("b",2))for x1, x2in xx:print("x1 = ", x1)print("x2 = ", x2)#> x1 =  a#> x2 =  1#> x1 =  b#> x2 =  2

If you attempt to unpack a container to the wrong number of symbols,Python raises an error:

x= (1,2,3)a, b, c= x# successa, b= x# error, x has too many values to unpack#> ValueError: too many values to unpack (expected 2)a, b, c, d= x# error, x has not enough values to unpack#> ValueError: not enough values to unpack (expected 4, got 3)

It is possible to unpack a variable number of arguments, using* as a prefix to a symbol. (You’ll see the*prefix again when we talk about functions)

x= (1,2,3)a,*the_rest= xa#> 1the_rest#> [2, 3]

You can also unpack nested structures:

x= ((1,2), (3,4))(a, b), (c, d)= x

Dictionaries

Dictionaries are most similar to R environments. They are a containerwhere you can retrieve items by name, though in Python the name (calledakey in Python’s parlance) does not need to be a string likein R. It can be any Python object with ahash() method(meaning, it can be almost any Python object). They can be created usingsyntax like{key: value}. Like Python lists, they aremodified in place. Note thatr_to_py() converts R namedlists to dictionaries.

d= {"key1":1,"key2":2}d2= dd#> {'key1': 1, 'key2': 2}d["key1"]#> 1d["key3"]=3d2# modified in place!#> {'key1': 1, 'key2': 2, 'key3': 3}

Like R environments (and unlike R’s named lists), you cannot indexinto a dictionary with an integer to get an item at a specific indexposition. Dictionaries areunordered containers.(However—beginning with Python 3.7, dictionaries do preserve the iteminsertion order).

d= {"key1":1,"key2":2}d[1]# error#> KeyError: 1

A container that closest matches the semantics of R’s named list istheOrderedDict,but that’s relatively uncommon in Python code so we don’t cover itfurther.

Sets

Sets are a container that can be used to efficiently track uniqueitems or deduplicate lists. They are constructed using{val1, val2} (like a dictionary, but without:). Think of them as dictionary where you only use thekeys. Sets have many efficient methods for membership operations, likeintersection(),issubset(),union() and so on.

s= {1,2,3}type(s)#> <class 'set'>s#> {1, 2, 3}s.add(1)s#> {1, 2, 3}

Iteration with`for`

Thefor statement in Python can be used to iterate overany kind of container.

for xin [1,2,3]:print(x)#> 1#> 2#> 3

R has a relatively limited set of objects that can be passed tofor. Python by comparison, provides an iterator protocolinterface, which means that authors can define custom objects, withcustom behavior that is invoked byfor. (We’ll have anexample for how to define a custom iterable when we get to classes). Youmay want to use a Python iterable from R using reticulate, so it’shelpful to peel back the syntactic sugar a little to show what thefor statement is doing in Python, and how you can stepthrough it manually.

There are two things that happen: first, an iterator is constructedfrom the supplied object. Then, the new iterator object is repeatedlycalled withnext() until it is exhausted.

l= [1,2,3]it=iter(l)# create an iterator objectit#> <list_iterator object at 0x1402267a0># call `next` on the iterator until it is exhausted:next(it)#> 1next(it)#> 2next(it)#> 3next(it)#> StopIteration

In R, you can use reticulate to step through an iterator the sameway.

library(reticulate)l<-r_to_py(list(1,2,3))it<-as_iterator(l)iter_next(it)#> 1.0iter_next(it)#> 2.0iter_next(it)#> 3.0iter_next(it, completed="StopIteration")#> [1] "StopIteration"

Iterating over dictionaries first requires understanding if you areiterating over the keys, values, or both. Dictionaries have methods thatallow you to specify which.

d= {"key1":1,"key2":2}for keyin d:print(key)#> key1#> key2for valuein d.values():print(value)#> 1#> 2for key, valuein d.items():print(key,":", value)#> key1 : 1#> key2 : 2

Comprehensions

Comprehensions are special syntax that allow you to construct acontainer like a list or a dict, while also executing a small operationor single expression on each element. You can think of it as specialsyntax for R’slapply.

For example:

x= [1,2,3]# a list comprehension built from x, where you add 100 to each elementl= [element+100for elementin x]l#> [101, 102, 103]# a dict comprehension built from x, where the key is a string.# Python's str() is like R's as.character()d= {str(element) : element+100for elementin x}d#> {'1': 101, '2': 102, '3': 103}

Defining Functions with`def`

Python functions are defined with thedef statement. Thesyntax for specifying function arguments and default values is verysimilar to R.

def my_function(name="World"):print("Hello", name)my_function()#> Hello Worldmy_function("Friend")#> Hello Friend

The equivalent R snippet would be

my_function<-function(name="World"){cat("Hello",name,"\n")}my_function()#> Hello Worldmy_function("Friend")#> Hello Friend

Unlike R functions, the last value in a function is not automaticallyreturned. Python requires an explicit return statement.

def fn():1print(fn())#> Nonedef fn():return1print(fn())#> 1

(Note for advanced R users: Python has no equivalent of R’s argument“promises”. Function argument default values are evaluated once, whenthe function is constructed. This can be surprising if you define aPython function with a mutable object as a default argument value, likea Python list!)

def my_func(x= []):  x.append("was called")print(x)my_func()#> ['was called']my_func()#> ['was called', 'was called']my_func()#> ['was called', 'was called', 'was called']

You can also define Python functions that take a variable number ofarguments, similar to... in R. A notable difference isthat R’s... makes no distinction between named and unnamedarguments, but Python does. In Python, prefixing a single*captures unnamed arguments, and two** signifies thatkeyword arguments are captured.

def my_func(*args,**kwargs):print("args = ", args)# args is a tupleprint("kwargs = ", kwargs)# kwargs is a dictionarymy_func(1,2,3, a=4, b=5, c=6)#> args =  (1, 2, 3)#> kwargs =  {'a': 4, 'b': 5, 'c': 6}

Whereas the* and** in a functiondefinition signaturepack arguments, in a function call theyunpack arguments. Unpacking arguments in a function call isequivalent to usingdo.call() in R.

def my_func(a, b, c):print(a, b, c)args= (1,2,3)my_func(*args)#> 1 2 3kwargs= {"a":1,"b":2,"c":3}my_func(**kwargs)#> 1 2 3

Defining Classes with`class`

One could argue that in R, the preeminent unit of composition forcode is thefunction, and in Python, it’s theclass. You can be a very productive R user and never useR6, reference classes, or similar R equivalents to the object-orientedstyle of Pythonclass’s.

In Python, however, understanding the basics of howclass objects work is requisite knowledge, becauseclass’s are how you organize and find methods in Python.(In contrast to R’s approach, where methods are found by dispatchingfrom a generic). Fortunately, the basics ofclass’s areaccessible.

Don’t be intimidated if this is your first exposure to objectoriented programming. We’ll start by building up a simple Python classfor demonstration purposes.

class MyClass:pass# `pass` means do nothing.MyClass#> <class '__main__.MyClass'>type(MyClass)#> <class 'type'>instance= MyClass()instance#> <__main__.MyClass object at 0x14023b260>type(instance)#> <class '__main__.MyClass'>

Like thedef statement, theclass statementbinds a new callable symbol,MyClass. First note the strongnaming convention, classes are typicallyCamelCase, andfunctions are typicallysnake_case. After definingMyClass, you can interact with it, and see that it has type'type'. CallingMyClass() creates a new objectinstance of the class, which has type'MyClass' (ignore the__main__. prefix fornow). The instance prints with its memory address, which is a stronghint that it’s common to be managing many instances of a class, and thatthe instance is mutable (modified-in-place by default).

In the first example, we defined an emptyclass, butwhen we inspect it we see that it already comes with a bunch ofattributes (dir() in Python is equivalent tonames() in R):

dir(MyClass)#> ['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__']

What are all the underscores?

Python typically indicates that something is special by wrapping thename in double underscores. A special double-underscore-wrapped token iscommonly called a “dunder”. “Special” is not a technical term, it justmeans that the token invokes a Python language feature. Some dundertokens are merely ways code authors can plug into specific syntacticsugars, others are values provided by the interpreter that would beotherwise hard to acquire, yet others are for extending languageinterfaces (e.g., the iteration protocol), and finally, a small handfulof dunders are truly complicated to understand. Fortunately, as an Ruser looking to use some Python features through reticulate, you onlyneed to know about a few easy-to-understand dunders.

The most common dunder method you’ll encounter when reading Pythoncode is__init__(). This is a function that is called whenthe class constructor is called, that is, when a class isinstantiated. It is meant to initialize the new classinstance. (In very sophisticated code bases, you may also encounterclasses where__new__ is also defined, this is calledbefore__init__).

class MyClass:print("MyClass's definition body is being evaluated")def__init__(self):print(self,"is initializing")#> MyClass's definition body is being evaluatedprint("MyClass is finished being created")#> MyClass is finished being createdinstance= MyClass()#> <__main__.MyClass object at 0x140266330> is initializingprint(instance)#> <__main__.MyClass object at 0x140266330>instance2= MyClass()#> <__main__.MyClass object at 0x11e3ad490> is initializingprint(instance2)#> <__main__.MyClass object at 0x11e3ad490>

A few things to note:

theclass statement takes a code block that isdefined by a common indentation level. The code block has the same exactsemantics as any other expression that takes a code block, likeif anddef. The body of the class is evaluatedonlyonce, when the class constructor is first beingcreated. Beware that any objects defined here are shared by allinstances of the class!
__init__ is just a normal function, defined withdef like any other function. Except it’s inside the classbody.
__init__ take an argument:self.self is the class instance being initialized (note theidentical memory address betweenself andinstance). Also note that we didn’t provideself when callMyClass() to create the classinstance,self was spliced into the function call by theinterpreter.
__init__ is called each time a new instance iscreated.

Functions defined inside aclass code block are calledmethods, and the important thing to know about methods is thateach time they are called from a class instance, the instance is splicedinto the function call as the first argument. This applies to allfunctions defined in a class, including dunders. (The sole exception isif the function is decorated with something like@classmethod or@staticmethod).

class MyClass:def a_method(self):print("MyClass.a_method() was called with",self)instance= MyClass()instance.a_method()#> MyClass.a_method() was called with <__main__.MyClass object at 0x11e3c7f20>MyClass.a_method()# error, missing required argument `self`#> TypeError: MyClass.a_method() missing 1 required positional argument: 'self'MyClass.a_method(instance)# identical to instance.a_method()#> MyClass.a_method() was called with <__main__.MyClass object at 0x11e3c7f20>

Other dunder’s worth knowing about are:

__getitem__: the function invoked when subsetting aninstance with[ (Equivalent to defining a[ S3method in R.
__getattr__: the function invoked when subsettingwith. (Equivalent to defining a$ S3 methodin R.
__iter__ and__next__: functionsinvoked byfor.
__call__: invoked when a class instance is calledlike a function (e.g.,instance()).
__bool__: invoked byif andwhile (equivalent toas.logical() in R, butreturning only a scalar, not a vector).
__repr__,__str__, functions invokedfor formatting and pretty printing (akin toformat(),dput(), andprint() methods in R).
__enter__ and__exit__: functionsinvoked bywith.
Manybuilt-inPython functions are just sugar for invoking the dunder. For example:callingrepr(x) is identical tox.__repr__().Other builtins that are just sugar for invoking the dunder arenext(),iter(),str(),list(),dict(),bool(),dir(),hash() and more!

Iterators, revisited

Now that we have the basics ofclass, it’s time torevisit iterators. First, some terminology:

iterable: something that can be iterated over.Concretely, a class that defines an__iter__ method, whosejob is to return aniterator.

iterator: something that iterates. Concretely, aclass that defines a__next__ method, whose job is toreturn the next element each time it is called, and then raises aStopIteration exception once it’s exhausted.

It’s common to see classes that are both iterables and iterators,where the__iter__ method is just a stub that returnsself.

Here is a custom iterable / iterator implementation of Python’srange (similar toseq in R)

class MyRange:def__init__(self, start, end):self.start= startself.end= enddef__iter__(self):# reset our counter.self._index=self.start-1returnselfdef__next__(self):ifself._index<self.end:self._index+=1# incrementreturnself._indexelse:raiseStopIterationfor xin MyRange(1,3):print(x)#> 1#> 2#> 3# doing what `for` does, but manuallyr= MyRange(1,3)it=iter(r)next(it)#> 1next(it)#> 2next(it)#> 3next(it)#> StopIteration

Defining Generators with`yield`.

Generators are special Python functions that contain one or moreyield statements. As soon asyield is includedin a code block passed todef, the semantics changesubstantially. You’re no longer defining a mere function, but agenerator constructor! In turn, calling a generator constructor createsa generator object, which is just another type of iterator.

Here is an example:

def my_generator_constructor():yield1yield2yield3# At first glance it presents like a regular functionmy_generator_constructor#> <function my_generator_constructor at 0x1402579c0>type(my_generator_constructor)#> <class 'function'># But calling it returns something special, a 'generator object'my_generator= my_generator_constructor()my_generator#> <generator object my_generator_constructor at 0x11e3ff530>type(my_generator)#> <class 'generator'># The generator object is both an iterable and an iterator# it's __iter__ method is just a stub that returns `self`iter(my_generator)== my_generator== my_generator.__iter__()#> True# step through it like any other iteratornext(my_generator)#> 1my_generator.__next__()# next() is just sugar for calling the dunder#> 2next(my_generator)#> 3next(my_generator)#> StopIteration

Encounteringyield is like hitting the pause button on afunctions execution, it preserves the state of everything in thefunction body and returns control to whatever is iterating over thegenerator object. Callingnext() on the generator objectresumes execution of the function body until the nextyieldis encountered, or the function finishes.

Iteration closing remarks

Iteration is deeply baked into the Python language, and R users maybe surprised by how things in Python are iterable, iterators, or poweredby the iterator protocol under the hood. For example, the built-inmap() (equivalent to R’slapply()) yields aniterator, not a list. Similarly, a tuple comprehension like(elem for elem in x) produces an iterator. Most featuresdealing with files are iterators, and so on.

Any time you find an iterator inconvenient, you can materialize allthe elements into a list using the Python built-inlist(),orreticulate::iterate() in R. Also, if you like thereadability offor, you can utilize similar semantics toPython’sfor usingcoro::loop().

`import` and Modules

In R, authors can bundle their code into shareable extensions calledR packages, and R users can access objects from R packages vialibrary() or::. In Python, authors bundlecode intomodules, and users access modules usingimport. Consider the line:

import numpy

This statement has Python go out to the file system, find aninstalled Python module named ‘numpy’, load it (commonly meaning:evaluate its__init__.py file and construct amodule type), and bind it to the symbolnumpy.

The closest equivalent to this in R might be:

dplyr<-loadNamespace("dplyr")

Where are modules found?

In Python, the file system locations where modules are searched canbe accessed (and modified) from the list found atsys.path.This is Python’s equivalent to R’s.libPaths().sys.path will typically contain paths to the currentworking directory, the Python installation which contains the built-instandard library, administrator installed modules, user installedmodules, values from environment variables likePYTHONPATH,and any modifications made directly tosys.path by othercode in the current Python session (though this is relatively uncommonin practice).

import syssys.path#> ['', '/Users/tomasz/.pyenv/versions/3.12.4/bin', '/Users/tomasz/.pyenv/versions/3.12.4/lib/python312.zip', '/Users/tomasz/.pyenv/versions/3.12.4/lib/python3.12', '/Users/tomasz/.pyenv/versions/3.12.4/lib/python3.12/lib-dynload', '/Users/tomasz/.virtualenvs/r-reticulate/lib/python3.12/site-packages', '/Users/tomasz/github/rstudio/reticulate/inst/python', '/Users/tomasz/.virtualenvs/r-reticulate/lib/python312.zip', '/Users/tomasz/.virtualenvs/r-reticulate/lib/python3.12', '/Users/tomasz/.virtualenvs/r-reticulate/lib/python3.12/lib-dynload']

You can inspect where a module was loaded from by accessing thedunder__path__ or__file__ (especially usefulwhen troubleshooting installation issues):

import osos.__file__#> '/Users/tomasz/.virtualenvs/r-reticulate/lib/python3.12/os.py'numpy.__path__#> ['/Users/tomasz/.virtualenvs/r-reticulate/lib/python3.12/site-packages/numpy']

Once a module is loaded, you can access symbols from the module using. (equivalent to::, or maybe$.environment, in R).

numpy.abs(-1)#> 1

There is also special syntax for specifying the symbol a module isbound to upon import, and for importing only some specific symbols.

import numpy# importimport numpyas np# import and bind to a custom symbol `np`npis numpy# test for identicalness, similar to identical(np, numpy)#> Truefrom numpyimportabs# import only `numpy.abs`, bind it to `abs`absis numpy.abs#> Truefrom numpyimportabsas abs2# import only `numpy.abs`, bind it to `abs2`abs2is numpy.abs#> True

If you’re looking for the Python equivalent of R’slibrary(), which makes all of a package’s exported symbolsavailable, it might be usingimport with a*wildcard, though it’s relatively uncommon to do so. The*wildcard will expand to include all the symbols in module, or all thesymbols listed in__all__, if it is defined.

from numpyimport*

Python doesn’t make a distinction like R does between packageexported and internal symbols. In Python, all module symbols are equal,though there is the naming convention that intended-to-be-internalsymbols are prefixed with a single leading underscore. (Two leadingunderscores invoke an advanced language feature called “name mangling”,which is outside the scope of this introduction).

Integers and Floats

R users generally don’t need to be aware of the difference betweenintegers and floating point numbers, but that’s not the case in Python.If this is your first exposure to numeric data types, here are theessentials:

integer types can only represent whole numbers like1 or2, not floating point numbers like1.2.
floating-point types can represent any number, but with somedegree of imprecision.

In R, writing a bare literal number like12 produces afloating point type, whereas in Python, it produces an integer. You canproduce an integer literal in R by appending anL, as in12L. Many Python functions expect integers, and will errorwhen provided a float.

For example, say we have a Python function that expects aninteger:

def a_strict_Python_function(x):assertisinstance(x,int),"x is not an int"print("Yay! x was an int")

When calling it from R, you must be sure to call it with aninteger:

library(reticulate)py$a_strict_Python_function(3)# error#> x is not an intpy$a_strict_Python_function(3L)# success#> Yay! x was an intpy$a_strict_Python_function(as.integer(3))# success#> Yay! x was an int

What about R vectors?

R is a language designed for numerical computing first. Numericvector data types are baked deep into the R language, to the point thatthe language doesn’t even distinguish scalars from vectors. Bycomparison, numerical computing capabilities in Python are generallyprovided by third party packages (modules, in Pythonparlance).

In Python, thenumpy module is most commonly used tohandle contiguous arrays of data. The closest equivalent to an R numericvector is a numpy array, or sometimes, a list of scalar numbers (somePythonistas might argue forarray.array() here, but that’sso rarely encountered in actual Python code we don’t mention itfurther).

Teaching the NumPy interface is beyond the scope of this primer, butit’s worth pointing out some potential tripping hazards for usersaccustomed to R arrays:

When indexing into multidimensional numpy arrays, trailingdimensions can be omitted and are implicitly treated as missing. Theconsequence is that iterating over arrays means iterating over the firstdimension. For example, this iterates over the rows of a matrix.

import numpyas npm= np.arange(12).reshape((3,4))m#> array([[ 0,  1,  2,  3],#>        [ 4,  5,  6,  7],#>        [ 8,  9, 10, 11]])m[0, :]# first row#> array([0, 1, 2, 3])m[0]# also first row#> array([0, 1, 2, 3])for rowin m:print(row)#> [0 1 2 3]#> [4 5 6 7]#> [ 8  9 10 11]

Many numpy operations modify the array in place! This is surprisingto R users, who are used to the convenience and safety of R’scopy-on-modify semantics. Unfortunately, there is no simple scheme ornaming convention you can rely on to quickly determine if a particularmethod modifies in-place or creates a new array copy. The only reliableway is to consult thedocumentation,and conduct small experiments at thereticulate::repl_python().

Decorators

Decorators are just functions that take a function as an argument,and then typically returns another function. Any function can be invokedas a decorator with the@ syntax, which is just sugar forthis simple action:

def my_decorator(func):  func.x="a decorator modified this function by adding an attribute `x`"return funcdef my_function():passmy_function= my_decorator(my_function)# @ is just fancy syntax for the above two lines@my_decoratordef my_function():pass

One decorator you might encounter frequently is:

@property, which automatically calls a class methodwhen the attribute is accessed (similar tomakeActiveBinding() in R).

`with` and context management

Any object that defines__enter__ and__exit__ methods implements the “context” protocol, and canbe passed towith. For example, here is a customimplementation of a context manager that temporarily changes the currentworking directory (equivalent to R’swithr::with_dir())

from osimport getcwd, chdirclass wd_context:def__init__(self, wd):self.new_wd= wddef__enter__(self):self.original_wd= getcwd()    chdir(self.new_wd)def__exit__(self,*args):# __exit__ takes some additional argument that are commonly ignored    chdir(self.original_wd)getcwd()#> '/Users/tomasz/github/rstudio/reticulate/vignettes'with wd_context(".."):print("in the context, wd is:", getcwd())#> in the context, wd is: /Users/tomasz/github/rstudio/reticulategetcwd()#> '/Users/tomasz/github/rstudio/reticulate/vignettes'

Learning More

Hopefully, this short primer to Python has provided a good foundationfor confidently reading Python documentation and code, and using Pythonmodules from R via reticulate. Of course, there is much, much more tolearn about Python. Googling questions about Python reliably brings uppages of results, but not always sorted in order of most useful. Blogposts and tutorials targeting beginners can be valuable, but rememberthat Python’s official documentation is generally excellent, and itshould be your first destination when you have questions.

https://docs.Python.org/3/

https://docs.Python.org/3/library/index.html

To learn Python more fully, the built-in official tutorial is alsoexcellent and comprehensive (but does require a time commitment to getvalue out of it)https://docs.Python.org/3/tutorial/index.html

Finally, don’t forget to solidify your understanding by conductingsmall experiments at thereticulate::repl_python().

Thank you for reading!

Movatterモバイル変換

Primer on Python for R Users