Python call by ref call by value using ctypes

Question:

I am trying to write a program to illustrate to A level students the difference between call by reference and call by value using Python. I had succeeded by passing mutable objects as variables to functions, but found I could also do the same using the ctypes library.

I don’t quite understand how it works because there is a function byref() in the ctype library, but it didn’t work in my example. However, by calling a function without byref() it did work!

My working code:

"""
Program to illustrate call by ref
"""

from  ctypes import *  #allows call by ref

test = c_int(56)  #Python call by reference eg address
t = 67            #Python call by value eg copy


#expects a ctypes argument
def byRefExample(x):
    x.value= x.value + 2
    

#expects a normal Python variable
def byValueExample(x):
    x = x + 2
         

if __name__ == "__main__":

    print "Before call test is",test
    byRefExample(test)                
    print "After call test is",test

    print "Before call t is",t
    byValueExample(t)
    print "After call t is",t

Question

When passing a normal Python variable to byValueExample() it works as expected. The copy of the function argument t changes but the variable t in the header does not. However, when I pass the ctypes variable test both the local and the header variable change, thus it is acting like a C pointer variable. Although my program works, I am not sure how and why the byref() function doesn’t work when used like this:

byRefExample(byref(test))
Asked By: Timothy Lawman

||

Answers:

You’re actually using terminology that’s not exactly correct, and potentially very misleading. I’ll explain at the end. But first I’ll answer in terms of your wording.


I had succeeded by passing mutable objects as variables to functions but found I could also do the same using the ctypes library.

That’s because those ctypes objects are mutable objects, so you’re just doing the same thing you already did. In particular, a ctypes.c_int is a mutable object holding an integer value, which you can mutate by setting its value member. So you’re already doing the exact same thing you’d done without ctypes.

In more detail, compare these:

def by_ref_using_list(x):
    x[0] += 1
value = [10]
by_ref_using_list(value)
print(value[0])

def by_ref_using_dict(x):
    x['value'] += 1
value = {'value': 10}
by_ref_using_list(value)
print(value['value'])

class ValueHolder(object):
    def __init__(self, value):
        self.value = value
def by_ref_using_int_holder(x):
    x.value += 1
value = ValueHolder(10)
by_ref_using_list(value)
print(value.value)

You’d expect all three of those to print out 11, because they’re just three different ways of passing different kinds of mutable objects and mutating them.

And that’s exactly what you’re doing with c_int.

You may want to read the FAQ How do I write a function with output parameters (call by reference)?, although it seems like you already know the answers there, and just wanted to know how ctypes fits in…


So, what is byref even for, then?

It’s used for calling a C function that takes values by reference C-style: by using explicit pointer types. For example:

void by_ref_in_c(int *x) {
    *x += 1;
}

You can’t pass this a c_int object, because it needs a pointer to a c_int. And you can’t pass it an uninitialized POINTER(c_int), because then it’s just going to be writing to random memory. You need to get the pointer to an actual c_int. Which you can do like this:

x = c_int(10)
xp = pointer(x)
by_ref_in_c(xp)
print(x)

That works just fine. But it’s overkill, because you’ve created an extra Python ctypes object, xp, that you don’t really need for anything. And that’s what byref is for: it gives you a lightweight pointer to an object, that can only be used for passing that object by reference:

x = c_int(10)
by_ref_in_c(byref(x))
print(x)

And that explains why this doesn’t work:

byRefExample(byref(test))

That call is making a lightweight pointer to test, and passing that pointer to byRefExample. But byRefExample doesn’t want a pointer to a c_int, it wants a c_int.

Of course this is all in Python, not C, so there’s no static type checking going on. The function call works just fine, and your code doesn’t care what type it gets, so long as it has a value member that you can increment. But a POINTER doesn’t have a value member. (It has a contents member instead.) So, you get an AttributeError trying to access x.value.


So, how do you do this kind of thing?

Well, using a single-element-list is a well-known hack to get around the fact that you need to share something mutable but you only have something immutable. If you use it, experienced Python programmers will know what you’re up to.

That being said, if you think you need this, you’re usually wrong. Often the right answer is to just return the new value. It’s easier to reason about functions that don’t mutate anything. You can string them together in any way you want, turn them inside-out with generators and iterators, ship them off to child processes to take advantage of those extra cores in your CPU, etc. And even if you don’t do any of that stuff, it’s usually faster to return a new value than to modify one in-place, even in cases where you wouldn’t expect that (e.g., deleting 75% of the values in a list).

And often, when you really do need mutable values, there’s already an obvious place for them to live, such as instance attributes of a class.

But sometimes you do need the single-element list hack, so it’s worth having in your repertoire; just don’t use it when you don’t need it.


So, what’s wrong with your terminology?

In a sense (the sense Ruby and Lisp programmers use), everything in Python is pass-by-reference. In another sense (the sense many Java and VB programmers use), it’s all pass-by-value. But really, it’s best to not call it either.* What you’re passing is neither a copy of the value of a variable, nor a reference to a variable, but a reference to a value. When you call that byValueExample(t) function, you’re not passing a new integer with the value 67 the way you would in C, you’re passing a reference to the same integer 67 that’s bound to the name t. If you could mutate 67 (you can’t, because ints are immutable), the caller would see the change.

Second, Python names are not even variables in the sense you’re thinking of. In C, a variable is an lvalue. It has a type and, more importantly, an address. So, you can pass around a reference to the variable itself, rather than to its value. In Python, a name is just a name (usually a key in a module, local, or object dictionary). It doesn’t have a type or an address. It’s not a thing you can pass around. So, there is no way to pass the variable x by reference.**

Finally, = in Python isn’t an assignment operator that copies a value to a variable; it’s a binding operator that gives a value a name. So, in C, when you write x = x + 1, that copies the value x + 1 to the location of the variable x, but in Python, when you write x = x + 1, that just rebinds the local variable x to refer to the new value x + 1. That won’t have any effect on whatever value x used to be bound to. (Well, if it was the only reference to that value, the garbage collector might clean it up… but that’s it.)

This is actually a lot easier to understand if you’re coming from C++, which really forces you to understand rvalues and lvalues and different kinds of references and copy construction vs. copy assignment and so on… In C, it’s all deceptively simple, which makes it harder to realize how very different it is from the equally-simple Python.


* Some people in the Python community like to call it “pass-by-sharing”. Some researchers call it “pass-by-object”. Others choose to first differentiate between value semantics and reference semantics, before describing calling styles, so you can call this “reference-semantics pass-by-copy”. But, while at least those names aren’t ambiguous, they also aren’t very well known, so they’re not likely to help anyone. I think it’s better to describe it than to try to figure out the best name for it…

** Of course, because Python is fully reflective, you can always pass the string x and the context in which it’s found, directly or indirectly… If your byRefExample did globals()['x'] = x + 2, that would affect the global x. But… don’t do that.

Answered By: abarnert

Python uses neither “call-by-reference” or “call-by-value” but “call-by-object”. Assignment gives names to objects.

test = c_int(56)
t = 67

test is a name given to a ctypes.c_int object that internally has a value name assigned to an int object.

t is a name give to an int object.

When calling byRefExample(test), x is another name given to the ctypes.c_int object referenced by test.

x.value = x.value + 2

The above reassigns the ‘value’ name stored in the ctypes.c_int object to a completely new int object with a different value. Since value is an attribute of the same ctypes.c_int object referred by the names test and x, x.value and test.value are referring to the same value.

When calling byValueExample(t), x is another name given to the int object referenced by t.

x = x + 2

The above reassigns the name x to a completely new int object with a different value. x and t no longer refer to the same object, so t will not observe the change. It still refers to the original int object.

You can observe this by printing the id() of the objects at different points in time:

from  ctypes import *

test = c_int(56)
t = 67

print('test id =',id(test))
print('t    id =',id(t))

#expects a ctypes argument
def byRefExample(x):
    print('ByRef x',x,id(x))
    print('ByRef x.value',x.value,id(x.value))
    x.value = x.value + 2
    print('ByRef x.value',x.value,id(x.value))
    print('ByRef x',x,id(x))

#expects a normal Python variable
def byValueExample(x):
    print('ByVal x',x,id(x))
    x = x + 2
    print('ByVal x',x,id(x))

print("Before call test is",test,id(test))
print("Before call test is",test.value,id(test.value))
byRefExample(test)                
print("After call test is",test.value,id(test.value))
print("After call test is",test,id(test))

print("Before call t is",t,id(t))
byValueExample(t)
print("After call t is",t,id(t))

Output (with comments):

test id = 80548680
t    id = 507083328
Before call test is c_long(56) 80548680
Before call test.value is 56 507082976
ByRef x c_long(56) 80548680                 # same id as test
ByRef x.value 56 507082976
ByRef x.value 58 507083040                  # x.value is new object!
ByRef x c_long(58) 80548680                 # but x is still the same.
After call test.value is 58 507083040       # test.value sees new object because...
After call test is c_long(58) 80548680      # test is same object as x.
Before call t is 67 507083328
ByVal x 67 507083328                        # same id as t
ByVal x 69 507083392                        # x is new object!
After call t is 67 507083328                # t id same old object.
Answered By: Mark Tolonen
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.