What does the underscore suffix in PyTorch functions mean?

Question

In PyTorch, many methods of a tensor exist in two versions – one with a underscore suffix, and one without. If I try them out, they seem to do the same thing:

In [1]: import torch

In [2]: a = torch.tensor([2, 4, 6])

In [3]: a.add(10)
Out[3]: tensor([12, 14, 16])

In [4]: a.add_(10)
Out[4]: tensor([12, 14, 16])

What is the difference between

torch.add and torch.add_
torch.sub and torch.sub_
…and so on?

Asked By: soerface

||

Source

Answer 1

According to the documentation, Methods which end in an underscore change the tensor in-place. That means that no new memory is being allocated by doing the operation, which in general increase performance, but can lead to problems and worse performance in PyTorch.

In [2]: a = torch.tensor([2, 4, 6])

tensor.add():

In [3]: b = a.add(10)

In [4]: a is b
Out[4]: False # b is a new tensor, new memory was allocated

tensor.add_():

In [3]: b = a.add_(10)

In [4]: a is b
Out[4]: True # Same object, no new memory was allocated

Notice, that the operators + and += are also two different implementations. + creates a new tensor by using .add(), while += modifies the tensor by using .add_()

In [2]: a = torch.tensor([2, 4, 6])

In [3]: id(a)
Out[3]: 140250660654104

In [4]: a += 10

In [5]: id(a)
Out[5]: 140250660654104 # Still the same object, no memory allocation was required

In [6]: a = a + 10

In [7]: id(a)
Out[7]: 140250649668272 # New object was created

Answered By: soerface

Answer 2

You have already answered your own question that the underscore indicates in-place operations in PyTorch. However I want to point out briefly why in-place operations can be problematic:

First of all on the PyTorch sites it is recommended to not use in-place operations in most cases. Unless working under heavy memory pressure it is more efficient in most cases to not use in-place operations.
https://pytorch.org/docs/stable/notes/autograd.html#in-place-operations-with-autograd
Secondly there can be problems calculating the gradients when using in-place operations:

Every tensor keeps a version counter, that is incremented every time
it is marked dirty in any operation. When a Function saves any tensors
for backward, a version counter of their containing Tensor is saved as
well. Once you access self.saved_tensors it is checked, and if it is
greater than the saved value an error is raised. This ensures that if
you’re using in-place functions and not seeing any errors, you can be
sure that the computed gradients are correct.
Same source as above.

Here is a shot and slightly modified example taken from the answer you’ve posted:

First the in-place version:

import torch
a = torch.tensor([2, 4, 6], requires_grad=True, dtype=torch.float)
adding_tensor = torch.rand(3)
b = a.add_(adding_tensor)
c = torch.sum(b)
c.backward()
print(c.grad_fn)

Which leads to this error:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-27-c38b252ffe5f> in <module>
      2 a = torch.tensor([2, 4, 6], requires_grad=True, dtype=torch.float)
      3 adding_tensor = torch.rand(3)
----> 4 b = a.add_(adding_tensor)
      5 c = torch.sum(b)
      6 c.backward()

RuntimeError: a leaf Variable that requires grad has been used in an in-place operation.

Secondly the non in-place version:

import torch
a = torch.tensor([2, 4, 6], requires_grad=True, dtype=torch.float)
adding_tensor = torch.rand(3)
b = a.add(adding_tensor)
c = torch.sum(b)
c.backward()
print(c.grad_fn)

Which works just fine – output:

<SumBackward0 object at 0x7f06b27a1da0>

So as a take-away I just wanted to point out to carefully use in-place operations in PyTorch.

Answered By: MBT

Answer 3

In PyTorch, the ends with an underscore, a convention in PyTorch that indicates the method will not return a new tensor but will instead modify the tensor in place. For example, scatter_.

https://yuyangyy.medium.com/understand-torch-scatter-b0fd6275331c

Answered By: rk___

What does the underscore suffix in PyTorch functions mean?

Question:

Answers: