What is the difference between torch.tensor and torch.Tensor?
Question:
Since version 0.4.0, it is possible to use torch.tensor
and torch.Tensor
What is the difference? What was the reasoning for providing these two very similar and confusing alternatives?
Answers:
According to discussion on pytorch discussion
torch.Tensor
constructor is overloaded to do the same thing as both torch.tensor
and torch.empty
. It is thought this overload would make code confusing, so split torch.Tensor
into torch.tensor
and torch.empty
.
So yes, to some extent, torch.tensor
works similarly to torch.Tensor (when you pass in data). no, neither should be more efficient than the other. It’s just that the torch.empty
and torch.tensor
have a nicer API than torch.Tensor
constructor.
In PyTorch torch.Tensor
is the main tensor class. So all tensors are just instances of torch.Tensor
.
When you call torch.Tensor()
you will get an empty tensor without any data
.
In contrast torch.tensor
is a function which returns a tensor. In the documentation it says:
torch.tensor(data, dtype=None, device=None, requires_grad=False) → Tensor
Constructs a tensor with data
.
This also explains why it is no problem creating an empty tensor instance of `torch.Tensor` without `data` by calling:
tensor_without_data = torch.Tensor()
But on the other side:
tensor_without_data = torch.tensor()
Will lead to an error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-12-ebc3ceaa76d2> in <module>()
----> 1 torch.tensor()
TypeError: tensor() missing 1 required positional arguments: "data"
But in general there is no reason to choose `torch.Tensor` over `torch.tensor`. Also `torch.Tensor` lacks a docstring.
Similar behaviour for creating a tensor without data
like with: torch.Tensor()
can be achieved using:
torch.tensor(())
Output:
tensor([])
In addition to the above answers, I noticed:
torch.Tensor()
creates a tensor with the default data type, as defined by torch.get_default_dtype()
.
torch.tensor()
will infer data type from the data.
For example:
>>> torch.Tensor([1, 2, 3]).dtype
torch.float32
>>> torch.tensor([1, 2, 3]).dtype
torch.int64
https://discuss.pytorch.org/t/difference-between-torch-tensor-and-torch-tensor/30786/2
torch.tensor infers the dtype automatically, while torch.Tensor
returns a torch.FloatTensor. I would recommend to stick to
torch.tensor, which also has arguments like dtype, if you would like
to change the type.
torch.Tensor
is a favorite method used when creating parameters (for instance in nn.Linear
, nn._ConvNd
).
Why? Because it is very fast. It is even a bit faster than torch.empty()
.
import torch
torch.set_default_dtype(torch.float32) # default
%timeit torch.empty(1000,1000)
%timeit torch.Tensor(1000,1000)
%timeit torch.ones(1000,1000)
%timeit torch.tensor([[1]*1000]*1000)
Out:
68.4 µs ± 789 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
67.9 µs ± 349 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
1.26 ms ± 8.61 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
36.1 ms ± 610 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
torch.Tensor()
and and torch.empty()
are very similar and return a tensor filled with uninitialized data.
Why do we not initialize parameters in __init__
while technically this is possible?
Here is the torch.Tensor
in practice inside nn.Linear
to create the weight
parameter:
self.weight = nn.Parameter(torch.Tensor(out_features, in_features))
We do not initialize it per design. There is another reset_parameters()
method and because while training it may need to "reset" the parameters again, we call reset_paremeters()
at the end of the __init__()
method.
Maybe in the future torch.empty()
will replace torch.Tensor()
because these are the same in effect.
Also there is one nice option with reset_parameters()
, you may create your own version and alter the original initialization procedure if needed.
Since version 0.4.0, it is possible to use torch.tensor
and torch.Tensor
What is the difference? What was the reasoning for providing these two very similar and confusing alternatives?
According to discussion on pytorch discussion
torch.Tensor
constructor is overloaded to do the same thing as both torch.tensor
and torch.empty
. It is thought this overload would make code confusing, so split torch.Tensor
into torch.tensor
and torch.empty
.
So yes, to some extent, torch.tensor
works similarly to torch.Tensor (when you pass in data). no, neither should be more efficient than the other. It’s just that the torch.empty
and torch.tensor
have a nicer API than torch.Tensor
constructor.
In PyTorch torch.Tensor
is the main tensor class. So all tensors are just instances of torch.Tensor
.
When you call torch.Tensor()
you will get an empty tensor without any data
.
In contrast torch.tensor
is a function which returns a tensor. In the documentation it says:
torch.tensor(data, dtype=None, device=None, requires_grad=False) → Tensor
Constructs a tensor with
data
.
This also explains why it is no problem creating an empty tensor instance of `torch.Tensor` without `data` by calling:
tensor_without_data = torch.Tensor()
But on the other side:
tensor_without_data = torch.tensor()
Will lead to an error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-12-ebc3ceaa76d2> in <module>()
----> 1 torch.tensor()
TypeError: tensor() missing 1 required positional arguments: "data"
But in general there is no reason to choose `torch.Tensor` over `torch.tensor`. Also `torch.Tensor` lacks a docstring.
Similar behaviour for creating a tensor without data
like with: torch.Tensor()
can be achieved using:
torch.tensor(())
Output:
tensor([])
In addition to the above answers, I noticed:
torch.Tensor()
creates a tensor with the default data type, as defined bytorch.get_default_dtype()
.torch.tensor()
will infer data type from the data.
For example:
>>> torch.Tensor([1, 2, 3]).dtype
torch.float32
>>> torch.tensor([1, 2, 3]).dtype
torch.int64
https://discuss.pytorch.org/t/difference-between-torch-tensor-and-torch-tensor/30786/2
torch.tensor infers the dtype automatically, while torch.Tensor
returns a torch.FloatTensor. I would recommend to stick to
torch.tensor, which also has arguments like dtype, if you would like
to change the type.
torch.Tensor
is a favorite method used when creating parameters (for instance in nn.Linear
, nn._ConvNd
).
Why? Because it is very fast. It is even a bit faster than torch.empty()
.
import torch
torch.set_default_dtype(torch.float32) # default
%timeit torch.empty(1000,1000)
%timeit torch.Tensor(1000,1000)
%timeit torch.ones(1000,1000)
%timeit torch.tensor([[1]*1000]*1000)
Out:
68.4 µs ± 789 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
67.9 µs ± 349 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
1.26 ms ± 8.61 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
36.1 ms ± 610 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
torch.Tensor()
and and torch.empty()
are very similar and return a tensor filled with uninitialized data.
Why do we not initialize parameters in __init__
while technically this is possible?
Here is the torch.Tensor
in practice inside nn.Linear
to create the weight
parameter:
self.weight = nn.Parameter(torch.Tensor(out_features, in_features))
We do not initialize it per design. There is another reset_parameters()
method and because while training it may need to "reset" the parameters again, we call reset_paremeters()
at the end of the __init__()
method.
Maybe in the future torch.empty()
will replace torch.Tensor()
because these are the same in effect.
Also there is one nice option with reset_parameters()
, you may create your own version and alter the original initialization procedure if needed.