How to access the network weights while using PyTorch 'nn.Sequential'?
Question:
I’m building a neural network and I don’t know how to access the model weights for each layer.
I’ve tried
model.input_size.weight
Code:
input_size = 784
hidden_sizes = [128, 64]
output_size = 10
# Build a feed-forward network
model = nn.Sequential(nn.Linear(input_size, hidden_sizes[0]),
nn.ReLU(),
nn.Linear(hidden_sizes[0], hidden_sizes[1]),
nn.ReLU(),
nn.Linear(hidden_sizes[1], output_size),
nn.Softmax(dim=1))
I expected to get the weights but I got
‘Sequential’ object has no attribute ‘input_size’
Answers:
You can use model[0].weight.grad to display the weights
As per the official pytorch discussion forum here, you can access weights of a specific module in nn.Sequential()
using
model.layer[0].weight # for accessing weights of first layer wrapped in nn.Sequential()
I’ve tried many ways, and it seems that the only way is by naming each layer by passing OrderedDict
from collections import OrderedDict
model = nn.Sequential(OrderedDict([
('fc1', nn.Linear(input_size, hidden_sizes[0])),
('relu1', nn.ReLU()),
('fc2', nn.Linear(hidden_sizes[0], hidden_sizes[1])),
('relu2', nn.ReLU()),
('output', nn.Linear(hidden_sizes[1], output_size)),
('softmax', nn.Softmax(dim=1))]))
So to access the weights of each layer, we need to call it by its own unique layer name.
For example to access weights of layer 1 model.fc1.weight
Parameter containing:
tensor([[-7.3584e-03, -2.3753e-02, -2.2565e-02, ..., 2.1965e-02,
1.0699e-02, -2.8968e-02],
[ 2.2930e-02, -2.4317e-02, 2.9939e-02, ..., 1.1536e-02,
1.9830e-02, -1.4294e-02],
[ 3.0891e-02, 2.5781e-02, -2.5248e-02, ..., -1.5813e-02,
6.1708e-03, -1.8673e-02],
...,
[-1.2596e-03, -1.2320e-05, 1.9106e-02, ..., 2.1987e-02,
-3.3817e-02, -9.4880e-03],
[ 1.4234e-02, 2.1246e-02, -1.0369e-02, ..., -1.2366e-02,
-4.7024e-04, -2.5259e-02],
[ 7.5356e-03, 3.4400e-02, -1.0673e-02, ..., 2.8880e-02,
-1.0365e-02, -1.2916e-02]], requires_grad=True)
If you print out the model usingprint(model)
, you would get
Sequential(
(0): Linear(in_features=784, out_features=128, bias=True)
(1): ReLU()
(2): Linear(in_features=128, out_features=64, bias=True)
(3): ReLU()
(4): Linear(in_features=64, out_features=10, bias=True)
(5): Softmax(dim=1) )
Now you have access to all indices of layers so you can get the weights of (let’s say) second linear layer by model[4].weight
.
Let’s say you define the model as a class. Then you can call model.parameters().
`# Build a feed-forward network
class FFN(nn.Module):
def __init__(self):
super().__init__()
self.layer1 = nn.Linear(input_size, hidden_sizes[0])
self.layer2 = nn.Linear(hidden_sizes[0], hidden_sizes[1])
self.layer3 = nn.Linear(hidden_sizes[1], output_size)
self.relu = nn.ReLU()
self.softmax = nn.Softmax(dim=1)
def forward(self, x):
x = self.relu(self.layer1(x))
x = self.relu(self.layer2(x))
x = self.softmax(self.layer3(x))
return x
model = FFN()
print(model.parameters())`
Which will print <generator object Module.parameters at 0x7f99886d0d58>
, so you can pass that to an optimizer right away!
But, if you want to access particular weights or look at them manually, you can just convert to a list: print(list(model.parameters()))
. Which will spit out a giant list of weights.
But, let’s say you only want the last layer, then you can do: print(list(model.parameters())[-1])
, which will print: tensor([-0.0347, -0.0289, -0.0652, -0.1233, 0.1093, 0.1187, -0.0407, 0.0885, -0.0045, -0.1238], requires_grad=True)
For both the sequential model and the class model, you can access the layer weights via the children method:
for layer in model.children():
if isinstance(layer, nn.Linear):
print(layer.state_dict())
This will give you the output like this:
OrderedDict([
('weight', tensor([[-0.0039, -0.0045...]])),
('bias', tensor([[-0.0019, -0.0025...]]))
])
Or like this:
for layer in model.children():
if isinstance(layer, nn.Linear):
print('weight:', layer.weight
print('bias:', layer.bias
For class-based models the order is going to be as layers are defined in the init method.
I’m building a neural network and I don’t know how to access the model weights for each layer.
I’ve tried
model.input_size.weight
Code:
input_size = 784
hidden_sizes = [128, 64]
output_size = 10
# Build a feed-forward network
model = nn.Sequential(nn.Linear(input_size, hidden_sizes[0]),
nn.ReLU(),
nn.Linear(hidden_sizes[0], hidden_sizes[1]),
nn.ReLU(),
nn.Linear(hidden_sizes[1], output_size),
nn.Softmax(dim=1))
I expected to get the weights but I got
‘Sequential’ object has no attribute ‘input_size’
You can use model[0].weight.grad to display the weights
As per the official pytorch discussion forum here, you can access weights of a specific module in nn.Sequential()
using
model.layer[0].weight # for accessing weights of first layer wrapped in nn.Sequential()
I’ve tried many ways, and it seems that the only way is by naming each layer by passing OrderedDict
from collections import OrderedDict
model = nn.Sequential(OrderedDict([
('fc1', nn.Linear(input_size, hidden_sizes[0])),
('relu1', nn.ReLU()),
('fc2', nn.Linear(hidden_sizes[0], hidden_sizes[1])),
('relu2', nn.ReLU()),
('output', nn.Linear(hidden_sizes[1], output_size)),
('softmax', nn.Softmax(dim=1))]))
So to access the weights of each layer, we need to call it by its own unique layer name.
For example to access weights of layer 1 model.fc1.weight
Parameter containing:
tensor([[-7.3584e-03, -2.3753e-02, -2.2565e-02, ..., 2.1965e-02,
1.0699e-02, -2.8968e-02],
[ 2.2930e-02, -2.4317e-02, 2.9939e-02, ..., 1.1536e-02,
1.9830e-02, -1.4294e-02],
[ 3.0891e-02, 2.5781e-02, -2.5248e-02, ..., -1.5813e-02,
6.1708e-03, -1.8673e-02],
...,
[-1.2596e-03, -1.2320e-05, 1.9106e-02, ..., 2.1987e-02,
-3.3817e-02, -9.4880e-03],
[ 1.4234e-02, 2.1246e-02, -1.0369e-02, ..., -1.2366e-02,
-4.7024e-04, -2.5259e-02],
[ 7.5356e-03, 3.4400e-02, -1.0673e-02, ..., 2.8880e-02,
-1.0365e-02, -1.2916e-02]], requires_grad=True)
If you print out the model usingprint(model)
, you would get
Sequential(
(0): Linear(in_features=784, out_features=128, bias=True)
(1): ReLU()
(2): Linear(in_features=128, out_features=64, bias=True)
(3): ReLU()
(4): Linear(in_features=64, out_features=10, bias=True)
(5): Softmax(dim=1) )
Now you have access to all indices of layers so you can get the weights of (let’s say) second linear layer by model[4].weight
.
Let’s say you define the model as a class. Then you can call model.parameters().
`# Build a feed-forward network
class FFN(nn.Module):
def __init__(self):
super().__init__()
self.layer1 = nn.Linear(input_size, hidden_sizes[0])
self.layer2 = nn.Linear(hidden_sizes[0], hidden_sizes[1])
self.layer3 = nn.Linear(hidden_sizes[1], output_size)
self.relu = nn.ReLU()
self.softmax = nn.Softmax(dim=1)
def forward(self, x):
x = self.relu(self.layer1(x))
x = self.relu(self.layer2(x))
x = self.softmax(self.layer3(x))
return x
model = FFN()
print(model.parameters())`
Which will print <generator object Module.parameters at 0x7f99886d0d58>
, so you can pass that to an optimizer right away!
But, if you want to access particular weights or look at them manually, you can just convert to a list: print(list(model.parameters()))
. Which will spit out a giant list of weights.
But, let’s say you only want the last layer, then you can do: print(list(model.parameters())[-1])
, which will print: tensor([-0.0347, -0.0289, -0.0652, -0.1233, 0.1093, 0.1187, -0.0407, 0.0885, -0.0045, -0.1238], requires_grad=True)
For both the sequential model and the class model, you can access the layer weights via the children method:
for layer in model.children():
if isinstance(layer, nn.Linear):
print(layer.state_dict())
This will give you the output like this:
OrderedDict([
('weight', tensor([[-0.0039, -0.0045...]])),
('bias', tensor([[-0.0019, -0.0025...]]))
])
Or like this:
for layer in model.children():
if isinstance(layer, nn.Linear):
print('weight:', layer.weight
print('bias:', layer.bias
For class-based models the order is going to be as layers are defined in the init method.