How to extract and visualize feature value for an arbitrary layer during inference with YOLOv7?
Question:
In my case, I would like to extract and visualize the features output in layers 102, 103, 104 in the following code in cfg/training/yolov7.yaml
.
# yolov7 head
head:
[[-1, 1, SPPCSPC, [512]], # 51
[-1, 1, Conv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[37, 1, Conv, [256, 1, 1]], # route backbone P4
[[-1, -2], 1, Concat, [1]],
[-1, 1, Conv, [256, 1, 1]],
[-2, 1, Conv, [256, 1, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [256, 1, 1]], # 63
[-1, 1, Conv, [128, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[24, 1, Conv, [128, 1, 1]], # route backbone P3
[[-1, -2], 1, Concat, [1]],
[-1, 1, Conv, [128, 1, 1]],
[-2, 1, Conv, [128, 1, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [128, 1, 1]], # 75
[-1, 1, MP, []],
[-1, 1, Conv, [128, 1, 1]],
[-3, 1, Conv, [128, 1, 1]],
[-1, 1, Conv, [128, 3, 2]],
[[-1, -3, 63], 1, Concat, [1]],
[-1, 1, Conv, [256, 1, 1]],
[-2, 1, Conv, [256, 1, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [256, 1, 1]], # 88
[-1, 1, MP, []],
[-1, 1, Conv, [256, 1, 1]],
[-3, 1, Conv, [256, 1, 1]],
[-1, 1, Conv, [256, 3, 2]],
[[-1, -3, 51], 1, Concat, [1]],
[-1, 1, Conv, [512, 1, 1]],
[-2, 1, Conv, [512, 1, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [512, 1, 1]], # 101
[75, 1, RepConv, [256, 3, 1]], #extract
[88, 1, RepConv, [512, 3, 1]], #extract
[101, 1, RepConv, [1024, 3, 1]], #extract
[[102,103,104], 1, IDetect, [nc, anchors]], # Detect(P3, P4, P5)
]
Also, the following is the result of printing out the model.
Model(
(model): Sequential(
(0): Conv(
(conv): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(act): SiLU(inplace=True)
)
(1): Conv(
(conv): Conv2d(32, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(act): SiLU(inplace=True)
)
(2): Conv(
(conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(act): SiLU(inplace=True)
)
----------------------------------------------------
(102): RepConv(
(act): SiLU(inplace=True)
(rbr_reparam): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) # extract
)
(103): RepConv(
(act): SiLU(inplace=True)
(rbr_reparam): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) # extract
)
(104): RepConv(
(act): SiLU(inplace=True)
(rbr_reparam): Conv2d(512, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) # extract
)
(105): IDetect(
(m): ModuleList(
(0): Conv2d(256, 21, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d(512, 21, kernel_size=(1, 1), stride=(1, 1))
(2): Conv2d(1024, 21, kernel_size=(1, 1), stride=(1, 1))
)
(ia): ModuleList(
(0): ImplicitA()
(1): ImplicitA()
(2): ImplicitA()
)
(im): ModuleList(
(0): ImplicitM()
(1): ImplicitM()
(2): ImplicitM()
)
)
)
)
However, I would like to be able to take out features of any layer if possible, as I may need features of layers other than this one.
How can I do this?
I tried to do the extraction and visualization from the Model
class in models/yolo.py
with reference to https://github.com/ultralytics/yolov5/issues/3089, but could not figure out which code to edit and how.
I tried to do the same with the IDetect
class, but could not figure it out either.
Answers:
You can register a forward hook to the layer(s) in question. Per pytorch documentation, "The hook will be called every time after forward() has computed an output."
Essentially, the forward hook function modifies a variable of global scope that will persist after the layer forward
call terminates. You store the output of the layer forward
call (by way of the forward hook function) in this variable, and you can then reference it later.
(Explicitly, I believe what happens is that registering the forward hook implicitly changes the nn.module
in question to be of global scope such that the value persists after the termination of the function call. See pytorch docs for more on this.)
In any case , the forward hook function needs the following function signature:
hook(module, input, output) -> None or modified output
So, a trivial example would be:
def make_hook(key):
def hook(model, input, output):
intermediate_output[key] = output.detach()
return hook
The outer function itself returns a function, as the input to register_module_forward_hook
is a function with the above signature.
Then we can add the forward hook to any module with:
model.<layer_name>.register_forward_hook(make_hook("example_key"))
So, in summary, your code would look something like:
def make_hook(key):
def hook(model, input, output):
intermediate_output[key] = output.detach()
return hook
# define model
model = Yolo5() # I know this is wrong but you didn't include the actual model in your question so this is just an example
intermediate_output = {}
# register hook to as many layers as you want
model.conv4.register_forward_hook("conv4") # same here, I made these layer names up
model.maxpool8.register_forward_hook("maxpool8")
# dummy input
inp = torch.random.rand(1,3,1080,1920)
# forward pass
model(inp)
# reference intermediate_output
intermediate_output["conv4"] # should have the output from this layer stored as value
Do note that because using forward hooks "adds global state" to the module
pytorch docs suggest to use this feature only temporarily for debugging purposes and not for persistent solutions. For a longer-term solution you could modify the forward
pass of the main model architecture to store these values as intermediate outputs and return all of these values at the end.
Thanks to @DerekG for helping me figure this out!
The following is the code in yolov7/detect.py
after the resolution.
The -----
line indicates the omission of a code.
-------------------------------------------------------------
from utils.plots import plot_one_box, plot_ts_feature_maps # Add plot_ts_feature_maps method
-------------------------------------------------------------
def detect(save_img=False):
-------------------------------------------------------------
# Load model
model = attempt_load(weights, map_location=device) # load FP32 model
---------------------------------------------------------------------
# Set Dataloader
vid_path, vid_writer = None, None
if webcam:
view_img = check_imshow()
cudnn.benchmark = True # set True to speed up constant image size inference
dataset = LoadStreams(source, img_size=imgsz, stride=stride)
else:
dataset = LoadImages(source, img_size=imgsz, stride=stride)
--------------------------------------------------------------------------
for path, img, im0s, vid_cap in dataset:
img = torch.from_numpy(img).to(device)
img = img.half() if half else img.float() # uint8 to fp16/32
img /= 255.0 # 0 - 255 to 0.0 - 1.0
if img.ndimension() == 3:
img = img.unsqueeze(0)
------------------------------------------------------------------
# Start of postscript
def make_hook(key):
def hook(model, input, output):
intermediate_output[key] = output.detach()
return hook
layer_num = 104 # Intermediate layer number
intermediate_output = {}
model.model[layer_num].register_forward_hook(make_hook(layer_num))
# forward pass
model(img)
# print feature map shape
feature_maps = intermediate_output[layer_num]
print(feature_maps.shape)
# Outputs a feature map of the intermediate layer
plot_ts_feature_maps(feature_maps)
# End of postscript
t2 = time_synchronized()
------------------------------------------------------------------
Also, yolov7/utils/plots.py
was added as follows.
Torchshow is a module to visualize Tensor. Here is the official GitHub: https://github.com/xwying/torchshow
-------------------------------------------------------------------------
# Add module
import torchshow as ts
-------------------------------------------------------------------------
# Add plot_ts_feature_maps method at the bottom
def plot_ts_feature_maps(feature_maps):
import matplotlib
matplotlib.use('TkAgg')
feature_maps = feature_maps.to(torch.float32)
ts.show(feature_maps[0])
As a test, to extract 4 feature maps for the second layer, I changed layer_num = 1
in detect.py
and ts.show(feature_maps[0][:4])
in plots.py
and ran the following command.
python detect.py --weights yolov7.pt --source inference/images/horses.jpg --device 0 --no-trace
The inference results and feature maps were then output as follows.
inference results
feature map
In my case, I would like to extract and visualize the features output in layers 102, 103, 104 in the following code in cfg/training/yolov7.yaml
.
# yolov7 head
head:
[[-1, 1, SPPCSPC, [512]], # 51
[-1, 1, Conv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[37, 1, Conv, [256, 1, 1]], # route backbone P4
[[-1, -2], 1, Concat, [1]],
[-1, 1, Conv, [256, 1, 1]],
[-2, 1, Conv, [256, 1, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [256, 1, 1]], # 63
[-1, 1, Conv, [128, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[24, 1, Conv, [128, 1, 1]], # route backbone P3
[[-1, -2], 1, Concat, [1]],
[-1, 1, Conv, [128, 1, 1]],
[-2, 1, Conv, [128, 1, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [64, 3, 1]],
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [128, 1, 1]], # 75
[-1, 1, MP, []],
[-1, 1, Conv, [128, 1, 1]],
[-3, 1, Conv, [128, 1, 1]],
[-1, 1, Conv, [128, 3, 2]],
[[-1, -3, 63], 1, Concat, [1]],
[-1, 1, Conv, [256, 1, 1]],
[-2, 1, Conv, [256, 1, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [256, 1, 1]], # 88
[-1, 1, MP, []],
[-1, 1, Conv, [256, 1, 1]],
[-3, 1, Conv, [256, 1, 1]],
[-1, 1, Conv, [256, 3, 2]],
[[-1, -3, 51], 1, Concat, [1]],
[-1, 1, Conv, [512, 1, 1]],
[-2, 1, Conv, [512, 1, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[[-1, -2, -3, -4, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [512, 1, 1]], # 101
[75, 1, RepConv, [256, 3, 1]], #extract
[88, 1, RepConv, [512, 3, 1]], #extract
[101, 1, RepConv, [1024, 3, 1]], #extract
[[102,103,104], 1, IDetect, [nc, anchors]], # Detect(P3, P4, P5)
]
Also, the following is the result of printing out the model.
Model(
(model): Sequential(
(0): Conv(
(conv): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(act): SiLU(inplace=True)
)
(1): Conv(
(conv): Conv2d(32, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(act): SiLU(inplace=True)
)
(2): Conv(
(conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(act): SiLU(inplace=True)
)
----------------------------------------------------
(102): RepConv(
(act): SiLU(inplace=True)
(rbr_reparam): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) # extract
)
(103): RepConv(
(act): SiLU(inplace=True)
(rbr_reparam): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) # extract
)
(104): RepConv(
(act): SiLU(inplace=True)
(rbr_reparam): Conv2d(512, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) # extract
)
(105): IDetect(
(m): ModuleList(
(0): Conv2d(256, 21, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d(512, 21, kernel_size=(1, 1), stride=(1, 1))
(2): Conv2d(1024, 21, kernel_size=(1, 1), stride=(1, 1))
)
(ia): ModuleList(
(0): ImplicitA()
(1): ImplicitA()
(2): ImplicitA()
)
(im): ModuleList(
(0): ImplicitM()
(1): ImplicitM()
(2): ImplicitM()
)
)
)
)
However, I would like to be able to take out features of any layer if possible, as I may need features of layers other than this one.
How can I do this?
I tried to do the extraction and visualization from the Model
class in models/yolo.py
with reference to https://github.com/ultralytics/yolov5/issues/3089, but could not figure out which code to edit and how.
I tried to do the same with the IDetect
class, but could not figure it out either.
You can register a forward hook to the layer(s) in question. Per pytorch documentation, "The hook will be called every time after forward() has computed an output."
Essentially, the forward hook function modifies a variable of global scope that will persist after the layer forward
call terminates. You store the output of the layer forward
call (by way of the forward hook function) in this variable, and you can then reference it later.
(Explicitly, I believe what happens is that registering the forward hook implicitly changes the nn.module
in question to be of global scope such that the value persists after the termination of the function call. See pytorch docs for more on this.)
In any case , the forward hook function needs the following function signature:
hook(module, input, output) -> None or modified output
So, a trivial example would be:
def make_hook(key):
def hook(model, input, output):
intermediate_output[key] = output.detach()
return hook
The outer function itself returns a function, as the input to register_module_forward_hook
is a function with the above signature.
Then we can add the forward hook to any module with:
model.<layer_name>.register_forward_hook(make_hook("example_key"))
So, in summary, your code would look something like:
def make_hook(key):
def hook(model, input, output):
intermediate_output[key] = output.detach()
return hook
# define model
model = Yolo5() # I know this is wrong but you didn't include the actual model in your question so this is just an example
intermediate_output = {}
# register hook to as many layers as you want
model.conv4.register_forward_hook("conv4") # same here, I made these layer names up
model.maxpool8.register_forward_hook("maxpool8")
# dummy input
inp = torch.random.rand(1,3,1080,1920)
# forward pass
model(inp)
# reference intermediate_output
intermediate_output["conv4"] # should have the output from this layer stored as value
Do note that because using forward hooks "adds global state" to the module
pytorch docs suggest to use this feature only temporarily for debugging purposes and not for persistent solutions. For a longer-term solution you could modify the forward
pass of the main model architecture to store these values as intermediate outputs and return all of these values at the end.
Thanks to @DerekG for helping me figure this out!
The following is the code in yolov7/detect.py
after the resolution.
The -----
line indicates the omission of a code.
-------------------------------------------------------------
from utils.plots import plot_one_box, plot_ts_feature_maps # Add plot_ts_feature_maps method
-------------------------------------------------------------
def detect(save_img=False):
-------------------------------------------------------------
# Load model
model = attempt_load(weights, map_location=device) # load FP32 model
---------------------------------------------------------------------
# Set Dataloader
vid_path, vid_writer = None, None
if webcam:
view_img = check_imshow()
cudnn.benchmark = True # set True to speed up constant image size inference
dataset = LoadStreams(source, img_size=imgsz, stride=stride)
else:
dataset = LoadImages(source, img_size=imgsz, stride=stride)
--------------------------------------------------------------------------
for path, img, im0s, vid_cap in dataset:
img = torch.from_numpy(img).to(device)
img = img.half() if half else img.float() # uint8 to fp16/32
img /= 255.0 # 0 - 255 to 0.0 - 1.0
if img.ndimension() == 3:
img = img.unsqueeze(0)
------------------------------------------------------------------
# Start of postscript
def make_hook(key):
def hook(model, input, output):
intermediate_output[key] = output.detach()
return hook
layer_num = 104 # Intermediate layer number
intermediate_output = {}
model.model[layer_num].register_forward_hook(make_hook(layer_num))
# forward pass
model(img)
# print feature map shape
feature_maps = intermediate_output[layer_num]
print(feature_maps.shape)
# Outputs a feature map of the intermediate layer
plot_ts_feature_maps(feature_maps)
# End of postscript
t2 = time_synchronized()
------------------------------------------------------------------
Also, yolov7/utils/plots.py
was added as follows.
Torchshow is a module to visualize Tensor. Here is the official GitHub: https://github.com/xwying/torchshow
-------------------------------------------------------------------------
# Add module
import torchshow as ts
-------------------------------------------------------------------------
# Add plot_ts_feature_maps method at the bottom
def plot_ts_feature_maps(feature_maps):
import matplotlib
matplotlib.use('TkAgg')
feature_maps = feature_maps.to(torch.float32)
ts.show(feature_maps[0])
As a test, to extract 4 feature maps for the second layer, I changed layer_num = 1
in detect.py
and ts.show(feature_maps[0][:4])
in plots.py
and ran the following command.
python detect.py --weights yolov7.pt --source inference/images/horses.jpg --device 0 --no-trace
The inference results and feature maps were then output as follows.
inference results
feature map