nncore.nn
Blocks
- class nncore.nn.blocks.Clamp(*args: Any, **kwargs: Any)[source]
Clamp activation layer.
- Parameters:
min (float, optional) – The lower-bound of the range. Default:
-1
.max (float, optional) – The upper-bound of the range. Default:
1
.
- class nncore.nn.blocks.EffMish(*args: Any, **kwargs: Any)[source]
An efficient implementation of Mish activation layer introduced in [1].
References
Misra et al. (https://arxiv.org/abs/1908.08681)
- class nncore.nn.blocks.EffSwish(*args: Any, **kwargs: Any)[source]
An efficient implementation of Swish activation layer introduced in [1].
References
Ramachandran et al. (https://arxiv.org/abs/1710.05941)
- class nncore.nn.blocks.Mish(*args: Any, **kwargs: Any)[source]
Mish activation layer introduced in [1].
References
Misra et al. (https://arxiv.org/abs/1908.08681)
- class nncore.nn.blocks.Swish(*args: Any, **kwargs: Any)[source]
Swish activation layer introduced in [1].
References
Ramachandran et al. (https://arxiv.org/abs/1710.05941)
- class nncore.nn.blocks.GAT(*args: Any, **kwargs: Any)[source]
Graph Attention Layer introduced in [1].
- Parameters:
in_features (int) – Number of input features.
out_features (int) – Number of output features.
heads (int, optional) – Number of attention heads. Default:
1
.p (float, optional) – The dropout probability. Default:
0
.negative_slope (float, optional) – The negative slope of
LeakyReLU
. Default:0.2
.concat (bool, optional) – Whether to concatenate the features from different attention heads. Default:
True
.residual (bool, optional) – Whether to add residual connections. Default:
True
.bias (bool, optional) – Whether to add the bias term. Default:
True
.
References
Veličković et al. (https://arxiv.org/abs/1710.10903)
- class nncore.nn.blocks.GCN(*args: Any, **kwargs: Any)[source]
Graph Convolutional Layer introduced in [1].
- Parameters:
in_features (int) – Number of input features.
out_features (int) – Number of output features.
bias (bool, optional) – Whether to add the bias term. Default:
True
.
References
Kipf et al. (https://arxiv.org/abs/1609.02907)
- class nncore.nn.blocks.SGC(*args: Any, **kwargs: Any)[source]
Simple Graph Convolutional Layer introduced in [1].
- Parameters:
in_features (int) – Number of input features.
out_features (int) – Number of output features.
k (int, optional) – Number of layers to be stacked.
bias (bool, optional) – Whether to add the bias term. Default:
True
.
References
Wu et al. (https://arxiv.org/abs/1902.07153)
- class nncore.nn.blocks.CrossAttentionLayer(*args: Any, **kwargs: Any)[source]
Cross Attention Layer.
- Parameters:
dims (int) – The input feature dimensions.
heads (int, optional) – The number of attention heads. Default:
8
.ratio (int, optional) – The ratio of hidden layer dimensions in the feed forward network. Default:
4
.p (float, optional) – The dropout probability. Default:
0.1
.pre_norm (bool, optional) – Whether to apply the normalization before instead of after each layer. Default:
True
.norm_cfg (dict | str | None, optional) – The config or name of the normalization layer. Default:
dict(type='LN')
.act_cfg (dict | str | None, optional) – The config or name of the activation layer. Default:
dict(type='ReLU', inplace=True)
.
- class nncore.nn.blocks.FeedForwardNetwork(*args: Any, **kwargs: Any)[source]
Feed Forward Network introduced in [1].
- Parameters:
dims (int) – The input feature dimensions.
ratio (float, optional) – The ratio of hidden layer dimensions with respect to the input dimensions. Default:
4
.p (float, optional) – The dropout probability. Default:
0.1
.act_cfg (dict | str | None, optional) – The config or name of the activation layer. Default:
dict(type='ReLU', inplace=True)
.
References
Vaswani et al. (https://arxiv.org/abs/1706.03762)
- class nncore.nn.blocks.MultiHeadAttention(*args: Any, **kwargs: Any)[source]
Multi-Head Attention introduced in [1].
- Parameters:
dims (int) – The input feature dimensions.
k_dims (int | None, optional) – The dimensions of key matrix. If not specified, it will be the same as
q_dims
. Default:None
.v_dims (int | None, optional) – The dimensions of value matrix. If not specified, it will be the same as
q_dims
. Default:None
.h_dims (int | None, optional) – The hidden dimensions. If not specified, it will be the same as
q_dims
. Default:None
.o_dims (int | None, optional) – The output dimensions. If not specified, it will be the same as
q_dims
. Default:None
.heads (int, optional) – The number of attention heads. Default:
8
.p (float, optional) – The dropout probability. Default:
0.1
.bias (bool, optional) – Whether to add the bias term. Default:
True
.
References
Vaswani et al. (https://arxiv.org/abs/1706.03762)
- class nncore.nn.blocks.PositionalEncoding(*args: Any, **kwargs: Any)[source]
Positional Encoding introduced in [1].
- Parameters:
dims (int) – The input feature dimensions.
learnable (bool, optional) – Whether the positional encoding is learnable. Default:
True
.p (float, optional) – The dropout probability. Default:
0.1
.max_len (int, optional) – The maximum length of the input sequence. Default:
5000
.
References
Vaswani et al. (https://arxiv.org/abs/1706.03762)
- class nncore.nn.blocks.TransformerDecoderLayer(*args: Any, **kwargs: Any)[source]
Transformer Decoder Layer introduced in [1].
- Parameters:
dims (int) – The input feature dimensions.
heads (int, optional) – The number of attention heads. Default:
8
.ratio (int, optional) – The ratio of hidden layer dimensions in the feed forward network. Default:
4
.p (float, optional) – The dropout probability. Default:
0.1
.pre_norm (bool, optional) – Whether to apply the normalization before instead of after each layer. Default:
True
.norm_cfg (dict | str | None, optional) – The config or name of the normalization layer. Default:
dict(type='LN')
.act_cfg (dict | str | None, optional) – The config or name of the activation layer. Default:
dict(type='ReLU', inplace=True)
.
References
Vaswani et al. (https://arxiv.org/abs/1706.03762)
- class nncore.nn.blocks.TransformerEncoderLayer(*args: Any, **kwargs: Any)[source]
Transformer Encoder Layer introduced in [1].
- Parameters:
dims (int) – The input feature dimensions.
heads (int, optional) – The number of attention heads. Default:
8
.ratio (float, optional) – The ratio of hidden layer dimensions in the feed forward network. Default:
4
.p (float, optional) – The dropout probability. Default:
0.1
.pre_norm (bool, optional) – Whether to apply the normalization before instead of after each layer. Default:
True
.norm_cfg (dict | str | None, optional) – The config or name of the normalization layer. Default:
dict(type='LN')
.act_cfg (dict | str | None, optional) – The config or name of the activation layer. Default:
dict(type='ReLU', inplace=True)
.
References
Vaswani et al. (https://arxiv.org/abs/1706.03762)
Losses
- class nncore.nn.losses.DynamicBCELoss(*args: Any, **kwargs: Any)[source]
Dynamic Binary Cross Entropy Loss that supports dynamic loss weights during training.
- Parameters:
reduction (str, optional) – Reduction method. Currently supported values include
'mean'
,'sum'
, and'none'
. Default:'mean'
.pos_weight (float | None, optional) – Weight of the positive examples. Default:
None
.loss_weight (float, optional) – Weight of the loss. Default:
1.0
.
- class nncore.nn.losses.InfoNCELoss(*args: Any, **kwargs: Any)[source]
InfoNCE Loss introduced in [1].
- Parameters:
temperature (float, optional) – The initial temperature for softmax. Default:
0.07
.max_scale (float, optional) – The maximum value of learnable scale. Default:
100
.learnable (bool, optional) – Whether the logit scale is learnable. Default:
True
.loss_weight (float, optional) – Weight of the loss. Default:
1.0
.
References
Oord et al. (https://arxiv.org/abs/1807.03748)
- class nncore.nn.losses.TripletLoss(*args: Any, **kwargs: Any)[source]
Triplet Loss.
- Parameters:
margin (float, optional) – The margin between positive and negative samples. Default:
0.5
.reduction (str, optional) – Reduction method. Currently supported values include
'mean'
,'sum'
, and'none'
. Default:'mean'
.loss_weight (float, optional) – Weight of the loss. Default:
1.0
.
- nncore.nn.losses.infonce_loss(a, b, temperature=0.07, scale=None, max_scale=100)[source]
InfoNCE Loss introduced in [1].
- Parameters:
a (
torch.Tensor
) – The first group of samples.b (
torch.Tensor
) – The second group of samples.temperature (float, optional) – The temperature for softmax. Default:
0.07
.scale (
torch.Tensor
| None, optional) – The logit scale to use. If not specified, the scale will be calculated from temperature. Default:None
.max_scale (float, optional) – The maximum logit scale value. Default:
100
.
References
Oord et al. (https://arxiv.org/abs/1807.03748)
- nncore.nn.losses.triplet_loss(pos, neg, anchor, margin=0.5)[source]
Triplet Loss.
- Parameters:
pos (
torch.Tensor
) – Positive samples.neg (
torch.Tensor
) – Negative samples.anchor (
torch.Tensor
) – Anchors for distance calculation.margin (float, optional) – The margin between positive and negative samples. Default:
0.5
.
- Returns:
The loss tensor.
- Return type:
torch.Tensor
- class nncore.nn.losses.FocalLoss(*args: Any, **kwargs: Any)[source]
Focal Loss introduced in [1].
- Parameters:
alpha (float, optional) – Weighting factor in range
(0, 1)
to balance positive and negative examples.-1
means no weighting. Default:-1
.gamma (float, optional) – Exponent of the modulating factor to balance easy and hard examples. Default:
2.0
.reduction (str, optional) – Reduction method. Currently supported values include
'mean'
,'sum'
, and'none'
. Default:'mean'
.loss_weight (float, optional) – Weight of the loss. Default:
1.0
.
References
Lin et al. (https://arxiv.org/abs/1708.02002)
- class nncore.nn.losses.FocalLossStar(*args: Any, **kwargs: Any)[source]
Focal Loss* introduced in [1].
- Parameters:
alpha (float, optional) – Weighting factor in range
(0, 1)
to balance positive and negative examples.-1
means no weighting. Default:-1
.gamma (float, optional) – Exponent of the modulating factor to balance easy and hard examples. Default:
1.0
.reduction (str, optional) – Reduction method. Currently supported values include
'mean'
,'sum'
, and'none'
. Default:'mean'
.loss_weight (float, optional) – Weight of the loss. Default:
1.0
.
References
Lin et al. (https://arxiv.org/abs/1708.02002)
- class nncore.nn.losses.GaussianFocalLoss(*args: Any, **kwargs: Any)[source]
Focal Loss introduced in [1] for targets in gaussian distribution.
- Parameters:
alpha (float, optional) – Weighting factor to balance positive and negative examples. Default:
2.0
.gamma (float, optional) – Exponent of the modulating factor to balance easy and hard examples. Default:
4.0
.reduction (str, optional) – Reduction method. Currently supported values include
'mean'
,'sum'
, and'none'
. Default:'mean'
.loss_weight (float, optional) – Weight of the loss. Default:
1.0
.
References
Lin et al. (https://arxiv.org/abs/1708.02002)
- nncore.nn.losses.focal_loss(pred, target, alpha=-1, gamma=2.0)[source]
Focal Loss introduced in [1].
- Parameters:
pred (
torch.Tensor
) – The predictions.target (
torch.Tensor
) – The binary classification label for each element (0 for negative classes and 1 for positive classes).alpha (float, optional) – Weighting factor in range
(0, 1)
to balance positive and negative examples.-1
means no weighting. Default:-1
.gamma (float, optional) – Exponent of the modulating factor to balance easy and hard examples. Default:
2.0
.
- Returns:
The loss tensor.
- Return type:
torch.Tensor
References
Lin et al. (https://arxiv.org/abs/1708.02002)
- nncore.nn.losses.focal_loss_star(pred, target, alpha=-1, gamma=1.0)[source]
Focal Loss* introduced in [1].
- Parameters:
pred (
torch.Tensor
) – The predictions.target (
torch.Tensor
) – The binary classification label for each element (0 for negative classes and 1 for positive classes).alpha (float, optional) – Weighting factor in range
(0, 1)
to balance positive and negative examples.-1
means no weighting. Default:-1
.gamma (float, optional) – Exponent of the modulating factor to balance easy and hard examples. Default:
1.0
.
- Returns:
The loss tensor.
- Return type:
torch.Tensor
References
Lin et al. (https://arxiv.org/abs/1708.02002)
- nncore.nn.losses.gaussian_focal_loss(pred, target, alpha=2.0, gamma=4.0)[source]
Focal Loss introduced in [1] for targets in gaussian distribution.
- Parameters:
pred (
torch.Tensor
) – The predictions.target (
torch.Tensor
) – The learning targets in gaussian distribution.alpha (float, optional) – Weighting factor to balance positive and negative examples. Default:
2.0
.gamma (float, optional) – Exponent of the modulating factor to balance easy and hard examples. Default:
4.0
.
- Returns:
The loss tensor.
- Return type:
torch.Tensor
References
Law et al. (https://arxiv.org/abs/1808.01244)
- class nncore.nn.losses.GHMCLoss(*args: Any, **kwargs: Any)[source]
Gradient Harmonized Classification Loss introduced in [1].
- Parameters:
bins (int, optional) – Number of the unit regions for distribution calculation. Default:
10
.momentum (float, optional) – The parameter for moving average. Default:
0
.loss_weight (float, optional) – Weight of the loss. Default:
1.0
.
References
Li et al. (https://arxiv.org/abs/1811.05181)
- class nncore.nn.losses.BalancedL1Loss(*args: Any, **kwargs: Any)[source]
Balanced L1 Loss introduced in [1].
- Parameters:
beta (float, optional) – The threshold in the piecewise function. Default:
1.0
.alpha (float, optional) – The dominator of the loss. Default:
0.5
.gamma (float, optional) – The promotion controller of the loss. Default:
1.5
.reduction (str, optional) – Reduction method. Currently supported values include
'mean'
,'sum'
, and'none'
. Default:'mean'
.
References
Pang et al. (https://arxiv.org/abs/1904.02701)
- class nncore.nn.losses.L1Loss(*args: Any, **kwargs: Any)[source]
L1 Loss.
- Parameters:
reduction (str, optional) – Reduction method. Currently supported values include
'mean'
,'sum'
, and'none'
. Default:'mean'
.loss_weight (float, optional) – Weight of the loss. Default:
1.0
.
- class nncore.nn.losses.SmoothL1Loss(*args: Any, **kwargs: Any)[source]
Smooth L1 Loss introduced in [1].
- Parameters:
beta (float, optional) – The threshold in the piecewise function. Default:
1.0
.reduction (str, optional) – Reduction method. Currently supported values include
'mean'
,'sum'
, and'none'
. Default:'mean'
.loss_weight (float, optional) – Weight of the loss. Default:
1.0
.
References
Girshick et al. (https://arxiv.org/abs/1504.08083)
- nncore.nn.losses.balanced_l1_loss(pred, target, beta=1.0, alpha=0.5, gamma=1.5)[source]
Balanced L1 Loss introduced in [1].
- Parameters:
pred (
torch.Tensor
) – The predictions.target (
torch.Tensor
) – The learning targets.beta (float, optional) – The threshold in the piecewise function. Default:
1.0
.alpha (float, optional) – The dominator of the loss. Default:
0.5
.gamma (float, optional) – The promotion controller of the loss. Default:
1.5
.
- Returns:
The loss tensor.
- Return type:
torch.Tensor
References
Pang et al. (https://arxiv.org/abs/1904.02701)
- nncore.nn.losses.l1_loss(pred, target)[source]
L1 Loss.
- Parameters:
pred (
torch.Tensor
) – The predictions.target (
torch.Tensor
) – The learning targets.
- Returns:
The loss tensor.
- Return type:
torch.Tensor
- nncore.nn.losses.smooth_l1_loss(pred, target, beta=1.0)[source]
Smooth L1 Loss introduced in [1].
- Parameters:
pred (
torch.Tensor
) – The predictions.target (
torch.Tensor
) – The learning targets.beta (float, optional) – The threshold in the piecewise function. Default:
1.0
.
- Returns:
The loss tensor.
- Return type:
torch.Tensor
References
Girshick et al. (https://arxiv.org/abs/1504.08083)
Modules
- class nncore.nn.modules.ConvModule(*args: Any, **kwargs: Any)[source]
A module that bundles convolution, normalization, and activation layers.
- Parameters:
in_channels (int) – Number of input channels.
out_channels (int) – Number of output channels.
kernel_size (tuple[int] | int) – Size of the convolution kernel.
stride (tuple[int] | int, optional) – Stride of the convolution. Default:
1
.padding (tuple[int] | int | str, optional) – Padding added to the input. Default:
0
.dilation (tuple[int] | int, optional) – Spacing among neighbouring kernel elements. Default:
1
.groups (int, optional) – Number of blocked connections from input to output channels. Default:
1
.bias (str | bool, optional) – Whether to add the bias term in the convolution layer. If
bias='auto'
, the module will decide it automatically base on whether it has a normalization layer. Default:'auto'
.conv_cfg (dict | str | None, optional) – The config or name of the convolution layer. If not specified,
nn.Conv2d
will be used. Default:None
.norm_cfg (dict | str | None, optional) – The config or name of the normalization layer. Default:
None
.act_cfg (dict | str | None, optional) – The config or name of the activation layer. Default:
dict(type='ReLU', inplace=True)
.order (tuple[str], optional) – The order of layers. It is expected to be a sequence of
'conv'
,'norm'
, and'act'
. Default:('conv', 'norm', 'act')
.
- nncore.nn.modules.build_conv_modules(dims, kernels, last_norm=False, last_act=False, default=None, **kwargs)[source]
Build a sequential module list containing convolution, normalization, and activation layers.
- Parameters:
dims (list[int]) – The sequence of numbers of dimensions of channels.
kernels (list[int] | int) – The size or list of sizes of the convolution kernels.
last_norm (bool, optional) – Whether to add a normalization layer after the last convolution layer. Default:
False
.last_act (bool, optional) – Whether to add an activation layer after the last convolution layer. Default:
False
.default (any, optional) – The default value when the
dims
is not valid. Default:None
.
- Returns:
The constructed module.
- Return type:
nn.Sequential
|ConvModule
- class nncore.nn.modules.LinearModule(*args: Any, **kwargs: Any)[source]
A module that bundles linear, normalization, and activation layers.
- Parameters:
in_features (int) – Number of input features.
out_features (int) – Number of output features.
bias (str | bool, optional) – Whether to add the bias term in the linear layer. If
bias='auto'
, the module will decide it automatically base on whether it has a normalization layer. Default:'auto'
.norm_cfg (dict | str | None, optional) – The config or name of the normalization layer. Default:
None
.act_cfg (dict | str | None, optional) – The config or name of the activation layer. Default:
dict(type='ReLU', inplace=True)
.order (tuple[str], optional) – The order of layers. It is expected to be a sequence of
'linear'
,'norm'
, and'act'
. Default:('linear', 'norm', 'act')
.
- nncore.nn.modules.build_linear_modules(dims, last_norm=False, last_act=False, default=None, **kwargs)[source]
Build a multi-layer perceptron (MLP).
- Parameters:
dims (list[int]) – The sequence of numbers of dimensions of features.
last_norm (bool, optional) – Whether to add a normalization layer after the last linear layer. Default:
False
.last_act (bool, optional) – Whether to add an activation layer after the last linear layer. Default:
False
.default (any, optional) – The default value when the
dims
is not valid. Default:None
.
- Returns:
The constructed module.
- Return type:
nn.Sequential
|LinearModule
- class nncore.nn.modules.MsgPassModule(*args: Any, **kwargs: Any)[source]
A module that bundles message passing, normalization, and activation layers.
- Parameters:
in_features (int) – Number of input features.
out_features (int) – Number of output features.
bias (str | bool, optional) – Whether to add the bias term in the message passing layer. If
bias='auto'
, the module will decide it automatically base on whether it has a normalization layer. Default:'auto'
.msg_pass_cfg (dict | str | None, optional) – The config or name of the message passing layer. If not specified,
GCN
will be used. Default:None
.norm_cfg (dict | str | None, optional) – The config or name of the normalization layer. Default:
None
.act_cfg (dict | str | None, optional) – The config or name of the activation layer. Default:
dict(type='ReLU', inplace=True)
.order (tuple[str], optional) – The order of layers. It is expected to be a sequence of
'msg_pass'
,'norm'
, and'act'
. Default:('msg_pass', 'norm', 'act')
.
- nncore.nn.modules.build_msg_pass_modules(dims, last_norm=False, last_act=False, default=None, **kwargs)[source]
Build a module list containing message passing, normalization, and activation layers.
- Parameters:
dims (list[int]) – The sequence of numbers of dimensions of features.
last_norm (bool, optional) – Whether to add a normalization layer after the last message passing layer. Default:
False
.last_act (bool, optional) – Whether to add an activation layer after the last message passing layer. Default:
False
.default (any, optional) – The default value when the
dims
is not valid. Default:None
.
- Returns:
The constructed module list.
- Return type:
nn.ModuleList
Builder
- nncore.nn.builder.build_model(cfg, *args, bundler='sequential', dist=None, **kwargs)[source]
Build a general model from a dict or str. This method searches for modules in
MODELS
first, and then fall back totorch.nn
.- Parameters:
cfg (dict | str) – The config or name of the model.
bundler (str | None, optional) – The type of bundler for multiple models. Expected values include
'sequential'
and'modulelist'
. Default:'sequential'
.dist (bool | None, optional) – Whether the model is distributed. If not specified, the model will not be wrapped. Default:
None
.
- Returns:
The constructed model.
- Return type:
nn.Module
- nncore.nn.builder.build_act_layer(cfg, *args, **kwargs)[source]
Build an activation layer from a dict or str. This method searches for layers in
ACTIVATIONS
first, and then fall back totorch.nn
.- Parameters:
cfg (dict | str) – The config or name of the layer.
- Returns:
The constructed layer.
- Return type:
nn.Module
- nncore.nn.builder.build_conv_layer(cfg, *args, **kwargs)[source]
Build a convolution layer from a dict or str. This method searches for layers in
CONVS
first, and then fall back totorch.nn
.- Parameters:
cfg (dict | str) – The config or name of the layer.
- Returns:
The constructed layer.
- Return type:
nn.Module
- nncore.nn.builder.build_msg_pass_layer(cfg, *args, **kwargs)[source]
Build a message passing layer from a dict or str. This method searches for layers in
MESSAGE_PASSINGS
first, and then fall back totorch.nn
.- Parameters:
cfg (dict | str) – The config or name of the layer.
- Returns:
The constructed layer.
- Return type:
nn.Module
- nncore.nn.builder.build_norm_layer(cfg, *args, dims=None, **kwargs)[source]
Build a normalization layer from a dict or str. This method searches for layers in
NORMS
first, and then fall back totorch.nn
.- Parameters:
cfg (dict | str) – The config or name of the layer.
dims (int | None, optional) – The input dimensions of the layer. Default:
None
.
- Returns:
The constructed layer.
- Return type:
nn.Module
- nncore.nn.builder.build_loss(cfg, *args, **kwargs)[source]
Build a loss module from a dict or str. This method searches for modules in
LOSSES
first, and then fall back totorch.nn
.- Parameters:
cfg (dict | str) – The config or name of the module.
- Returns:
The constructed module.
- Return type:
nn.Module
Init
- nncore.nn.init.constant_init_(module, value=1, bias=0)[source]
Initialize a module using a constant.
- Parameters:
module (
nn.Module
) – The module to be initialized.value (int, optional) – The value to be filled. Default:
1
.bias (int, optional) – The bias of the module. Default:
0
.
- nncore.nn.init.normal_init_(module, mean=0, std=1, bias=0)[source]
Initialize a module using normal distribution.
- Parameters:
module (
nn.Module
) – The module to be initialized.mean (int, optional) – Mean of the distribution. Default:
0
.std (int, optional) – Standard deviation of the distribution. Default:
1
.bias (int, optional) – The bias of the module. Default:
0
.
- nncore.nn.init.uniform_init_(module, a=0, b=1, bias=0)[source]
Initialize a module using uniform distribution.
- Parameters:
module (
nn.Module
) – The module to be initialized.a (int, optional) – Lower bound of the distribution. Default:
0
.b (int, optional) – Upper bound of the distribution. Default:
1
.bias (int, optional) – The bias of the module. Default:
0
.
- nncore.nn.init.xavier_init_(module, gain=1, bias=0, distribution='normal')[source]
Initialize a module using the method introduced in [1].
- Parameters:
module (
nn.Module
) – The module to be initialized.gain (int, optional) – The scaling factor. Default:
1
.bias (int, optional) – The bias of the module. Default:
0
.distribution (str, optional) – The type of distribution to use. Expected values include
normal
anduniform
. Default:'normal'
.
References
Glorot et al. (http://proceedings.mlr.press/v9/glorot10a)
- nncore.nn.init.kaiming_init_(module, a=0, mode='fan_in', nonlinearity='leaky_relu', bias=0, distribution='normal')[source]
Initialize a module using the method introduced in [1].
- Parameters:
module (
nn.Module
) – The module to be initialized.a (int, optional) – The negative slope of
LeakyReLU
. Default:0
.mode (str, optional) – The direction of pass whose magnitude of the variance of the weights are preserved. Expected values include
'fan_in'
and'fan_out'
. Default:'fan_in'
.nonlinearity (str, optional) – The nonlinearity after the parameterized layers. The expected values are
'relu'
and'leaky_relu'
. Default:'leaky_relu'
.bias (int, optional) – The bias of the module. Default:
0
.distribution (str, optional) – The type of distribution to use. Expected values include
normal
anduniform
. Default:'normal'
.
References
He et al. (https://arxiv.org/abs/1502.01852)
Utils
- nncore.nn.utils.move_to_device(data, device='cpu')[source]
Recursively move a tensor or a collection of tensors to the specific device.
- Parameters:
data (dict | list |
torch.Tensor
) – The tensor or collection of tensors to be moved.device (
torch.device
| str, optional) – The destination device. Default:'cpu'
.
- Returns:
The moved tensor or collection of tensors.
- Return type:
dict | list |
torch.Tensor
- nncore.nn.utils.fuse_bn_(model)[source]
During inference, the functionary of batch norm layers is turned off but only the mean and var are used, which exposes the chance to fuse it with the preceding convolution or linear layers to simplify the network structure and save computations.
- Parameters:
model (
nn.Module
) – The model whoseConv-BN
andLinear-BN
structure to be fused.- Returns:
The model whose
Conv-BN
andLinear-BN
structure has been fused.- Return type:
nn.Module
- nncore.nn.utils.update_bn_stats_(model, data_loader, num_iters=200, **kwargs)[source]
Recompute and update the BN stats to make them more precise. During training, both BN stats and the weight are changing after every iteration, so the running average can not precisely reflect the actual stats of the current model. In this function, the BN stats are recomputed with fixed weights to make the running average more precise. Specifically, it computes the true average of per-batch mean/variance instead of the running average.
- Parameters:
model (
nn.Module
) –The model whose BN stats will be recomputed. Note that:
This function will not alter the training mode of the given model. Users are responsible for setting the layers that needs Precise-BN to training mode, prior to calling this function.
Be careful if your models contain other stateful layers in addition to BN, i.e. layers whose state can change in forward iterations. This function will alter their state. If you wish them unchanged, you need to either pass in a submodule without those layers or backup the states.
data_loader (iterator) – The data loader to use.
num_iters (int, optional) – Number of iterations to compute the stats. Default:
200
.
- nncore.nn.utils.publish_model(checkpoint, out='model.pth', keys_to_keep=['state_dict', 'meta'], device='cpu', meta=None, hash_type='sha256', hash_len=8)[source]
Publish a model by removing needless data in the checkpoint, moving the weights to the specified device, and hashing the output model file.
- Parameters:
checkpoint (dict | str) – The checkpoint or path to the checkpoint.
out (str, optional) – Path to the output checkpoint file. Default:
'model.pth'
.keys_to_keep (list[str], optional) – The list of keys to be kept from the checkpoint. Default:
['state_dict', 'meta']
.device (
torch.device
| str) – The destination device. Default:'cpu'
.meta (dict | None, optional) – The meta data to be saved. Note that the key
nncore_version
andcreate_time
are reserved by the method. Default:None
.hash_type (str, optional) – Type of the hash algorithm. Currently supported algorithms include
'md5'
,'sha1'
,'sha224'
,'sha256'
,'sha384'
,'sha512'
,'blake2b'
,'blake2s'
,'sha3_224'
,'sha3_256'
,'sha3_384'
,'sha3_512'
,'shake_128'
, and'shake_256'
. Default:'sha256'
.hash_len (int, optional) – Length of the hash value. Default:
8
.
- nncore.nn.utils.model_soup(model1, model2, out='model.pth', device='cpu')[source]
Combine two models by calculating the element-wise average of their weight matrices (i.e. cooking model soups [1]). The output model is expected to have better performance compaired with the original ones.
- Parameters:
model1 (dict | str) – The checkpoint or path to the checkpoint of the first model.
model2 (dict | str) – The checkpoint or path to the checkpoint of the second model.
out (str, optional) – Path to the output checkpoint file. Default:
'model.pth'
.device (
torch.device
| str) – The destination device. Default:'cpu'
.
References
Wortsman et al. (https://arxiv.org/abs/2203.05482)