nncore.nn

Blocks

class nncore.nn.blocks.Clamp(*args: Any, **kwargs: Any)[source]

Clamp activation layer.

Parameters:
  • min (float, optional) – The lower-bound of the range. Default: -1.

  • max (float, optional) – The upper-bound of the range. Default: 1.

class nncore.nn.blocks.EffMish(*args: Any, **kwargs: Any)[source]

An efficient implementation of Mish activation layer introduced in [1].

References

  1. Misra et al. (https://arxiv.org/abs/1908.08681)

class nncore.nn.blocks.EffSwish(*args: Any, **kwargs: Any)[source]

An efficient implementation of Swish activation layer introduced in [1].

References

  1. Ramachandran et al. (https://arxiv.org/abs/1710.05941)

class nncore.nn.blocks.Mish(*args: Any, **kwargs: Any)[source]

Mish activation layer introduced in [1].

References

  1. Misra et al. (https://arxiv.org/abs/1908.08681)

class nncore.nn.blocks.Swish(*args: Any, **kwargs: Any)[source]

Swish activation layer introduced in [1].

References

  1. Ramachandran et al. (https://arxiv.org/abs/1710.05941)

class nncore.nn.blocks.GAT(*args: Any, **kwargs: Any)[source]

Graph Attention Layer introduced in [1].

Parameters:
  • in_features (int) – Number of input features.

  • out_features (int) – Number of output features.

  • heads (int, optional) – Number of attention heads. Default: 1.

  • p (float, optional) – The dropout probability. Default: 0.

  • negative_slope (float, optional) – The negative slope of LeakyReLU. Default: 0.2.

  • concat (bool, optional) – Whether to concatenate the features from different attention heads. Default: True.

  • residual (bool, optional) – Whether to add residual connections. Default: True.

  • bias (bool, optional) – Whether to add the bias term. Default: True.

References

  1. Veličković et al. (https://arxiv.org/abs/1710.10903)

forward(x, graph)[source]
Parameters:
  • x (torch.Tensor[N, M]) – The input node features.

  • graph (torch.Tensor[N, N]) – The graph structure where graph[i, j] == n (n > 0) means there is a link from node i to node j while graph[i, j] == 0 means not.

class nncore.nn.blocks.GCN(*args: Any, **kwargs: Any)[source]

Graph Convolutional Layer introduced in [1].

Parameters:
  • in_features (int) – Number of input features.

  • out_features (int) – Number of output features.

  • bias (bool, optional) – Whether to add the bias term. Default: True.

References

  1. Kipf et al. (https://arxiv.org/abs/1609.02907)

forward(x, graph)[source]
Parameters:
  • x (torch.Tensor[N, M]) – The input node features.

  • graph (torch.Tensor[N, N]) – The graph structure where graph[i, j] == n (n > 0) means there is an link with weight n from node i to node j while graph[i, j] == 0 means not.

class nncore.nn.blocks.SGC(*args: Any, **kwargs: Any)[source]

Simple Graph Convolutional Layer introduced in [1].

Parameters:
  • in_features (int) – Number of input features.

  • out_features (int) – Number of output features.

  • k (int, optional) – Number of layers to be stacked.

  • bias (bool, optional) – Whether to add the bias term. Default: True.

References

  1. Wu et al. (https://arxiv.org/abs/1902.07153)

class nncore.nn.blocks.CrossAttentionLayer(*args: Any, **kwargs: Any)[source]

Cross Attention Layer.

Parameters:
  • dims (int) – The input feature dimensions.

  • heads (int, optional) – The number of attention heads. Default: 8.

  • ratio (int, optional) – The ratio of hidden layer dimensions in the feed forward network. Default: 4.

  • p (float, optional) – The dropout probability. Default: 0.1.

  • pre_norm (bool, optional) – Whether to apply the normalization before instead of after each layer. Default: True.

  • norm_cfg (dict | str | None, optional) – The config or name of the normalization layer. Default: dict(type='LN').

  • act_cfg (dict | str | None, optional) – The config or name of the activation layer. Default: dict(type='ReLU', inplace=True).

class nncore.nn.blocks.FeedForwardNetwork(*args: Any, **kwargs: Any)[source]

Feed Forward Network introduced in [1].

Parameters:
  • dims (int) – The input feature dimensions.

  • ratio (float, optional) – The ratio of hidden layer dimensions with respect to the input dimensions. Default: 4.

  • p (float, optional) – The dropout probability. Default: 0.1.

  • act_cfg (dict | str | None, optional) – The config or name of the activation layer. Default: dict(type='ReLU', inplace=True).

References

  1. Vaswani et al. (https://arxiv.org/abs/1706.03762)

class nncore.nn.blocks.MultiHeadAttention(*args: Any, **kwargs: Any)[source]

Multi-Head Attention introduced in [1].

Parameters:
  • dims (int) – The input feature dimensions.

  • k_dims (int | None, optional) – The dimensions of key matrix. If not specified, it will be the same as q_dims. Default: None.

  • v_dims (int | None, optional) – The dimensions of value matrix. If not specified, it will be the same as q_dims. Default: None.

  • h_dims (int | None, optional) – The hidden dimensions. If not specified, it will be the same as q_dims. Default: None.

  • o_dims (int | None, optional) – The output dimensions. If not specified, it will be the same as q_dims. Default: None.

  • heads (int, optional) – The number of attention heads. Default: 8.

  • p (float, optional) – The dropout probability. Default: 0.1.

  • bias (bool, optional) – Whether to add the bias term. Default: True.

References

  1. Vaswani et al. (https://arxiv.org/abs/1706.03762)

class nncore.nn.blocks.PositionalEncoding(*args: Any, **kwargs: Any)[source]

Positional Encoding introduced in [1].

Parameters:
  • dims (int) – The input feature dimensions.

  • learnable (bool, optional) – Whether the positional encoding is learnable. Default: True.

  • p (float, optional) – The dropout probability. Default: 0.1.

  • max_len (int, optional) – The maximum length of the input sequence. Default: 5000.

References

  1. Vaswani et al. (https://arxiv.org/abs/1706.03762)

class nncore.nn.blocks.TransformerDecoderLayer(*args: Any, **kwargs: Any)[source]

Transformer Decoder Layer introduced in [1].

Parameters:
  • dims (int) – The input feature dimensions.

  • heads (int, optional) – The number of attention heads. Default: 8.

  • ratio (int, optional) – The ratio of hidden layer dimensions in the feed forward network. Default: 4.

  • p (float, optional) – The dropout probability. Default: 0.1.

  • pre_norm (bool, optional) – Whether to apply the normalization before instead of after each layer. Default: True.

  • norm_cfg (dict | str | None, optional) – The config or name of the normalization layer. Default: dict(type='LN').

  • act_cfg (dict | str | None, optional) – The config or name of the activation layer. Default: dict(type='ReLU', inplace=True).

References

  1. Vaswani et al. (https://arxiv.org/abs/1706.03762)

class nncore.nn.blocks.TransformerEncoderLayer(*args: Any, **kwargs: Any)[source]

Transformer Encoder Layer introduced in [1].

Parameters:
  • dims (int) – The input feature dimensions.

  • heads (int, optional) – The number of attention heads. Default: 8.

  • ratio (float, optional) – The ratio of hidden layer dimensions in the feed forward network. Default: 4.

  • p (float, optional) – The dropout probability. Default: 0.1.

  • pre_norm (bool, optional) – Whether to apply the normalization before instead of after each layer. Default: True.

  • norm_cfg (dict | str | None, optional) – The config or name of the normalization layer. Default: dict(type='LN').

  • act_cfg (dict | str | None, optional) – The config or name of the activation layer. Default: dict(type='ReLU', inplace=True).

References

  1. Vaswani et al. (https://arxiv.org/abs/1706.03762)

Losses

class nncore.nn.losses.DynamicBCELoss(*args: Any, **kwargs: Any)[source]

Dynamic Binary Cross Entropy Loss that supports dynamic loss weights during training.

Parameters:
  • reduction (str, optional) – Reduction method. Currently supported values include 'mean', 'sum', and 'none'. Default: 'mean'.

  • pos_weight (float | None, optional) – Weight of the positive examples. Default: None.

  • loss_weight (float, optional) – Weight of the loss. Default: 1.0.

class nncore.nn.losses.InfoNCELoss(*args: Any, **kwargs: Any)[source]

InfoNCE Loss introduced in [1].

Parameters:
  • temperature (float, optional) – The initial temperature for softmax. Default: 0.07.

  • max_scale (float, optional) – The maximum value of learnable scale. Default: 100.

  • learnable (bool, optional) – Whether the logit scale is learnable. Default: True.

  • loss_weight (float, optional) – Weight of the loss. Default: 1.0.

References

  1. Oord et al. (https://arxiv.org/abs/1807.03748)

class nncore.nn.losses.TripletLoss(*args: Any, **kwargs: Any)[source]

Triplet Loss.

Parameters:
  • margin (float, optional) – The margin between positive and negative samples. Default: 0.5.

  • reduction (str, optional) – Reduction method. Currently supported values include 'mean', 'sum', and 'none'. Default: 'mean'.

  • loss_weight (float, optional) – Weight of the loss. Default: 1.0.

nncore.nn.losses.infonce_loss(a, b, temperature=0.07, scale=None, max_scale=100)[source]

InfoNCE Loss introduced in [1].

Parameters:
  • a (torch.Tensor) – The first group of samples.

  • b (torch.Tensor) – The second group of samples.

  • temperature (float, optional) – The temperature for softmax. Default: 0.07.

  • scale (torch.Tensor | None, optional) – The logit scale to use. If not specified, the scale will be calculated from temperature. Default: None.

  • max_scale (float, optional) – The maximum logit scale value. Default: 100.

References

  1. Oord et al. (https://arxiv.org/abs/1807.03748)

nncore.nn.losses.triplet_loss(pos, neg, anchor, margin=0.5)[source]

Triplet Loss.

Parameters:
  • pos (torch.Tensor) – Positive samples.

  • neg (torch.Tensor) – Negative samples.

  • anchor (torch.Tensor) – Anchors for distance calculation.

  • margin (float, optional) – The margin between positive and negative samples. Default: 0.5.

Returns:

The loss tensor.

Return type:

torch.Tensor

class nncore.nn.losses.FocalLoss(*args: Any, **kwargs: Any)[source]

Focal Loss introduced in [1].

Parameters:
  • alpha (float, optional) – Weighting factor in range (0, 1) to balance positive and negative examples. -1 means no weighting. Default: -1.

  • gamma (float, optional) – Exponent of the modulating factor to balance easy and hard examples. Default: 2.0.

  • reduction (str, optional) – Reduction method. Currently supported values include 'mean', 'sum', and 'none'. Default: 'mean'.

  • loss_weight (float, optional) – Weight of the loss. Default: 1.0.

References

  1. Lin et al. (https://arxiv.org/abs/1708.02002)

class nncore.nn.losses.FocalLossStar(*args: Any, **kwargs: Any)[source]

Focal Loss* introduced in [1].

Parameters:
  • alpha (float, optional) – Weighting factor in range (0, 1) to balance positive and negative examples. -1 means no weighting. Default: -1.

  • gamma (float, optional) – Exponent of the modulating factor to balance easy and hard examples. Default: 1.0.

  • reduction (str, optional) – Reduction method. Currently supported values include 'mean', 'sum', and 'none'. Default: 'mean'.

  • loss_weight (float, optional) – Weight of the loss. Default: 1.0.

References

  1. Lin et al. (https://arxiv.org/abs/1708.02002)

class nncore.nn.losses.GaussianFocalLoss(*args: Any, **kwargs: Any)[source]

Focal Loss introduced in [1] for targets in gaussian distribution.

Parameters:
  • alpha (float, optional) – Weighting factor to balance positive and negative examples. Default: 2.0.

  • gamma (float, optional) – Exponent of the modulating factor to balance easy and hard examples. Default: 4.0.

  • reduction (str, optional) – Reduction method. Currently supported values include 'mean', 'sum', and 'none'. Default: 'mean'.

  • loss_weight (float, optional) – Weight of the loss. Default: 1.0.

References

  1. Lin et al. (https://arxiv.org/abs/1708.02002)

nncore.nn.losses.focal_loss(pred, target, alpha=-1, gamma=2.0)[source]

Focal Loss introduced in [1].

Parameters:
  • pred (torch.Tensor) – The predictions.

  • target (torch.Tensor) – The binary classification label for each element (0 for negative classes and 1 for positive classes).

  • alpha (float, optional) – Weighting factor in range (0, 1) to balance positive and negative examples. -1 means no weighting. Default: -1.

  • gamma (float, optional) – Exponent of the modulating factor to balance easy and hard examples. Default: 2.0.

Returns:

The loss tensor.

Return type:

torch.Tensor

References

  1. Lin et al. (https://arxiv.org/abs/1708.02002)

nncore.nn.losses.focal_loss_star(pred, target, alpha=-1, gamma=1.0)[source]

Focal Loss* introduced in [1].

Parameters:
  • pred (torch.Tensor) – The predictions.

  • target (torch.Tensor) – The binary classification label for each element (0 for negative classes and 1 for positive classes).

  • alpha (float, optional) – Weighting factor in range (0, 1) to balance positive and negative examples. -1 means no weighting. Default: -1.

  • gamma (float, optional) – Exponent of the modulating factor to balance easy and hard examples. Default: 1.0.

Returns:

The loss tensor.

Return type:

torch.Tensor

References

  1. Lin et al. (https://arxiv.org/abs/1708.02002)

nncore.nn.losses.gaussian_focal_loss(pred, target, alpha=2.0, gamma=4.0)[source]

Focal Loss introduced in [1] for targets in gaussian distribution.

Parameters:
  • pred (torch.Tensor) – The predictions.

  • target (torch.Tensor) – The learning targets in gaussian distribution.

  • alpha (float, optional) – Weighting factor to balance positive and negative examples. Default: 2.0.

  • gamma (float, optional) – Exponent of the modulating factor to balance easy and hard examples. Default: 4.0.

Returns:

The loss tensor.

Return type:

torch.Tensor

References

  1. Law et al. (https://arxiv.org/abs/1808.01244)

class nncore.nn.losses.GHMCLoss(*args: Any, **kwargs: Any)[source]

Gradient Harmonized Classification Loss introduced in [1].

Parameters:
  • bins (int, optional) – Number of the unit regions for distribution calculation. Default: 10.

  • momentum (float, optional) – The parameter for moving average. Default: 0.

  • loss_weight (float, optional) – Weight of the loss. Default: 1.0.

References

  1. Li et al. (https://arxiv.org/abs/1811.05181)

class nncore.nn.losses.BalancedL1Loss(*args: Any, **kwargs: Any)[source]

Balanced L1 Loss introduced in [1].

Parameters:
  • beta (float, optional) – The threshold in the piecewise function. Default: 1.0.

  • alpha (float, optional) – The dominator of the loss. Default: 0.5.

  • gamma (float, optional) – The promotion controller of the loss. Default: 1.5.

  • reduction (str, optional) – Reduction method. Currently supported values include 'mean', 'sum', and 'none'. Default: 'mean'.

References

  1. Pang et al. (https://arxiv.org/abs/1904.02701)

class nncore.nn.losses.L1Loss(*args: Any, **kwargs: Any)[source]

L1 Loss.

Parameters:
  • reduction (str, optional) – Reduction method. Currently supported values include 'mean', 'sum', and 'none'. Default: 'mean'.

  • loss_weight (float, optional) – Weight of the loss. Default: 1.0.

class nncore.nn.losses.SmoothL1Loss(*args: Any, **kwargs: Any)[source]

Smooth L1 Loss introduced in [1].

Parameters:
  • beta (float, optional) – The threshold in the piecewise function. Default: 1.0.

  • reduction (str, optional) – Reduction method. Currently supported values include 'mean', 'sum', and 'none'. Default: 'mean'.

  • loss_weight (float, optional) – Weight of the loss. Default: 1.0.

References

  1. Girshick et al. (https://arxiv.org/abs/1504.08083)

nncore.nn.losses.balanced_l1_loss(pred, target, beta=1.0, alpha=0.5, gamma=1.5)[source]

Balanced L1 Loss introduced in [1].

Parameters:
  • pred (torch.Tensor) – The predictions.

  • target (torch.Tensor) – The learning targets.

  • beta (float, optional) – The threshold in the piecewise function. Default: 1.0.

  • alpha (float, optional) – The dominator of the loss. Default: 0.5.

  • gamma (float, optional) – The promotion controller of the loss. Default: 1.5.

Returns:

The loss tensor.

Return type:

torch.Tensor

References

  1. Pang et al. (https://arxiv.org/abs/1904.02701)

nncore.nn.losses.l1_loss(pred, target)[source]

L1 Loss.

Parameters:
  • pred (torch.Tensor) – The predictions.

  • target (torch.Tensor) – The learning targets.

Returns:

The loss tensor.

Return type:

torch.Tensor

nncore.nn.losses.smooth_l1_loss(pred, target, beta=1.0)[source]

Smooth L1 Loss introduced in [1].

Parameters:
  • pred (torch.Tensor) – The predictions.

  • target (torch.Tensor) – The learning targets.

  • beta (float, optional) – The threshold in the piecewise function. Default: 1.0.

Returns:

The loss tensor.

Return type:

torch.Tensor

References

  1. Girshick et al. (https://arxiv.org/abs/1504.08083)

nncore.nn.losses.weighted_loss(func)[source]

A syntactic sugar for loss functions with dynamic weights and average factors. This method is expected to be used as a decorator.

Modules

class nncore.nn.modules.ConvModule(*args: Any, **kwargs: Any)[source]

A module that bundles convolution, normalization, and activation layers.

Parameters:
  • in_channels (int) – Number of input channels.

  • out_channels (int) – Number of output channels.

  • kernel_size (tuple[int] | int) – Size of the convolution kernel.

  • stride (tuple[int] | int, optional) – Stride of the convolution. Default: 1.

  • padding (tuple[int] | int | str, optional) – Padding added to the input. Default: 0.

  • dilation (tuple[int] | int, optional) – Spacing among neighbouring kernel elements. Default: 1.

  • groups (int, optional) – Number of blocked connections from input to output channels. Default: 1.

  • bias (str | bool, optional) – Whether to add the bias term in the convolution layer. If bias='auto', the module will decide it automatically base on whether it has a normalization layer. Default: 'auto'.

  • conv_cfg (dict | str | None, optional) – The config or name of the convolution layer. If not specified, nn.Conv2d will be used. Default: None.

  • norm_cfg (dict | str | None, optional) – The config or name of the normalization layer. Default: None.

  • act_cfg (dict | str | None, optional) – The config or name of the activation layer. Default: dict(type='ReLU', inplace=True).

  • order (tuple[str], optional) – The order of layers. It is expected to be a sequence of 'conv', 'norm', and 'act'. Default: ('conv', 'norm', 'act').

nncore.nn.modules.build_conv_modules(dims, kernels, last_norm=False, last_act=False, default=None, **kwargs)[source]

Build a sequential module list containing convolution, normalization, and activation layers.

Parameters:
  • dims (list[int]) – The sequence of numbers of dimensions of channels.

  • kernels (list[int] | int) – The size or list of sizes of the convolution kernels.

  • last_norm (bool, optional) – Whether to add a normalization layer after the last convolution layer. Default: False.

  • last_act (bool, optional) – Whether to add an activation layer after the last convolution layer. Default: False.

  • default (any, optional) – The default value when the dims is not valid. Default: None.

Returns:

The constructed module.

Return type:

nn.Sequential | ConvModule

class nncore.nn.modules.LinearModule(*args: Any, **kwargs: Any)[source]

A module that bundles linear, normalization, and activation layers.

Parameters:
  • in_features (int) – Number of input features.

  • out_features (int) – Number of output features.

  • bias (str | bool, optional) – Whether to add the bias term in the linear layer. If bias='auto', the module will decide it automatically base on whether it has a normalization layer. Default: 'auto'.

  • norm_cfg (dict | str | None, optional) – The config or name of the normalization layer. Default: None.

  • act_cfg (dict | str | None, optional) – The config or name of the activation layer. Default: dict(type='ReLU', inplace=True).

  • order (tuple[str], optional) – The order of layers. It is expected to be a sequence of 'linear', 'norm', and 'act'. Default: ('linear', 'norm', 'act').

nncore.nn.modules.build_linear_modules(dims, last_norm=False, last_act=False, default=None, **kwargs)[source]

Build a multi-layer perceptron (MLP).

Parameters:
  • dims (list[int]) – The sequence of numbers of dimensions of features.

  • last_norm (bool, optional) – Whether to add a normalization layer after the last linear layer. Default: False.

  • last_act (bool, optional) – Whether to add an activation layer after the last linear layer. Default: False.

  • default (any, optional) – The default value when the dims is not valid. Default: None.

Returns:

The constructed module.

Return type:

nn.Sequential | LinearModule

class nncore.nn.modules.MsgPassModule(*args: Any, **kwargs: Any)[source]

A module that bundles message passing, normalization, and activation layers.

Parameters:
  • in_features (int) – Number of input features.

  • out_features (int) – Number of output features.

  • bias (str | bool, optional) – Whether to add the bias term in the message passing layer. If bias='auto', the module will decide it automatically base on whether it has a normalization layer. Default: 'auto'.

  • msg_pass_cfg (dict | str | None, optional) – The config or name of the message passing layer. If not specified, GCN will be used. Default: None.

  • norm_cfg (dict | str | None, optional) – The config or name of the normalization layer. Default: None.

  • act_cfg (dict | str | None, optional) – The config or name of the activation layer. Default: dict(type='ReLU', inplace=True).

  • order (tuple[str], optional) – The order of layers. It is expected to be a sequence of 'msg_pass', 'norm', and 'act'. Default: ('msg_pass', 'norm', 'act').

forward(x, graph)[source]
Parameters:
  • x (torch.Tensor[N, M]) – The input node features.

  • graph (torch.Tensor[N, N]) – The graph structure where graph[i, j] == n (n > 0) means there is a link with weight n from node i to node j while graph[i, j] == 0 means not.

nncore.nn.modules.build_msg_pass_modules(dims, last_norm=False, last_act=False, default=None, **kwargs)[source]

Build a module list containing message passing, normalization, and activation layers.

Parameters:
  • dims (list[int]) – The sequence of numbers of dimensions of features.

  • last_norm (bool, optional) – Whether to add a normalization layer after the last message passing layer. Default: False.

  • last_act (bool, optional) – Whether to add an activation layer after the last message passing layer. Default: False.

  • default (any, optional) – The default value when the dims is not valid. Default: None.

Returns:

The constructed module list.

Return type:

nn.ModuleList

Builder

nncore.nn.builder.build_model(cfg, *args, bundler='sequential', dist=None, **kwargs)[source]

Build a general model from a dict or str. This method searches for modules in MODELS first, and then fall back to torch.nn.

Parameters:
  • cfg (dict | str) – The config or name of the model.

  • bundler (str | None, optional) – The type of bundler for multiple models. Expected values include 'sequential' and 'modulelist'. Default: 'sequential'.

  • dist (bool | None, optional) – Whether the model is distributed. If not specified, the model will not be wrapped. Default: None.

Returns:

The constructed model.

Return type:

nn.Module

nncore.nn.builder.build_act_layer(cfg, *args, **kwargs)[source]

Build an activation layer from a dict or str. This method searches for layers in ACTIVATIONS first, and then fall back to torch.nn.

Parameters:

cfg (dict | str) – The config or name of the layer.

Returns:

The constructed layer.

Return type:

nn.Module

nncore.nn.builder.build_conv_layer(cfg, *args, **kwargs)[source]

Build a convolution layer from a dict or str. This method searches for layers in CONVS first, and then fall back to torch.nn.

Parameters:

cfg (dict | str) – The config or name of the layer.

Returns:

The constructed layer.

Return type:

nn.Module

nncore.nn.builder.build_msg_pass_layer(cfg, *args, **kwargs)[source]

Build a message passing layer from a dict or str. This method searches for layers in MESSAGE_PASSINGS first, and then fall back to torch.nn.

Parameters:

cfg (dict | str) – The config or name of the layer.

Returns:

The constructed layer.

Return type:

nn.Module

nncore.nn.builder.build_norm_layer(cfg, *args, dims=None, **kwargs)[source]

Build a normalization layer from a dict or str. This method searches for layers in NORMS first, and then fall back to torch.nn.

Parameters:
  • cfg (dict | str) – The config or name of the layer.

  • dims (int | None, optional) – The input dimensions of the layer. Default: None.

Returns:

The constructed layer.

Return type:

nn.Module

nncore.nn.builder.build_loss(cfg, *args, **kwargs)[source]

Build a loss module from a dict or str. This method searches for modules in LOSSES first, and then fall back to torch.nn.

Parameters:

cfg (dict | str) – The config or name of the module.

Returns:

The constructed module.

Return type:

nn.Module

Init

nncore.nn.init.constant_init_(module, value=1, bias=0)[source]

Initialize a module using a constant.

Parameters:
  • module (nn.Module) – The module to be initialized.

  • value (int, optional) – The value to be filled. Default: 1.

  • bias (int, optional) – The bias of the module. Default: 0.

nncore.nn.init.normal_init_(module, mean=0, std=1, bias=0)[source]

Initialize a module using normal distribution.

Parameters:
  • module (nn.Module) – The module to be initialized.

  • mean (int, optional) – Mean of the distribution. Default: 0.

  • std (int, optional) – Standard deviation of the distribution. Default: 1.

  • bias (int, optional) – The bias of the module. Default: 0.

nncore.nn.init.uniform_init_(module, a=0, b=1, bias=0)[source]

Initialize a module using uniform distribution.

Parameters:
  • module (nn.Module) – The module to be initialized.

  • a (int, optional) – Lower bound of the distribution. Default: 0.

  • b (int, optional) – Upper bound of the distribution. Default: 1.

  • bias (int, optional) – The bias of the module. Default: 0.

nncore.nn.init.xavier_init_(module, gain=1, bias=0, distribution='normal')[source]

Initialize a module using the method introduced in [1].

Parameters:
  • module (nn.Module) – The module to be initialized.

  • gain (int, optional) – The scaling factor. Default: 1.

  • bias (int, optional) – The bias of the module. Default: 0.

  • distribution (str, optional) – The type of distribution to use. Expected values include normal and uniform. Default: 'normal'.

References

  1. Glorot et al. (http://proceedings.mlr.press/v9/glorot10a)

nncore.nn.init.kaiming_init_(module, a=0, mode='fan_in', nonlinearity='leaky_relu', bias=0, distribution='normal')[source]

Initialize a module using the method introduced in [1].

Parameters:
  • module (nn.Module) – The module to be initialized.

  • a (int, optional) – The negative slope of LeakyReLU. Default: 0.

  • mode (str, optional) – The direction of pass whose magnitude of the variance of the weights are preserved. Expected values include 'fan_in' and 'fan_out'. Default: 'fan_in'.

  • nonlinearity (str, optional) – The nonlinearity after the parameterized layers. The expected values are 'relu' and 'leaky_relu'. Default: 'leaky_relu'.

  • bias (int, optional) – The bias of the module. Default: 0.

  • distribution (str, optional) – The type of distribution to use. Expected values include normal and uniform. Default: 'normal'.

References

  1. He et al. (https://arxiv.org/abs/1502.01852)

nncore.nn.init.init_module_(module, method, **kwargs)[source]

Initialize a module using the specified method.

Parameters:
  • module (nn.Module) – The module to be initialized.

  • method (str) – The initialization method. Expected methods include 'constant', 'normal', 'uniform', 'xavier', 'kaiming'.

Utils

nncore.nn.utils.move_to_device(data, device='cpu')[source]

Recursively move a tensor or a collection of tensors to the specific device.

Parameters:
  • data (dict | list | torch.Tensor) – The tensor or collection of tensors to be moved.

  • device (torch.device | str, optional) – The destination device. Default: 'cpu'.

Returns:

The moved tensor or collection of tensors.

Return type:

dict | list | torch.Tensor

nncore.nn.utils.fuse_bn_(model)[source]

During inference, the functionary of batch norm layers is turned off but only the mean and var are used, which exposes the chance to fuse it with the preceding convolution or linear layers to simplify the network structure and save computations.

Parameters:

model (nn.Module) – The model whose Conv-BN and Linear-BN structure to be fused.

Returns:

The model whose Conv-BN and Linear-BN structure has been fused.

Return type:

nn.Module

nncore.nn.utils.update_bn_stats_(model, data_loader, num_iters=200, **kwargs)[source]

Recompute and update the BN stats to make them more precise. During training, both BN stats and the weight are changing after every iteration, so the running average can not precisely reflect the actual stats of the current model. In this function, the BN stats are recomputed with fixed weights to make the running average more precise. Specifically, it computes the true average of per-batch mean/variance instead of the running average.

Parameters:
  • model (nn.Module) –

    The model whose BN stats will be recomputed. Note that:

    1. This function will not alter the training mode of the given model. Users are responsible for setting the layers that needs Precise-BN to training mode, prior to calling this function.

    2. Be careful if your models contain other stateful layers in addition to BN, i.e. layers whose state can change in forward iterations. This function will alter their state. If you wish them unchanged, you need to either pass in a submodule without those layers or backup the states.

  • data_loader (iterator) – The data loader to use.

  • num_iters (int, optional) – Number of iterations to compute the stats. Default: 200.

nncore.nn.utils.publish_model(checkpoint, out='model.pth', keys_to_keep=['state_dict', 'meta'], device='cpu', meta=None, hash_type='sha256', hash_len=8)[source]

Publish a model by removing needless data in the checkpoint, moving the weights to the specified device, and hashing the output model file.

Parameters:
  • checkpoint (dict | str) – The checkpoint or path to the checkpoint.

  • out (str, optional) – Path to the output checkpoint file. Default: 'model.pth'.

  • keys_to_keep (list[str], optional) – The list of keys to be kept from the checkpoint. Default: ['state_dict', 'meta'].

  • device (torch.device | str) – The destination device. Default: 'cpu'.

  • meta (dict | None, optional) – The meta data to be saved. Note that the key nncore_version and create_time are reserved by the method. Default: None.

  • hash_type (str, optional) – Type of the hash algorithm. Currently supported algorithms include 'md5', 'sha1', 'sha224', 'sha256', 'sha384', 'sha512', 'blake2b', 'blake2s', 'sha3_224', 'sha3_256', 'sha3_384', 'sha3_512', 'shake_128', and 'shake_256'. Default: 'sha256'.

  • hash_len (int, optional) – Length of the hash value. Default: 8.

nncore.nn.utils.model_soup(model1, model2, out='model.pth', device='cpu')[source]

Combine two models by calculating the element-wise average of their weight matrices (i.e. cooking model soups [1]). The output model is expected to have better performance compaired with the original ones.

Parameters:
  • model1 (dict | str) – The checkpoint or path to the checkpoint of the first model.

  • model2 (dict | str) – The checkpoint or path to the checkpoint of the second model.

  • out (str, optional) – Path to the output checkpoint file. Default: 'model.pth'.

  • device (torch.device | str) – The destination device. Default: 'cpu'.

References

  1. Wortsman et al. (https://arxiv.org/abs/2203.05482)