PyTorch Wrappers

Training and inference

dpipe.torch.model.optimizer_step(optimizer: torch.optim.Optimizer, loss: torch.Tensor, scaler: Optional[torch.cuda.amp.GradScaler] = None, clip_grad: Optional[float] = None, accumulate: bool = False, **params) torch.Tensor[source]

Performs the backward pass with respect to loss, as well as a gradient step or gradient accumlation.

If a scaler is passed - it is used to perform the gradient step (automatic mixed precision support). If a clip_grad is passed - gradient will be clipped by this value considered as maximum l2 norm. accumulate indicates whether to perform gradient step or just accumulate gradients. params is used to change the optimizer’s parameters.

Examples

>>> optimizer = Adam(model.parameters(), lr=1)
>>> optimizer_step(optimizer, loss) # perform a gradient step
>>> optimizer_step(optimizer, loss, lr=1e-3) # set lr to 1e-3 and perform a gradient step
>>> optimizer_step(optimizer, loss, betas=(0, 0)) # set betas to 0 and perform a gradient step
>>> optimizer_step(optimizer, loss, accumulate=True) # perform a gradient accumulation

Notes

The incoming optimizer’s parameters are not restored to their original values.

dpipe.torch.model.train_step(*inputs: ndarray, architecture: torch.nn.Module, criterion: Callable, optimizer: torch.optim.Optimizer, n_targets: int = 1, loss_key: Optional[str] = None, scaler: Optional[torch.cuda.amp.GradScaler] = None, clip_grad: Optional[float] = None, accumulate: bool = False, gradient_accumulation_steps: int = 1, **optimizer_params) ndarray[source]

Performs a forward-backward pass, and make a gradient step or accumulation, according to the given inputs.

Parameters
  • inputs – inputs batches. The last n_targets batches are passed to criterion. The remaining batches are fed into the architecture.

  • architecture – the neural network architecture.

  • criterion – the loss function. Returns either a scalar or a dictionary of scalars. In the latter case loss_key must be provided.

  • optimizer

  • n_targets – how many values from inputs to be considered as targets.

  • loss_key – in case criterion returns a dictionary of scalars, indicates which key should be used for gradient computation.

  • scaler – a gradient scaler used to operate in automatic mixed precision mode.

  • clip_grad – maximum l2 norm of the gradient to clip it by.

  • accumulate – whether to accumulate gradients or perform optimizer step.

  • gradient_accumulation_steps

  • optimizer_params – additional parameters that will override the optimizer’s current parameters (e.g. lr).

Notes

Note that both input and output are not of type torch.Tensor - the conversion to and from torch.Tensor is made inside this function.

References

optimizer_step

dpipe.torch.model.inference_step(*inputs: ~numpy.ndarray, architecture: torch.nn.Module, activation: ~typing.Callable = <function identity>, amp: bool = False) ndarray[source]

Returns the prediction for the given inputs.

Notes

Note that both input and output are not of type torch.Tensor - the conversion to and from torch.Tensor is made inside this function.

dpipe.torch.model.multi_inference_step(*inputs: ~numpy.ndarray, architecture: torch.nn.Module, activations: ~typing.Union[~typing.Callable, ~typing.Sequence[~typing.Optional[~typing.Callable]]] = <function identity>, amp: bool = False) list[source]

Returns the prediction for the given inputs.

The architecture is expected to return a sequence of torch.Tensor objects.

Notes

Note that both input and output are not of type torch.Tensor - the conversion to and from torch.Tensor is made inside this function.

Loss functions

dpipe.torch.functional.focal_loss_with_logits(logits: torch.Tensor, target: torch.Tensor, weight: Optional[torch.Tensor] = None, gamma: float = 2, alpha: float = 0.25, reduce: Optional[Callable] = torch.mean)[source]

Function that measures Focal Loss between target and output logits.

Parameters
  • logits (torch.Tensor) – tensor of an arbitrary shape.

  • target (torch.Tensor) – tensor of the same shape as logits.

  • weight (torch.Tensor, None, optional) – a manual rescaling weight. Must be broadcastable to logits.

  • gamma (float) – the power of focal loss factor. Defaults to 2.

  • alpha (float, None, optional) – weighting factor of the focal loss. If None, no weighting will be performed. Defaults to 0.25.

  • reduce (Callable, None, optional) – the reduction operation to be applied to the final loss. Defaults to torch.mean. If None, no reduction will be performed.

References

Focal Loss

dpipe.torch.functional.linear_focal_loss_with_logits(logits: torch.Tensor, target: torch.Tensor, gamma: float, beta: float, weight: Optional[torch.Tensor] = None, reduce: Optional[Callable] = torch.mean)[source]

Function that measures Linear Focal Loss between target and output logits. Equals to BinaryCrossEntropy( gamma * logits + beta, target , weights).

Parameters
  • logits (torch.Tensor) – tensor of an arbitrary shape.

  • target (torch.Tensor) – tensor of the same shape as logits.

  • gamma (float) – multiplication coefficient for logits tensor.

  • beta (float) – coefficient to be added to all the elements in logits tensor.

  • weight (torch.Tensor) – a manual rescaling weight. Must be broadcastable to logits.

  • reduce (Callable, None, optional) – the reduction operation to be applied to the final loss. Defaults to torch.mean. If None - no reduction will be performed.

References

Focal Loss

dpipe.torch.functional.weighted_cross_entropy_with_logits(logit: torch.Tensor, target: torch.Tensor, weight: Optional[torch.Tensor] = None, alpha: float = 1, adaptive: bool = False, reduce: Optional[Callable] = torch.mean)[source]

Function that measures Binary Cross Entropy between target and output logits. This version of BCE has additional options of constant or adaptive weighting of positive examples.

Parameters
  • logit (torch.Tensor) – tensor of an arbitrary shape.

  • target (torch.Tensor) – tensor of the same shape as logits.

  • weight (torch.Tensor) – a manual rescaling weight. Must be broadcastable to logits.

  • alpha (float, optional) – a weight for the positive class examples.

  • adaptive (bool, optional) – If True, uses adaptive weight [N - sum(p_i)] / sum(p_i) for a positive class examples.

  • reduce (Callable, None, optional) – the reduction operation to be applied to the final loss. Defaults to torch.mean. If None - no reduction will be performed.

References

WCE

dpipe.torch.functional.tversky_loss(pred: torch.Tensor, target: torch.Tensor, alpha=0.5, epsilon=1e-07, reduce: Optional[Callable] = torch.mean)[source]

References

`Tversky Loss https://arxiv.org/abs/1706.05721`_

dpipe.torch.functional.focal_tversky_loss(pred: torch.Tensor, target: torch.Tensor, gamma=1.3333333333333333, alpha=0.5, epsilon=1e-07)[source]

References

`Focal Tversky Loss https://arxiv.org/abs/1810.07842`_

dpipe.torch.functional.dice_loss(pred: torch.Tensor, target: torch.Tensor, epsilon=1e-07)[source]

References

Dice Loss

dpipe.torch.functional.masked_loss(mask: torch.Tensor, criterion: Callable, prediction: torch.Tensor, target: torch.Tensor, **kwargs)[source]

Calculates the criterion between the masked prediction and target. args and kwargs are passed to criterion as additional arguments.

If the mask is empty - returns 0 wrapped in a torch tensor.

dpipe.torch.functional.moveaxis(x: torch.Tensor, source: Union[int, Sequence[int]], destination: Union[int, Sequence[int]])[source]

Move axes of a torch.Tensor to new positions. Other axes remain in their original order.

dpipe.torch.functional.softmax(x: torch.Tensor, axis: Union[int, Sequence[int]])[source]

A multidimensional version of softmax.

Utils

dpipe.torch.utils.load_model_state(module: torch.nn.Module, path: Union[Path, str], modify_state_fn: Optional[Callable] = None, strict: bool = True)[source]

Updates the module’s state dict by the one located at path.

Parameters
  • module (nn.Module) –

  • path (PathLike) –

  • modify_state_fn (Callable(current_state, state_to_load)) – if not None, two arguments will be passed to the function: current state of the model and the state loaded from the path. This function should modify states as needed and return the final state to load. For example, it could help you to transfer weights from similar but not completely equal architecture.

  • strict (bool) –

dpipe.torch.utils.save_model_state(module: torch.nn.Module, path: Union[Path, str])[source]

Saves the module’s state dict to path.

dpipe.torch.utils.get_device(x: Optional[Union[torch.device, torch.nn.Module, torch.Tensor, str]] = None) torch.device[source]

Determines the correct device based on the input.

Parameters

x (torch.device, torch.nn.Module, torch.Tensor, str, None) –

if torch.Tensor - returns the device on which it is located
if torch.nn.Module - returns the device on which its parameters are located
if str or torch.device - returns torch.device(x)
if None - same as ‘cuda’ if CUDA is available, ‘cpu’ otherwise.

dpipe.torch.utils.to_device(x: Union[torch.nn.Module, torch.Tensor], device: Optional[Union[torch.device, torch.nn.Module, torch.Tensor, str]] = 'cpu')[source]

Move x to device.

Parameters
  • x

  • device – the device on which to move x. See get_device for details.

dpipe.torch.utils.to_cuda(x, cuda: Optional[Union[torch.nn.Module, torch.Tensor, bool]] = None)[source]

Move x to cuda if specified.

Parameters
  • x

  • cuda – whether to move to cuda. If None, torch.cuda.is_available() is used to determine that.

dpipe.torch.utils.to_var(*arrays: Union[Iterable, int, float], device: Union[torch.device, torch.nn.Module, torch.Tensor, str] = 'cpu', requires_grad: bool = False)[source]

Convert numpy arrays to torch Tensors.

Parameters
  • arrays (array-like) – objects, that will be converted to torch Tensors.

  • device – the device on which to move x. See get_device for details.

  • requires_grad – whether the tensors require grad.

Notes

If arrays contains a single argument the result will not be contained in a tuple: >>> x = to_var(x) >>> x, y = to_var(x, y)

If this is not the desired behaviour, use sequence_to_var, which always returns a tuple of tensors.

dpipe.torch.utils.to_np(*tensors: torch.Tensor)[source]

Convert torch Tensors to numpy arrays.

Notes

If tensors contains a single argument the result will not be contained in a tuple: >>> x = to_np(x) >>> x, y = to_np(x, y)

If this is not the desired behaviour, use sequence_to_np, which always returns a tuple of arrays.

dpipe.torch.utils.set_params(optimizer: torch.optim.Optimizer, **params) torch.optim.Optimizer[source]

Change an optimizer’s parameters by the ones passed in params.

dpipe.torch.utils.set_lr(optimizer: torch.optim.Optimizer, lr: float) torch.optim.Optimizer[source]

Change an optimizer’s learning rate to lr.

dpipe.torch.utils.get_parameters(optimizer: torch.optim.Optimizer) Iterator[torch.nn.parameter.Parameter][source]

Returns an iterator over model parameters stored in optimizer.

dpipe.torch.utils.has_batchnorm(architecture: torch.nn.Module) bool[source]

Check whether architecture has BatchNorm module

dpipe.torch.utils.order_to_mode(order: int, dim: int)[source]

Converts the order of interpolation to a “mode” string.

Examples

>>> order_to_mode(1, 3)
'trilinear'