DeepPipe documentation

DeepPipe is a collection of utils for deep learning experiments mainly aimed at medical imaging applications.

Contents

Experiments management

Layouts

class dpipe.layout.base.Flat(split: Iterable[Sequence], prefixes: Sequence[str] = ('train', 'val', 'test'))[source]

Bases: Layout

Generates an experiment with a ‘flat’ structure. Creates a subdirectory of experiment_path for the each entry of split. The subdirectory contains corresponding structure of identifiers.

Also, the config file from config_path is copied to experiment_path/resources.config.

Parameters
  • split – an iterable with groups of ids.

  • prefixes (Sequence[str]) – the corresponding prefixes for each identifier group of split which will be used to generate appropriate filenames. Default is ('train', 'val', 'test').

Examples

>>> ids = [
>>>     [[1, 2, 3], [4, 5, 6], [7, 8]],
>>>     [[1, 4, 8], [7, 5, 2], [6, 3]],
>>> ]
>>> Flat(ids).build('some_path.config', 'experiments/base')
# resulting folder structure:
# experiments/base:
#   - resources.config
#   - experiment_0:
#       - train_ids.json # 1, 2, 3
#       - val_ids.json # 4, 5, 6
#       - test_ids.json # 7, 8
#   - experiment_1:
#       - train_ids.json # 1, 4, 8
#       - val_ids.json # 7, 5, 2
#       - test_ids.json # 6, 3

Splitters

Train-val-test
dpipe.split.cv.split(ids, *, n_splits, random_state=42)[source]
dpipe.split.cv.leave_group_out(ids, groups, *, val_size=None, random_state=42)[source]

Leave one group out CV. Validation subset will be selected randomly.

dpipe.split.cv.train_val_test_split(ids, *, val_size, n_splits, random_state=42)[source]

Splits the dataset’s ids into triplets (train, validation, test). The test ids are determined as in the standard K-fold cross-validation setting: for each fold a different portion of 1/K ids is kept for testing. The remaining (K - 1) / K ids are split into train and validation sets according to val_size.

Parameters
  • ids

  • val_size (float, int) – If float, should be between 0.0 and 1.0 and represents the proportion of the train set to include in the validation set. If int, represents the absolute number of validation samples.

  • n_splits (int) – the number of cross-validation folds.

Returns

splits

Return type

Sequence of triplets

dpipe.split.cv.group_train_val_test_split(ids: Sequence, groups: Union[Callable, Sequence], *, val_size, n_splits, random_state=42)[source]

Splits the dataset’s ids into triplets (train, validation, test) keeping all the objects from a group in the same set (either train, validation or test). The test ids are determined as in the standard K-fold cross-validation setting: for each fold a different portion of 1 / K ids is kept for testing. The remaining (K - 1) / K ids are split into train and validation sets according to val_size.

The splitter guarantees that no objects belonging to the same group will en up in different sets.

Parameters
  • ids

  • groups (np.ndarray[int]) –

  • val_size (float, int) – If float, should be between 0.0 and 1.0 and represents the proportion of the train set to include in the validation set. If int, represents the absolute number of validation samples.

  • n_splits (int) – the number of cross-validation folds

dpipe.split.cv.stratified_train_val_test_split(ids: Sequence, labels: Union[Callable, Sequence], *, val_size, n_splits, random_state=42)[source]

Imaging utils

Preprocessing

dpipe.im.preprocessing.normalize(x: ndarray, mean: bool = True, std: bool = True, percentiles: Optional[Union[float, Sequence[float]]] = None, axis: Optional[Union[int, Sequence[int]]] = None, dtype=None) ndarray[source]

Normalize x’s values to make mean and std independently along axes equal to 0 and 1 respectively (if specified).

Parameters
  • x

  • mean – whether to make mean == zero

  • std – whether to make std == 1

  • percentiles – if pair (a, b) - the percentiles between which mean and/or std will be estimated if scalar (s) - same as (s, 100 - s) if None - same as (0, 100).

  • axis – axes along which mean and/or std will be estimated independently. If None - the statistics will be estimated globally.

  • dtype – the dtype of the output.

dpipe.im.preprocessing.min_max_scale(x: ndarray, axis: Optional[Union[int, Sequence[int]]] = None) ndarray[source]

Scale x’s values so that its minimum and maximum become 0 and 1 respectively independently along axes.

dpipe.im.preprocessing.bytescale(x: ndarray) ndarray[source]

Scales x’s values so that its minimum and maximum become 0 and 255 respectively. Afterwards converts it to uint8.

dpipe.im.preprocessing.describe_connected_components(mask: ndarray, background: int = 0, drop_background: bool = True)[source]

Get the connected components of mask as well as their labels and volumes.

Parameters
  • mask

  • background – the label of the background. The pixels with this label will be marked as the background component (even if it is not connected).

  • drop_background – whether to exclude the background from the returned components’ descriptions.

Returns

  • labeled_mask – array of the same shape as mask.

  • labels – a list of labels from the labeled_mask. The background label is always 0. The labels are sorted according to their corresponding volumes.

  • volumes – a list of corresponding labels’ volumes.

dpipe.im.preprocessing.get_greatest_component(mask: ndarray, background: int = 0, drop_background: bool = True) ndarray[source]

Get the greatest connected component from mask. See describe_connected_components for details.

Shape operations

dpipe.im.shape_ops.zoom(x: ndarray, scale_factor: Union[float, Sequence[float]], axis: Optional[Union[int, Sequence[int]]] = None, order: int = 1, fill_value: Union[float, Callable] = 0, num_threads: int = -1, backend: Optional[Union[str, Backend, Type[Backend]]] = None) ndarray[source]

Rescale x according to scale_factor along the axis.

Uses a fast parallelizable implementation for fp32 / fp64 (and bool-int16-32-64 if order == 0) inputs, ndim <= 4 and order = 0 or 1.

Parameters
  • x (np.ndarray) – n-dimensional array

  • scale_factor (AxesParams) – float or sequence of floats describing how to scale along axes

  • axis (AxesLike) – axis along which array will be scaled

  • order (int) – order of interpolation

  • fill_value (float | Callable) – value to fill past edges. If Callable (e.g. numpy.min) - fill_value(x) will be used

  • num_threads (int) – the number of threads to use for computation. Default = the cpu count. If negative value passed cpu count + num_threads + 1 threads will be used

  • backend (BackendLike) – which backend to use. numba, cython and scipy are available, cython is used by default

Returns

zoomed – zoomed array

Return type

np.ndarray

Examples

>>> zoomed = zoom(x, 2, axis=[0, 1])  # 3d array
>>> zoomed = zoom(x, [1, 2, 3])  # different scales along each axes
>>> zoomed = zoom(x.astype(int))  # will fall back to scipy's implementation because of int dtype
dpipe.im.shape_ops.zoom_to_shape(x: ndarray, shape: Union[int, Sequence[int]], axis: Optional[Union[int, Sequence[int]]] = None, order: int = 1, fill_value: Union[float, Callable] = 0, num_threads: int = -1, backend: Optional[Union[str, Backend, Type[Backend]]] = None) ndarray[source]

Rescale x to match shape along the axis.

Uses a fast parallelizable implementation for fp32 / fp64 (and bool-int16-32-64 if order == 0) inputs, ndim <= 4 and order = 0 or 1.

Parameters
  • x (np.ndarray) – n-dimensional array

  • shape (AxesLike) – float or sequence of floats describing desired lengths along axes

  • axis (AxesLike) – axis along which array will be scaled

  • order (int) – order of interpolation

  • fill_value (float | Callable) – value to fill past edges. If Callable (e.g. numpy.min) - fill_value(x) will be used

  • num_threads (int) – the number of threads to use for computation. Default = the cpu count. If negative value passed cpu count + num_threads + 1 threads will be used

  • backend (BackendLike) – which backend to use. numba, cython and scipy are available, cython is used by default

Returns

zoomed – zoomed array

Return type

np.ndarray

Examples

>>> zoomed = zoom_to_shape(x, [3, 4, 5])  # 3d array
>>> zoomed = zoom_to_shape(x, [6, 7], axis=[1, 2])  # zoom to shape along specified axes
>>> zoomed = zoom_to_shape(x.astype(int))  # will fall back to scipy's implementation because of int dtype
dpipe.im.shape_ops.proportional_zoom_to_shape(x: ndarray, shape: Union[int, Sequence[int]], axis: Optional[Union[int, Sequence[int]]] = None, padding_values: Union[float, Sequence[float], Callable] = 0, order: int = 1) ndarray[source]

Proportionally rescale x to fit shape along axes then pad it to that shape. :param x: :param shape: final shape. :param axis: axes along which x will be padded. If None - the last len(shape) axes are used. :param padding_values: values to pad with. :param order: order of interpolation.

dpipe.im.shape_ops.crop_to_shape(x: ndarray, shape: Union[int, Sequence[int]], axis: Optional[Union[int, Sequence[int]]] = None, ratio: Union[float, Sequence[float]] = 0.5) ndarray[source]

Crop x to match shape along axes. :param x: :param shape: final shape. :param axis: axes along which x will be padded. If None - the last len(shape) axes are used. :param ratio: the fraction of the crop that will be applied to the left, 1 - ratio will be applied to the right.

dpipe.im.shape_ops.crop_to_box(x: ndarray, box: ndarray, axis: Optional[Union[int, Sequence[int]]] = None, padding_values: Optional[Union[float, Sequence[float]]] = None) ndarray[source]

Crop x according to box along axis.

Parameters
  • x (np.ndarray) – n-dimensional array

  • box (np.ndarray) – array of shape (2, x.ndim or len(axis) if axis is passed) describing crop boundaries

  • axis (AxesLike) – axis along which x will be cropped

  • padding_values (AxesParams) – values to pad with if box exceeds the input’s limits

Returns

cropped – cropped array

Return type

np.ndarray

Examples

>>> x  # array of shape [2, 3, 4]
>>> cropped = crop_to_box(x, np.array([[0, 0, 0], [1, 1, 1]]))  # crop to shape [1, 1, 1]
>>> cropped = crop_to_box(x, np.array([[0, 0, 0], [5, 5, 5]]))  # fail, box exceeds the input's limits
>>> cropped = crop_to_box(x, np.array([[0], [5]]), axis=0, padding_values=0)  # pad with 0-s to shape [5, 3, 4]
dpipe.im.shape_ops.restore_crop(x: ndarray, box: ndarray, shape: Union[int, Sequence[int]], padding_values: Union[float, Sequence[float], Callable] = 0) ndarray[source]

Pad x to match shape. The left padding is taken equal to box’s start.

Parameters
  • x (np.ndarray) – n-dimensional array to pad

  • box (np.ndarray) – array of shape (2, x.ndim) describing crop boundaries

  • shape (AxesLike) – shape to restore crop to

  • padding_values (Union[AxesParams, Callable]) – values to pad with. If Callable (e.g. numpy.min) - padding_values(x) will be used

Returns

padded – padded array

Return type

np.ndarray

Examples

>>> x  # array of shape [2, 3, 4]
>>> padded = restore_crop(x, np.array([[0, 0, 0], [2, 3, 4]]), [4, 4, 4])  # pad to shape [4, 4, 4]
>>> padded = restore_crop(x, np.array([[0, 0, 0], [1, 1, 1]]), [4, 4, 4])  # fail, box is inconsistent with an array
>>> padded = restore_crop(x, np.array([[1, 2, 3], [3, 5, 7]]), [3, 5, 7])  # pad to shape [3, 5, 7]
dpipe.im.shape_ops.pad(x: ndarray, padding: Union[int, Sequence[int], Sequence[Sequence[int]]], axis: Optional[Union[int, Sequence[int]]] = None, padding_values: Union[float, Sequence[float], Callable] = 0) ndarray[source]

Pad x according to padding along the axis.

Parameters
  • x (np.ndarray) – n-dimensional array to pad

  • padding (Union[AxesLike, Sequence[Sequence[int]]]) – if 2D array [[start_1, stop_1], …, [start_n, stop_n]] - specifies individual padding for each axis from axis. The length of the array must either be equal to 1 or match the length of axis. If 1D array [val_1, …, val_n] - same as [[val_1, val_1], …, [val_n, val_n]]. If scalar (val) - same as [[val, val]]

  • axis (AxesLike) – axis along which x will be padded

  • padding_values (Union[AxesParams, Callable]) – values to pad with, must be broadcastable to the resulting array. If Callable (e.g. numpy.min) - padding_values(x) will be used

Returns

padded – padded array

Return type

np.ndarray

Examples

>>> padded = pad(x, 2)  # pad 2 zeros on each side of each axes
>>> padded = pad(x, [1, 1], axis=(-1, -2))  # pad 1 zero on each side of last 2 axes
dpipe.im.shape_ops.pad_to_shape(x: ndarray, shape: Union[int, Sequence[int]], axis: Optional[Union[int, Sequence[int]]] = None, padding_values: Union[float, Sequence[float], Callable] = 0, ratio: Union[float, Sequence[float]] = 0.5) ndarray[source]

Pad x to match shape along the axis.

Parameters
  • x (np.ndarray) – n-dimensional array to pad

  • shape (AxesLike) – final shape

  • axis (AxesLike) – axis along which x will be padded

  • padding_values (Union[AxesParams, Callable]) – values to pad with, must be broadcastable to the resulting array. If Callable (e.g. numpy.min) - padding_values(x) will be used

  • ratio (AxesParams) – float or sequence of floats describing what proportion of padding to apply on the left sides of padding axes. Remaining ratio of padding will be applied on the right sides

Returns

padded – padded array

Return type

np.ndarray

Examples

>>> padded = pad_to_shape(x, [4, 5, 6])  # pad 3d array
>>> padded = pad_to_shape(x, [4, 5], axis=[0, 1], ratio=0)  # pad first 2 axes on the right
dpipe.im.shape_ops.pad_to_divisible(x: ndarray, divisor: Union[int, Sequence[int]], axis: Optional[Union[int, Sequence[int]]] = None, padding_values: Union[float, Sequence[float], Callable] = 0, ratio: Union[float, Sequence[float]] = 0.5, remainder: Union[int, Sequence[int]] = 0) ndarray[source]

Pad x to be divisible by divisor along the axis.

Parameters
  • x (np.ndarray) – n-dimensional array to pad

  • divisor (AxesLike) – float or sequence of floats an incoming array shape will be divisible by

  • axis (AxesLike) – axis along which the array will be padded. If None - the last len(divisor) axes are used

  • padding_values (Union[AxesParams, Callable]) – values to pad with. If Callable (e.g. numpy.min) - padding_values(x) will be used

  • ratio (AxesParams) – float or sequence of floats describing what proportion of padding to apply on the left sides of padding axes. Remaining ratio of padding will be applied on the right sides

  • remainder (AxesLike) – x will be padded such that its shape gives the remainder remainder when divided by divisor

Returns

padded – padded array

Return type

np.ndarray

Examples

>>> x  # array of shape [2, 3, 4]
>>> padded = pad_to_divisible(x, 6)  # pad to shape [6, 6, 6]
>>> padded = pad_to_divisible(x, [4, 3], axis=[0, 1], ratio=1)  # pad first 2 axes on the left, shape - [4, 3, 4]
>>> padded = pad_to_divisible(x, 3, remainder=1)  # pad to shape [4, 4, 4]

Data augmentation

dpipe.im.augmentation.elastic_transform(x: ndarray, amplitude: float, axis: Optional[Union[int, Sequence[int]]] = None, order: int = 1)[source]

Apply a gaussian elastic distortion with a given amplitude to a tensor along the given axes.

Metrics

dpipe.im.metrics.dice_score(x: ndarray, y: ndarray) float[source]
dpipe.im.metrics.sensitivity(y_true, y_pred)[source]
dpipe.im.metrics.specificity(y_true, y_pred)[source]
dpipe.im.metrics.precision(y_true, y_pred)[source]
dpipe.im.metrics.recall(y_true, y_pred)[source]
dpipe.im.metrics.iou(x: ndarray, y: ndarray) float[source]
dpipe.im.metrics.assd(x, y, voxel_shape=None)[source]
dpipe.im.metrics.hausdorff_distance(x, y, voxel_shape=None)[source]
dpipe.im.metrics.cross_entropy_with_logits(target: ~numpy.ndarray, logits: ~numpy.ndarray, axis: int = 1, reduce: ~typing.Optional[~typing.Callable] = <function mean>)[source]

A numerically stable cross entropy for numpy arrays. target and logits must have the same shape except for axis.

Parameters
  • target – integer array of shape (d1, …, di, dj, …, dn)

  • logits – array of shape (d1, …, di, k, dj, …, dn)

  • axis – the axis containing the logits for each class: logits.shape[axis] == k

  • reduce – the reduction operation to be applied to the final loss. If None - no reduction will be performed.

dpipe.im.metrics.convert_to_aggregated(metrics: ~typing.Dict[str, ~typing.Callable], aggregate_fn: ~typing.Callable = <function mean>, key_prefix: str = '', key_suffix: str = '', *args, **kwargs)[source]
dpipe.im.metrics.to_aggregated(metric: ~typing.Callable, aggregate: ~typing.Callable = <function mean>, *args, **kwargs)[source]

Converts a metric that receives two values to a metric that receives two sequences and returns an aggregated value.

args and kwargs are passed as additional arguments ot aggregate.

Examples

>>> mean_dice = to_aggregated(dice_score)
>>> worst_dice = to_aggregated(dice_score, aggregate=np.min)
dpipe.im.metrics.fraction(numerator, denominator, empty_val: float = 1)[source]

Box

Functions to work with boxes: immutable numpy arrays of shape (2, n) which represent the coordinates of the upper left and lower right corners of an n-dimensional rectangle.

In slicing operations, as everywhere in Python, the left corner is inclusive, and the right one is non-inclusive.

dpipe.im.box.make_box_(iterable) ndarray[source]

Returns a box, generated inplace from the iterable. If iterable was a numpy array, will make it immutable and return.

dpipe.im.box.returns_box(func: Callable) Callable[source]

Returns function, decorated so that it returns a box.

dpipe.im.box.get_containing_box(shape: tuple) ndarray[source]

Returns box that contains complete array of shape shape.

dpipe.im.box.broadcast_box(box: ndarray, shape: tuple, dims: tuple) ndarray[source]

Returns box, such that it contains box across dims and whole array with shape shape across other dimensions.

dpipe.im.box.limit_box(box, limit) ndarray[source]

Returns a box, maximum subset of the input box so that start would be non-negative and stop would be limited by the limit.

dpipe.im.box.get_box_padding(box: ndarray, limit)[source]
Returns padding that is necessary to get box from array of shape limit.

Returns padding in numpy form, so it can be given to numpy.pad.

dpipe.im.box.add_margin(box: ndarray, margin) ndarray[source]

Returns a box with size increased by the margin (need to be broadcastable to the box) compared to the input box.

dpipe.im.box.get_centered_box(center: ndarray, box_size: ndarray) ndarray[source]

Get box of size box_size, centered in the center. If box_size is odd, center will be closer to the right.

dpipe.im.box.mask2bounding_box(mask: ndarray) ndarray[source]

Find the smallest box that contains all true values of the mask.

Grid splitters

Function for working with patches from tensors. See the Working with patches tutorial for more details.

dpipe.im.grid.get_boxes(shape: Union[int, Sequence[int]], box_size: Union[int, Sequence[int]], stride: Union[int, Sequence[int]], axis: Optional[Union[int, Sequence[int]]] = None, valid: bool = True) Iterable[ndarray][source]

Yield boxes appropriate for a tensor of shape shape in a convolution-like fashion.

Parameters
  • shape – the input tensor’s shape.

  • box_size

  • axis – axes along which the slices will be taken.

  • stride – the stride (step-size) of the slice.

  • valid – whether boxes of size smaller than box_size should be left out.

References

See the Working with patches tutorial for more details.

dpipe.im.grid.divide(x: ~numpy.ndarray, patch_size: ~typing.Union[int, ~typing.Sequence[int]], stride: ~typing.Union[int, ~typing.Sequence[int]], axis: ~typing.Optional[~typing.Union[int, ~typing.Sequence[int]]] = None, valid: bool = False, get_boxes: ~typing.Callable = <function get_boxes>) Iterable[ndarray][source]

A convolution-like approach to generating patches from a tensor.

Parameters
  • x

  • patch_size

  • axis – dimensions along which the slices will be taken.

  • stride – the stride (step-size) of the slice.

  • valid – whether patches of size smaller than patch_size should be left out.

  • get_boxes – function that yields boxes, for signature see get_boxes

References

See the Working with patches tutorial for more details.

dpipe.im.grid.combine(patches: ~typing.Iterable[~numpy.ndarray], output_shape: ~typing.Union[int, ~typing.Sequence[int]], stride: ~typing.Union[int, ~typing.Sequence[int]], axis: ~typing.Optional[~typing.Union[int, ~typing.Sequence[int]]] = None, valid: bool = False, combiner: ~typing.Type[~dpipe.im.grid.PatchCombiner] = <class 'dpipe.im.grid.Average'>, get_boxes: ~typing.Callable = <function get_boxes>) ndarray[source]

Build a tensor of shape output_shape from patches obtained in a convolution-like approach with corresponding parameters. The overlapping parts are aggregated using the strategy from combiner - Average by default.

References

See the Working with patches tutorial for more details.

class dpipe.im.grid.PatchCombiner(shape: Tuple[int, ...], dtype: dtype)[source]

Bases: object

update(box: ndarray, patch: ndarray)[source]
build() ndarray[source]
class dpipe.im.grid.Average(shape: Tuple[int, ...], dtype: dtype)[source]

Bases: PatchCombiner

update(box: ndarray, patch: ndarray)[source]
build()[source]

Patch

Tools for patch extraction and generation.

dpipe.im.patch.uniform(shape, random_state: Optional[RandomState] = None)[source]
dpipe.im.patch.sample_box_center_uniformly(shape, box_size: array, random_state: Optional[RandomState] = None)[source]

Returns the center of a sampled uniformly box of size box_size, contained in the array of shape shape.

dpipe.im.patch.get_random_patch(*arrays: ~numpy.ndarray, patch_size: ~typing.Union[int, ~typing.Sequence[int]], axis: ~typing.Optional[~typing.Union[int, ~typing.Sequence[int]]] = None, distribution: ~typing.Callable = <function uniform>)[source]

Get a random patch of size path_size along the axes for each of the arrays. The patch position is equal for all the arrays.

Parameters
  • arrays

  • patch_size

  • axis

  • distribution (Callable(shape)) – function that samples a random number in the range [0, n) for each axis. Defaults to a uniform distribution.

dpipe.im.patch.get_random_box(shape: ~typing.Union[int, ~typing.Sequence[int]], box_shape: ~typing.Union[int, ~typing.Sequence[int]], axis: ~typing.Union[int, ~typing.Sequence[int]] = None, distribution: ~typing.Callable = <function uniform>) ndarray[source]

Get a random box of shape box_shape that fits in the shape along the given axes.

Distributions

Module for calculation of various statistics given a discrete or piecewise-linear distribution.

dpipe.im.dist.weighted_sum(weights: Union[ndarray, torch.Tensor], axis: Union[int, Sequence[int]], values_range: Callable) Union[ndarray, torch.Tensor][source]

Calculates a weighted sum of values returned by values_range with the corresponding weights along a given axis.

Parameters
  • weights

  • axis

  • values_range – takes n as input and returns an array of n values where n = weights.shape[axis].

dpipe.im.dist.expectation(distribution: ~typing.Union[~numpy.ndarray, torch.Tensor], axis: int, integral: ~typing.Callable = <function polynomial>, *args, **kwargs) Union[ndarray, torch.Tensor][source]

Calculates the expectation of a function h given its integral and a distribution.

args and kwargs are passed to integral as additional arguments.

Parameters
  • distribution – the distribution by which the expectation will be calculated. Must sum to 1 along the axis.

  • axis – the axis along which the expectation is calculated.

  • integral – the definite integral of the function h. See polynomial for an example.

Notes

This function calculates the expectation by a piecewise-linear distribution in the range \([0, N]\) where N = distribution.shape[axis] + 1:

\[\mathbb{E}_F[h] = \int\limits_0^N h(x) dF(x) = \sum\limits_0^{N-1} \int\limits_i^{i+1} h(x) dF(x) = \sum\limits_0^{N-1} distribution_i \int\limits_i^{i+1} h(x) dx = \sum\limits_0^{N-1} distribution_i \cdot (H(i+1) - H(i)),\]

where \(distribution_i\) are taken along axis, \(H(i) = \int\limits_0^{i} h(x) dx\) are returned by integral.

References

polynomial

dpipe.im.dist.marginal_expectation(distribution: ~typing.Union[~numpy.ndarray, torch.Tensor], axis: ~typing.Union[int, ~typing.Sequence[int]], integrals: ~typing.Union[~typing.Callable, ~typing.Sequence[~typing.Callable]] = <function polynomial>, *args, **kwargs) list[source]

Computes expectations along the axis according to integrals independently.

args and kwargs are passed to integral as additional arguments.

dpipe.im.dist.polynomial(n: int, order=1) ndarray[source]

The definite integral for a polynomial function of a given order from 0 to n - 1.

Examples

>>> polynomial(10, 1) # x ** 2 / 2 from 0 to 9
array([ 0. ,  0.5,  2. ,  4.5,  8. , 12.5, 18. , 24.5, 32. , 40.5])

Slicing

dpipe.im.slices.iterate_slices(*data: ndarray, axis: int)[source]

Iterate over slices of a series of tensors along a given axis.

dpipe.im.slices.iterate_axis(x: ndarray, axis: int)[source]

Images visualization

dpipe.im.visualize.slice3d(*data: ndarray, axis: int = -1, scale: int = 5, max_columns: Optional[int] = None, colorbar: bool = False, show_axes: bool = False, cmap: Union[Colormap, str] = 'gray', vlim: Optional[Union[float, Sequence[float]]] = None, titles: Optional[Sequence[Optional[str]]] = None)[source]

Creates an interactive plot, simultaneously showing slices along a given axis for all the passed images.

Parameters
  • data

  • axis

  • scale – the figure scale.

  • max_columns – the maximal number of figures in a row. If None - all figures will be in the same row.

  • colorbar – Whether to display a colorbar.

  • show_axes – Whether to do display grid on the image.

  • cmap

  • vlim – used to normalize luminance data. If None - the limits are determined automatically. Must be broadcastable to (len(data), 2). See matplotlib.pyplot.imshow (vmin and vmax) for details.

dpipe.im.visualize.animate3d(*data: ndarray, output_path: Union[Path, str], axis: int = -1, scale: int = 5, max_columns: Optional[int] = None, colorbar: bool = False, show_axes: bool = False, cmap: str = 'gray', vlim=(None, None), fps: int = 30, writer: str = 'imagemagick', repeat: bool = True)[source]

Saves an animation to output_path, simultaneously showing slices along a given axis for all the passed images.

Parameters
  • data (np.ndarray) –

  • output_path (str) –

  • axis (int) –

  • scale (int) – the figure scale.

  • max_columns (int) – the maximal number of figures in a row. If None - all figures will be in the same row.

  • colorbar (bool) – Whether to display a colorbar. Works only if ``vlim``s are not None.

  • show_axes (bool) – Whether to do display grid on the image.

  • cmap – parameters passed to matplotlib.pyplot.imshow

  • vlim – parameters passed to matplotlib.pyplot.imshow

  • fps (int) –

  • writer (str) –

  • repeat (bool) – whether the animation should repeat when the sequence of frames is completed.

dpipe.im.visualize.default_clip(image, body_organ='Brain')[source]

Clips image (CT) pixels/voxels to ranges, typically used for different body organs.

Parameters

numpy.array (image -) –

:param : :param body_organ - str: possible values: Brain, Lungs :param : possible values: Brain, Lungs

Color space conversion

dpipe.im.hsv.hsv_image(hue, saturation, value)[source]

Creates image in HSV format from HSV data.

dpipe.im.hsv.rgb_from_hsv_data(hue, saturation, value)[source]

Creates image in RGB format from HSV data.

dpipe.im.hsv.gray_image_colored_mask(gray_image, mask, hue)[source]

Creates gray image with colored mask. Keeps intensities intact, so dark areas on gray image will be hard to see even after colorization.

dpipe.im.hsv.gray_image_bright_colored_mask(gray_image, mask, hue)[source]

Creates gray image with colored mask. Changes mask intensities, so dark areas on gray image will be easy to see after colorization.

dpipe.im.hsv.segmentation_probabilities(image, probabilities, hue)[source]
dpipe.im.hsv.masked_segmentation_probabilities(image, probabilities, hue, mask)[source]

Various utils

dpipe.im.utils.apply_along_axes(func: Callable, x: ndarray, axis: Union[int, Sequence[int]], *args, **kwargs)[source]

Apply func to slices from x taken along axes. args and kwargs are passed as additional arguments.

Notes

func must return an array of the same shape as it received.

dpipe.im.utils.build_slices(start: Sequence[int], stop: Optional[Sequence[int]] = None) Tuple[slice, ...][source]

Returns a tuple of slices built from start and stop.

Examples

>>> build_slices([1, 2, 3], [4, 5, 6])
(slice(1, 4), slice(2, 5), slice(3, 6))
>>> build_slices([10, 11])
(slice(10), slice(11))
dpipe.im.utils.composition(func: Callable, *args, **kwargs)[source]

Applies func to the output of the decorated function. args and kwargs are passed as additional positional and keyword arguments respectively.

dpipe.im.utils.get_mask_volume(mask: ndarray, *spacing: Union[float, Sequence[float]], location: bool = False) float[source]

Calculates the mask volume given its spatial spacing.

Parameters
  • mask

  • spacing – each value represents the spacing for the corresponding axis. If float - the values are uniformly spaced along this axis. If Sequence[float] - the values are non-uniformly spaced.

  • location – whether to interpret the Sequence[float] in spacing as values’ locations or spacings. If True - the deltas are used as spacings.

Shape utils

dpipe.im.shape_utils.extract_dims(array, ndim=1)[source]

Decrease the dimensionality of array by extracting ndim leading singleton dimensions.

dpipe.im.shape_utils.prepend_dims(array, ndim=1)[source]

Increase the dimensionality of array by adding ndim leading singleton dimensions.

dpipe.im.shape_utils.append_dims(array, ndim=1)[source]

Increase the dimensionality of array by adding ndim singleton dimensions to the end of its shape.

dpipe.im.shape_utils.insert_dims(array, index=0, ndim=1)[source]

Increase the dimensionality of array by adding ndim singleton dimensions before the specified ``index` of its shape.

dpipe.im.shape_utils.shape_after_convolution(shape: Union[int, Sequence[int]], kernel_size: Union[int, Sequence[int]], stride: Union[int, Sequence[int]] = 1, padding: Union[int, Sequence[int]] = 0, dilation: Union[int, Sequence[int]] = 1, valid: bool = True) tuple[source]

Get the shape of a tensor after applying a convolution with corresponding parameters.

dpipe.im.shape_utils.shape_after_full_convolution(shape: Union[int, Sequence[int]], kernel_size: Union[int, Sequence[int]], axis: Optional[Union[int, Sequence[int]]] = None, stride: Union[int, Sequence[int]] = 1, padding: Union[int, Sequence[int]] = 0, dilation: Union[int, Sequence[int]] = 1, valid: bool = True) tuple[source]

Get the shape of a tensor after applying a convolution with corresponding parameters along the given axes. The dimensions along the remaining axes will become singleton.

Input/Output

Input/Output operations.

All the loading functions have the interface load(path, **kwargs) where kwargs are loader-specific keyword arguments.

Similarly, all the saving functions have the interface save(value, path, **kwargs).

class dpipe.io.ConsoleArguments[source]

Bases: object

A class that simplifies access to console arguments.

dpipe.io.load_or_create(path: ~typing.Union[~pathlib.Path, str], create: ~typing.Callable, *args, save: ~typing.Callable = <function save>, load: ~typing.Callable = <function load>, **kwargs)[source]

load a file from path if it exists. Otherwise create the value, save it to path, and return it.

args and kwargs are passed to create as additional arguments.

dpipe.io.choose_existing(*paths: Union[Path, str]) Path[source]

Returns the first existing path from a list of paths.

dpipe.io.load(path: Union[Path, str], ext: Optional[str] = None, **kwargs)[source]

Load a file located at path. kwargs are format-specific keyword arguments.

The following extensions are supported:

npy, tif, png, jpg, bmp, hdr, img, csv, dcm, nii, nii.gz, json, mhd, csv, txt, pickle, pkl, config

dpipe.io.save(value, path: Union[Path, str], **kwargs)[source]

Save value to a file located at path. kwargs are format-specific keyword arguments.

The following extensions are supported:

npy, npy.gz, tif, png, jpg, bmp, hdr, img, csv nii, nii.gz, json, mhd, csv, txt, pickle, pkl

dpipe.io.load_json(path: Union[Path, str])[source]

Load the contents of a json file.

dpipe.io.save_json(value, path: Union[Path, str], *, indent: Optional[int] = None)[source]

Dump a json-serializable object to a json file.

dpipe.io.load_pickle(path: Union[Path, str])[source]

Load a pickled value from path.

dpipe.io.save_pickle(value, path: Union[Path, str])[source]

Pickle a value to path.

dpipe.io.load_numpy(path: Union[Path, str], *, allow_pickle: bool = True, fix_imports: bool = True, decompress: bool = False)[source]

A wrapper around np.load with allow_pickle set to True by default.

dpipe.io.save_numpy(value, path: Union[Path, str], *, allow_pickle: bool = True, fix_imports: bool = True, compression: Optional[int] = None, timestamp: Optional[int] = None)[source]

A wrapper around np.save that matches the interface save(what, where).

dpipe.io.load_csv(path: Union[Path, str], **kwargs)[source]
dpipe.io.save_csv(value, path: Union[Path, str], *, compression: Optional[int] = None, **kwargs)[source]
dpipe.io.load_text(path: Union[Path, str])[source]
dpipe.io.save_text(value: str, path: Union[Path, str])[source]

Training

Checkpoints

class dpipe.train.checkpoint.Checkpoints(base_path: Union[Path, str], objects: Iterable, frequency: Optional[int] = None)[source]

Bases: object

Saves the most recent iteration to base_path and removes the previous one.

Parameters
  • base_path (str) – path to save/restore checkpoint object in/from.

  • objects (Dict[PathLike, Any]) – objects to save. Each key-value pair represents the path relative to base_path and the corresponding object.

  • frequency (int) – the frequency with which the objects are stored. By default only the latest checkpoint is saved.

save(iteration: int, train_losses: Optional[Sequence] = None, metrics: Optional[dict] = None)[source]

Save the states of all tracked objects.

restore() int[source]

Restore the most recent states of all tracked objects and return next iteration’s index.

dpipe.train.checkpoint.CheckpointManager

alias of Checkpoints

Policies

class dpipe.train.policy.Policy[source]

Bases: object

Interface for various policies.

epoch_started(epoch: int)[source]

Update the policy before an epoch will start. The epochs numeration starts at zero.

train_step_started(epoch: int, iteration: int)[source]

Update the policy before a new train step. iteration denotes the iteration index inside the current epoch. The epochs and iterations numeration starts at zero.

train_step_finished(epoch: int, iteration: int, loss: Any)[source]

Update the policy after a train step. iteration denotes the iteration index inside the current epoch. loss is the value returned by the last train step. The epochs and iterations numeration starts at zero.

validation_started(epoch: int, train_losses: Sequence)[source]

Update the policy after the batch iterator was depleted. The epochs numeration starts at zero.

The history of train_losses and metrics from the entire epoch is provided as additional information.

epoch_finished(epoch: int, train_losses: Sequence, metrics: Optional[dict] = None, policies: Optional[dict] = None)[source]

Update the policy after an epoch is finished. The epochs numeration starts at zero.

The history of train_losses and metrics and policies from the entire epoch is provided as additional information.

class dpipe.train.policy.ValuePolicy(initial)[source]

Bases: Policy

Interface for policies that have a value which changes over time.

value
Type

the current value carried by the policy.

dpipe.train.policy.Constant

alias of ValuePolicy

class dpipe.train.policy.DecreasingOnPlateau(*, initial: float, multiplier: float, patience: int, rtol, atol)[source]

Bases: ValuePolicy

Policy that traces average train loss and if it didn’t decrease according to atol or rtol for patience epochs, multiply value by multiplier. atol :- absolute tolerance for detecting change in training loss value. rtol :- relative tolerance for detecting change in training loss value.

class dpipe.train.policy.Exponential(initial: float, multiplier: float, step_length: int = 1, floordiv: bool = True, min_value: float = -inf, max_value: float = inf)[source]

Bases: ValuePolicy

Exponentially change the value by a factor of multiplier each step_length epochs. If floordiv is False - the value will be changed continuously.

class dpipe.train.policy.Schedule(initial: float, epoch2value_multiplier: Dict[int, float])[source]

Bases: ValuePolicy

Multiply value by multipliers given by epoch2value_multiplier at corresponding epochs.

class dpipe.train.policy.Switch(initial: float, epoch_to_value: Dict[int, Any])[source]

Bases: ValuePolicy

Changes the value at specific epochs to the values given in epoch_to_value.

class dpipe.train.policy.LambdaEpoch(func: Callable, *args, **kwargs)[source]

Bases: ValuePolicy

Use the passed function to calculate the value for the current epoch (starting with 0).

exception dpipe.train.policy.EarlyStopping[source]

Bases: StopIteration

Exception raised by policies in order to trigger early stopping.

class dpipe.train.policy.TQDM(loss: bool = True)[source]

Bases: Policy

Adds a tqdm progressbar. If loss is True - the progressbar will also display the current train loss.

Logging

Validation

Batch iterators

Tools for creating batch iterators. See the Batch iterators tutorial for more details.

Pipeline

class dpipe.batch_iter.pipeline.Infinite(source: ~typing.Iterable, *transformers: ~typing.Union[~typing.Callable, ~dpipe.batch_iter.pipeline.Transform], batch_size: ~typing.Union[int, ~typing.Callable], batches_per_epoch: int, buffer_size: int = 1, combiner: ~typing.Callable = <function combine_to_arrays>, **kwargs)[source]

Bases: object

Combine source and transformers into a batch iterator that yields batches of size batch_size.

Parameters
  • source (Iterable) – an infinite iterable.

  • transformers (Callable) – the callable that transforms the objects generated by the previous element of the pipeline.

  • batch_size (int, Callable) – the size of batch.

  • batches_per_epoch (int) – the number of batches to yield each epoch.

  • buffer_size (int) – the number of objects to keep buffered in each pipeline element. Default is 1.

  • combiner (Callable) – combines chunks of single batches in multiple batches, e.g. combiner([(x, y), (x, y)]) -> ([x, x], [y, y]). Default is combine_to_arrays.

  • kwargs – additional keyword arguments passed to the combiner.

References

See the Batch iterators tutorial for more details.

close()[source]

Stop all background processes.

property closing_callback

A callback to make this interface compatible with Lightning which allows for a safe release of resources

Examples

>>> batch_iter = Infinite(...)
>>> trainer = Trainer(callbacks=[batch_iter.closing_callback, ...])
class dpipe.batch_iter.pipeline.Threads(func: Callable, *args, n_workers: int = 1, buffer_size: int = 1, **kwargs)[source]

Bases: Iterator

Apply func concurrently to each object in the batch iterator by moving it to n_workers threads.

Parameters
  • transform (Callable(Iterable) -> Iterable) – a function that takes an iterable and yields transformed values.

  • n_workers (int) – the number of threads to which transform will be moved.

  • buffer_size (int) – the number of objects to keep buffered.

  • args – additional positional arguments passed to transform.

  • kwargs – additional keyword arguments passed to transform.

References

See the Batch iterators tutorial for more details.

class dpipe.batch_iter.pipeline.Loky(func: Callable, *args, n_workers: int = 1, buffer_size: int = 1, **kwargs)[source]

Bases: Transform

Apply func concurrently to each object in the batch iterator by moving it to n_workers processes.

Parameters
  • transform (Callable(Iterable) -> Iterable) – a function that takes an iterable and yields transformed values.

  • n_workers (int) – the number of threads to which transform will be moved.

  • buffer_size (int) – the number of objects to keep buffered.

  • args – additional positional arguments passed to transform.

  • kwargs – additional keyword arguments passed to transform.

Notes

Process-based parallelism is implemented with the loky backend.

References

See the Batch iterators tutorial for more details.

class dpipe.batch_iter.pipeline.Iterator(transform: Callable, *args, n_workers: int = 1, buffer_size: int = 1, **kwargs)[source]

Bases: Transform

Apply transform to the iterator of values that flow through the batch iterator.

Parameters
  • transform (Callable(Iterable) -> Iterable) – a function that takes an iterable and yields transformed values.

  • n_workers (int) – the number of threads to which transform will be moved.

  • buffer_size (int) – the number of objects to keep buffered.

  • args – additional positional arguments passed to transform.

  • kwargs – additional keyword arguments passed to transform.

References

See the Batch iterators tutorial for more details.

dpipe.batch_iter.pipeline.combine_batches(inputs)[source]

Combines tuples from inputs into batches: [(x, y), (x, y)] -> [(x, x), (y, y)]

dpipe.batch_iter.pipeline.combine_to_arrays(inputs)[source]

Combines tuples from inputs into batches of numpy arrays.

dpipe.batch_iter.pipeline.combine_pad(inputs, padding_values: Union[float, Sequence[float]] = 0, ratio: Union[float, Sequence[float]] = 0.5)[source]

Combines tuples from inputs into batches and pads each batch in order to obtain a correctly shaped numpy array.

Parameters
  • inputs

  • padding_values – values to pad with. If Callable (e.g. numpy.min) - padding_values(x) will be used.

  • ratio – the fraction of the padding that will be applied to the left, 1.0 - ratio will be applied to the right. By default 0.5 - ratio, it is applied uniformly to the left and right.

References

pad_to_shape

Sources

dpipe.batch_iter.sources.sample(sequence: Sequence, weights: Optional[Sequence[float]] = None, random_state: Optional[Union[RandomState, int]] = None)[source]

Infinitely yield samples from sequence according to weights.

Parameters
  • sequence (Sequence) – the sequence of elements to sample from.

  • weights (Sequence[float], None, optional) – the weights associated with each element. If None, the weights are assumed to be equal. Should be the same size as sequence.

  • random_state (int, np.random.RandomState, None, optional) – if not None, used to set the random seed for reproducibility reasons.

dpipe.batch_iter.sources.load_by_random_id(*loaders: Callable, ids: Sequence, weights: Optional[Sequence[float]] = None, random_state: Optional[Union[RandomState, int]] = None)[source]

Infinitely yield objects loaded by loaders according to the identifier from ids. The identifiers are randomly sampled from ids according to the weights.

Parameters
  • loaders (Callable) – function, which loads object by its id.

  • ids (Sequence) – the sequence of identifiers to sample from.

  • weights (Sequence[float], None, optional) – The weights associated with each id. If None, the weights are assumed to be equal. Should be the same size as ids.

  • random_state (int, np.random.RandomState, None, optional) – if not None, used to set the random seed for reproducibility reasons.

Blocks

class dpipe.batch_iter.expiration_pool.ExpirationPool(pool_size: int, repetitions: int, iterations: int = 1)[source]

Bases: Iterator

A simple expiration pool for time consuming operations that don’t fit into RAM. See expiration_pool for details.

Examples

>>> batch_iter = Infinite(
    # ... some expensive operations, e.g. loading from disk, or preprocessing
    ExpirationPool(pool_size, repetitions),
    # ... here are the values from pool
    # ... other lightweight operations
    # ...
)
dpipe.batch_iter.expiration_pool.expiration_pool(iterable: Iterable, pool_size: int, repetitions: int, iterations: int = 1)[source]

Caches pool_size items from iterable. The item is removed from cache after it was generated repetitions times. After an item is removed, a new one is extracted from the iterable. Finally, iterations controls how many values are generated after a new value is added, thus speeding up the pipeline at early stages.

Utils

dpipe.batch_iter.utils.pad_batch_equal(batch, padding_values: Union[float, Sequence[float]] = 0, ratio: Union[float, Sequence[float]] = 0.5)[source]

Pad each element of batch to obtain a correctly shaped array.

References

pad_to_shape

dpipe.batch_iter.utils.unpack_args(func: Callable, *args, **kwargs)[source]

Returns a function that takes an iterable and unpacks it while calling func.

args and kwargs are passed to func as additional arguments.

Examples

>>> def add(x, y):
>>>     return x + y
>>>
>>> add_ = unpack_args(add)
>>> add(1, 2) == add_([1, 2])
>>> True
dpipe.batch_iter.utils.multiply(func: Callable, *args, **kwargs)[source]

Returns a function that takes an iterable and maps func over it. Useful when multiple batches require the same function.

args and kwargs are passed to func as additional arguments.

dpipe.batch_iter.utils.apply_at(index: Union[int, Sequence[int]], func: Callable, *args, **kwargs)[source]

Returns a function that takes an iterable and applies func to the values at the corresponding index.

args and kwargs are passed to func as additional arguments.

Examples

>>> first_sqr = apply_at(0, np.square)
>>> first_sqr([3, 2, 1])
>>> (9, 2, 1)
dpipe.batch_iter.utils.zip_apply(*functions: Callable, **kwargs)[source]

Returns a function that takes an iterable and zips functions over it.

kwargs are passed to each function as additional arguments.

Examples

>>> zipper = zip_apply(np.square, np.sqrt)
>>> zipper([4, 9])
>>> (16, 3)
dpipe.batch_iter.utils.random_apply(p: float, func: Callable, *args, **kwargs)[source]

Returns a function that applies func with a given probability p.

args and kwargs are passed to func as additional arguments.

dpipe.batch_iter.utils.sample_args(func: Callable, *args: Callable, **kwargs: Callable)[source]

Returns a function that samples arguments for func from args and kwargs.

Each argument in args and kwargs must be a callable that samples a random value.

Examples

>>> from scipy.ndimage import  rotate
>>>
>>> random_rotate = sample_args(rotate, angle=np.random.normal)
>>> random_rotate(x)
>>> # same as
>>> rotate(x, angle=np.random.normal())

Prediction

Various functions for prediction with neural networks. See the Predict tutorial for more details.

Predictors

Ready-to-use predictors.

dpipe.predict.shape.add_extract_dims(n_add: int = 1, n_extract: Optional[int] = None, sequence: bool = False)[source]

Adds n_add dimensions before a prediction and extracts n_extract dimensions after this prediction.

Parameters
  • n_add (int) – number of dimensions to add.

  • n_extract (int, None, optional) – number of dimensions to extract. If None, extracts the same number of dimensions as were added (n_add).

  • sequence – if True - the output is expected to be a sequence, and the dims are extracted for each element of the sequence.

dpipe.predict.shape.divisible_shape(divisor: Union[int, Sequence[int]], axis: Optional[Union[int, Sequence[int]]] = None, padding_values: Union[float, Sequence[float], Callable] = 0, ratio: Union[float, Sequence[float]] = 0.5)[source]

Pads an incoming array to be divisible by divisor along the axes. Afterwards the padding is removed.

Parameters
  • divisor – a value an incoming array should be divisible by.

  • axis – axes along which the array will be padded. If None - the last len(divisor) axes are used.

  • padding_values – values to pad with. If Callable (e.g. numpy.min) - padding_values(x) will be used.

  • ratio – the fraction of the padding that will be applied to the left, 1 - ratio will be applied to the right.

References

pad_to_divisible

dpipe.predict.shape.patches_grid(patch_size: ~typing.Union[int, ~typing.Sequence[int]], stride: ~typing.Union[int, ~typing.Sequence[int]], axis: ~typing.Optional[~typing.Union[int, ~typing.Sequence[int]]] = None, padding_values: ~typing.Union[float, ~typing.Sequence[float], ~typing.Callable] = 0, ratio: ~typing.Union[float, ~typing.Sequence[float]] = 0.5, combiner: ~typing.Type[~dpipe.im.grid.PatchCombiner] = <class 'dpipe.im.grid.Average'>, get_boxes: ~typing.Callable = <function get_boxes>)[source]

Divide an incoming array into patches of corresponding patch_size and stride and then combine the predicted patches by aggregating the overlapping regions using the combiner - Average by default.

If padding_values is not None, the array will be padded to an appropriate shape to make a valid division. Afterwards the padding is removed. Otherwise if input cannot be patched without remainder ValueError is raised.

References

grid.divide, grid.combine, pad_to_shape

Functions

Various functions that can be used to build predictors.

dpipe.predict.functional.chain_decorators(*decorators: Callable, predict: Callable, **kwargs)[source]

Wraps predict into a series of decorators.

kwargs are passed as additional arguments to predict.

Examples

>>> @decorator1
>>> @decorator2
>>> def f(x):
>>>     return x + 1
>>> # same as:
>>> def f(x):
>>>     return x + 1
>>>
>>> f = chain_decorators(decorator1, decorator2, predict=f)
dpipe.predict.functional.preprocess(func, *args, **kwargs)[source]

Applies function func with given parameters before making a prediction.

Examples

>>> from dpipe.im.shape_ops import pad
>>> from dpipe.predict.functional import preprocess
>>>
>>> @preprocess(pad, padding=[10, 10, 10], padding_values=np.min)
>>> def predict(x):
>>>     return model.do_inf_step(x)
performs spatial padding before prediction.

References

postprocess

dpipe.predict.functional.postprocess(func, *args, **kwargs)[source]

Applies function func with given parameters after making a prediction.

References

preprocess

NN Layers

Residual Blocks

class dpipe.layers.resblock.ResBlock(*args: Any, **kwargs: Any)[source]

Bases: Module

Performs a sequence of two convolutions with residual connection (Residual Block).

Parameters
  • in_channels (int) – the number of incoming channels.

  • out_channels (int) – the number of the ResBlock output channels. Note, if in_channels != out_channels, then linear transform will be applied to the shortcut.

  • kernel_size (int, tuple) – size of the convolving kernel.

  • stride (int, tuple, optional) – stride of the convolution. Default is 1. Note, if stride is greater than 1, then linear transform will be applied to the shortcut.

  • padding (int, tuple, optional) – zero-padding added to all spatial sides of the input. Default is 0.

  • dilation (int, tuple, optional) – spacing between kernel elements. Default is 1.

  • bias (bool) – if True, adds a learnable bias to the output. Default is False.

  • activation_module (None, nn.Module, optional) – module to build up activation layer. Default is torch.nn.ReLU.

  • conv_module (nn.Module) – module to build up convolution layer with given parameters, e.g. torch.nn.Conv3d.

  • batch_norm_module (nn.Module) – module to build up batch normalization layer, e.g. torch.nn.BatchNorm3d.

  • kwargs – additional arguments passed to conv_module.

FPN

class dpipe.layers.fpn.FPN(*args: Any, **kwargs: Any)[source]

Bases: Module

Feature Pyramid Network - a generalization of UNet.

Parameters
  • layer (Callable) – the structural block of each level, e.g. torch.nn.Conv2d.

  • downsample (nn.Module) – the downsampling layer, e.g. torch.nn.MaxPool2d.

  • upsample (nn.Module) – the upsampling layer, e.g. torch.nn.Upsample.

  • merge (Callable(left, down)) – a function that merges the upsampled features map with the one coming from the left branch, e.g. torch.add.

  • structure (Sequence[Union[Sequence[int], nn.Module]]) – a collection of channels sequences, see Examples section for details.

  • last_level (bool) – If True only the result of the last level is returned (as in UNet), otherwise the results from all levels are returned (as in FPN).

  • kwargs – additional arguments passed to layer.

Examples

>>> from dpipe.layers import ResBlock2d
>>>
>>> structure = [
>>>     [[16, 16, 16],       [16, 16, 16]],  # level 1, left and right
>>>     [[16, 32, 32],       [32, 32, 16]],  # level 2, left and right
>>>                [32, 64, 32]              # final level
>>> ]
>>>
>>> upsample = nn.Upsample(scale_factor=2, mode='bilinear')
>>> downsample = nn.MaxPool2d(kernel_size=2)
>>>
>>> ResUNet = FPN(
>>>     ResBlock2d, downsample, upsample, torch.add,
>>>     structure, kernel_size=3, dilation=1, padding=1, last_level=True
>>> )

References

make_consistent_seq FPN UNet

Structure

dpipe.layers.structure.make_consistent_seq(layer: Callable, channels: Sequence[int], *args, **kwargs)[source]

Builds a sequence of layers that have consistent input and output channels/features.

args and kwargs are passed as additional parameters.

Examples

>>> make_consistent_seq(nn.Conv2d, [16, 32, 64, 128], kernel_size=3, padding=1)
>>> # same as
>>> nn.Sequential(
>>>     nn.Conv2d(16, 32, kernel_size=3, padding=1),
>>>     nn.Conv2d(32, 64, kernel_size=3, padding=1),
>>>     nn.Conv2d(64, 128, kernel_size=3, padding=1),
>>> )
class dpipe.layers.structure.ConsistentSequential(*args: Any, **kwargs: Any)[source]

Bases: Sequential

A sequence of layers that have consistent input and output channels/features.

args and kwargs are passed as additional parameters.

Examples

>>> ConsistentSequential(nn.Conv2d, [16, 32, 64, 128], kernel_size=3, padding=1)
>>> # same as
>>> nn.Sequential(
>>>     nn.Conv2d(16, 32, kernel_size=3, padding=1),
>>>     nn.Conv2d(32, 64, kernel_size=3, padding=1),
>>>     nn.Conv2d(64, 128, kernel_size=3, padding=1),
>>> )
class dpipe.layers.structure.PreActivation(*args: Any, **kwargs: Any)[source]

Bases: Module

Runs a sequence of batch_norm, activation, and layer.

in -> (BN -> activation -> layer) -> out

Parameters
  • in_features (int) – the number of incoming features/channels.

  • out_features (int) – the number of the output features/channels.

  • batch_norm_module – module to build up batch normalization layer, e.g. torch.nn.BatchNorm3d.

  • activation_module – module to build up activation layer. Default is torch.nn.ReLU.

  • layer_module (Callable(in_features, out_features, **kwargs)) – module to build up the main layer, e.g. torch.nn.Conv3d or torch.nn.Linear.

  • kwargs – additional arguments passed to layer_module.

class dpipe.layers.structure.PostActivation(*args: Any, **kwargs: Any)[source]

Bases: Module

Performs a sequence of layer, batch_norm and activation:

in -> (layer -> BN -> activation) -> out

Parameters
  • in_features (int) – the number of incoming features/channels.

  • out_features (int) – the number of the output features/channels.

  • batch_norm_module – module to build up batch normalization layer, e.g. torch.nn.BatchNorm3d.

  • activation_module – module to build up activation layer. Default is torch.nn.ReLU.

  • layer_module (Callable(in_features, out_features, **kwargs)) – module to build up the main layer, e.g. torch.nn.Conv3d or torch.nn.Linear.

  • kwargs – additional arguments passed to layer_module.

Notes

If layer supports a bias term, make sure to pass bias=False.

class dpipe.layers.structure.Lambda(*args: Any, **kwargs: Any)[source]

Bases: Module

Applies func to the incoming tensor.

kwargs are passed as additional arguments.

class dpipe.layers.conv.PreActivationND(*args: Any, **kwargs: Any)[source]

Bases: PreActivation

Performs a sequence of batch_norm, activation, and convolution

in -> (BN -> activation -> Conv) -> out

Parameters
  • in_channels (int) – the number of incoming channels.

  • out_channels (int) – the number of the PreActivation output channels.

  • kernel_size (int, tuple) – size of the convolving kernel.

  • stride (int, tuple, optional) – stride of the convolution. Default is 1.

  • padding (int, tuple, optional) – zero-padding added to all spatial sides of the input. Default is 0.

  • dilation (int, tuple, optional) – spacing between kernel elements. Default is 1.

  • groups (int, optional) – number of blocked connections from input channels to output channels. Default is 1.

  • bias (bool) – if True, adds a learnable bias to the output. Default is False

  • batch_norm_module (nn.Module) – module to build up batch normalization layer, e.g. torch.nn.BatchNorm3d.

  • activation_module (nn.Module) – module to build up activation layer. Default is torch.nn.ReLU.

  • conv_module (nn.Module) – module to build up convolution layer with given parameters, e.g. torch.nn.Conv3d.

  • kwargs – additional arguments passed to layer_module

class dpipe.layers.conv.PostActivationND(*args: Any, **kwargs: Any)[source]

Bases: PostActivation

Performs a sequence of convolution, batch_norm and activation:

in -> (Conv -> BN -> activation) -> out

Parameters
  • in_channels (int) – the number of incoming channels.

  • out_channels (int) – the number of the PostActivation output channels.

  • kernel_size (int, tuple) – size of the convolving kernel.

  • stride (int, tuple, optional) – stride of the convolution. Default is 1.

  • padding (int, tuple, optional) – zero-padding added to all spatial sides of the input. Default is 0.

  • dilation (int, tuple, optional) – spacing between kernel elements. Default is 1.

  • groups (int, optional) – number of blocked connections from input channels to output channels. Default is 1.

  • batch_norm_module (nn.Module) – module to build up batch normalization layer, e.g. torch.nn.BatchNorm3d.

  • activation_module (nn.Module) – module to build up activation layer. Default is torch.nn.ReLU.

  • conv_module (nn.Module) – module to build up convolution layer with given parameters, e.g. torch.nn.Conv3d.

  • kwargs – additional arguments passed to layer_module

Shape Operations

class dpipe.layers.shape.InterpolateToInput(*args: Any, **kwargs: Any)[source]

Bases: Module

Interpolates the result of path to the original shape along the spatial axis.

Parameters
  • path (nn.Module) – arbitrary neural network module to calculate the result.

  • mode (str) – algorithm used for upsampling. Should be one of ‘nearest’ | ‘linear’ | ‘bilinear’ | ‘trilinear’ | ‘area’. Default is ‘nearest’.

  • axis (AxesLike, None, optional) – spatial axes to interpolate result along. If axes is None, the result is interpolated along all the spatial axes.

class dpipe.layers.shape.Reshape(*args: Any, **kwargs: Any)[source]

Bases: Module

Reshape the incoming tensor to the given shape.

Parameters

shape (Union[int, str]) – the resulting shape. String values denote indices in the input tensor’s shape.

Examples

>>> layer = Reshape('0', '1', 500, 500)
>>> layer(x)
>>> # same as
>>> x.reshape(x.shape[0], x.shape[1], 500, 500)
class dpipe.layers.shape.Softmax(*args: Any, **kwargs: Any)[source]

Bases: Module

A multidimensional version of softmax.

class dpipe.layers.shape.PyramidPooling(*args: Any, **kwargs: Any)[source]

Bases: Module

Implements the pyramid pooling operation.

Parameters
  • pooling (Callable) – the pooling to be applied, e.g. torch.nn.functional.max_pool2d.

  • levels (int) – the number of pyramid levels, default is 1 which is the global pooling operation.

PyTorch Wrappers

Training and inference

dpipe.torch.model.optimizer_step(optimizer: torch.optim.Optimizer, loss: torch.Tensor, scaler: Optional[torch.cuda.amp.GradScaler] = None, clip_grad: Optional[float] = None, accumulate: bool = False, **params) torch.Tensor[source]

Performs the backward pass with respect to loss, as well as a gradient step or gradient accumlation.

If a scaler is passed - it is used to perform the gradient step (automatic mixed precision support). If a clip_grad is passed - gradient will be clipped by this value considered as maximum l2 norm. accumulate indicates whether to perform gradient step or just accumulate gradients. params is used to change the optimizer’s parameters.

Examples

>>> optimizer = Adam(model.parameters(), lr=1)
>>> optimizer_step(optimizer, loss) # perform a gradient step
>>> optimizer_step(optimizer, loss, lr=1e-3) # set lr to 1e-3 and perform a gradient step
>>> optimizer_step(optimizer, loss, betas=(0, 0)) # set betas to 0 and perform a gradient step
>>> optimizer_step(optimizer, loss, accumulate=True) # perform a gradient accumulation

Notes

The incoming optimizer’s parameters are not restored to their original values.

dpipe.torch.model.train_step(*inputs: ndarray, architecture: torch.nn.Module, criterion: Callable, optimizer: torch.optim.Optimizer, n_targets: int = 1, loss_key: Optional[str] = None, scaler: Optional[torch.cuda.amp.GradScaler] = None, clip_grad: Optional[float] = None, accumulate: bool = False, gradient_accumulation_steps: int = 1, **optimizer_params) ndarray[source]

Performs a forward-backward pass, and make a gradient step or accumulation, according to the given inputs.

Parameters
  • inputs – inputs batches. The last n_targets batches are passed to criterion. The remaining batches are fed into the architecture.

  • architecture – the neural network architecture.

  • criterion – the loss function. Returns either a scalar or a dictionary of scalars. In the latter case loss_key must be provided.

  • optimizer

  • n_targets – how many values from inputs to be considered as targets.

  • loss_key – in case criterion returns a dictionary of scalars, indicates which key should be used for gradient computation.

  • scaler – a gradient scaler used to operate in automatic mixed precision mode.

  • clip_grad – maximum l2 norm of the gradient to clip it by.

  • accumulate – whether to accumulate gradients or perform optimizer step.

  • gradient_accumulation_steps

  • optimizer_params – additional parameters that will override the optimizer’s current parameters (e.g. lr).

Notes

Note that both input and output are not of type torch.Tensor - the conversion to and from torch.Tensor is made inside this function.

References

optimizer_step

dpipe.torch.model.inference_step(*inputs: ~numpy.ndarray, architecture: torch.nn.Module, activation: ~typing.Callable = <function identity>, amp: bool = False) ndarray[source]

Returns the prediction for the given inputs.

Notes

Note that both input and output are not of type torch.Tensor - the conversion to and from torch.Tensor is made inside this function. Inputs will be converted to fp16 if amp is True.

dpipe.torch.model.multi_inference_step(*inputs: ~numpy.ndarray, architecture: torch.nn.Module, activations: ~typing.Union[~typing.Callable, ~typing.Sequence[~typing.Optional[~typing.Callable]]] = <function identity>, amp: bool = False) list[source]

Returns the prediction for the given inputs.

The architecture is expected to return a sequence of torch.Tensor objects.

Notes

Note that both input and output are not of type torch.Tensor - the conversion to and from torch.Tensor is made inside this function. Inputs will be converted to fp16 if amp is True.

Loss functions

dpipe.torch.functional.focal_loss_with_logits(logits: torch.Tensor, target: torch.Tensor, weight: Optional[torch.Tensor] = None, gamma: float = 2, alpha: float = 0.25, reduce: Optional[Callable] = torch.mean)[source]

Function that measures Focal Loss between target and output logits.

Parameters
  • logits (torch.Tensor) – tensor of an arbitrary shape.

  • target (torch.Tensor) – tensor of the same shape as logits.

  • weight (torch.Tensor, None, optional) – a manual rescaling weight. Must be broadcastable to logits.

  • gamma (float) – the power of focal loss factor. Defaults to 2.

  • alpha (float, None, optional) – weighting factor of the focal loss. If None, no weighting will be performed. Defaults to 0.25.

  • reduce (Callable, None, optional) – the reduction operation to be applied to the final loss. Defaults to torch.mean. If None, no reduction will be performed.

References

Focal Loss

dpipe.torch.functional.linear_focal_loss_with_logits(logits: torch.Tensor, target: torch.Tensor, gamma: float, beta: float, weight: Optional[torch.Tensor] = None, reduce: Optional[Callable] = torch.mean)[source]

Function that measures Linear Focal Loss between target and output logits. Equals to BinaryCrossEntropy( gamma * logits + beta, target , weights).

Parameters
  • logits (torch.Tensor) – tensor of an arbitrary shape.

  • target (torch.Tensor) – tensor of the same shape as logits.

  • gamma (float) – multiplication coefficient for logits tensor.

  • beta (float) – coefficient to be added to all the elements in logits tensor.

  • weight (torch.Tensor) – a manual rescaling weight. Must be broadcastable to logits.

  • reduce (Callable, None, optional) – the reduction operation to be applied to the final loss. Defaults to torch.mean. If None - no reduction will be performed.

References

Focal Loss

dpipe.torch.functional.weighted_cross_entropy_with_logits(logit: torch.Tensor, target: torch.Tensor, weight: Optional[torch.Tensor] = None, alpha: float = 1, adaptive: bool = False, reduce: Optional[Callable] = torch.mean)[source]

Function that measures Binary Cross Entropy between target and output logits. This version of BCE has additional options of constant or adaptive weighting of positive examples.

Parameters
  • logit (torch.Tensor) – tensor of an arbitrary shape.

  • target (torch.Tensor) – tensor of the same shape as logits.

  • weight (torch.Tensor) – a manual rescaling weight. Must be broadcastable to logits.

  • alpha (float, optional) – a weight for the positive class examples.

  • adaptive (bool, optional) – If True, uses adaptive weight [N - sum(p_i)] / sum(p_i) for a positive class examples.

  • reduce (Callable, None, optional) – the reduction operation to be applied to the final loss. Defaults to torch.mean. If None - no reduction will be performed.

References

WCE

dpipe.torch.functional.tversky_loss(pred: torch.Tensor, target: torch.Tensor, alpha=0.5, epsilon=1e-07, reduce: Optional[Callable] = torch.mean)[source]

References

`Tversky Loss https://arxiv.org/abs/1706.05721`_

dpipe.torch.functional.focal_tversky_loss(pred: torch.Tensor, target: torch.Tensor, gamma=1.3333333333333333, alpha=0.5, epsilon=1e-07)[source]

References

`Focal Tversky Loss https://arxiv.org/abs/1810.07842`_

dpipe.torch.functional.dice_loss(pred: torch.Tensor, target: torch.Tensor, epsilon=1e-07)[source]

References

Dice Loss

dpipe.torch.functional.masked_loss(mask: torch.Tensor, criterion: Callable, prediction: torch.Tensor, target: torch.Tensor, **kwargs)[source]

Calculates the criterion between the masked prediction and target. args and kwargs are passed to criterion as additional arguments.

If the mask is empty - returns 0 wrapped in a torch tensor.

dpipe.torch.functional.moveaxis(x: torch.Tensor, source: Union[int, Sequence[int]], destination: Union[int, Sequence[int]])[source]

Move axes of a torch.Tensor to new positions. Other axes remain in their original order.

dpipe.torch.functional.softmax(x: torch.Tensor, axis: Union[int, Sequence[int]])[source]

A multidimensional version of softmax.

Utils

dpipe.torch.utils.load_model_state(module: torch.nn.Module, path: Union[Path, str], modify_state_fn: Optional[Callable] = None, strict: bool = True)[source]

Updates the module’s state dict by the one located at path.

Parameters
  • module (nn.Module) –

  • path (PathLike) –

  • modify_state_fn (Callable(current_state, state_to_load)) – if not None, two arguments will be passed to the function: current state of the model and the state loaded from the path. This function should modify states as needed and return the final state to load. For example, it could help you to transfer weights from similar but not completely equal architecture.

  • strict (bool) –

dpipe.torch.utils.save_model_state(module: torch.nn.Module, path: Union[Path, str])[source]

Saves the module’s state dict to path.

dpipe.torch.utils.get_device(x: Optional[Union[torch.device, torch.nn.Module, torch.Tensor, str]] = None) torch.device[source]

Determines the correct device based on the input.

Parameters

x (torch.device, torch.nn.Module, torch.Tensor, str, None) –

if torch.Tensor - returns the device on which it is located
if torch.nn.Module - returns the device on which its parameters are located
if str or torch.device - returns torch.device(x)
if None - same as ‘cuda’ if CUDA is available, ‘cpu’ otherwise.

dpipe.torch.utils.to_device(x: Union[torch.nn.Module, torch.Tensor], device: Optional[Union[torch.device, torch.nn.Module, torch.Tensor, str]] = 'cpu')[source]

Move x to device.

Parameters
  • x

  • device – the device on which to move x. See get_device for details.

dpipe.torch.utils.to_cuda(x, cuda: Optional[Union[torch.nn.Module, torch.Tensor, bool]] = None)[source]

Move x to cuda if specified.

Parameters
  • x

  • cuda – whether to move to cuda. If None, torch.cuda.is_available() is used to determine that.

dpipe.torch.utils.to_var(*arrays: Union[Iterable, int, float], device: Union[torch.device, torch.nn.Module, torch.Tensor, str] = 'cpu', requires_grad: bool = False)[source]

Convert numpy arrays to torch Tensors.

Parameters
  • arrays (array-like) – objects, that will be converted to torch Tensors.

  • device – the device on which to move x. See get_device for details.

  • requires_grad – whether the tensors require grad.

Notes

If arrays contains a single argument the result will not be contained in a tuple: >>> x = to_var(x) >>> x, y = to_var(x, y)

If this is not the desired behaviour, use sequence_to_var, which always returns a tuple of tensors.

dpipe.torch.utils.to_np(*tensors: torch.Tensor)[source]

Convert torch Tensors to numpy arrays.

Notes

If tensors contains a single argument the result will not be contained in a tuple: >>> x = to_np(x) >>> x, y = to_np(x, y)

If this is not the desired behaviour, use sequence_to_np, which always returns a tuple of arrays.

dpipe.torch.utils.set_params(optimizer: torch.optim.Optimizer, **params) torch.optim.Optimizer[source]

Change an optimizer’s parameters by the ones passed in params.

dpipe.torch.utils.set_lr(optimizer: torch.optim.Optimizer, lr: float) torch.optim.Optimizer[source]

Change an optimizer’s learning rate to lr.

dpipe.torch.utils.get_parameters(optimizer: torch.optim.Optimizer) Iterator[torch.nn.parameter.Parameter][source]

Returns an iterator over model parameters stored in optimizer.

dpipe.torch.utils.has_batchnorm(architecture: torch.nn.Module) bool[source]

Check whether architecture has BatchNorm module

dpipe.torch.utils.order_to_mode(order: int, dim: int)[source]

Converts the order of interpolation to a “mode” string.

Examples

>>> order_to_mode(1, 3)
'trilinear'

Iterator utils

dpipe.itertools.pam(functions: Iterable[Callable], *args, **kwargs)[source]

Inverse of map. Apply a sequence of callables to fixed arguments.

Examples

>>> list(pam([np.sqrt, np.square, np.cbrt], 64))
[8, 4096, 4]
dpipe.itertools.zip_equal(*args: Union[Sized, Iterable]) Iterable[Tuple][source]

zip over the given iterables, but enforce that all of them exhaust simultaneously.

Examples

>>> zip_equal([1, 2, 3], [4, 5, 6]) # ok
>>> zip_equal([1, 2, 3], [4, 5, 6, 7]) # raises ValueError
# ValueError is raised even if the lengths are not known
>>> zip_equal([1, 2, 3], map(np.sqrt, [4, 5, 6])) # ok
>>> zip_equal([1, 2, 3], map(np.sqrt, [4, 5, 6, 7])) # raises ValueError
dpipe.itertools.head_tail(iterable: Iterable) Tuple[Any, Iterable][source]

Split the iterable into the first and the rest of the elements.

Examples

>>> head, tail = head_tail(map(np.square, [1, 2, 3]))
>>> head, list(tail)
1, [4, 9]
dpipe.itertools.peek(iterable: Iterable) Tuple[Any, Iterable][source]

Return the first element from iterable and the whole iterable.

Notes

The incoming iterable might be mutated, use the returned iterable instead.

Examples

>>> original_iterable = map(np.square, [1, 2, 3])
>>> head, iterable = peek(original_iterable)
>>> head, list(iterable)
1, [1, 4, 9]
# list(original_iterable) would return [4, 9]
dpipe.itertools.lmap(func: Callable, *iterables: Iterable) list[source]

Composition of list and map.

dpipe.itertools.pmap(func: Callable, iterable: Iterable, *args, **kwargs) Iterable[source]

Partial map. Maps func over iterable using args and kwargs as additional arguments.

dpipe.itertools.dmap(func: Callable, dictionary: dict, *args, **kwargs)[source]

Transform the dictionary by mapping func over its values. args and kwargs are passed as additional arguments.

Examples

>>> dmap(np.square, {'a': 1, 'b': 2})
{'a': 1, 'b': 4}
dpipe.itertools.zdict(keys: Iterable, values: Iterable) dict[source]

Create a dictionary from keys and values.

dpipe.itertools.squeeze_first(inputs)[source]

Remove the first dimension in case it is singleton.

dpipe.itertools.flatten(iterable: Iterable, iterable_types: Optional[Union[tuple, type]] = None) list[source]

Recursively flattens an iterable as long as it is an instance of iterable_types.

Examples

>>> flatten([1, [2, 3], [[4]]])
[1, 2, 3, 4]
>>> flatten([1, (2, 3), [[4]]])
[1, (2, 3), 4]
>>> flatten([1, (2, 3), [[4]]], iterable_types=(list, tuple))
[1, 2, 3, 4]
dpipe.itertools.filter_mask(iterable: Iterable, mask: Iterable[bool]) Iterable[source]

Filter values from iterable according to mask.

dpipe.itertools.extract(sequence: Sequence, indices: Iterable)[source]

Extract indices from sequence.

dpipe.itertools.negate_indices(indices: Iterable, length: int)[source]

Return valid indices for a sequence of len length that are not present in indices.

dpipe.itertools.make_chunks(iterable: Iterable, chunk_size: int, incomplete: bool = True)[source]

Group iterable into chunks of size chunk_size.

Parameters
  • iterable

  • chunk_size

  • incomplete – whether to yield the last chunk in case it has a smaller size.

dpipe.itertools.collect(func: Callable)[source]

Make a function that returns a list from a function that returns an iterator.

Examples

>>> @collect
>>> def squares(n):
>>>     for i in range(n):
>>>         yield i ** 2
>>>
>>> squares(3)
[1, 4, 9]
dpipe.itertools.stack(axis: int = 0, dtype: Optional[dtype] = None)[source]

Stack the values yielded by a generator function along a given axis. dtype (if any) determines the data type of the resulting array.

Examples

>>> @stack(1)
>>> def consecutive(n):
>>>     for i in range(n):
>>>         yield i, i+1
>>>
>>> consecutive(3)
array([[0, 1, 2],
       [1, 2, 3]])
dpipe.itertools.recursive_conditional_map(xr, f, condition)[source]

Walks recursively through iterable data structure xr. Applies f on objects that satisfy condition.

Commands

Contains a few more sophisticated commands that are usually accessed directly inside configs.

dpipe.commands.populate(path: Union[Path, str], func: Callable, *args, **kwargs)[source]

Call func with args and kwargs if path doesn’t exist.

Examples

>>> populate('metrics.json', save_metrics, targets, predictions)
# if `metrics.json` doesn't exist, the following call will be performed:
>>> save_metrics(targets, predictions)
Raises

FileNotFoundError – if after calling func the path still doesn’t exist.:

dpipe.commands.lock_dir(folder: Union[Path, str] = '.', lock: str = '.lock')[source]

Lock the given folder by generating a special lock file - lock.

Raises

FileExistsError – if lock already exists, i.e. the folder is already locked.:

dpipe.commands.transform(input_path, output_path, transform_fn)[source]
dpipe.commands.load_from_folder(path: ~typing.Union[~pathlib.Path, str], loader=<function load>, ext='.npy')[source]

Yields (id, object) pairs loaded from path.

dpipe.commands.map_ids_to_disk(func: ~typing.Callable[str, object], ids: ~typing.Iterable[str], output_path: str, exist_ok: bool = False, save: ~typing.Callable = <function save>, ext: str = '.npy')[source]

Apply func to each id from ids and save each output to output_path using save. If exist_ok is True the existing files will be ignored, otherwise an exception is raised.

dpipe.commands.predict(ids, output_path, load_x, predict_fn, exist_ok=False, save: ~typing.Callable = <function save>, ext='.npy')[source]
dpipe.commands.evaluate_aggregated_metrics(load_y_true, metrics: dict, predictions_path, results_path, exist_ok=False, loader: ~typing.Callable = <function load>, ext='.npy')[source]
dpipe.commands.evaluate_individual_metrics(load_y_true, metrics: dict, predictions_path, results_path, exist_ok=False, loader: ~typing.Callable = <function load>, ext='.npy')[source]

Dataset

Datasets are used for data and metadata loading.

Interfaces

class dpipe.dataset.base.Dataset(*args, **kwargs)[source]

Bases: object

Interface for datasets.

Its subclasses must define the ids attribute - a tuple of identifiers, one for each dataset entry, as well as methods for loading an entry by its identifier.

ids
Type

a tuple of identifiers, one for each dataset entry.

Helpers

class dpipe.dataset.csv.CSV(path: ~typing.Union[~pathlib.Path, str], filename: str = 'meta.csv', index_col: str = 'id', loader: ~typing.Callable = <function load>)[source]

Bases: Dataset

A small wrapper for dataframes that contain paths to data.

Parameters
  • path (PathLike) – the path to the data.

  • filename (str) – the relative path to the csv dataframe. Default is meta.csv.

  • index_col (str, None, optional) – the column that will be used as index. Must contain unique values. Default is id.

  • loader (Callable) – the function to load an object by the path located in a corresponding dataset entry. Default is load_by_ext.

get(index, col)[source]

Returns dataframe element from index and col.

get_global_path(index: str, col: str) str[source]

Get the global path at index and col. Often data frames contain path to data, this is a convenient way to obtain the global path.

load(index: str, col: str, loader=None)[source]

Loads the object from the path located in index and col positions in dataframe.

Wrappers

Wrappers change the dataset’s behaviour. See the Wrappers tutorial for more details.

class dpipe.dataset.wrappers.Proxy(shadowed)[source]

Bases: object

Base class for all wrappers.

dpipe.dataset.wrappers.cache_methods(instance, methods: Optional[Iterable[str]] = None, maxsize: Optional[int] = None)[source]

Cache the instance’s methods. If methods is None, all public methods will be cached.

dpipe.dataset.wrappers.cache_methods_to_disk(instance, base_path: ~typing.Union[~pathlib.Path, str], loader: ~typing.Callable = <function load_numpy>, saver: ~typing.Callable = <function save_numpy>, **methods: str)[source]

Cache the instance’s methods to disk.

Parameters
  • instance – arbitrary object

  • base_path (str) – the path, all other paths of methods relative to.

  • methods (str) – each keyword argument has the form method_name=path_to_cache. The methods are assumed to take a single argument of type str.

  • loader – loads a single object given its path.

  • saver (Callable(value, path)) – saves a single object to the given path.

dpipe.dataset.wrappers.apply(instance, **methods: Callable)[source]

Applies a given function to the output of a given method.

Parameters
  • instance – arbitrary object

  • methods (Callable) – each keyword argument has the form method_name=func_to_apply. func_to_apply is applied to the method_name method.

Examples

>>> # normalize will be applied to the output of load_image
>>> dataset = apply(base_dataset, load_image=normalize)
dpipe.dataset.wrappers.set_attributes(instance, **attributes)[source]

Sets or overwrites attributes with those provided as keyword arguments.

Parameters
  • instance – arbitrary object

  • attributes – each keyword argument has the form attr_name=attr_value.

dpipe.dataset.wrappers.change_ids(dataset: Dataset, change_id: Callable, methods: Optional[Iterable[str]] = None) Dataset[source]

Change the dataset’s ids according to the change_id function and adapt the provided methods to work with the new ids.

Parameters
  • dataset (Dataset) – the dataset to perform ids changing on.

  • change_id (Callable(str) -> str) – the method which allows change ids. Output ids should be unique as well as old ids.

  • methods (Iterable[str]) – the list of methods to be adapted. Each method takes a single argument - the identifier.

dpipe.dataset.wrappers.merge(*datasets: Dataset, methods: Optional[Sequence[str]] = None, attributes: Sequence[str] = ()) Dataset[source]

Merge several datasets into one by preserving the provided methods and attributes.

Parameters
  • datasets (Dataset) – sequence of datasets.

  • methods (Sequence[str], None, optional) – the list of methods to be preserved. Each method should take an identifier as its first argument. If None, all the common methods will be preserved.

  • attributes (Sequence[str]) – the list of attributes to be preserved. For each dataset their values should be the same. Default is the empty sequence ().

dpipe.dataset.wrappers.apply_mask(dataset: Dataset, mask_modality_id: int = -1, mask_value: Optional[int] = None) Dataset[source]

Applies the mask_modality_id modality as the binary mask to the other modalities and remove the mask from sequence of modalities.

Parameters
  • dataset (Dataset) – dataset which is used in the current task.

  • mask_modality_id (int) – the index of mask in the sequence of modalities. Default is -1, which means the last modality will be used as the mask.

  • mask_value (int, None, optional) – the value in the mask to filter other modalities with. If None, greater than zero filtering will be applied. Default is None.

Examples

>>> modalities = ['flair', 't1', 'brain_mask']  # we are to apply brain mask to other modalities
>>> target = 'target'
>>>
>>> dataset = apply_mask(
>>>     dataset=Wmh2017(
>>>         data_path=data_path,
>>>         modalities=modalities,
>>>         target=target
>>>     ),
>>>     mask_modality_id=-1,
>>>     mask_value=1
>>> )

Tutorials

This section contains various tutorials generated from jupyter notebooks located here.

Batch iterators

Batch iterators are built using the following constructor:

from dpipe.batch_iter import Infinite

its only required argument is source - an infinite iterable that yields entries from your data.

We’ll build an example batch iterator that yields batches from the MNIST dataset:

from torchvision.datasets import MNIST
from pathlib import Path
import numpy as np


# download to ~/tests/MNIST, if necessary
dataset = MNIST(Path('~/tests/MNIST').expanduser(), transform=np.array, download=True)
Sampling
from dpipe.batch_iter import sample

# yield 10 batches of size 30 each epoch:

batch_iter = Infinite(
    sample(dataset), # randomly sample from the dataset
    batch_size=30, batches_per_epoch=10,
)

sample infinitely yields data randomly sampled from the dataset:

for x, y in sample(dataset):
    print(x.shape, y)
    break
(28, 28) 7

We use infinite sources because our batch iterators are executed in a background thread, this allows us to use the resources more efficiently. For example, a new batch can be prepared while the network’s forward and backward passes are performed in the main thread.

Now we can simply iterate over batch_iter:

# give 10 batches of size 30
for xs, ys in batch_iter():
    print(xs.shape, ys.shape)
(30, 28, 28) (30,)
(30, 28, 28) (30,)
(30, 28, 28) (30,)
(30, 28, 28) (30,)
(30, 28, 28) (30,)
(30, 28, 28) (30,)
(30, 28, 28) (30,)
(30, 28, 28) (30,)
(30, 28, 28) (30,)
(30, 28, 28) (30,)

… and reuse it again:

# give another 10 batches of size 30
for xs, ys in batch_iter():
    print(xs.shape, ys.shape)
(30, 28, 28) (30,)
(30, 28, 28) (30,)
(30, 28, 28) (30,)
(30, 28, 28) (30,)
(30, 28, 28) (30,)
(30, 28, 28) (30,)
(30, 28, 28) (30,)
(30, 28, 28) (30,)
(30, 28, 28) (30,)
(30, 28, 28) (30,)

After the training is over you must close the batch iterator in order to stop all the background processes:

batch_iter.close()

Or you can use it as a context manager:

batch_iter = Infinite(
    sample(dataset),
    batch_size=30, batches_per_epoch=10,
)

with batch_iter:
    for xs, ys in batch_iter():
        print(xs.shape, ys.shape)
(30, 28, 28) (30,)
(30, 28, 28) (30,)
(30, 28, 28) (30,)
(30, 28, 28) (30,)
(30, 28, 28) (30,)
(30, 28, 28) (30,)
(30, 28, 28) (30,)
(30, 28, 28) (30,)
(30, 28, 28) (30,)
(30, 28, 28) (30,)
Transformations

Let’s add more transformations to the data.

from dpipe.im import zoom

def zoom_image(pair):
    image, label = pair
    return zoom(image, scale_factor=[2, 2]), label
batch_iter = Infinite(
    sample(dataset), # yields pairs
    zoom_image, # zoom the images by a factor of 2

    batch_size=30, batches_per_epoch=3,
)

You can think of Infinite as a pipe through which the data flows.

Each function takes as input the data (an [image, label] pair in this case) applies a trasformation, and the result is propagated further.

with batch_iter:
    for xs, ys in batch_iter():
        print(xs.shape, ys.shape)
(30, 56, 56) (30,)
(30, 56, 56) (30,)
(30, 56, 56) (30,)

Note, that because sample yields pairs, pair is the input of zoom_image. This is not very user-friendly, that’s why there are a number of wrappers for transformers:

from dpipe.batch_iter import unpack_args

# a better version of zoom
def zoom_image(image, label):
    return zoom(image, scale_factor=[2, 2]), label


batch_iter = Infinite(
    sample(dataset),
    unpack_args(zoom_image), # unpack the arguments before calling the function

    batch_size=30, batches_per_epoch=3)

# or use a lambda directly
batch_iter = Infinite(
    sample(dataset),
    unpack_args(lambda image, label: [zoom(image, scale_factor=[2, 2]), label]),

    batch_size=30, batches_per_epoch=3)

However, there is still redundancy: the label argument is simply passed through, only the image is transformed. Let’s fix that:

from dpipe.batch_iter import apply_at

batch_iter = Infinite(
    sample(dataset),
    # apply zoom at index 0 of the pair with scale_factor=[2, 2] as an additional argument
    apply_at(0, zoom, scale_factor=[2, 2]),

    batch_size=30, batches_per_epoch=3)
with batch_iter:
    for xs, ys in batch_iter():
        print(xs.shape, ys.shape)
(30, 56, 56) (30,)
(30, 56, 56) (30,)
(30, 56, 56) (30,)

Now we don’t even have to create another function!

Check dpipe.batch_iter.utils for other helper functions.

Parallel execution

The batch iterator supports both thread-based and process-based execution.

Threads

Wrap the function in Threads in order to enable thread-based parallelism:

%%time

import time
import itertools
from dpipe.batch_iter import Threads


def do_stuff(x):
    time.sleep(1)
    return x ** 2,

batch_iter = Infinite(
    range(10),
    do_stuff, # sleep for 10 seconds
    batch_size=10, batches_per_epoch=1
)

for value in batch_iter():
    pass
CPU times: user 33.3 ms, sys: 9.17 ms, total: 42.5 ms
Wall time: 10 s
%%time

batch_iter = Infinite(
    range(10),
    Threads(do_stuff, n_workers=2), # sleep for 5 seconds
    batch_size=10, batches_per_epoch=1
)

for value in batch_iter():
    pass
CPU times: user 21.4 ms, sys: 7.75 ms, total: 29.1 ms
Wall time: 5.01 s
Processes

Similarly, wrap the function in Loky in order to enable process-based parallelism:

from dpipe.batch_iter import Loky
%%time

batch_iter = Infinite(
    range(10),
    Loky(do_stuff, n_workers=2), # sleep for 5 seconds
    batch_size=10, batches_per_epoch=1
)

for value in batch_iter():
    pass
CPU times: user 43.6 ms, sys: 27.6 ms, total: 71.2 ms
Wall time: 5.56 s
Combining objects into batches

If your dataset contains items of various shapes, you can’t just stack them into batches. For example you may want to pad them to a common shape. To do this, pass a custom combiner to Infinite:

# random 3D images of random shapes:

images = [np.random.randn(10, 10, np.random.randint(2, 40)) for _ in range(100)]
labels = np.random.randint(0, 2, size=30)
images[0].shape, images[1].shape
((10, 10, 34), (10, 10, 34))
from dpipe.batch_iter import combine_pad

batch_iter = Infinite(
    sample(list(zip(images, labels))),
    batch_size=5, batches_per_epoch=3,
#     pad and combine
    combiner=combine_pad
)

with batch_iter:
    for xs, ys in batch_iter():
        print(xs.shape, ys.shape)
(5, 10, 10, 39) (5,)
(5, 10, 10, 34) (5,)
(5, 10, 10, 39) (5,)
Adaptive batch size

If samples in your pipeline have various sizes, a constant batch size can be too wasteful.

You can pass a function to batch_size instead of an integer.

Let’s say we are classifying 3D images of different shapes along the last axis. We want a batch to contain at most 100 slices along the last axis.

def should_add(seq, item):
    # seq - sequence of already added objects to the batch
    # item - the next item

    count = 0
    for image, label in seq + [item]:
        count += image.shape[-1]

    return count <= 100
from dpipe.batch_iter import combine_pad

batch_iter = Infinite(
    sample(list(zip(images, labels))),

    batch_size=should_add, batches_per_epoch=3,
    combiner=combine_pad
)

with batch_iter:
    for xs, ys in batch_iter():
        print(xs.shape, ys.shape)
(5, 10, 10, 34) (5,)
(4, 10, 10, 25) (4,)
(4, 10, 10, 32) (4,)

Note that the batch sizes are different: 4, 4, 5

Training

deep_pipe has a unified interface for training models. We will show an example for a model written in PyTorch.

from dpipe.train import train

this is the main function; it requires a batch iterator, and a train_step function, that performs a forward-backward pass for a given batch.

Let’s build all the required components.

Batch iterator

The batch iterators are covered in a separate tutorial (Batch iterators), we’ll reuse the code from it:

from torchvision.datasets import MNIST
from dpipe.batch_iter import Infinite, sample, apply_at
from pathlib import Path
import numpy as np

# download to ~/tests/MNIST, if necessary
dataset = MNIST(Path('~/tests/MNIST').expanduser(), transform=np.array, download=True)


# yield 10 batches of size 30 each epoch:

batch_iter = Infinite(
    sample(dataset),
    apply_at(0, lambda x: x[None].astype('float32')), # add channels dim
    batch_size=30, batches_per_epoch=10,
)
Train Step

Next, we will implement the function that performs a train_step. But first we need an architecture:

import torch
from torch import nn
from dpipe import layers


architecture = nn.Sequential(
    nn.Conv2d(1, 32, kernel_size=3),
    nn.ReLU(),
    nn.Conv2d(32, 64, kernel_size=3),
    nn.ReLU(),
    nn.Conv2d(64, 128, kernel_size=3),

    nn.AdaptiveMaxPool2d((1, 1)),
    nn.Flatten(),

    nn.ReLU(),
    nn.Linear(128, 10),
)
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(architecture.parameters(), lr=1e-3)
from dpipe.torch import to_var, to_np

def cls_train_step(images, labels):
    # move images and labels to same device as architecture
    images, labels = to_var(images, labels, device=architecture)
    architecture.train()

    logits = architecture(images)
    loss = criterion(logits, labels)

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    # `train_step` must return the loss which will be later user for logging
    return to_np(loss)
Training the model

Next, we just run the train function:

train(cls_train_step, batch_iter, n_epochs=10)

A more general version of the function cls_train_step is already available in dpipe:

from dpipe.torch import train_step

Apart from the input batches it requires the following arguments: architecture, optimizer, criterion. We can pass these arguments directly to train, so the previous call is equivalent to:

train(
    train_step, batch_iter, n_epochs=10,
    architecture=architecture, optimizer=optimizer, criterion=criterion
)
Logging

After calling train the interpreter just “hangs” until the training is over. In order to log various information about the training process, you can pass a logger:

from dpipe.train import ConsoleLogger

train(
    train_step, batch_iter, n_epochs=3, logger=ConsoleLogger(),
    architecture=architecture, optimizer=optimizer, criterion=criterion
)
00000: train loss: 0.29427966475486755
00001: train loss: 0.26119616627693176
00002: train loss: 0.2186189591884613

There are various logger implementations, e.g. one that writes in a format, readable by tensorboard - TBLogger.

Checkpoints

It is often useful to keep checkpoints (or snapshots) of you model and optimizer in case you may want to resotore them. To do that, pass the checkpoints argument:

from dpipe.train import Checkpoints


checkpoints = Checkpoints(
    'PATH/TO/CHECKPOINTS/FOLDER',
    [architecture, optimizer],
)

train(
    train_step, batch_iter, n_epochs=3, checkpoints=checkpoints,
    architecture=architecture, optimizer=optimizer, criterion=criterion
)

The cool part is that if the training is prematurely stopped, e.g. by an exception, you can resume the training from the same point instead of starting over:

train(
    train_step, batch_iter, n_epochs=3, checkpoints=checkpoints,
    architecture=architecture, optimizer=optimizer, criterion=criterion
)
# ... something bad happened, e.g. KeyboardInterrupt

# start from where you left off
train(
    train_step, batch_iter, n_epochs=3, checkpoints=checkpoints,
    architecture=architecture, optimizer=optimizer, criterion=criterion
)
Value Policies

You can further customize the training process by passing addtitional values to train_step that change in time.

For example, train_step takes an optional argument lr - used to update the optimizer’s learning rate.

We can change this value after each trainig epoch using the ValuePolicy interface. Let’s use an exponential learning rate:

from dpipe.train import Exponential

train(
    train_step, batch_iter, n_epochs=10,
    architecture=architecture, optimizer=optimizer, criterion=criterion,
    lr=Exponential(initial=1e-3, multiplier=0.5, step_length=3) # decrease by a factor of 2 every 3 epochs
)
Validation

Finally, you may want to evaluate your network on a separate validation set after each epoch. This is done by the validate argument. It expects a function that simply returns a dictionary with the calculated metrics, e.g.:

def validate():
    architecture.eval()

    # ... predict on validation set
    pred = ...
    ys = ...

    acc = accuracy_score(ys, pred)
    return {
        'acuracy': acc
    }

train(
    train_step, batch_iter, n_epochs=10, validate=validate,
    architecture=architecture, optimizer=optimizer, criterion=criterion,
)

Predict

Usually when dealing with neural networks, at inference time the input data may require some preprocessing before being fed into the network. Also, the network’s output might need postprocessing in order to obtain a final prediction.

Padding and cropping

Let’s suppose that we have a network for segmentation that can only work with images larger than 256x256 pixels.

Before feeding a given image into the network you may want to pad it:

from dpipe.medim.shape_ops import pad_to_shape

padded = pad_to_shape(image, np.maximum(image.shape, (256, 256)))
mask = network(padded)

Now you need to remove the padding in order to make the mask of same shape as image:

from dpipe.medim.shape_ops import crop_to_shape

mask = crop_to_shape(mask, image.shape)

Let’s make a function that implements the whole pipeline:

import numpy as np
from dpipe.medim.shape_ops import pad_to_shape, crop_to_shape

def predict_pad(image, network, min_shape):
    # pad
    padded = pad_to_shape(image, np.maximum(image.shape, min_shape))
    # predict
    mask = network(padded)
    # restore
    mask = crop_to_shape(mask, image.shape)
    return mask

Now we have a perfectly reusable function.

Scale

Now let’s write a function that downsamples the input by a factor of 2 and then zooms the output by 2.

import numpy as np
from dpipe.medim.shape_ops import zoom, zoom_to_shape

def predict_zoom(image, network, scale_factor=0.5):
    # zoom
    zoomed = zoom(image, scale_factor)
    # predict
    mask = network(zoomed)
    # restore
    mask = zoom_to_shape(mask, image.shape)
    return mask
Combining

Now suppose we want to combine zooming and padding. We could do something like:

import numpy as np
from dpipe.medim.shape_ops import pad_to_shape, crop_to_shape

def predict(image, network, min_shape, scale_factor):
    # zoom
    zoomed = zoom(image, scale_factor)

    # ---
    # pad
    padded = pad_to_shape(image, np.maximum(zoomed.shape, min_shape))
    # predict
    mask = network(padded)
    # restore
    mask = crop_to_shape(mask, np.minimum(mask.shape, zoomed.shape))
    # ---

    mask = zoom_to_shape(mask, image.shape)
    return mask

Note how the content of predict is divided in two regions: basically it looks like the function predict_zoom but with the line

mask = network(padded)

replaced by the body of predict_pad.

Basically, it means that we can pass predict_pad as the network argument and reuse the functions we defined above:

def predict(image, network, min_shape, scale_factor):
    def network_(x):
        return predict_pad(x, network, min_shape)

    return predict_zoom(image, network_, scale_factor)

predict_pad “wraps” the original network - it behaves like network, and predict_zoom doesn’t really care whether it received the original network or a wrapped one.

This sounds just like a decorator (a very good explanation can be found here).

If we implement predict_pad and predict_zoom as decorators we can more easily reuse them:

def predict_pad(min_shape):
    def decorator(network):
        def predict(image):
            # pad
            padded = pad_to_shape(image, np.maximum(image.shape, min_shape))
            # predict
            mask = network(padded)
            # restore
            mask = crop_to_shape(mask, np.minimum(mask.shape, image.shape))
            return mask

        return predict
    return decorator

def predict_zoom(scale_factor):
    def decorator(network):
        def predict(image):
            # zoom
            zoomed = zoom(image, scale_factor)
            # predict
            mask = network(padded)
            # restore
            mask = zoom_to_shape(mask, image.shape)
            return mask

        return predict
    return decorator

Then the same predict can be defined like so:

@predict_zoom(0.5)
@predict_pad((256, 256))
def predict(image):
    # here the image is already zoomed and padded
    return network(image)

Now predict is just a function that receives a single argument - the image.

If you don’t like the decorator approach you can use a handy function for that:

from dpipe.predict.functional import chain_decorators

predict = chain_decorators(
    predict_zoom(0.5),
    predict_pad((256, 256)),
    predict=network,
)

which gives the same function.

Working with patches

If your pipeline requires images of a given shape, you may want to split larger images into patches, perform some operations and then combine the results.

!wget https://www.bluecross.org.uk/sites/default/files/d8/assets/images/118809lprLR.jpg
import numpy as np
from imageio import imread
import matplotlib.pyplot as plt
%matplotlib inline

image = imread('118809lprLR.jpg')
plt.imshow(image)
Probability maps
from torchvision.models import resnet50
from torchvision.transforms import Normalize

model = resnet50(pretrained=True)
# resnet requires normalization
normalize = Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])

We’ll classify this image by averaging the logits on each patch. We’ll be taking patches in a convolution-like fashion, i.e. with a fixed stride.

from dpipe.medim import grid
from dpipe.torch import to_var, to_np
from scipy.special import softmax
from dpipe.medim.shape_utils import shape_after_convolution

x = np.moveaxis(image.astype('float32'), -1, 0) # move channels forward
x = x / 256

probas = []
for patch in grid.divide(x, patch_size=(256, 256), stride=32, valid=True):
    # move the patch to the same device as the model
    patch = to_var(patch, device=model)
    patch = normalize(patch)
    pred = to_np(model(patch[None])[0])
    pred = softmax(pred)

    # according to https://gist.github.com/yrevar/942d3a0ac09ec9e5eb3a
    # 281 is "tabby, tabby cat"
    probas.append(pred[281][None, None])

output_shape = shape_after_convolution(x.shape[1:], kernel_size=256, stride=32)
# combine "patches" of shape (1, 1) into an image of `output_shape` with stride 1
heatmap = grid.combine(probas, output_shape, stride=(1, 1))
plt.figure(figsize=(20, 10))
plt.subplot(1, 2, 1)
plt.imshow(heatmap)
plt.subplot(1, 2, 2)
plt.imshow(image)
Patches segmentation
from torchvision.models.segmentation import fcn_resnet101
model = fcn_resnet101(pretrained=True)
pred.shape
x = np.moveaxis(image.astype('float32'), -1, 0) # move channels forward
x = x / 256

probas = []
for patch in grid.divide(x, patch_size=(256, 256), stride=32):
    # move the patch to the same device as the model
    patch = to_var(patch, device=model)
    patch = normalize(patch)

    pred = model(patch[None])['out'][0]
    pred = to_np(pred)
    # 'cat' is 8
    pred = pred[8]

    probas.append(pred)

segmentation = grid.combine(probas, x.shape[1:], stride=(32, 32))
plt.figure(figsize=(20, 10))
plt.subplot(1, 2, 1)
plt.imshow(segmentation)
plt.subplot(1, 2, 2)
plt.imshow(image)
Using predictors

The previous approach is a quite common pattern: split -> segment -> combine, that’s why there is a predictor that reduces boilerplate code:

from dpipe.predict import patches_grid


@patches_grid(patch_size=(256, 256), stride=(32, 32), padding_values=None)
def segment(patch):
    patch = to_var(patch, device=model)
    patch = normalize(patch)

    pred = model(patch[None])['out'][0]
    # 'cat' is 8
    return to_np(pred[8])

You can then reuse this function:

segmentation = segment(image)

Wrappers

Consider the following dataset, which is a simple loader for MNIST:

class MNIST:
    # ...

    def load_image(self, identifier: str):
        return self.xs[int(identifier)]

    def load_label(self, identifier: str):
        return self.ys[int(identifier)]

# The full implementation can be found at `dpipe.tests.mnist.resources`:
# from dpipe.tests.mnist.resources import MNIST

dataset = MNIST('PATH TO DATA')
dataset.load_image(0).shape, dataset.load_label(0)
((1, 28, 28), 5)

Next, suppose you want to upsample the images by a factor of 2.

There are several solutions:

  • Rewrite the dataset - breaks compatibility, not reusable

  • Write a new dataset - not reusable, generates a lot of repetitive code

  • Subclass the dataset - not reusable

  • Wrap the dataset

Wrappers are handy when you need to change the dataset’s behaviour in a reusable way.

You can think of a wrapper as an additional layer around the original dataset. In case of upsampling it could look something like this:

from dpipe.dataset.wrappers import Proxy
from dpipe.medim.shape_ops import zoom

class UpsampleWrapper(Proxy):
    def load_image(self, identifier):
        # self._shadowed is the original dataset
        image = self._shadowed.load_image(identifier)
        image = zoom(image, [2, 2])
        return image
upsampled = UpsampleWrapper(dataset)
upsampled.load_image(0).shape, upsampled.load_label(0)
((1, 56, 56), 5)

Now this wrapper can be reused with other datasets that have the load_image method. Note that load_label is also working, even though it wasn’t defined in the wrapper.

dpipe already has a collection of predefined wrappers, for example, you can apply upsampling as follows:

from dpipe.dataset.wrappers import apply

upsampled = apply(dataset, load_image=lambda image: zoom(image, [2, 2]))

or in a more functional fashion:

from functools import partial

upsampled = apply(dataset, load_image=partial(zoom, scale_factor=[2, 2]))