histomicstk.segmentation

This package contains functions for segmenting a variety of objects/structures (e.g. nuclei, tissue, cytoplasm) found in histopathology images.

histomicstk.segmentation.embed_boundaries(im_input, im_perim, color=None)[source]

Embeds object boundaries into an RGB color, grayscale or binary image, returning a color rendering of the image and object boundaries.

Takes as input a grayscale or color image, a perimeter mask of object boundaries, and an RGB triplet, and embeds the object boundaries into the input image at the prescribed color. Returns a color RGB image of type unsigned char. If the input image is type double, and has pixels inside range [0, 1], then it will be scaled to the range [0, 255]. Otherwise it will be assumed to be in the range of an unsigned char image.

Parameters:
  • im_input (array_like) – A color or grayscale image.

  • im_perim (array_like) – A binary image where object perimeter pixels have value 1, and non-perimeter pixels have value 0.

  • color (array_like) – A 1 x 3 array of RGB values in the range [0, 255].

Returns:

im_embed – A color image of type unsigned char where boundary pixels take on the color defined by the RGB-triplet ‘color’.

Return type:

array_like

histomicstk.segmentation.rag(im_label, neigh_conn=4)[source]

Constructs a region adjacency graph for a label image using either 4-neighbor or 8-neighbor connectivity. Background pixels are not included (im_label == 0). Not intended to build large graphs from individual pixels.

Parameters:
  • im_label (array_like) – im_label image where positive values (im_label > 0) correspond to foreground objects of interest.

  • neigh_conn (float) – The neighbor connectivity to use, either ‘4’ or ‘8’. Default value = 4.

Returns:

adj_mat – A binary matrix of size N x N, where N is the number of objects in im_label. A value of ‘True’ at adj_mat(i,j) indicates that objects ‘i’ and ‘j’ are neigh_conn.

Return type:

array_like

histomicstk.segmentation.rag_add_layer(adj_mat)[source]

Adds an additional layer of dependence to a region adjacency graph, connecting each node to the neighbors of its immediate neighbors.

Parameters:

adj_mat (array_like) – A binary matrix of size N x N, where N is the number of objects in Label. A value of ‘True’ at adj_mat(i,j) indicates that objects ‘i’ and ‘j’ are neighbors.

Returns:

Layered – A version of ‘adj_mat’ with additional edges to connect 2-neighbors.

Return type:

array_like

histomicstk.segmentation.rag_color(adj_mat)[source]

Generates a coloring of an adjacency graph using the sequential coloring algorithm. Used to bin regions from a label image into a small number of independent groups that can be processed separately with algorithms like multi-label graph cuts or individual active contours. The rationale is to color adjacent objects with distinct colors so that their contours can be co-evolved.

Parameters:

adj_mat (array_like) – A binary matrix of size N x N, where N is the number of objects in Label. A value of ‘True’ at adj_mat(i,j) indicates that objects ‘i’ and ‘j’ are neighbors. Does not contain entries for background objects.

Returns:

Colors – A list of colors for the objects encoded in ‘adj_mat’. No two objects that are connected in ‘adj_mat’ will share the same color.

Return type:

array_like

histomicstk.segmentation.simple_mask(im_rgb, bandwidth=2, bgnd_std=2.5, tissue_std=30, min_peak_width=10, max_peak_width=25, fraction=0.1, min_tissue_prob=0.05)[source]

Performs segmentation of the foreground (tissue) Uses a simple two-component Gaussian mixture model to mask tissue areas from background in brightfield H&E images. Kernel-density estimation is used to create a smoothed image histogram, and then this histogram is analyzed to identify modes corresponding to tissue and background. The mode peaks are then analyzed to estimate their width, and a constrained optimization is performed to fit gaussians directly to the histogram (instead of using expectation-maximization directly on the data which is more prone to local minima effects). A maximum-likelihood threshold is then derived and used to mask the tissue area in a binarized image.

Parameters:
  • im_rgb (array_like) – An RGB image of type unsigned char.

  • bandwidth (double, optional) – Bandwidth for kernel density estimation - used for smoothing the grayscale histogram. Default value = 2.

  • bgnd_std (double, optional) – Standard deviation of background gaussian to be used if estimation fails. Default value = 2.5.

  • tissue_std (double, optional) – Standard deviation of tissue gaussian to be used if estimation fails. Default value = 30.

  • min_peak_width (double, optional) – Minimum peak width for finding peaks in KDE histogram. Used to initialize curve fitting process. Default value = 10.

  • max_peak_width (double, optional) – Maximum peak width for finding peaks in KDE histogram. Used to initialize curve fitting process. Default value = 25.

  • fraction (double, optional) – Fraction of pixels to sample for building foreground/background model. Default value = 0.10.

  • min_tissue_prob (double, optional) – Minimum probability to qualify as tissue pixel. Default value = 0.05.

Returns:

im_mask – A binarized version of I where foreground (tissue) has value ‘1’.

Return type:

array_like

** Sub-packages **