This package contains functions to computing a variety of image-based features
that quantify the appearance and/or morphology of an objects/regions in the
image. These are needed for classifying objects (e.g. nuclei) and
regions (e.g. tissues) found in histopathology images.
Calculates Fourier shape descriptors for each objects.
im_label (array_like) – A labeled mask image wherein intensity of a pixel is the ID of the
object it belongs to. Non-zero values are considered to be foreground
K (int, optional) – Number of points for boundary resampling to calculate fourier
descriptors. Default value = 128.
Fs (int, optional) – Number of frequency bins for calculating FSDs. Default value = 6.
Delta (int, optional) – Used to dilate nuclei and define cytoplasm region. Default value = 8.
rprops (output of skimage.measure.regionprops, optional) – rprops = skimage.measure.regionprops( im_label ). If rprops is not
passed then it will be computed inside which will increase the
computation time.
fdata – object/label.
Return type:
Pandas data frame containing the FSD features for each
Compute global (i.e., not per-nucleus) features of the nuclei with
the given centroids based on the partitioning of the space into
Voronoi cells and on the induced graph structure.
centroids (array_like) – Nx2 numpy array of nuclear centroids
neighbor_distances (array_like) – Radii to count neighbors in
neighbor_counts (sequence) – Sequence of numbers of neighbors, each of which is used to
compute statistics relating to the distance required to reach
that many neighbors.
props – A single-row DataFrame with the following columns:
voronoi_…: Voronoi diagram features
area_…: Polygon area features
peri_…: Polygon perimeter features
max_dist_…: Maximum distance in polygon features
delaunay_…: Delaunay triangulation features
sides_…: Triangle side length features
area_…: Triangle area features
mst_branches_…: Minimum spanning tree branch features
density_…: Density features
0, 1, …, len(neighbor_distances) - 1: Neighbor count
within given radius features.
0, 1, …, len(neighbor_counts) - 1: Minimum distance to
enclose count neighbors features
The “…”s are meant to signify that what precedes is the
start of a column name. At the end of each column name is one
of ‘mean’, ‘stddev’, ‘min_max_ratio’, and ‘disorder’.
‘min_max_ratio’ is the minimum-to-maximum ratio, and disorder
is stddev / (mean + stddev).
Return type:
The indices for the density features are with respect to the
sorted values of the corresponding argument sequence.
Calculates gradient features from an intensity image.
im_label (array_like) – A labeled mask image wherein intensity of a pixel is the ID of the
object it belongs to. Non-zero values are considered to be foreground
im_intensity (array_like) – Intensity image
num_hist_bins (int, optional) – Number of bins used to computed the gradient histogram of an object.
Histogram is used to energy and entropy features. Default is 10.
rprops (output of skimage.measure.regionprops, optional) – rprops = skimage.measure.regionprops( im_label ). If rprops is not
passed then it will be computed inside which will increase the
computation time.
fdata – A pandas dataframe containing the gradient features listed below for
each object/label.
Return type:
List of gradient features computed by this function:
Mean of gradient data.
Standard deviation of gradient data.
Skewness of gradient data. Value is 0 when all values are equal.
Kurtosis of gradient data. Value is -3 when all values are equal.
Energy of the gradient magnitude histogram of object pixels
Entropy of the gradient magnitude histogram of object pixels.
Calculates 26 Haralick texture features for each object in the given label
These features are derived from gray-level co-occurence matrix (GLCM)
that is a two dimensional histogram containing the counts/probabilities of
co-occurring intensity values with a given neighborhood offset in the
region occupied by an object in the image.
im_label (array_like) – An ND labeled mask image wherein intensity of a pixel is the ID of the
object it belongs to. Non-zero values are considered to be foreground
im_intensity (array_like) – An ND single channel intensity image.
offsets (array_like, optional) –
A (num_offsets, num_image_dims) array of offset vectors
specifying the distance between the pixel-of-interest and
its neighbor. Note that the first dimension corresponds to
the rows.
See histomicstk.features.graycomatrixext for more details.
num_levels (unsigned int, optional) –
An integer specifying the number of gray levels For example, if
NumLevels is 8, the intensity values of the input image are
scaled so they are integers between 1 and 8. The number of gray
levels determines the size of the gray-level co-occurrence matrix.
Default: 2 for binary/logical image, 32 for numeric image
gray_limits (array_like, optional) –
A two-element array specifying the desired input intensity range.
Intensity values in the input image will be clipped into this range.
Default: [0, 1] for boolean-valued image, [0, 255] for integer-valued
image, and [0.0, 1.0] for-real valued image
fdata – A pandas dataframe containing the haralick features.
Return type:
This function computes the following list of haralick features derived
from normalized GLCMs (P) of the given list of neighborhood offsets:
Haralick.ASM.Mean, Haralick.ASM.Rangefloat
Mean and range of the angular second moment (ASM) feature for GLCMs
of all offsets. It is a measure of image homogeneity and is computed
as follows:
Mean and range of the Contrast feature for GLCMs of all offsets. It is
a measure of the amount of variation between intensities of
neighboiring pixels. It is equal to zero for a constant image and
increases as the amount of variation increases. It is computed as
Mean and range of the Correlation feature for GLCMs of all offsets. It
is a measure of correlation between the intensity values of
neighboring pixels. It is computed as follows:
Calculate intensity features from an intensity image.
im_label (array_like) – A labeled mask image wherein intensity of a pixel is the ID of the
object it belongs to. Non-zero values are considered to be foreground
im_intensity (array_like) – Intensity image.
num_hist_bins (int, optional) – Number of bins used to computed the intensity histogram of an object.
Histogram is used to energy and entropy features. Default is 10.
rprops (output of skimage.measure.regionprops, optional) – rprops = skimage.measure.regionprops( im_label ). If rprops is not
passed then it will be computed inside which will increase the
computation time.
feature_list (list, default is None) – list of intensity features to return.
If none, all intensity features are returned.
fdata – A pandas dataframe containing the intensity features listed below for
each object/label.
Return type:
List of intensity features computed by this function:
Minimum intensity of object pixels.
Maximum intensity of object pixels.
Mean intensity of object pixels
Median intensity of object pixels
Difference between mean and median intensities of object pixels.
Standard deviation of the intensities of object pixels
Intensity.IQR: float
Inter-quartile range of the intensities of object pixels
Intensity.MAD: float
Median absolute deviation of the intensities of object pixels
Skewness of the intensities of object pixels. Value is 0 when all
intensity values are equal.
Kurtosis of the intensities of object pixels. Value is -3 when all
values are equal.
Energy of the intensity histogram of object pixels
Entropy of the intensity histogram of object pixels.
im_label (array_like) – A labeled mask image wherein intensity of a pixel is the ID of the
object it belongs to. Non-zero values are considered to be foreground
rprops (output of skimage.measure.regionprops, optional) – rprops = skimage.measure.regionprops( im_label ). If rprops is not
passed then it will be computed inside which will increase the
computation time.
fdata – A pandas dataframe containing the morphometry features for each
object/label listed below.
Return type:
List of morphometry features computed by this function:
Angle between the horizontal axis and the major axis of the ellipse
that has the same second moments as the region,
ranging from -pi/2 to pi/2 counter-clockwise.
Number of pixels the object occupies.
Number of pixels of convex hull image, which is the smallest convex
polygon that encloses the region.
The length of the major axis of the ellipse that has the same
normalized second central moments as the object.
The length of the minor axis of the ellipse that has the same
normalized second central moments as the region.
Perimeter of object which approximates the contour as a line
through the centers of border pixels using a 4-connectivity.
Shape.Circularity: float
A measure of how similar the shape of an object is to the circle
A measure of aspect ratio computed to be the eccentricity of the
ellipse that has the same second-moments as the object region.
Eccentricity of an ellipse is the ratio of the focal distance
(distance between focal points) over the major axis length. The value
is in the interval [0, 1). When it is 0, the ellipse becomes a circle.
The diameter of a circle with the same area as the object.
Ratio of area of the object to its axis-aligned bounding box.
A measure of aspect ratio. Ratio of minor to major axis of the ellipse
that has the same second-moments as the object region
A measure of convexity computed as the ratio of the number of pixels
in the object to that of its convex hull.
Where k ranges from 1-7 are the 7 Hu moments features. The first six
moments are translation, scale and rotation invariant, while the
seventh moment flips its sign if the shape is a mirror image.
Same as Hu moments, but instead of using the binary mask, using the
intensity image.
im_label (array_like) – A labeled mask image wherein intensity of a pixel is the ID of the
object it belongs to. Non-zero values are considered to be foreground
fsd_bnd_pts (int, optional) – Number of points for boundary resampling to calculate fourier
descriptors. Default value = 128.
fsd_freq_bins (int, optional) – Number of frequency bins for calculating FSDs. Default value = 6.
cyto_width (float, optional) – Estimated width of the ring-like neighborhood region around each
nucleus to be considered as its cytoplasm. Default value = 8.
An integer specifying the number of gray levels For example, if
NumLevels is 32, the intensity values of the input image are
scaled so they are integers between 0 and 31. The number of gray
levels determines the size of the gray-level co-occurrence matrix.
Default: 32
morphometry_features_flag (bool, optional) – A flag that can be used to specify whether or not to compute
morphometry (size and shape) features.
See histomicstk.features.compute_morphometry_features for more details.
fsd_features_flag (bool, optional) – A flag that can be used to specify whether or not to compute
Fouried shape descriptor (FSD) features.
See histomicstk.features.compute_fsd_features for more details.
intensity_features_flag (bool, optional) – A flag that can be used to specify whether or not to compute
intensity features from the nucleus and cytoplasm channels.
See histomicstk.features.compute_fsd_features for more details.
gradient_features_flag (bool, optional) – A flag that can be used to specify whether or not to compute
gradient/edge features from intensity and cytoplasm channels.
See histomicstk.features.compute_gradient_features for more details.
haralick_features_flag (bool, optional) – A flag that can be used to specify whether or not to compute
haralick features from intensity and cytoplasm channels.
See histomicstk.features.compute_haralick_features for more details.
return_nuclei_annotation (bool, optional) – Returns the nuclei annotation if kept True
fdata (pandas.DataFrame) – A pandas data frame containing the features listed below for each
nuclei_annot_list (List) – List containing the boundaries of segmented nuclei in the input image.
List of features computed by this function
Location of the nucleus and its code in the input labeled mask.
Columns are prefixed by Identifier.. These include …
Identifier.Label (int) - nucleus label in the input labeled mask
Identifier.Xmin (int) - Left bound
Identifier.Ymin (int) - Upper bound
Identifier.Xmax (int) - Right bound
Identifier.Ymax (int) - Lower bound
Identifier.CentroidX (float) - X centroid (columns)
Identifier.CentroidY (float) - Y centroid (rows)
Identifier.WeightedCentroidX (float) - intensity-weighted X centroid
Identifier.WeightedCentroidY (float) - intensity-weighted Y centroid
Morphometry (size, shape, and orientation) features of the nuclei
See histomicstk.features.compute_morphometry_features for more details.
Feature names prefixed by Size., Shape., or Orientation..
Fourier shape descriptor features
See histomicstk.features.compute_fsd_features for more details.
Feature names are prefixed by FSD.
Intensity features for the nucleus and cytoplasm channels
See histomicstk.features.compute_fsd_features for more details.
Feature names are prefixed by Nucleus.Intensity. for nucleus features
and Cytoplasm.Intensity. for cytoplasm features.
Gradient/edge features for the nucleus and cytoplasm channels
See histomicstk.features.compute_gradient_features for more details.
Feature names are prefixed by Nucleus.Gradient. for nucleus features
and Cytoplasm.Gradient. for cytoplasm features.
Haralick features for the nucleus and cytoplasm channels
See histomicstk.features.compute_haralick_features for more details.
Feature names are prefixed by Nucleus.Haralick. for nucleus features
and Cytoplasm.Haralick. for cytoplasm features.
Computes gray-level co-occurence matrix (GLCM) within a region of
interest (ROI) of an image. GLCM is a 2D histogram/matrix containing the
counts/probabilities of co-occuring intensity values at a given offset
within an ROI of an image.
Read the documentation to know the default values used for each of the
optional parameter in different scenarios.
im_input (array_like) – Input single channel intensity image
im_roi_mask (array_like, optional) –
A binary mask specifying the region of interest within which
to compute the GLCM. If not specified GLCM is computed for the
the entire image.
Default: None
offsets (array_like, optional) –
A (num_offsets, num_image_dims) array of offset vectors
specifying the distance between the pixel-of-interest and
its neighbor. Note that the first dimension corresponds to
the rows.
Because this offset is often expressed as an angle, the
following table lists the offset values that specify common
angles for a 2D image, given the pixel distance D.
An integer specifying the number of gray levels For example, if
NumLevels is 8, the intensity values of the input image are
scaled so they are integers between 1 and 8. The number of gray
levels determines the size of the gray-level co-occurrence matrix.
Default: 2 for binary/logical image, 32 for numeric image
gray_limits (array_like, optional) –
A two-element array specifying the desired input intensity range.
Intensity values in the input image will be clipped into this range.
Default: [0, 1] for boolean-valued image, [0, 255] for integer-valued
image, and [0.0, 1.0] for-real valued image
A boolean value that specifies whether or not the ordering of values
in pixel pairs is considered while creating the GLCM matrix.
For example, if Symmetric is True, then while calculating the
number of times the value 1 is adjacent to the value 2, both
1,2 and 2,1 pairings are counted. GLCM created in this way is
symmetric across its diagonal.
Specifies whether or not to exclude a pixel-pair if the
neighboring pixel in the pair is outside im_roi_mask.
Has an effect only when im_roi_mask is specified.
Default: False
glcm – num_levels x num_levels x num_offsets array containing the GLCM
for each offset.