This package contains functions to computing a variety of image-based features
that quantify the appearance and/or morphology of an objects/regions in the
image. These are needed for classifying objects (e.g. nuclei) and
regions (e.g. tissues) found in histopathology images.
Calculates Fourier shape descriptors for each objects.
Parameters:
im_label (array_like) – A labeled mask image wherein intensity of a pixel is the ID of the
object it belongs to. Non-zero values are considered to be foreground
objects.
K (int, optional) – Number of points for boundary resampling to calculate fourier
descriptors. Default value = 128.
Fs (int, optional) – Number of frequency bins for calculating FSDs. Default value = 6.
Delta (int, optional) – Used to dilate nuclei and define cytoplasm region. Default value = 8.
rprops (output of skimage.measure.regionprops, optional) – rprops = skimage.measure.regionprops( im_label ). If rprops is not
passed then it will be computed inside which will increase the
computation time.
Returns:
fdata – object/label.
Return type:
Pandas data frame containing the FSD features for each
Compute global (i.e., not per-nucleus) features of the nuclei with
the given centroids based on the partitioning of the space into
Voronoi cells and on the induced graph structure.
Parameters:
centroids (array_like) – Nx2 numpy array of nuclear centroids
neighbor_distances (array_like) – Radii to count neighbors in
neighbor_counts (sequence) – Sequence of numbers of neighbors, each of which is used to
compute statistics relating to the distance required to reach
that many neighbors.
Returns:
props – A single-row DataFrame with the following columns:
voronoi_…: Voronoi diagram features
area_…: Polygon area features
peri_…: Polygon perimeter features
max_dist_…: Maximum distance in polygon features
delaunay_…: Delaunay triangulation features
sides_…: Triangle side length features
area_…: Triangle area features
mst_branches_…: Minimum spanning tree branch features
density_…: Density features
neighbors_in_distance_…
0, 1, …, len(neighbor_distances) - 1: Neighbor count
within given radius features.
distance_for_neighbors_…
0, 1, …, len(neighbor_counts) - 1: Minimum distance to
enclose count neighbors features
The “…”s are meant to signify that what precedes is the
start of a column name. At the end of each column name is one
of ‘mean’, ‘stddev’, ‘min_max_ratio’, and ‘disorder’.
‘min_max_ratio’ is the minimum-to-maximum ratio, and disorder
is stddev / (mean + stddev).
Return type:
pandas.DataFrame
Note
The indices for the density features are with respect to the
sorted values of the corresponding argument sequence.
Calculates gradient features from an intensity image.
Parameters:
im_label (array_like) – A labeled mask image wherein intensity of a pixel is the ID of the
object it belongs to. Non-zero values are considered to be foreground
objects.
im_intensity (array_like) – Intensity image
num_hist_bins (int, optional) – Number of bins used to computed the gradient histogram of an object.
Histogram is used to energy and entropy features. Default is 10.
rprops (output of skimage.measure.regionprops, optional) – rprops = skimage.measure.regionprops( im_label ). If rprops is not
passed then it will be computed inside which will increase the
computation time.
Returns:
fdata – A pandas dataframe containing the gradient features listed below for
each object/label.
Return type:
pandas.DataFrame
Notes
List of gradient features computed by this function:
Gradient.Mag.Meanfloat
Mean of gradient data.
Gradient.Mag.Stdfloat
Standard deviation of gradient data.
Gradient.Mag.Skewnessfloat
Skewness of gradient data. Value is 0 when all values are equal.
Gradient.Mag.Kurtosisfloat
Kurtosis of gradient data. Value is -3 when all values are equal.
Gradient.Mag.HistEnergyfloat
Energy of the gradient magnitude histogram of object pixels
Gradient.Mag.HistEnergyfloat
Entropy of the gradient magnitude histogram of object pixels.
Calculates 26 Haralick texture features for each object in the given label
mask.
These features are derived from gray-level co-occurence matrix (GLCM)
that is a two dimensional histogram containing the counts/probabilities of
co-occurring intensity values with a given neighborhood offset in the
region occupied by an object in the image.
Parameters:
im_label (array_like) – An ND labeled mask image wherein intensity of a pixel is the ID of the
object it belongs to. Non-zero values are considered to be foreground
objects.
im_intensity (array_like) – An ND single channel intensity image.
offsets (array_like, optional) –
A (num_offsets, num_image_dims) array of offset vectors
specifying the distance between the pixel-of-interest and
its neighbor. Note that the first dimension corresponds to
the rows.
See histomicstk.features.graycomatrixext for more details.
num_levels (unsigned int, optional) –
An integer specifying the number of gray levels For example, if
NumLevels is 8, the intensity values of the input image are
scaled so they are integers between 1 and 8. The number of gray
levels determines the size of the gray-level co-occurrence matrix.
Default: 2 for binary/logical image, 32 for numeric image
gray_limits (array_like, optional) –
A two-element array specifying the desired input intensity range.
Intensity values in the input image will be clipped into this range.
Default: [0, 1] for boolean-valued image, [0, 255] for integer-valued
image, and [0.0, 1.0] for-real valued image
Returns:
fdata – A pandas dataframe containing the haralick features.
Return type:
pandas.DataFrame
Notes
This function computes the following list of haralick features derived
from normalized GLCMs (P) of the given list of neighborhood offsets:
Haralick.ASM.Mean, Haralick.ASM.Rangefloat
Mean and range of the angular second moment (ASM) feature for GLCMs
of all offsets. It is a measure of image homogeneity and is computed
as follows:
Mean and range of the Contrast feature for GLCMs of all offsets. It is
a measure of the amount of variation between intensities of
neighboiring pixels. It is equal to zero for a constant image and
increases as the amount of variation increases. It is computed as
follows:
Mean and range of the Correlation feature for GLCMs of all offsets. It
is a measure of correlation between the intensity values of
neighboring pixels. It is computed as follows:
Calculate intensity features from an intensity image.
Parameters:
im_label (array_like) – A labeled mask image wherein intensity of a pixel is the ID of the
object it belongs to. Non-zero values are considered to be foreground
objects.
im_intensity (array_like) – Intensity image.
num_hist_bins (int, optional) – Number of bins used to computed the intensity histogram of an object.
Histogram is used to energy and entropy features. Default is 10.
rprops (output of skimage.measure.regionprops, optional) – rprops = skimage.measure.regionprops( im_label ). If rprops is not
passed then it will be computed inside which will increase the
computation time.
feature_list (list, default is None) – list of intensity features to return.
If none, all intensity features are returned.
Returns:
fdata – A pandas dataframe containing the intensity features listed below for
each object/label.
Return type:
pandas.DataFrame
Notes
List of intensity features computed by this function:
Intensity.Minfloat
Minimum intensity of object pixels.
Intensity.Maxfloat
Maximum intensity of object pixels.
Intensity.Meanfloat
Mean intensity of object pixels
Intensity.Medianfloat
Median intensity of object pixels
Intensity.MeanMedianDifffloat
Difference between mean and median intensities of object pixels.
Intensity.Stdfloat
Standard deviation of the intensities of object pixels
Intensity.IQR: float
Inter-quartile range of the intensities of object pixels
Intensity.MAD: float
Median absolute deviation of the intensities of object pixels
Intensity.Skewnessfloat
Skewness of the intensities of object pixels. Value is 0 when all
intensity values are equal.
Intensity.Kurtosisfloat
Kurtosis of the intensities of object pixels. Value is -3 when all
values are equal.
Intensity.HistEnergyfloat
Energy of the intensity histogram of object pixels
Intensity.HistEntropyfloat
Entropy of the intensity histogram of object pixels.
im_label (array_like) – A labeled mask image wherein intensity of a pixel is the ID of the
object it belongs to. Non-zero values are considered to be foreground
objects.
rprops (output of skimage.measure.regionprops, optional) – rprops = skimage.measure.regionprops( im_label ). If rprops is not
passed then it will be computed inside which will increase the
computation time.
Returns:
fdata – A pandas dataframe containing the morphometry features for each
object/label listed below.
Return type:
pandas.DataFrame
Notes
List of morphometry features computed by this function:
Orientation.Orientationfloat
Angle between the horizontal axis and the major axis of the ellipse
that has the same second moments as the region,
ranging from -pi/2 to pi/2 counter-clockwise.
Size.Areaint
Number of pixels the object occupies.
Size.ConvexHullAreaint
Number of pixels of convex hull image, which is the smallest convex
polygon that encloses the region.
Size.MajorAxisLengthfloat
The length of the major axis of the ellipse that has the same
normalized second central moments as the object.
Size.MinorAxisLengthfloat
The length of the minor axis of the ellipse that has the same
normalized second central moments as the region.
Size.Perimeterfloat
Perimeter of object which approximates the contour as a line
through the centers of border pixels using a 4-connectivity.
Shape.Circularity: float
A measure of how similar the shape of an object is to the circle
Shape.Eccentricityfloat
A measure of aspect ratio computed to be the eccentricity of the
ellipse that has the same second-moments as the object region.
Eccentricity of an ellipse is the ratio of the focal distance
(distance between focal points) over the major axis length. The value
is in the interval [0, 1). When it is 0, the ellipse becomes a circle.
Shape.EquivalentDiameterfloat
The diameter of a circle with the same area as the object.
Shape.Extentfloat
Ratio of area of the object to its axis-aligned bounding box.
A measure of aspect ratio. Ratio of minor to major axis of the ellipse
that has the same second-moments as the object region
Shape.Solidityfloat
A measure of convexity computed as the ratio of the number of pixels
in the object to that of its convex hull.
Shape.HuMoments-kfloat
Where k ranges from 1-7 are the 7 Hu moments features. The first six
moments are translation, scale and rotation invariant, while the
seventh moment flips its sign if the shape is a mirror image.
See https://learnopencv.com/shape-matching-using-hu-moments-c-python/
Shape.WeightedHuMoments-kfloat
Same as Hu moments, but instead of using the binary mask, using the
intensity image.
im_label (array_like) – A labeled mask image wherein intensity of a pixel is the ID of the
object it belongs to. Non-zero values are considered to be foreground
objects.
fsd_bnd_pts (int, optional) – Number of points for boundary resampling to calculate fourier
descriptors. Default value = 128.
fsd_freq_bins (int, optional) – Number of frequency bins for calculating FSDs. Default value = 6.
cyto_width (float, optional) – Estimated width of the ring-like neighborhood region around each
nucleus to be considered as its cytoplasm. Default value = 8.
An integer specifying the number of gray levels For example, if
NumLevels is 32, the intensity values of the input image are
scaled so they are integers between 0 and 31. The number of gray
levels determines the size of the gray-level co-occurrence matrix.
Default: 32
morphometry_features_flag (bool, optional) – A flag that can be used to specify whether or not to compute
morphometry (size and shape) features.
See histomicstk.features.compute_morphometry_features for more details.
fsd_features_flag (bool, optional) – A flag that can be used to specify whether or not to compute
Fouried shape descriptor (FSD) features.
See histomicstk.features.compute_fsd_features for more details.
intensity_features_flag (bool, optional) – A flag that can be used to specify whether or not to compute
intensity features from the nucleus and cytoplasm channels.
See histomicstk.features.compute_fsd_features for more details.
gradient_features_flag (bool, optional) – A flag that can be used to specify whether or not to compute
gradient/edge features from intensity and cytoplasm channels.
See histomicstk.features.compute_gradient_features for more details.
haralick_features_flag (bool, optional) – A flag that can be used to specify whether or not to compute
haralick features from intensity and cytoplasm channels.
See histomicstk.features.compute_haralick_features for more details.
return_nuclei_annotation (bool, optional) – Returns the nuclei annotation if kept True
Returns:
fdata (pandas.DataFrame) – A pandas data frame containing the features listed below for each
object/label
nuclei_annot_list (List) – List containing the boundaries of segmented nuclei in the input image.
Notes
List of features computed by this function
Identifier
Location of the nucleus and its code in the input labeled mask.
Columns are prefixed by Identifier.. These include …
Identifier.Label (int) - nucleus label in the input labeled mask
Identifier.Xmin (int) - Left bound
Identifier.Ymin (int) - Upper bound
Identifier.Xmax (int) - Right bound
Identifier.Ymax (int) - Lower bound
Identifier.CentroidX (float) - X centroid (columns)
Identifier.CentroidY (float) - Y centroid (rows)
Identifier.WeightedCentroidX (float) - intensity-weighted X centroid
Identifier.WeightedCentroidY (float) - intensity-weighted Y centroid
Morphometry (size, shape, and orientation) features of the nuclei
See histomicstk.features.compute_morphometry_features for more details.
Feature names prefixed by Size., Shape., or Orientation..
Fourier shape descriptor features
See histomicstk.features.compute_fsd_features for more details.
Feature names are prefixed by FSD.
Intensity features for the nucleus and cytoplasm channels
See histomicstk.features.compute_fsd_features for more details.
Feature names are prefixed by Nucleus.Intensity. for nucleus features
and Cytoplasm.Intensity. for cytoplasm features.
Gradient/edge features for the nucleus and cytoplasm channels
See histomicstk.features.compute_gradient_features for more details.
Feature names are prefixed by Nucleus.Gradient. for nucleus features
and Cytoplasm.Gradient. for cytoplasm features.
Haralick features for the nucleus and cytoplasm channels
See histomicstk.features.compute_haralick_features for more details.
Feature names are prefixed by Nucleus.Haralick. for nucleus features
and Cytoplasm.Haralick. for cytoplasm features.
Computes gray-level co-occurence matrix (GLCM) within a region of
interest (ROI) of an image. GLCM is a 2D histogram/matrix containing the
counts/probabilities of co-occuring intensity values at a given offset
within an ROI of an image.
Read the documentation to know the default values used for each of the
optional parameter in different scenarios.
Parameters:
im_input (array_like) – Input single channel intensity image
im_roi_mask (array_like, optional) –
A binary mask specifying the region of interest within which
to compute the GLCM. If not specified GLCM is computed for the
the entire image.
Default: None
offsets (array_like, optional) –
A (num_offsets, num_image_dims) array of offset vectors
specifying the distance between the pixel-of-interest and
its neighbor. Note that the first dimension corresponds to
the rows.
Because this offset is often expressed as an angle, the
following table lists the offset values that specify common
angles for a 2D image, given the pixel distance D.
An integer specifying the number of gray levels For example, if
NumLevels is 8, the intensity values of the input image are
scaled so they are integers between 1 and 8. The number of gray
levels determines the size of the gray-level co-occurrence matrix.
Default: 2 for binary/logical image, 32 for numeric image
gray_limits (array_like, optional) –
A two-element array specifying the desired input intensity range.
Intensity values in the input image will be clipped into this range.
Default: [0, 1] for boolean-valued image, [0, 255] for integer-valued
image, and [0.0, 1.0] for-real valued image
A boolean value that specifies whether or not the ordering of values
in pixel pairs is considered while creating the GLCM matrix.
For example, if Symmetric is True, then while calculating the
number of times the value 1 is adjacent to the value 2, both
1,2 and 2,1 pairings are counted. GLCM created in this way is
symmetric across its diagonal.
Specifies whether or not to exclude a pixel-pair if the
neighboring pixel in the pair is outside im_roi_mask.
Has an effect only when im_roi_mask is specified.
Default: False
Returns:
glcm – num_levels x num_levels x num_offsets array containing the GLCM
for each offset.