{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Converting masks back to annotations\n", "\n", "**Overview:**\n", "\n", "![masks_to_annotations](https://user-images.githubusercontent.com/22067552/80078415-e2de0100-851c-11ea-81ce-3b2d74ee6246.png)\n", "\n", "Most segmentation algorithms produce outputs in an image format. Visualizing these outputs in HistomicsUI requires conversion from mask images to an annotation document containing (x,y) coordinates in the whole-slide image coordinate frame. This notebook demonstrates this conversion process in two steps:\n", "\n", "- Converting a mask image into contours (coordinates in the mask frame)\n", "\n", "- Placing contours data into a format following the annotation document schema that can be pushed to DSA for visualization in HistomicsUI.\n", "\n", "This notebook is based on work described in Amgad et al, 2019:\n", "\n", "_Mohamed Amgad, Habiba Elfandy, Hagar Hussein, ..., Jonathan Beezley, Deepak R Chittajallu, David Manthey, David A Gutman, Lee A D Cooper, Structured crowdsourcing enables convolutional segmentation of histology images, Bioinformatics, 2019, btz083_\n", "\n", "**Where to look?**\n", "\n", "```\n", "|_ histomicstk/\n", " |_annotations_and_masks/\n", " | |_masks_to_annotations_handler.py\n", " |_tests/\n", " |_test_masks_to_annotations_handler.py\n", "```" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import os\n", "CWD = os.getcwd()\n", "import girder_client\n", "from pandas import read_csv\n", "from imageio import imread\n", "from histomicstk.annotations_and_masks.masks_to_annotations_handler import (\n", " get_contours_from_mask,\n", " get_single_annotation_document_from_contours,\n", " get_annotation_documents_from_contours)\n", "\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "plt.rcParams['figure.figsize'] = 7, 7" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Connect girder client and set parameters\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": "{'_accessLevel': 2,\n '_id': '59bc677892ca9a0017c2e855',\n '_modelType': 'user',\n 'admin': True,\n 'created': '2017-09-15T23:51:20.203000+00:00',\n 'email': 'mtageld@emory.edu',\n 'emailVerified': False,\n 'firstName': 'Mohamed',\n 'groupInvites': [],\n 'groups': ['59f7713a92ca9a0017a29765',\n '5c607488e62914004d0ff4a6',\n '5e44a2e0ddda5f8398785304',\n '5e76b3f3ddda5f83982beb9a'],\n 'lastName': 'Tageldin',\n 'login': 'kheffah',\n 'otp': False,\n 'public': True,\n 'size': 0,\n 'status': 'enabled'}" }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# APIURL = 'http://demo.kitware.com/histomicstk/api/v1/'\n", "# SAMPLE_SLIDE_ID = '5bbdee92e629140048d01b5d'\n", "APIURL = 'http://candygram.neurology.emory.edu:8080/api/v1/'\n", "SAMPLE_SLIDE_ID = '5d586d76bd4404c6b1f286ae'\n", "\n", "# Connect to girder client\n", "gc = girder_client.GirderClient(apiUrl=APIURL)\n", "gc.authenticate(interactive=True)\n", "# gc.authenticate(apiKey='kri19nTIGOkWH01TbzRqfohaaDWb6kPecRqGmemb')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Let's inspect the ground truth codes file\n", "\n", "This contains the ground truth codes and information dataframe. This is a dataframe that is indexed by the annotation group name and has the following columns:\n", "\n", "- ``group``: group name of annotation (string), eg. \"mostly_tumor\"\n", "- ``GT_code``: int, desired ground truth code (in the mask) Pixels of this value belong to corresponding group (class)\n", "- ``color``: str, rgb format. eg. rgb(255,0,0).\n", "\n", "**NOTE:**\n", "\n", "Zero pixels have special meaning and do not encode specific ground truth class. Instead, they simply mean 'Outside ROI' and should be ignored during model training or evaluation." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# read GTCodes dataframe\n", "GTCODE_PATH = os.path.join(\n", " CWD, '..', '..', 'tests', 'test_files', 'sample_GTcodes.csv')\n", "GTCodes_df = read_csv(GTCODE_PATH)\n", "GTCodes_df.index = GTCodes_df.loc[:, 'group']" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
groupoverlay_orderGT_codeis_roiis_background_classcolorcomments
group
roiroi025410rgb(200,0,150)NaN
evaluation_roievaluation_roi025310rgb(255,0,0)NaN
mostly_tumormostly_tumor1100rgb(255,0,0)core class
mostly_stromamostly_stroma2201rgb(255,125,0)core class
mostly_lymphocytic_infiltratemostly_lymphocytic_infiltrate1300rgb(0,0,255)core class
\n
", "text/plain": " group overlay_order \\\ngroup \nroi roi 0 \nevaluation_roi evaluation_roi 0 \nmostly_tumor mostly_tumor 1 \nmostly_stroma mostly_stroma 2 \nmostly_lymphocytic_infiltrate mostly_lymphocytic_infiltrate 1 \n\n GT_code is_roi is_background_class \\\ngroup \nroi 254 1 0 \nevaluation_roi 253 1 0 \nmostly_tumor 1 0 0 \nmostly_stroma 2 0 1 \nmostly_lymphocytic_infiltrate 3 0 0 \n\n color comments \ngroup \nroi rgb(200,0,150) NaN \nevaluation_roi rgb(255,0,0) NaN \nmostly_tumor rgb(255,0,0) core class \nmostly_stroma rgb(255,125,0) core class \nmostly_lymphocytic_infiltrate rgb(0,0,255) core class " }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "GTCodes_df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Read and visualize mask" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "# read mask\n", "X_OFFSET = 59206\n", "Y_OFFSET = 33505\n", "MASKNAME = 'TCGA-A2-A0YE-01Z-00-DX1.8A2E3094-5755-42BC-969D-7F0A2ECA0F39' + \\\n", " '_left-%d_top-%d_mag-BASE.png' % (X_OFFSET, Y_OFFSET)\n", "MASKPATH = os.path.join(CWD, '..', '..', 'tests', 'test_files', 'annotations_and_masks', MASKNAME)\n", "MASK = imread(MASKPATH)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": "
" }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.figure(figsize=(7,7))\n", "plt.imshow(MASK)\n", "plt.title(MASKNAME[:23])\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Get contours from mask" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This function ``get_contours_from_mask()`` generates contours from a mask image. There are many parameters that can be set but most have defaults set for the most common use cases. The only required parameters you must provide are ``MASK`` and ``GTCodes_df``, but you may want to consider setting the following parameters based on your specific needs: ``get_roi_contour``, ``roi_group``, ``discard_nonenclosed_background``, ``background_group``, that control behaviour regarding region of interest (ROI) boundary and background pixel class (e.g. stroma)." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Parse ground truth mask and gets countours for annotations.\n", "\n", " Parameters\n", " -----------\n", " MASK : nd array\n", " ground truth mask (m,n) where pixel values encode group membership.\n", " GTCodes_df : pandas Dataframe\n", " the ground truth codes and information dataframe.\n", " This is a dataframe that is indexed by the annotation group name and\n", " has the following columns.\n", "\n", " group: str\n", " group name of annotation, eg. mostly_tumor.\n", " GT_code: int\n", " desired ground truth code (in the mask). Pixels of this value\n", " belong to corresponding group (class).\n", " color: str\n", " rgb format. eg. rgb(255,0,0).\n", " groups_to_get : None\n", " if None (default) then all groups (ground truth labels) will be\n", " extracted. Otherwise pass a list fo strings like ['mostly_tumor',].\n", " MIN_SIZE : int\n", " minimum bounding box size of contour\n", " MAX_SIZE : None\n", " if not None, int. Maximum bounding box size of contour. Sometimes\n", " very large contours cause segmentation faults that originate from\n", " opencv and are not caught by python, causing the python process\n", " to unexpectedly hault. If you would like to set a maximum size to\n", " defend against this, a suggested maximum would be 15000.\n", " get_roi_contour : bool\n", " whether to get contour for boundary of region of interest (ROI). This\n", " is most relevant when dealing with multiple ROIs per slide and with\n", " rotated rectangular or polygonal ROIs.\n", " roi_group : str\n", " name of roi group in the GT_Codes dataframe (eg roi)\n", " discard_nonenclosed_background : bool\n", " If a background group contour is NOT fully enclosed, discard it.\n", " This is a purely aesthetic method, makes sure that the background group\n", " contours (eg stroma) are discarded by default to avoid cluttering the\n", " field when posted to DSA for viewing online. The only exception is\n", " if they are enclosed within something else (eg tumor), in which case\n", " they are kept since they represent holes. This is related to\n", " https://github.com/DigitalSlideArchive/HistomicsTK/issues/675\n", " WARNING - This is a bit slower since the contours will have to be\n", " converted to shapely polygons. It is not noticeable for hundreds of\n", " contours, but you will notice the speed difference if you are parsing\n", " thousands of contours. Default, for this reason, is False.\n", " background_group : str\n", " name of background group in the GT_codes dataframe (eg mostly_stroma)\n", " verbose : bool\n", " Print progress to screen?\n", " monitorPrefix : str\n", " text to prepend to printed statements\n", "\n", " Returns\n", " --------\n", " pandas DataFrame\n", " contours extracted from input mask. The following columns are output.\n", "\n", " group : str\n", " annotation group (ground truth label).\n", " color : str\n", " annotation color if it were to be posted to DSA.\n", " is_roi : bool\n", " whether this annotation is a region of interest boundary\n", " ymin : int\n", " minimun y coordinate\n", " ymax : int\n", " maximum y coordinate\n", " xmin : int\n", " minimum x coordinate\n", " xmax : int\n", " maximum x coordinate\n", " has_holes : bool\n", " whether this contour has holes\n", " touches_edge-top : bool\n", " whether this contour touches top mask edge\n", " touches_edge-bottom : bool\n", " whether this contour touches bottom mask edge\n", " touches_edge-left : bool\n", " whether this contour touches left mask edge\n", " touches_edge-right : bool\n", " whether this contour touches right mask edge\n", " coords_x : str\n", " vertix x coordinates comma-separated values\n", " coords_y\n", " vertix y coordinated comma-separated values\n", "\n", " \n" ] } ], "source": [ "print(get_contours_from_mask.__doc__)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Extract contours" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "TCGA-A2-A0YE: getting contours: non-roi: roi: NO OBJECTS!!\n", "TCGA-A2-A0YE: getting contours: non-roi: evaluation_roi: NO OBJECTS!!\n", "TCGA-A2-A0YE: getting contours: non-roi: mostly_tumor: getting contours\n", "TCGA-A2-A0YE: getting contours: non-roi: mostly_tumor: adding contours\n", "TCGA-A2-A0YE: getting contours: non-roi: mostly_stroma: getting contours\n", "TCGA-A2-A0YE: getting contours: non-roi: mostly_stroma: adding contours\n", "TCGA-A2-A0YE: getting contours: non-roi: nest 1 of 11: TOO SIMPLE (1 coordinates) -- IGNORED\n", "TCGA-A2-A0YE: getting contours: non-roi: nest 2 of 11: TOO SIMPLE (2 coordinates) -- IGNORED\n", "TCGA-A2-A0YE: getting contours: non-roi: nest 3 of 11: TOO SIMPLE (1 coordinates) -- IGNORED\n", "TCGA-A2-A0YE: getting contours: non-roi: nest 4 of 11: TOO SIMPLE (1 coordinates) -- IGNORED\n", "TCGA-A2-A0YE: getting contours: non-roi: nest 5 of 11: TOO SMALL (10 x 18 pixels) -- IGNORED\n", "TCGA-A2-A0YE: getting contours: non-roi: nest 6 of 11: TOO SIMPLE (1 coordinates) -- IGNORED\n", "TCGA-A2-A0YE: getting contours: non-roi: nest 8 of 11: TOO SIMPLE (1 coordinates) -- IGNORED\n", "TCGA-A2-A0YE: getting contours: non-roi: nest 9 of 11: TOO SIMPLE (1 coordinates) -- IGNORED\n", "TCGA-A2-A0YE: getting contours: non-roi: mostly_lymphocytic_infiltrate: getting contours\n", "TCGA-A2-A0YE: getting contours: non-roi: mostly_lymphocytic_infiltrate: adding contours\n", "TCGA-A2-A0YE: getting contours: non-roi: nest 5 of 14: TOO SMALL (23 x 74 pixels) -- IGNORED\n", "TCGA-A2-A0YE: getting contours: non-roi: necrosis_or_debris: NO OBJECTS!!\n", "TCGA-A2-A0YE: getting contours: non-roi: glandular_secretions: NO OBJECTS!!\n", "TCGA-A2-A0YE: getting contours: non-roi: mostly_blood: NO OBJECTS!!\n", "TCGA-A2-A0YE: getting contours: non-roi: exclude: getting contours\n", "TCGA-A2-A0YE: getting contours: non-roi: exclude: adding contours\n", "TCGA-A2-A0YE: getting contours: non-roi: metaplasia_NOS: NO OBJECTS!!\n", "TCGA-A2-A0YE: getting contours: non-roi: mostly_fat: NO OBJECTS!!\n", "TCGA-A2-A0YE: getting contours: non-roi: mostly_plasma_cells: NO OBJECTS!!\n", "TCGA-A2-A0YE: getting contours: non-roi: other_immune_infiltrate: NO OBJECTS!!\n", "TCGA-A2-A0YE: getting contours: non-roi: mostly_mucoid_material: NO OBJECTS!!\n", "TCGA-A2-A0YE: getting contours: non-roi: normal_acinus_or_duct: getting contours\n", "TCGA-A2-A0YE: getting contours: non-roi: normal_acinus_or_duct: adding contours\n", "TCGA-A2-A0YE: getting contours: non-roi: lymphatics: NO OBJECTS!!\n", "TCGA-A2-A0YE: getting contours: non-roi: undetermined: NO OBJECTS!!\n", "TCGA-A2-A0YE: getting contours: non-roi: nerve: NO OBJECTS!!\n", "TCGA-A2-A0YE: getting contours: non-roi: skin_adnexia: NO OBJECTS!!\n", "TCGA-A2-A0YE: getting contours: non-roi: blood_vessel: getting contours\n", "TCGA-A2-A0YE: getting contours: non-roi: blood_vessel: adding contours\n", "TCGA-A2-A0YE: getting contours: non-roi: angioinvasion: NO OBJECTS!!\n", "TCGA-A2-A0YE: getting contours: non-roi: mostly_dcis: NO OBJECTS!!\n", "TCGA-A2-A0YE: getting contours: non-roi: other: NO OBJECTS!!\n", "TCGA-A2-A0YE: getting contours: discarding backgrnd: discarded 3 contours\n", "TCGA-A2-A0YE: getting contours: roi: roi: getting contours\n", "TCGA-A2-A0YE: getting contours: roi: roi: adding contours\n" ] } ], "source": [ "# Let's extract all contours from a mask, including ROI boundary. We will\n", "# be discarding any stromal contours that are not fully enclosed within a\n", "# non-stromal contour since we already know that stroma is the background\n", "# group. This is so things look uncluttered when posted to DSA.\n", "groups_to_get = None\n", "contours_df = get_contours_from_mask(\n", " MASK=MASK, GTCodes_df=GTCodes_df, groups_to_get=groups_to_get,\n", " get_roi_contour=True, roi_group='roi',\n", " discard_nonenclosed_background=True,\n", " background_group='mostly_stroma',\n", " MIN_SIZE=30, MAX_SIZE=None, verbose=True,\n", " monitorPrefix=MASKNAME[:12] + ': getting contours')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Let's inspect the contours dataframe\n", "\n", "The columns that really matter here are ``group``, ``color``, ``coords_x``, and ``coords_y``." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
groupcoloryminymaxxminxmaxhas_holestouches_edge-toptouches_edge-lefttouches_edge-bottomtouches_edge-rightcoords_xcoords_y
0roirgb(200,0,150)0.04593.00.04541.00.01.01.01.01.02835,2834,2833,2832,2831,2830,2829,2827,2826,2...0,1,1,2,2,3,3,5,5,6,6,8,8,9,9,10,10,12,12,13,1...
1mostly_tumorrgb(255,0,0)4269.04560.01639.02039.01.00.00.00.00.01673,1672,1668,1667,1662,1661,1659,1658,1658,1...4269,4270,4270,4271,4271,4272,4272,4273,4274,4...
2mostly_tumorrgb(255,0,0)3764.04282.01607.02187.00.00.00.00.00.01770,1769,1768,1767,1765,1764,1762,1761,1760,1...3764,3765,3765,3766,3766,3767,3767,3768,3768,3...
3mostly_tumorrgb(255,0,0)3712.04051.01201.01411.00.00.00.00.00.01214,1213,1211,1210,1208,1207,1206,1205,1203,1...3712,3713,3713,3714,3714,3715,3715,3716,3716,3...
4mostly_tumorrgb(255,0,0)3356.03748.03108.03540.00.00.00.00.00.03342,3341,3337,3336,3332,3331,3328,3327,3326,3...3356,3357,3357,3358,3358,3359,3359,3360,3360,3...
\n
", "text/plain": " group color ymin ymax xmin xmax has_holes \\\n0 roi rgb(200,0,150) 0.0 4593.0 0.0 4541.0 0.0 \n1 mostly_tumor rgb(255,0,0) 4269.0 4560.0 1639.0 2039.0 1.0 \n2 mostly_tumor rgb(255,0,0) 3764.0 4282.0 1607.0 2187.0 0.0 \n3 mostly_tumor rgb(255,0,0) 3712.0 4051.0 1201.0 1411.0 0.0 \n4 mostly_tumor rgb(255,0,0) 3356.0 3748.0 3108.0 3540.0 0.0 \n\n touches_edge-top touches_edge-left touches_edge-bottom \\\n0 1.0 1.0 1.0 \n1 0.0 0.0 0.0 \n2 0.0 0.0 0.0 \n3 0.0 0.0 0.0 \n4 0.0 0.0 0.0 \n\n touches_edge-right coords_x \\\n0 1.0 2835,2834,2833,2832,2831,2830,2829,2827,2826,2... \n1 0.0 1673,1672,1668,1667,1662,1661,1659,1658,1658,1... \n2 0.0 1770,1769,1768,1767,1765,1764,1762,1761,1760,1... \n3 0.0 1214,1213,1211,1210,1208,1207,1206,1205,1203,1... \n4 0.0 3342,3341,3337,3336,3332,3331,3328,3327,3326,3... \n\n coords_y \n0 0,1,1,2,2,3,3,5,5,6,6,8,8,9,9,10,10,12,12,13,1... \n1 4269,4270,4270,4271,4271,4272,4272,4273,4274,4... \n2 3764,3765,3765,3766,3766,3767,3767,3768,3768,3... \n3 3712,3713,3713,3714,3714,3715,3715,3716,3716,3... \n4 3356,3357,3357,3358,3358,3359,3359,3360,3360,3... " }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "contours_df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Get annotation documents from contours\n", "\n", "This method ``get_annotation_documents_from_contours()`` generates formatted annotation documents from contours that can be posted to the DSA server." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Given dataframe of contours, get list of annotation documents.\n", "\n", " This method parses a dataframe of contours to a list of dictionaries, each\n", " of which represents and large_image style annotation. This is a wrapper\n", " that extends the functionality of the method\n", " get_single_annotation_document_from_contours(), whose docstring should\n", " be referenced for implementation details and further explanation.\n", "\n", " Parameters\n", " -----------\n", " contours_df : pandas DataFrame\n", " WARNING - This is modified inside the function, so pass a copy.\n", " This dataframe includes data on contours extracted from input mask\n", " using get_contours_from_mask(). If you have contours using some other\n", " method, just make sure the dataframe follows the same schema as the\n", " output from get_contours_from_mask(). You may find a sample dataframe\n", " in thie repo at ./tests/test_files/annotations_and_masks/sample_contours_df.tsv\n", " The following columns are relevant for this method.\n", "\n", " group : str\n", " annotation group (ground truth label).\n", " color : str\n", " annotation color if it were to be posted to DSA.\n", " coords_x : str\n", " vertix x coordinates comma-separated values\n", " coords_y\n", " vertix y coordinated comma-separated values\n", " separate_docs_by_group : bool\n", " if set to True, you get one or more annotation documents (dicts)\n", " for each group (eg tumor) independently.\n", " annots_per_doc : int\n", " maximum number of annotation elements (polygons) per dict. The smaller\n", " this number, the more numerous the annotation documents, but the more\n", " seamless it is to post this data to the DSA server or to view using the\n", " HistomicsTK interface since you will be loading smaller chunks of data\n", " at a time.\n", " annprops : dict\n", " properties of annotation elements. Contains the following keys\n", " F, X_OFFSET, Y_OFFSET, opacity, lineWidth. Refer to\n", " get_single_annotation_document_from_contours() for details.\n", " docnamePrefix : str\n", " test to prepend to annotation document name\n", " verbose : bool\n", " Print progress to screen?\n", " monitorPrefix : str\n", " text to prepend to printed statements\n", "\n", " Returns\n", " --------\n", " list of dicts\n", " DSA-style annotation document.\n", "\n", " \n" ] } ], "source": [ "print(get_annotation_documents_from_contours.__doc__)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As mentioned in the docs, this function wraps ``get_single_annotation_document_from_contours()``" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Given dataframe of contours, get annotation document.\n", "\n", " This uses the large_image annotation schema to create an annotation\n", " document that maybe posted to DSA for viewing using something like:\n", " resp = gc.post(\"/annotation?itemId=\" + slide_id, json=annotation_doc)\n", " The annotation schema can be found at:\n", " github.com/girder/large_image/blob/master/docs/annotations.md .\n", "\n", " Parameters\n", " -----------\n", " contours_df_slice : pandas DataFrame\n", " The following columns are of relevance and must be contained.\n", "\n", " group : str\n", " annotation group (ground truth label).\n", " color : str\n", " annotation color if it were to be posted to DSA.\n", " coords_x : str\n", " vertix x coordinates comma-separated values\n", " coords_y\n", " vertix y coordinated comma-separated values\n", " docname : str\n", " annotation document name\n", " F : float\n", " how much smaller is the mask where the contours come from is relative\n", " to the slide scan magnification. For example, if the mask is at 10x\n", " whereas the slide scan magnification is 20x, then F would be 2.0.\n", " X_OFFSET : int\n", " x offset to add to contours at BASE (SCAN) magnification\n", " Y_OFFSET : int\n", " y offset to add to contours at BASE (SCAN) magnification\n", " opacity : float\n", " opacity of annotation elements (in the range [0, 1])\n", " lineWidth : float\n", " width of boarders of annotation elements\n", " verbose : bool\n", " Print progress to screen?\n", " monitorPrefix : str\n", " text to prepend to printed statements\n", "\n", " Returns\n", " --------\n", " dict\n", " DSA-style annotation document ready to be post for viewing.\n", "\n", " \n" ] } ], "source": [ "print(get_single_annotation_document_from_contours.__doc__)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's get a list of annotation documents (each is a dictionary). For the purpose of this tutorial, \n", "we separate the documents by group (i.e. each document is composed of polygons from the same\n", "style/group). You could decide to allow heterogeneous groups in the same annotation document by\n", "setting ``separate_docs_by_group`` to ``False``. We place 10 polygons in each document for this demo\n", "for illustration purposes. Realistically you would want each document to contain several hundred depending on their complexity. Placing too many polygons in each document can lead to performance issues when rendering in HistomicsUI." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Get annotation documents" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "TCGA-A2-A0YE: annotation docs: mostly_lymphocytic_infiltrate: doc 1 of 1: contour 1 of 13\n", "TCGA-A2-A0YE: annotation docs: mostly_lymphocytic_infiltrate: doc 1 of 1: contour 2 of 13\n", "TCGA-A2-A0YE: annotation docs: mostly_lymphocytic_infiltrate: doc 1 of 1: contour 3 of 13\n", "TCGA-A2-A0YE: annotation docs: mostly_lymphocytic_infiltrate: doc 1 of 1: contour 4 of 13\n", "TCGA-A2-A0YE: annotation docs: mostly_lymphocytic_infiltrate: doc 1 of 1: contour 5 of 13\n", "TCGA-A2-A0YE: annotation docs: mostly_lymphocytic_infiltrate: doc 1 of 1: contour 6 of 13\n", "TCGA-A2-A0YE: annotation docs: mostly_lymphocytic_infiltrate: doc 1 of 1: contour 7 of 13\n", "TCGA-A2-A0YE: annotation docs: mostly_lymphocytic_infiltrate: doc 1 of 1: contour 8 of 13\n", "TCGA-A2-A0YE: annotation docs: mostly_lymphocytic_infiltrate: doc 1 of 1: contour 9 of 13\n", "TCGA-A2-A0YE: annotation docs: mostly_lymphocytic_infiltrate: doc 1 of 1: contour 10 of 13\n", "TCGA-A2-A0YE: annotation docs: mostly_lymphocytic_infiltrate: doc 1 of 1: contour 11 of 13\n", "TCGA-A2-A0YE: annotation docs: mostly_lymphocytic_infiltrate: doc 1 of 1: contour 12 of 13\n", "TCGA-A2-A0YE: annotation docs: mostly_lymphocytic_infiltrate: doc 1 of 1: contour 13 of 13\n", "TCGA-A2-A0YE: annotation docs: exclude: doc 1 of 1: contour 1 of 1\n", "TCGA-A2-A0YE: annotation docs: blood_vessel: doc 1 of 1: contour 1 of 3\n", "TCGA-A2-A0YE: annotation docs: blood_vessel: doc 1 of 1: contour 2 of 3\n", "TCGA-A2-A0YE: annotation docs: blood_vessel: doc 1 of 1: contour 3 of 3\n", "TCGA-A2-A0YE: annotation docs: roi: doc 1 of 1: contour 1 of 1\n", "TCGA-A2-A0YE: annotation docs: normal_acinus_or_duct: doc 1 of 1: contour 1 of 2\n", "TCGA-A2-A0YE: annotation docs: normal_acinus_or_duct: doc 1 of 1: contour 2 of 2\n", "TCGA-A2-A0YE: annotation docs: mostly_tumor: doc 1 of 2: contour 1 of 10\n", "TCGA-A2-A0YE: annotation docs: mostly_tumor: doc 1 of 2: contour 2 of 10\n", "TCGA-A2-A0YE: annotation docs: mostly_tumor: doc 1 of 2: contour 3 of 10\n", "TCGA-A2-A0YE: annotation docs: mostly_tumor: doc 1 of 2: contour 4 of 10\n", "TCGA-A2-A0YE: annotation docs: mostly_tumor: doc 1 of 2: contour 5 of 10\n", "TCGA-A2-A0YE: annotation docs: mostly_tumor: doc 1 of 2: contour 6 of 10\n", "TCGA-A2-A0YE: annotation docs: mostly_tumor: doc 1 of 2: contour 7 of 10\n", "TCGA-A2-A0YE: annotation docs: mostly_tumor: doc 1 of 2: contour 8 of 10\n", "TCGA-A2-A0YE: annotation docs: mostly_tumor: doc 1 of 2: contour 9 of 10\n", "TCGA-A2-A0YE: annotation docs: mostly_tumor: doc 1 of 2: contour 10 of 10\n", "TCGA-A2-A0YE: annotation docs: mostly_tumor: doc 2 of 2: contour 1 of 15\n", "TCGA-A2-A0YE: annotation docs: mostly_tumor: doc 2 of 2: contour 2 of 15\n", "TCGA-A2-A0YE: annotation docs: mostly_tumor: doc 2 of 2: contour 3 of 15\n", "TCGA-A2-A0YE: annotation docs: mostly_tumor: doc 2 of 2: contour 4 of 15\n", "TCGA-A2-A0YE: annotation docs: mostly_tumor: doc 2 of 2: contour 5 of 15\n", "TCGA-A2-A0YE: annotation docs: mostly_tumor: doc 2 of 2: contour 6 of 15\n", "TCGA-A2-A0YE: annotation docs: mostly_tumor: doc 2 of 2: contour 7 of 15\n", "TCGA-A2-A0YE: annotation docs: mostly_tumor: doc 2 of 2: contour 8 of 15\n", "TCGA-A2-A0YE: annotation docs: mostly_tumor: doc 2 of 2: contour 9 of 15\n", "TCGA-A2-A0YE: annotation docs: mostly_tumor: doc 2 of 2: contour 10 of 15\n", "TCGA-A2-A0YE: annotation docs: mostly_tumor: doc 2 of 2: contour 11 of 15\n", "TCGA-A2-A0YE: annotation docs: mostly_tumor: doc 2 of 2: contour 12 of 15\n", "TCGA-A2-A0YE: annotation docs: mostly_tumor: doc 2 of 2: contour 13 of 15\n", "TCGA-A2-A0YE: annotation docs: mostly_tumor: doc 2 of 2: contour 14 of 15\n", "TCGA-A2-A0YE: annotation docs: mostly_tumor: doc 2 of 2: contour 15 of 15\n" ] } ], "source": [ "# get list of annotation documents\n", "annprops = {\n", " 'X_OFFSET': X_OFFSET,\n", " 'Y_OFFSET': Y_OFFSET,\n", " 'opacity': 0.2,\n", " 'lineWidth': 4.0,\n", "}\n", "annotation_docs = get_annotation_documents_from_contours(\n", " contours_df.copy(), separate_docs_by_group=True, annots_per_doc=10,\n", " docnamePrefix='demo', annprops=annprops,\n", " verbose=True, monitorPrefix=MASKNAME[:12] + ': annotation docs')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Let's examine one of the documents. \n", "\n", "Limit display to the first two elements (polygons) and cap the vertices for clarity." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "ann_doc = annotation_docs[0].copy()\n", "ann_doc['elements'] = ann_doc['elements'][:2]\n", "for i in range(2):\n", " ann_doc['elements'][i]['points'] = ann_doc['elements'][i]['points'][:5]" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": "{'name': 'demo_mostly_lymphocytic_infiltrate-0',\n 'description': '',\n 'elements': [{'group': 'mostly_lymphocytic_infiltrate',\n 'type': 'polyline',\n 'lineColor': 'rgb(0,0,255)',\n 'lineWidth': 4.0,\n 'closed': True,\n 'points': [[61974.0, 37427.0, 0.0],\n [61975.0, 37428.0, 0.0],\n [61975.0, 37429.0, 0.0],\n [61976.0, 37430.0, 0.0],\n [61976.0, 37431.0, 0.0]],\n 'label': {'value': 'mostly_lymphocytic_infiltrate'},\n 'fillColor': 'rgba(0,0,255,0.2)'},\n {'group': 'mostly_lymphocytic_infiltrate',\n 'type': 'polyline',\n 'lineColor': 'rgb(0,0,255)',\n 'lineWidth': 4.0,\n 'closed': True,\n 'points': [[60531.0, 37045.0, 0.0],\n [60528.0, 37048.0, 0.0],\n [60527.0, 37048.0, 0.0],\n [60522.0, 37053.0, 0.0],\n [60522.0, 37054.0, 0.0]],\n 'label': {'value': 'mostly_lymphocytic_infiltrate'},\n 'fillColor': 'rgba(0,0,255,0.2)'}]}" }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ann_doc" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Post the annotation to the correct item/slide in DSA" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "# deleting existing annotations in target slide (if any)\n", "existing_annotations = gc.get('/annotation/item/' + SAMPLE_SLIDE_ID)\n", "for ann in existing_annotations:\n", " gc.delete('/annotation/%s' % ann['_id'])\n", "\n", "# post the annotation documents you created\n", "for annotation_doc in annotation_docs:\n", " resp = gc.post(\n", " '/annotation?itemId=' + SAMPLE_SLIDE_ID, json=annotation_doc)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now you can go to HistomicsUI and confirm that the posted annotations make\n", "sense and correspond to tissue boundaries and expected labels." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.8" } }, "nbformat": 4, "nbformat_minor": 2 }