{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Analyzing remotely hosted images with the girder client\n", "\n", "This example describes how to run image analysis tasks in a workflow process on collections of slides hosted on a DSA server. The API is used to retrieve pixel data from the server and the analysis is performed locally, with results pushed back to the server as anotations for visualization. Note that this is not the same approach as using girder tasks to run jobs remotely as a user through HistomicsUI. This is a utility intended for use by developers of image analysis and machine learning algorithms.\n", "\n", "In this example, we will be running a cellularity detection workflow on\n", "all slides in the following [source girder directory](http://candygram.neurology.emory.edu:8080/#collection/59a5c4e692ca9a00174d77d6/folder/5d5c28c6bd4404c6b1f3d598) and the results are posted to the following [results girder directory](http://candygram.neurology.emory.edu:8080/#collection/59a5c4e692ca9a00174d77d6/folder/5d9246f6bd4404c6b1faaa89).\n", "\n", "**Where to look?**\n", "\n", "```\n", "|_ histomicstk/\n", " |_workflows/\n", " |_workflow_runner.py \n", " |_specific_workflows.py \n", " |_tests/\n", " |_test_workflow_runner.py\n", "```" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import tempfile\n", "import girder_client\n", "# import numpy as np\n", "from pandas import read_csv\n", "from histomicstk.workflows.workflow_runner import Slide_iterator\n", "# from histomicstk.saliency.cellularity_detection import (\n", "# Cellularity_detector_superpixels)\n", "from histomicstk.saliency.cellularity_detection_thresholding import (\n", " Cellularity_detector_thresholding)\n", "from histomicstk.workflows.workflow_runner import (\n", " Workflow_runner, Slide_iterator)\n", "from histomicstk.workflows.specific_workflows import (\n", " cellularity_detection_workflow)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Connect girder client and set analysis parameters" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "APIURL = 'http://candygram.neurology.emory.edu:8080/api/v1/'\n", "SAMPLE_SOURCE_FOLDER_ID = '5d5c28c6bd4404c6b1f3d598'\n", "SAMPLE_DESTINATION_FOLDER_ID = '5d9246f6bd4404c6b1faaa89'\n", "\n", "# girder client\n", "gc = girder_client.GirderClient(apiUrl=APIURL)\n", "# gc.authenticate(interactive=True)\n", "gc.authenticate(apiKey='kri19nTIGOkWH01TbzRqfohaaDWb6kPecRqGmemb')\n", "\n", "# This is where the run logs will be saved\n", "logging_savepath = tempfile.mkdtemp()\n", "\n", "# params for cellularity thresholding\n", "cdt_params = {\n", " 'gc': gc,\n", " 'slide_id': '', # this will be handled by the slide iterator\n", " 'GTcodes': read_csv('../../histomicstk/saliency/tests/saliency_GTcodes.csv'),\n", " 'MAG': 3.0,\n", " 'visualize': True,\n", " 'verbose': 2,\n", " 'logging_savepath': logging_savepath,\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Explore the docs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Workflow_runner\n", "\n", "You will need to use the `Workflow_runner()` object." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Init Workflow_runner object.\n", "\n", " Arguments\n", " -----------\n", " slide_iterator : object\n", " Slide_iterator object\n", " workflow : method\n", " method whose parameters include slide_id and monitorPrefix,\n", " which is called for each slide\n", " workflow_kwargs : dict\n", " keyword arguments for the workflow method\n", " kwargs : key-value pairs\n", " The following are already assigned defaults by Base_HTK_Class\n", " but can be passed here to override defaults\n", " [verbose, monitorPrefix, logging_savepath, suppress_warnings]\n", "\n", " \n" ] } ], "source": [ "print(Workflow_runner.__init__.__doc__)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note how this requires:\n", "- `Slide_iterator` instance - which yields information\n", "about the slides you want to the run the workflow on.\n", "- `workflow` - a method that you define, which runs on a single slide.\n", "- `workflow_kwargs` - parameters for your defined method. \n", "\n", "In this example, we will be using cellularity_detection_workflow() as our workflow\n", "to run, which is defined in the `histomicstk.workflows.specific_workflows` module." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Slide_iterator" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Init Slide_iterator object.\n", "\n", " Arguments\n", " -----------\n", " gc : object\n", " girder client object\n", " source_folder_id : str\n", " girder ID of folder in which slides are located\n", " keep_slides : list\n", " List of slide names to keep. If None, all are kept.\n", " discard_slides : list\n", " List of slide names to discard.\n", " kwargs : key-value pairs\n", " The following are already assigned defaults by Base_HTK_Class\n", " but can be passed here to override defaults\n", " [verbose, monitorPrefix, logger, logging_savepath,\n", " suppress_warnings]\n", "\n", " \n" ] } ], "source": [ "print(Slide_iterator.__init__.__doc__)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Specific workflow to run" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Run cellularity detection for single slide.\n", "\n", " The cellularity detection algorithm can either be\n", " Cellularity_detector_superpixels or Cellularity_detector_thresholding.\n", "\n", " Arguments\n", " -----------\n", " gc : object\n", " girder client object\n", " cdo : object\n", " Cellularity_detector object instance. Can either be\n", " Cellularity_detector_superpixels() or\n", " Cellularity_detector_thresholding(). The thresholding-based workflow\n", " seems to be more robust, despite being simpler.\n", " slide_id : str\n", " girder id of slide on which workflow is done\n", " monitoPrefix : str\n", " this will set the cds monitorPrefix attribute\n", " destination_folder_id : str or None\n", " if not None, copy slide to this girder folder and post results\n", " there instead of original slide.\n", " keep_existing_annotations : bool\n", " keep existing annotations in slide when posting results?\n", "\n", " \n" ] } ], "source": [ "print(cellularity_detection_workflow.__doc__)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Initialize the workflow runner" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Saving logs to: /tmp/tmpz7t9bo3m/2019-10-27_18-53.log\n", "Saving logs to: /tmp/tmpz7t9bo3m/2019-10-27_18-53.log\n" ] } ], "source": [ "# Init specific workflow (Cellularity_detector_thresholding)\n", "cdt = Cellularity_detector_thresholding(**cdt_params)\n", "\n", "# Init workflow runner\n", "workflow_runner = Workflow_runner(\n", " slide_iterator=Slide_iterator(\n", " gc, source_folder_id=SAMPLE_SOURCE_FOLDER_ID,\n", " # keep_slides=None), # run all slides in girder directory\n", " keep_slides=[ # run specific slides only\n", " 'TCGA-A1-A0SK-01Z-00-DX1_POST.svs',\n", " 'TCGA-A2-A04Q-01Z-00-DX1_POST.svs',\n", " ]),\n", " workflow=cellularity_detection_workflow,\n", " workflow_kwargs={\n", " 'gc': gc,\n", " 'cdo': cdt,\n", " 'destination_folder_id': SAMPLE_DESTINATION_FOLDER_ID,\n", " 'keep_existing_annotations': False },\n", " logging_savepath=cdt.logging_savepath,\n", " monitorPrefix='test')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Run the detector" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): copying slide to destination folder\n", "test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): set_slide_info_and_get_tissue_mask()\n", "test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1\n", "test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1: set_tissue_rgb()\n", "test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1: initialize_labeled_mask()\n", "test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1: assign_components_by_thresholding()\n", "test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1: -- get HSI and LAB images ...\n", "test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1: -- thresholding blue_sharpie ...\n", "test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1: -- thresholding blood ...\n", "test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1: -- thresholding whitespace ...\n", "test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1: color_normalize_unspecified_components()\n", "test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1: -- macenko normalization ...\n", "test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1: find_potentially_cellular_regions()\n", "test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1: find_top_cellular_regions()\n", "test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1: visualize_results()\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): copying slide to destination folder\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): set_slide_info_and_get_tissue_mask()\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2: set_tissue_rgb()\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2: initialize_labeled_mask()\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2: assign_components_by_thresholding()\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2: -- get HSI and LAB images ...\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2: -- thresholding blue_sharpie ...\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2: -- thresholding blood ...\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2: -- thresholding whitespace ...\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2: color_normalize_unspecified_components()\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2: -- macenko normalization ...\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2: find_potentially_cellular_regions()\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2: find_top_cellular_regions()\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2: visualize_results()\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2: set_tissue_rgb()\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2: initialize_labeled_mask()\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2: assign_components_by_thresholding()\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2: -- get HSI and LAB images ...\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2: -- thresholding blue_sharpie ...\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2: -- thresholding blood ...\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2: -- thresholding whitespace ...\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2: color_normalize_unspecified_components()\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2: -- macenko normalization ...\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2: find_potentially_cellular_regions()\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2: find_top_cellular_regions()\n", "test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2: visualize_results()\n" ] } ], "source": [ "workflow_runner.run()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Check the DSA/HistomicsTK visualization\n", "\n", "Now you may go to the Digital Slide Archive and check the posted results at the [results girder directory](http://candygram.neurology.emory.edu:8080/#collection/59a5c4e692ca9a00174d77d6/folder/5d9246f6bd4404c6b1faaa89)." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.8" } }, "nbformat": 4, "nbformat_minor": 2 }