Usage¶
Basic Usage¶
Note
To load images, we relly on the OpenSlide library. Please only use images that are supported by OpenSlide.
If you encounter any issues with loading images, please check the OpenSlide documentation for supported formats.
If you are using a virtual environment, ensure it is activated before running the command.
If you want to use .tiff
or other plain images that are not supported by OpenSlide, install libvips and convert them to pyramid format.
For example, you can use the following command to convert a plain image to pyramid format:
vips tiffsave input.tiff output.tiff --tile --tile-width 256 --tile-height 256 --pyramid --compression jpeg --Q 90 --vips-progress
This will create a pyramid image that can be used with CellViT.
This package is designed as a command-line tool. Configuration can be provided either directly via the CellViT CLI or by using a configuration file. The configuration file is a YAML file containing the settings for the inference pipeline. The main script is located in the cellvit module, and can be run using the following command:
cellvit-inference
You then have to either specify a configuration file
cellvit-inference --config <path_to_config_file>
or provide the required parameters directly in the command line. To list all available parameters, run:
cellvit-inference --help
You can select to run inference for one slide only or for a batch of slides.
Configuration Options¶
The configuration options are divided into several sections, each with its own purpose. Below is a summary of the main sections and their parameters. They appear in the .yaml file as well as in the CLI.
Danger
Always one of the process_wsi or process_dataset options must be selected. They are mutually exclusive.
Section |
Name |
Description |
Type |
Default |
Required |
Further Information |
---|---|---|---|---|---|---|
General |
||||||
model |
Segmentation model to use
Choices: [“SAM”, “HIPT”]
|
str |
➖ |
✔️ |
||
nuclei_taxonomy |
Defines the nuclei classification taxonomy
Choices: [“binary”, “pannuke”, “consep”, “lizard”, “midog”, “nucls_main”, “nucls_super”, “ocelot”, “panoptils”]
|
str |
“pannuke” |
➖ |
||
Inference |
||||||
gpu |
GPU ID to use for inference |
int |
0 |
➖ |
||
enforce_amp |
Whether to use Automatic Mixed Precision (AMP) for inference |
bool |
false |
➖ |
||
batch_size |
Number of images (1024 x 1024 patches) processed per batch |
int |
8 |
➖ |
||
Output Settings |
||||||
outdir |
Path to the output directory where results will be stored |
str |
➖ |
✔️ |
||
geojson |
Whether to export results in GeoJSON format (for QuPath or other tools) |
bool |
false |
➖ |
||
graph |
Whether to generate a cell graph representation |
bool |
false |
➖ |
||
compression |
Whether to use Snappy compression for output files |
bool |
false |
➖ |
||
System |
||||||
cpu_count |
Number of CPU cores to use for inference |
int |
System configuration |
➖ |
||
ray_worker |
Number of ray worker to use for inference (limited by cpu-count) |
int |
System configuration |
➖ |
||
ray_remote_cpus |
Number of CPUs per ray worker |
int |
System configuration |
➖ |
||
memory |
RAM in MB to use |
int |
System configuration |
➖ |
||
Debug |
||||||
debug |
If debug should be used - this changes to logger level and requires ray[default]. Also outputs segmentation mask of the tissue preprocessing |
bool |
false |
➖ |
||
Processing Mode: Process a Single Whole Slide Image (WSI) |
||||||
wsi_path |
Path to the Whole Slide Image (WSI) file |
str |
➖ |
✔️ |
||
wsi_mpp |
Microns per pixel (spatial resolution of the slide) |
float |
Extracted automatically from file (if available) |
➖ |
||
wsi_magnification |
Magnification level of the slide (e.g., 40) |
int |
Extracted automatically from file (if available) |
➖ |
||
Processing Mode: Process a Dataset (Multiple WSI Files) |
||||||
wsi_folder |
Path to a folder containing multiple WSI files |
str |
➖ |
✔️ (if wsi_filelist is NOT used) |
||
wsi_filelist |
Path to a CSV file listing WSI files (must have a ‘path’ column) |
str |
➖ |
✔️ (if wsi_folder is NOT used) |
||
wsi_extension |
File extension of WSI files (used for detection within wsi_folder) |
str |
svs |
➖ |
||
wsi_mpp |
Microns per pixel (spatial resolution of the slide). Overwrites slide settings and also mpp set in the filelist |
float |
Extracted automatically from file (if available) |
➖ |
||
wsi_magnification |
Magnification level of the slide (e.g., 40). Overwrites slide settings and also magnification set in the filelist |
int |
Extracted automatically from file (if available) |
➖ |
YAML-Configuration¶
The configuration file for CellViT Inference is structured in .yaml
format. Below is an example configuration with explanations for each setting.
# ==========================
# CellViT Inference Config
# ==========================
# Model selection (REQUIRED)
model: # REQUIRED | str: Segmentation model to use.
# Choices: ["SAM", "HIPT"]
# Nuclei classification taxonomy (OPTIONAL)
nuclei_taxonomy: # OPTIONAL | str: Defines the nuclei classification taxonomy.
# Choices: ["binary", "pannuke", "consep", "lizard", "midog", "nucls_main", "nucls_super", "ocelot", "panoptils"]
# Default: "pannuke"
# ==========================
# Inference Settings (OPTIONAL)
# ==========================
inference:
gpu: # OPTIONAL | int: GPU ID to use for inference.
# Default: 0 (use first available GPU)
enforce_amp: # OPTIONAL | bool: Whether to use Automatic Mixed Precision (AMP) for inference.
# Default: false (disabled)
batch_size: # OPTIONAL | int: Number of images (1024 x 1024 patches) processed per batch.
# Default: 8
# ==========================
# Output Settings
# ==========================
output_format:
outdir: # REQUIRED | str: Path to the output directory where results will be stored.
geojson: # OPTIONAL | bool: Whether to export results in GeoJSON format (for QuPath or other tools).
# Default: false (disabled)
graph: # OPTIONAL | bool: Whether to generate a cell graph representation.
# Default: false (disabled)
compression: # OPTIONAL | bool: Whether to use Snappy compression for output files.
# Default: false (disabled)
# ==========================
# Processing Mode (Choose One)
# ==========================
# Either `process_wsi` (for a single image) or `process_dataset` (for multiple images) should be used.
# --- Process a Single Whole Slide Image (WSI) ---
process_wsi:
wsi_path: # REQUIRED | str: Path to the Whole Slide Image (WSI) file.
wsi_mpp: # OPTIONAL | float: Microns per pixel (spatial resolution of the slide).
# Default: Extracted automatically from file (if available).
wsi_magnification: # OPTIONAL | int: Magnification level of the slide (e.g., 20x, 40x).
# Default: Extracted automatically from file (if available).
# --- Process a Dataset (Multiple WSI Files) ---
process_dataset:
wsi_folder: # REQUIRED (if `wsi_filelist` is NOT used) | str: Path to a folder containing multiple WSI files.
# Either `wsi_folder` OR `wsi_filelist` must be provided (not both).
wsi_extension: # OPTIONAL | str: File extension of WSI files (used for detection within wsi_folder).
# Default: "svs"
wsi_filelist: # REQUIRED (if `wsi_folder` is NOT used) | str: Path to a CSV file listing WSI files.
# CSV must have a 'path' column, with optional 'wsi_mpp' and 'wsi_magnification' columns.
# If 'wsi_mpp' and 'wsi_magnification' are provided, they override global settings.
wsi_mpp: # OPTIONAL | float: Microns per pixel (spatial resolution).
# Default: Extracted automatically from file (if available).
# Can be used with both `wsi_folder` and `wsi_filelist`.
wsi_magnification: # OPTIONAL | int: Magnification level of the slides.
# Default: Extracted automatically from file (if available).
# Can be used with both `wsi_folder` and `wsi_filelist`.
# ==========================
# System Settings (OPTIONAL)
# ==========================
system:
cpu_count: # OPTIONAL | int: Number of CPU cores to use for inference.
# Default: Uses system configuration.
ray_worker: # OPTIONAL | int: Number of ray workers to use for inference. Limited by cpu_count.
# Default: Uses system configuration.
ray_remote_cpus: # OPTIONAL | int: Number of CPUs per ray worker.
# Default: Uses system configuration.
memory: # OPTIONAL | int: RAM in MB to use.
# Default: Uses system configuration.
# ==========================
# Debug Settings (OPTIONAL)
# ==========================
debug: # OPTIONAL | bool: If debug should be used - this changes to logger level and requires ray[default]
# Default: False
Examples for .yaml
configuration files can be found in the Examples section.
Note
The configuration file must be in YAML format.
Either run a single WSI or a dataset of WSIs, but not both at the same time.
The wsi_path and wsi_folder or wsi_filelist parameters are mutually exclusive.
The wsi_mpp and wsi_magnification parameters can be set globally or per WSI in the file list.
The output_format section allows you to customize the output format and compression settings.
The system section allows you to customize the CPU and memory settings for inference.
The debug section allows you to enable debug mode for more detailed logging.
The configuration file can be passed as a command-line argument using the –config flag.
CLI-Configuration¶
The CLI configuration allows you to specify the parameters directly in the command line.
General configuration¶
usage: cellvit-inference [-h] [--config CONFIG] [--model {SAM,HIPT}] [--nuclei_taxonomy {binary,pannuke,consep,lizard,midog,nucls_main,nucls_super,ocelot,panoptils}] [--gpu GPU]
[--enforce_amp] [--batch_size BATCH_SIZE] [--outdir OUTDIR] [--geojson] [--graph] [--compression] [--cpu_count CPU_COUNT] [--ray_worker RAY_WORKER]
[--ray_remote_cpus RAY_REMOTE_CPUS] [--memory MEMORY] [--debug]
{process_wsi,process_dataset} ...
Perform CellViT++ inference
positional arguments:
{process_wsi,process_dataset}
Select processing mode
process_wsi Process a single Whole Slide Image
process_dataset Process multiple WSI files
options:
-h, --help show this help message and exit
--config CONFIG Path to a YAML configuration file. If provided, CLI arguments are ignored. (default: None)
--model {SAM,HIPT} Segmentation model to use (default: None), REQUIRED
--nuclei_taxonomy {binary,pannuke,consep,lizard,midog,nucls_main,nucls_super,ocelot,panoptils}
Defines the nuclei classification taxonomy (default: pannuke), OPTIONAL
--debug Enable debug mode (changes logger level and requires ray[default]) (default: False), OPTIONAL
Inference Settings:
--gpu GPU GPU ID to use for inference (default: 0), OPTIONAL
--enforce_amp Whether to use Automatic Mixed Precision (AMP) for inference (default: False), OPTIONAL
--batch_size BATCH_SIZE
Number of images processed per batch (default: 8), OPTIONAL
Output Settings:
--outdir OUTDIR Path to the output directory where results will be stored (default: None), REQUIRED
--geojson Whether to export results in GeoJSON format (for QuPath or other tools) (default: False), OPTIONAL
--graph Whether to generate a cell graph representation (default: False), OPTIONAL
--compression Whether to use Snappy compression for output files (default: False), OPTIONAL
System Settings:
--cpu_count CPU_COUNT
Number of CPU cores to use for inference (default: None), OPTIONAL
--ray_worker RAY_WORKER
Number of ray worker to use for inference (limited by cpu-count) (default: None), OPTIONAL
--ray_remote_cpus RAY_REMOTE_CPUS
Number of CPUs per ray worker (default: None), OPTIONAL
--memory MEMORY RAM in MB to use (default: None), OPTIONAL
Process a single image¶
All previous configuration options need to be set before running the command with process_wsi
:
cellvit-inference [previous options] process_wsi [wsi_options]
The ``process_wsi``options are:
usage: cellvit-inference process_wsi [-h] (--wsi_folder WSI_FOLDER | --wsi_filelist WSI_FILELIST) [--wsi_extension WSI_EXTENSION] [--wsi_mpp WSI_MPP]
[--wsi_magnification WSI_MAGNIFICATION]
options:
-h, --help show this help message and exit
--wsi_path WSI_PATH Path to the Whole Slide Image (WSI) file, REQUIRED
--wsi_mpp WSI_MPP Microns per pixel (spatial resolution of the slide), OPTIONAL
Default: Extracted automatically from file (if available)
--wsi_magnification WSI_MAGNIFICATION
Magnification level of the slide (e.g., 40), OPTIONAL
Default: Extracted automatically from file (if available)
Process a dataset¶
cellvit-inference [previous options] process_dataset [wsi_options]
The ``process_dataset``options are:
usage: cellvit-inference process_dataset [-h] (--wsi_folder WSI_FOLDER | --wsi_filelist WSI_FILELIST) [--wsi_extension WSI_EXTENSION] [--wsi_mpp WSI_MPP]
[--wsi_magnification WSI_MAGNIFICATION]
options:
-h, --help show this help message and exit
--wsi_folder WSI_FOLDER
Path to a folder containing multiple WSI files, REQUIRED if wsi_filelist is NOT used
--wsi_filelist WSI_FILELIST
Path to a CSV file listing WSI files (must have a 'path' column), REQUIRED if wsi_folder is NOT used
--wsi_extension WSI_EXTENSION
File extension of WSI files (used for detection within wsi_folder), OPTIONAL
--wsi_mpp WSI_MPP Microns per pixel (spatial resolution), OPTIONAL
Default: Extracted automatically from file (if available)
Can be used with both wsi_folder and wsi_filelist
--wsi_magnification WSI_MAGNIFICATION
Magnification level of the slides, OPTIONAL
Default: Extracted automatically from file (if available)
Can be used with both wsi_folder and wsi_filelist
Note
The wsi_path and wsi_folder or wsi_filelist parameters are mutually exclusive.
The wsi_mpp and wsi_magnification parameters can be set globally or per WSI in the file list.
The output_format section allows you to customize the output format and compression settings.
The system section allows you to customize the CPU and memory settings for inference.
The debug section allows you to enable debug mode for more detailed logging.
The configuration file can be passed as a command-line argument using the –config flag.
The –wsi_folder option is used to specify a folder containing multiple WSI files.
The –wsi_filelist option is used to specify a CSV file listing WSI files, even from different folders. Provide the entire WSI-paths in the path column.
The –wsi_extension option is used to specify the file extension of WSI files (e.g., “svs”).