Cell Segmentation¶
SMINT provides powerful tools for segmenting cells and nuclei in whole-slide images, with support for distributed processing and multiple GPUs. This comprehensive guide covers all aspects of the segmentation pipeline, from data preparation to post-processing.
Overview of Segmentation Pipeline¶
The SMINT segmentation pipeline consists of several key stages:
- Image Preprocessing - Preparing and normalizing input images
- Cell Segmentation - Identifying whole cells using Cellpose
- Nuclei Segmentation - Identifying cell nuclei (optional)
- Post-processing - Refining segmentation results and extracting features
- Visualization - Generating visual outputs for quality control
Input Data Requirements¶
Supported Image Formats¶
SMINT supports the following image formats: - OME-TIFF (preferred) - Multi-channel, multi-resolution format with metadata - TIFF - Standard format, single or multi-channel - CZI - Carl Zeiss format, requires additional processing
Required Image Properties¶
For optimal segmentation results, input images should have: - Resolution: Ideally 0.5-1 μm/pixel for cell segmentation, 0.25-0.5 μm/pixel for nuclei - Channels: - At least one membrane/cytoplasm channel (e.g., WGA, phalloidin) - At least one nuclear channel (e.g., DAPI, Hoechst) for nuclei segmentation - Bit depth: 8-bit or 16-bit grayscale per channel - Quality: Minimal noise, good contrast between cells and background
Quick Start¶
Single-Process Segmentation¶
For standard whole-slide images that fit in memory:
python -m scripts.run_segmentation \
--image path/to/image.ome.tiff \
--output-dir results/segmentation \
--cell-channel 0 \
--nuclei-channel 1 \
--cell-diameter 60 \
--nuclei-diameter 30 \
--visualize
Multi-GPU Distributed Segmentation¶
For very large images requiring multiple GPUs:
python -m scripts.run_distributed_segmentation \
--image path/to/large_image.ome.tiff \
--output-dir results/segmentation \
--cell-channel 0 \
--nuclei-channel 1 \
--cell-diameter 60 \
--nuclei-diameter 30 \
--chunk-size 2048 2048 \
--gpus 0 1 2 3 \
--visualize
Detailed Parameter Guide¶
Common Parameters¶
| Parameter | Description | Default | Recommended Range |
|---|---|---|---|
--image |
Path to input image | Required | - |
--output-dir |
Directory to save results | Required | - |
--cell-channel |
Channel index for cell segmentation | 0 | Depends on staining |
--nuclei-channel |
Channel index for nuclei segmentation | 1 | Depends on staining |
--cell-diameter |
Expected cell diameter in pixels | 80 | 40-120 |
--nuclei-diameter |
Expected nuclei diameter in pixels | 40 | 20-60 |
--flow-threshold |
Flow threshold for Cellpose | 0.4 | 0.2-0.8 |
--cellprob-threshold |
Cell probability threshold | -1.0 | -3.0-0.0 |
--visualize |
Generate visualizations | False | - |
--chunk-size |
Size of image chunks to process | 2048 2048 | Depends on GPU memory |
Advanced Parameters¶
| Parameter | Description | Default | Notes |
|---|---|---|---|
--pretrained-model |
Path to custom Cellpose model | None | Use for specialized cell types |
--model-type |
Cellpose model type | "cyto" | Options: "cyto", "nuclei", "cyto2", or custom path |
--min-cell-size |
Minimum cell size in pixels | 15 | Filters out small objects |
--omit-overlap |
Remove overlapping cell masks | False | Useful for densely packed cells |
--adaptive-threshold |
Use adaptive thresholding | False | Helps with variable image intensity |
--normalize-channels |
Normalize channel intensities | True | Improves segmentation quality |
--save-zarr |
Save results in Zarr format | False | Useful for very large datasets |
--no-nuclei |
Skip nuclei segmentation | False | Speeds up processing |
Distributed Processing Parameters¶
| Parameter | Description | Default | Notes |
|---|---|---|---|
--gpus |
GPU device IDs to use | None | Space-separated list of GPU IDs |
--workers-per-gpu |
Dask workers per GPU | 1 | Increase for CPU-bound tasks |
--memory-limit |
Memory limit per worker | "10GB" | Adjust based on available RAM |
--scheduler-address |
Dask scheduler address | None | For connecting to existing cluster |
--overlap |
Overlap between chunks | 64 | Prevents boundary artifacts |
--batch-size |
Number of chunks per batch | 4 | Adjust based on GPU memory |
Segmentation API¶
For programmatic usage within Python scripts:
from smint.segmentation import process_large_image
# Basic usage
results = process_large_image(
image_path="path/to/image.ome.tiff",
csv_base_path="results/segmentation",
chunk_size=(2048, 2048),
# Cell parameters
cell_model_path="cyto",
cells_diameter=80.0,
cells_flow_threshold=0.4,
cells_cellprob_threshold=-1.5,
cells_channels=[0, 0], # [channel, 0] for grayscale, [channel1, channel2] for RGB
# Nuclei parameters
nuclei_model_path="nuclei",
nuclei_diameter=40.0,
nuclei_flow_threshold=0.4,
nuclei_cellprob_threshold=-1.5,
nuclei_channels=[1, 0], # [channel, 0] for grayscale
# Visualization
visualize=True,
visualize_output_dir="results/visualization",
num_visualize_chunks=5,
visualize_roi_size=(512, 512)
)
# Access the results
cell_outlines = results["cell_outlines"]
nuclei_outlines = results["nuclei_outlines"]
For distributed processing:
from smint.segmentation.distributed_seg import process_large_image_distributed
# Basic distributed usage
results = process_large_image_distributed(
image_path="path/to/large_image.ome.tiff",
output_zarr_path="results/segmentation.zarr",
csv_path="results/segmentation.csv",
blocksize=(2048, 2048),
channel=0, # Main channel for segmentation
gpus=[0, 1, 2, 3], # List of GPU IDs
overlap=64, # Overlap between chunks
batch_size=4, # Number of chunks per batch
model_type="cyto", # Cellpose model type
diameter=80.0, # Expected cell diameter
flow_threshold=0.4,
cellprob_threshold=-1.5
)
Advanced Adaptive Segmentation¶
SMINT supports adaptive segmentation that dynamically adjusts parameters based on local image characteristics:
from smint.segmentation import process_large_image
results = process_large_image(
# Basic parameters as above, plus:
enable_adaptive_nuclei=True,
nuclei_adaptive_flow_min=0.1,
nuclei_adaptive_flow_step_decrement=0.1,
nuclei_max_adaptive_attempts=5,
adaptive_nuclei_trigger_ratio=0.05 # Retry if nuclei count < 5% of cells
)
Output Files and Formats¶
SMINT generates the following output files:
| File | Description | Format |
|---|---|---|
cells_outlines.csv |
Cell outline coordinates | CSV with columns: cell_id,x,y,chunk_id |
nuclei_outlines.csv |
Nuclei outline coordinates | CSV with columns: nuclei_id,x,y,chunk_id |
cell_features.csv |
Extracted cell features | CSV with measurements for each cell |
segmentation_metadata.json |
Segmentation parameters and stats | JSON |
visualization/*.png |
Visualization images | PNG |
chunks/*.npy |
Raw segmentation masks (if saved) | NumPy arrays |
*.zarr |
Zarr store (for distributed processing) | Zarr directory structure |
Live Segmentation Viewer¶
SMINT includes a live segmentation viewer for monitoring the segmentation process in real-time:
from smint.visualization.live_scan_viewer import LiveScanViewer
import tkinter as tk
# Initialize the viewer
root = tk.Tk()
viewer = LiveScanViewer(
master=root,
full_scan_path="path/to/image.ome.tiff",
segmentation_history_dir="results/segmentation",
tile_info_path="results/tile_info.json",
update_interval_ms=1000 # Update every 1 second
)
# Start the viewer
viewer.pack(fill=tk.BOTH, expand=True)
root.mainloop()
Common Issues and Troubleshooting¶
Poor Segmentation Quality¶
- Problem: Cells or nuclei not properly detected
- Solution: Adjust diameter, flow_threshold, and cellprob_threshold. Try using larger diameter for bigger cells, lower flow_threshold for weakly stained cells.
Memory Errors¶
- Problem: "CUDA out of memory" or other memory-related errors
- Solution: Reduce chunk_size, increase overlap, or use distributed processing with multiple GPUs.
Processing Speed¶
- Problem: Segmentation taking too long
- Solution: Use multi-GPU processing, reduce visualization, skip nuclei segmentation if not needed.
Boundary Artifacts¶
- Problem: Cell masks cut off at chunk boundaries
- Solution: Increase overlap between chunks or post-process with stitch_masks=True.
Performance Benchmarks¶
SMINT segmentation performance on different hardware configurations:
| Image Size | Hardware | Processing Time | Memory Usage |
|---|---|---|---|
| 10k × 10k | Single GPU (RTX 3090) | ~5 minutes | ~8 GB VRAM |
| 50k × 50k | Single GPU (RTX 3090) | ~1 hour | ~10 GB VRAM |
| 50k × 50k | 4× GPUs (RTX 3090) | ~15 minutes | ~8 GB VRAM per GPU |
| 100k × 100k | 4× GPUs (RTX 3090) | ~1 hour | ~8 GB VRAM per GPU |
Tips for Best Results¶
- Image Quality: Start with high-quality, well-stained images for best results
- Parameter Tuning: Optimize cell_diameter, flow_threshold, and cellprob_threshold for your specific images
- Chunk Size: Balance between processing speed (larger chunks) and memory usage (smaller chunks)
- Custom Models: Train custom Cellpose models for specialized cell types
- Channel Selection: Choose channels with strongest cell/nuclei signal for segmentation
- Validation: Always validate segmentation quality with visualizations
- Adaptive Approach: Use adaptive parameters for images with varying intensity or cell density