ratiopath.ray.read_slides
FILE_EXTENSIONS = ['svs', 'tif', 'dcm', 'ndpi', 'vms', 'vmu', 'scn', 'mrxs', 'tiff', 'svslide', 'bif', 'czi', 'ome.tiff', 'ome.tif']
module-attribute
read_slides(paths, *, tile_extent, stride, mpp=None, level=None, filesystem=None, ray_remote_args=None, meta_provider=None, partition_filter=None, partitioning=None, shuffle=None, ignore_missing_paths=False, file_extensions=FILE_EXTENSIONS, concurrency=None, override_num_blocks=None)
Creates a :class:~ray.data.Dataset
from whole slide image files.
This function reads metadata from whole slide image (WSI) files and creates a Ray Dataset where each row corresponds to a single slide. The dataset contains metadata required for subsequent tiled processing, such as slide dimensions, resolution (MPP), and tiling parameters.
It automatically selects the best slide level based on the specified mpp
(microns per pixel) or uses the given level
.
Examples:
Read a single slide and create a metadata dataset.
>>> import ray
>>> from ratiopath.ray import read_slide
>>> ds = read_slide(
... "path/to/slide.svs",
... tile_extent=256,
... stride=256,
... mpp=0.5,
... )
>>> ds.schema()
Column Type
------ ----
path string
extent_x int64
extent_y int64
tile_extent_x int64
tile_extent_y int64
stride_x int64
stride_y int64
mpp_x double
mpp_y double
level int64
downsample double
Parameters:
Name | Type | Description | Default |
---|---|---|---|
paths
|
str | list[str]
|
A single file path or a list of file paths to whole slide images. |
required |
tile_extent
|
int | tuple[int, int]
|
The size of the tiles to be generated, as |
required |
stride
|
int | tuple[int, int]
|
The step size between consecutive tiles, as |
required |
mpp
|
float | None
|
The desired microns per pixel. The datasource will select the slide
level with the closest MPP. Exactly one of |
None
|
level
|
int | None
|
The desired slide level to use. Exactly one of |
None
|
filesystem
|
FileSystem | None
|
The PyArrow filesystem implementation to read from. If not provided, it will be inferred from the file paths. |
None
|
ray_remote_args
|
dict[str, Any] | None
|
kwargs passed to :func: |
None
|
meta_provider
|
BaseFileMetadataProvider | None
|
Custom metadata providers may be able to resolve file metadata more quickly and/or accurately. In most cases you do not need to set this parameter. |
None
|
partition_filter
|
PathPartitionFilter | None
|
A filter to read only selected partitions of a dataset. |
None
|
partitioning
|
Partitioning | None
|
A :class: |
None
|
shuffle
|
Literal['files'] | FileShuffleConfig | None
|
If set to "files", randomly shuffles the input file order. |
None
|
ignore_missing_paths
|
bool
|
If |
False
|
file_extensions
|
list[str] | None
|
A list of file extensions to filter files by. If |
FILE_EXTENSIONS
|
concurrency
|
int | None
|
The maximum number of Ray tasks to run concurrently. |
None
|
override_num_blocks
|
int | None
|
Override the number of output blocks from all read tasks. |
None
|
Returns:
Name | Type | Description |
---|---|---|
A |
Dataset
|
class: |
Dataset
|
slide, ready for tiling operations. |
Source code in ratiopath/ray/read_slides.py
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 |
|