dgp.datasets package

dgp.datasets.base_dataset module

Base dataset class compliant with the TRI-ML Data Governance Policy (DGP), which standardizes TRI’s data formats.

Please refer to dgp/proto/dataset.proto for the exact specifications of our DGP and to dgp/proto/annotations.proto for the expected structure for annotations.

class dgp.datasets.base_dataset.BaseDataset(dataset_metadata, scenes, datum_names, requested_annotations=None, requested_autolabels=None, split=None, autolabel_root=None, ignore_raw_datum=None)

Bases: object

A base class representing a Dataset. Provides utilities for parsing and slicing DGP format datasets.

dataset_metadata: DatasetMetadata

Dataset metadata object that encapsulates dataset-level metadata for both operating modes (scene or JSON).

scenes: list[SceneContainer]

List of SceneContainer objects to be included in the dataset.

datum_names: list, default: None

List of datum names (str) to be considered in the dataset.

requested_annotations: tuple[str], default: None

Tuple of desired annotation keys, i.e. (‘bounding_box_2d’, ‘bounding_box_3d’). Should match directory name containing annotations from dataset root.

requested_autolabels: tuple[str], default: None

Tuple of annotation keys similar to requested_annotations, but associated with a particular autolabeling model. Expected format is “<autolabel_model>/<annotation_key>”.

split: str, default: None

Split of dataset to read (“train” | “val” | “test” | “train_overfit”). If the split is None, the split type is not known and the dataset can be used for unsupervised / self-supervised learning.

autolabel_root: str, default: None

Optional path to autolabel root directory.

ignore_raw_datum: Optional[list[str]], default: None

Optionally pass a list of datum types to skip loading their raw data (but still load their annotations). For example, ignore_raw_datum=[‘image’] will skip loading the image rgb data. The rgb key will be set to None. This is useful when only annotations or extrinsics are needed. Allowed values are any combination of ‘image’,’point_cloud’,’radar_point_cloud’

static get_annotations(datum)
datum: Datum

Datum of type image, point cloud, etc..

annotations: annotations_pb2

Annotation proto object corresponding to the datum.

get_autolabels_for_datum(scene_idx, sample_idx_in_scene, datum_name)

Get autolabels associated with a datum if available

scene_idx: int

Index of the scene.

sample_idx_in_scene: int

Index of the sample within the scene at scene_idx.

datum_name: str

Name of the datum within sample

autolabels: dict

Map of <autolabel_model>/<annotation_key> : <annotation_path>. Returns empty dictionary if no autolabels exist for that datum.

get_camera_calibration(calibration_key, datum_name)

Get camera calibration given its calibration key and datum name.

calibration_key: str

Calibration key.

datum_name: str

Datum name whose calibration is requested.

camera: Camera

Calibrated camera with extrinsics/intrinsics set.

get_datum(scene_idx, sample_idx_in_scene, datum_name)

Get datum given its scene index, sample_idx_in_scene, and datum_name

scene_idx: int

Index of the scene.

sample_idx_in_scene: int

Index of the sample within the scene at scene_idx.

datum_name: str

Name of datum within simple

datum: Datum

Datum indexed at scene_idx, sample_idx_in_scene with the given datum_name.

get_datum_pose(datum)

Get the ego-pose associated with datum

datum: Datum

Datum of type image, point cloud, etc..

datum_pose: Pose

Pose object of datum’s ego pose

TypeError

Raised if datum type is unsupported.

get_file_meta_from_datum(scene_idx, sample_idx_in_scene, datum_name)

Get the sample file info from file datum.

scene_idx: int

Index of the scene.

sample_idx_in_scene: int

Index of the sample within the scene at scene_idx.

datum_name: str

Name of the datum within sample

data: OrderedDict

“timestamp”: int

Timestamp of the image in microseconds.

“datum_name”: str

Sensor name from which the data was collected

“filename”: str

File name associate to the file datum.

annotations: dict

Map from annotation key to annotation file for datum

get_image_from_datum(scene_idx, sample_idx_in_scene, datum_name)

Get the sample image data from image datum.

scene_idx: int

Index of the scene.

sample_idx_in_scene: int

Index of the sample within the scene at scene_idx.

datum_name: str

Name of the datum within sample

data: OrderedDict

“timestamp”: int

Timestamp of the image in microseconds.

“datum_name”: str

Sensor name from which the data was collected

“rgb”: PIL.Image (mode=RGB)

Image in RGB format.

“intrinsics”: np.ndarray

Camera intrinsics if available.

“extrinsics”: Pose

Camera extrinsics with respect to the vehicle frame, if available.

“pose”: Pose

Pose of sensor with respect to the world/global/local frame (reference frame that is initialized at start-time). (i.e. this provides the ego-pose in pose_WC).

annotations: dict

Map from annotation key to annotation file for datum

get_point_cloud_from_datum(scene_idx, sample_idx_in_scene, datum_name)

Get the sample lidar data from point cloud datum.

scene_idx: int

Index of the scene.

sample_idx_in_scene: int

Index of the sample within the scene at scene_idx.

datum_name: str

Name of the datum within sample

data: OrderedDict

“timestamp”: int

Timestamp of the lidar in microseconds.

“datum_name”: str

Sensor name from which the data was collected

“extrinsics”: Pose

Sensor extrinsics with respect to the vehicle frame.

“point_cloud”: np.ndarray (N x 3)

Point cloud in the local/world (L) frame returning X, Y and Z coordinates. The local frame is consistent across multiple timesteps in a scene.

“extra_channels”: np.ndarray (N x M)

Remaining channels from point_cloud (i.e. lidar intensity I or pixel colors RGB)

“pose”: Pose

Pose of sensor with respect to the world/global/local frame (reference frame that is initialized at start-time). (i.e. this provides the ego-pose in pose_WS where S refers to the point cloud sensor (S)).

annotations: dict

Map from annotation key to annotation file for datum

get_radar_point_cloud_from_datum(scene_idx, sample_idx_in_scene, datum_name)

Get the sample radar data from radar point cloud datum.

scene_idx: int

Index of the scene.

sample_idx_in_scene: int

Index of the sample within the scene at scene_idx.

datum_name: str

Name of the datum within sample

data: OrderedDict

“timestamp”: int

Timestamp of the radar point cloud in microseconds.

“datum_name”: str

Sensor name from which the data was collected

“extrinsics”: Pose

Sensor extrinsics with respect to the vehicle frame.

“point_cloud”: np.ndarray (N x 3)

Point cloud in the local/world (L) frame returning X, Y and Z coordinates. The local frame is consistent across multiple timesteps in a scene.

“velocity”: np.ndarray(N x 3)

Velocity vectors in sensor frame.

“covariance”: np.ndarray(N x 3 x 3)

Covariance matrix of point positions in sensor frame.

“extra_channels”: np.ndarray (N x M)

Remaining channels from radar, rcs_dbm, probability, sensor_id etc

“pose”: Pose

Pose of sensor with respect to the world/global/local frame (reference frame that is initialized at start-time). (i.e. this provides the ego-pose in pose_WS where S refers to the point cloud sensor (S)).

annotations: dict

Map from annotation key to annotation file for datum

get_sample(scene_idx, sample_idx_in_scene)

Get sample given its scene index and sample_idx_in_scene.

NOTE: Some samples may be removed during indexing. These samples will NOT be returned by this function. An unmodified list of samples can be accessed via the samples property on each SceneContainer.

scene_idx: int

Index of the scene.

sample_idx_in_scene: int

Index of the sample within the scene at scene_idx.

sample: Sample

Sample indexed at scene_idx and sample_idx_in_scene.

get_scene_metadata(scene_idx)

Get scene-level metadata for the scene index.

scene_idx: int

Index of scene.

scene_metadata: OrderedDict

Additional scene-level metadata for the dataset item at index. Note: This is used for traceability and sampling purposes.

get_sensor_extrinsics(calibration_key, datum_name)

Get sensor extrinsics given its calibration key and datum name.

calibration_key: str

Calibration key.

datum_name: str

Datum name whose calibration is requested.

p_WS: Pose

Extrinsics of sensor (S) with respect to the world (W)

property image_mean
property image_stddev
list_datum_names_available_in_all_scenes()

“Gets the set union of available datums names across all scenes. We assume that all samples in a scene have the same datums available.

available_datum_names: list

DatumId.name which are available across all scenes.

load_annotations(scene_idx, sample_idx_in_scene, datum_name)

Get annotations for a specified datum

scene_idx: int

Index of the scene.

sample_idx_in_scene: int

Index of the sample within the scene at scene_idx.

datum_name: str

Name of the datum within sample

annotations: dict

Dictionary mapping annotation key to Annotation object for given annotation type.

Exception

Raised if we cannot load an annotation type due to not finding an ontology for a requested annotation.

load_datum(scene_idx, sample_idx_in_scene, datum_name)

Load a datum given a sample and a datum name

scene_idx: int

Index of the scene.

sample_idx_in_scene: int

Index of the sample within the scene at scene_idx.

datum_name: str

Name of the datum within sample

datum: parsed datum type

For different datums, we return different types. For image types, we return a PIL.Image For point cloud types, we return a numpy float64 array

TypeError

Raised if the datum type is unsupported.

Exception

Raised if datum’s filename is not supported

property metadata_index

Builds an index of metadata items that refer to the scene index, sample index index.

metadata_index: dict

Dictionary of metadata for tuple key (scene_idx, sample_idx_in_scene) returning a dictionary of additional metadata information for the requested sample.

class dgp.datasets.base_dataset.DatasetMetadata(scenes, directory, ontology_table=None)

Bases: object

A Wrapper Dataset metadata class to support two entrypoints for datasets (reading from dataset.json OR from a scene_dataset.json). Aggregates statistics and onotology_table when construct DatasetMetadata object for SceneDataset.

scenes: list[SceneContainer]

List of SceneContainer objects to be included in the dataset.

directory: str

Directory of dataset.

ontology_table: dict, default: None

A dictionary mapping annotation key(s) to Ontology(s), i.e.: {

“bounding_box_2d”: BoundingBoxOntology[<ontology_sha>], “autolabel_model_1/bounding_box_2d”: BoundingBoxOntology[<ontology_sha>], “semantic_segmentation_2d”: SemanticSegmentationOntology[<ontology_sha>]

}

classmethod from_scene_containers(scene_containers, requested_annotations=None, requested_autolabels=None, autolabel_root=None)

Load DatasetMetadata from Scene Dataset JSON.

scene_containers: list of SceneContainer

List of SceneContainer objects.

requested_annotations: List(str)

List of annotations, such as [‘bounding_box_3d’, ‘bounding_box_2d’]

requested_autolabels: List(str)

List of autolabels, such as[‘model_a/bounding_box_3d’, ‘model_a/bounding_box_2d’]

autolabel_root: str, optional

Optional path to autolabel root directory. Default: None.

Exception

Raised if an ontology in a scene has no corresponding implementation yet.

static get_dataset_splits(dataset_json)

Get a list of splits in the dataset.json.

dataset_json: str

Full path to the dataset json holding dataset metadata, ontology, and image and annotation paths.

dataset_splits: list of str

List of dataset splits (train | val | test | train_overfit).

property metadata
class dgp.datasets.base_dataset.SceneContainer(scene_path, directory=None, autolabeled_scenes=None, is_datums_synchronized=False, use_diskcache=True, skip_missing_data=False)

Bases: object

Object-oriented container for assembling datasets from collections of scenes. Each scene is fully described within a sub-directory with an associated scene.json file.

This class also provides functionality for reinjecting autolabeled scenes into other scenes.

SCENE_CACHE = <diskcache.core.Cache object>
property annotation_index

Build 2D boolean DataArray for annotations. Rows correspond to the datum_idx_in_scene and columns correspond to requested annotation types.

For example: ` +----+-------------------+-------------------+-----+ |    | "bounding_box_2d" | "bounding_box_3d" | ... | +----+-------------------+-------------------+-----+ |  0 | False             | True              |     | |  1 | True              | True              |     | |  2 | False             | False             |     | | .. | ..                | ..                |     | +----+-------------------+-------------------+-----+ ` Returns ——- scene_annotation_index: xr.DataArray

Boolean index of annotations for this scene

property autolabels

“Associate autolabels to datums. Iterate through datums, and if that datum has a corresponding autolabel, add it to the autolabel object. Example resulting autolabel map is: {

<datum_hash>: {

<autolabel_key>: <autolabeled_annotation> …

}

cache_dir = '/home/runner/.dgp/cache/dgp_diskcache_21323'
cache_suffix = '21323'
property calibration_files

Returns the calibration index for a scene.

calibration_table: dict

Maps (calibration_key, datum_key) -> (p_WS, Camera)

For example: (p_WS, Camera) = self.calibration_table[(calibration_key, datum_name)]

check_datum_file(datum_idx_in_scene)

Checks if datum file exists Parameters ——— datum_idx_in_scene: int

Index of datum in this scene

bool: True if datum file exists

TypeError

Raised if the referenced datum has an unuspported type.

check_files()

Checks if scene and calibration files exist Returns ——- bool: True if scene and calibration files exist

property data

Returns the scene data.

property datum_index

Build a multidimensional DataArray to represent a scene. Rows correspond to samples, and columns correspond to datums. The value at each location is the datum_idx_in_scene, which can be used to directly fetch the desired datum given a sample index and datum name.

For example: ` +----+-------------+-------------+---------+-----+ |    | "camera_01" | "camera_02" | "lidar" | ... | +----+-------------+-------------+---------+-----+ |  0 |           0 |           1 |       2 |     | |  1 |           9 |          10 |      11 |     | |  2 |          18 |          19 |      20 |     | | .. |          .. |          .. |         |     | +----+-------------+-------------+---------+-----+ `

scene_datum_index: xr.DataArray

2D index describing samples and datums

property datum_names

“Gets the list of datums names available within a scene.

get_autolabels(sample_idx_in_scene, datum_name)

Get autolabels associated with a datum if available

sample_idx_in_scene: int

Index of the sample within the scene at scene_idx.

datum_name: str

Name of the datum within sample

autolabels: dict

Map of <autolabel_model>/<annotation_key> : <annotation_path>. Returns empty dictionary if no autolabels exist for that datum.

get_datum(sample_idx_in_scene, datum_name)

Get datum given its sample_idx_in_scene and the datum name.

sample_idx_in_scene: int

Index of the sample within the scene.

datum_name: str

Name of the datum within sample

datum: Datum

Datum at sample_idx_in_scene and datum_name for the scene.

get_datum_type(datum_name)

Get datum type based on the datum name

datum_name: str

The name of the datum to find a type for.

get_sample(sample_idx_in_scene)

Get sample given its sample_idx_in_scene.

NOTE: Some samples may be removed during indexing. These samples will NOT be returned by this function. An unmodified list of samples can be accessed via the samples property on each SceneContainer.

sample_idx_in_scene: int

Index of the sample within the scene.

sample: Sample

Sample indexed at sample_idx_in_scene for the scene.

property metadata_index

Helper for building metadata index.

TODO: Need to verify that the hashes are unique, and these lru-cached properties are consistent across disk-cached reads.

property ontology_files

Returns the ontology files for a scene.

ontology_files: dict

Maps annotation_key -> filename

For example: filename = scene.ontology_files[‘bounding_box_2d’]

random_str = '21323'
property samples

Returns the scene samples.

property scene

Returns scene. - If self.use_diskcache is True: returns the cached _scene if available, otherwise load the

scene and cache it.

  • If self.use_diskcache is False: returns _scene in memory if the instance has attribute _scene, otherwise load the scene and save it in memory. NOTE: Setting use_diskcache to False would exhaust the memory if have a large number of scenes.

select_datums(datum_names, requested_annotations=None, requested_autolabels=None)

Select a set of datums by name to be used in the scene.

datum_names: list

List of datum names to be used for instance of dataset

requested_annotations: tuple, optional

Tuple of annotation types, i.e. (‘bounding_box_2d’, ‘bounding_box_3d’). Should be equivalent to directory containing annotation from dataset root. Default: None.

requested_autolabels: tuple[str], optional

Tuple of annotation types similar to requested_annotations, but associated with a particular autolabeling model. Expected format is “<model_id>/<annotation_type>” Default: None.

ValueError

Raised if datum_names is not a list or tuple or if it is a sequence with no elements.

dgp.datasets.frame_dataset module

Dataset for handling frame-level (unordered) for unsupervised, self-supervised and supervised tasks. This dataset is compliant with the TRI-ML Dataset Governance Policy (DGP).

Please refer to dgp/proto/dataset.proto for the exact specifications of our dgp.

class dgp.datasets.frame_dataset.FrameScene(scene_json, datum_names=None, requested_annotations=None, requested_autolabels=None, only_annotated_datums=False, use_diskcache=True, skip_missing_data=False)

Bases: _FrameDataset

Main entry-point for single-modality dataset using a single scene JSON as input.

NOTE: This class can be used to introspect a single scene given a scene directory with its associated scene JSON.

scene_json: str

Full path to the scene json.

datum_names: list, default: None

Select datums for which to build index (see self.select_datums(datum_names)). NOTE: All selected datums must be of a the same datum type!

requested_annotations: tuple, default: None

Tuple of annotation types, i.e. (‘bounding_box_2d’, ‘bounding_box_3d’). Should be equivalent to directory containing annotation from dataset root.

requested_autolabels: tuple[str], default: None

Tuple of annotation types similar to requested_annotations, but associated with a particular autolabeling model. Expected format is “<model_id>/<annotation_type>”

only_annotated_datums: bool, default: False

If True, only datums with annotations matching the requested annotation types are returned.

use_diskcache: bool, default: True

If True, cache ScenePb2 object using diskcache. If False, save the object in memory. NOTE: Setting use_diskcache to False would exhaust the memory if have a large number of scenes.

skip_missing_data: bool, default: False

If True, check for missing files and skip during datum index building.

class dgp.datasets.frame_dataset.FrameSceneDataset(scene_dataset_json, split='train', datum_names=None, requested_annotations=None, requested_autolabels=None, only_annotated_datums=False, use_diskcache=True, skip_missing_data=False)

Bases: _FrameDataset

Main entry-point for single-modality dataset. Used for tasks with unordered data, i.e. 2D detection.

scene_dataset_json: str

Full path to the scene dataset json holding collections of paths to scene json.

split: str, default: ‘train’

Split of dataset to read (“train” | “val” | “test” | “train_overfit”).

datum_names: list, default: None

Select datums for which to build index (see self.select_datums(datum_names)). NOTE: All selected datums must be of a the same datum type!

requested_annotations: tuple, default: None

Tuple of annotation types, i.e. (‘bounding_box_2d’, ‘bounding_box_3d’). Should be equivalent to directory containing annotation from dataset root.

requested_autolabels: tuple[str], default: None

Tuple of annotation types similar to requested_annotations, but associated with a particular autolabeling model. Expected format is “<model_id>/<annotation_type>”

only_annotated_datums: bool, default: False

If True, only datums with annotations matching the requested annotation types are returned.

use_diskcache: bool, default: True

If True, cache ScenePb2 object using diskcache. If False, save the object in memory. NOTE: Setting use_diskcache to False would exhaust the memory if have a large number of scenes.

skip_missing_data: bool, default: False

If True, check for missing files and skip during datum index building.

dgp.datasets.pd_dataset module

class dgp.datasets.pd_dataset.ParallelDomainScene(scene_json, datum_names=None, requested_annotations=None, requested_autolabels=None, backward_context=0, forward_context=0, generate_depth_from_datum=None, only_annotated_datums=False, use_virtual_camera_datums=True, skip_missing_data=False, accumulation_context=None, transform_accumulated_box_points=False, use_diskcache=True, autolabel_root=None)

Bases: _ParallelDomainDataset

Refer to SynchronizedScene for parameters.

class dgp.datasets.pd_dataset.ParallelDomainSceneDataset(scene_dataset_json, split='train', datum_names=None, requested_annotations=None, requested_autolabels=None, backward_context=0, forward_context=0, generate_depth_from_datum=None, only_annotated_datums=False, use_virtual_camera_datums=True, skip_missing_data=False, accumulation_context=None, dataset_root=None, transform_accumulated_box_points=False, use_diskcache=True, autolabel_root=None)

Bases: _ParallelDomainDataset

Refer to SynchronizedSceneDataset for parameters.

dgp.datasets.synchronized_dataset module

Dataset for handling synchronized multi-modal samples for unsupervised, self-supervised and supervised tasks. This dataset is compliant with the TRI-ML Dataset Governance Policy (DGP).

Please refer to dgp/proto/dataset.proto for the exact specifications of our dgp.

class dgp.datasets.synchronized_dataset.SynchronizedScene(scene_json, datum_names=None, requested_annotations=None, requested_autolabels=None, backward_context=0, forward_context=0, accumulation_context=None, generate_depth_from_datum=None, only_annotated_datums=False, transform_accumulated_box_points=False, use_diskcache=True, autolabel_root=None, ignore_raw_datum=None)

Bases: _SynchronizedDataset

Main entry-point for multi-modal dataset with sample-level synchronization using a single scene JSON as input.

Note: This class can be used to introspect a single scene given a scene directory with its associated scene JSON.

scene_json: str

Full path to the scene json.

datum_names: list, default: None

Select list of datum names for synchronization (see self.select_datums(datum_names)).

requested_annotations: tuple, default: None

Tuple of annotation types, i.e. (‘bounding_box_2d’, ‘bounding_box_3d’). Should be equivalent to directory containing annotation from dataset root.

requested_autolabels: tuple[str], default: None

Tuple of annotation types similar to requested_annotations, but associated with a particular autolabeling model. Expected format is “<model_id>/<annotation_type>”

backward_context: int, default: 0

Backward context in frames [T-backward, …, T-1]

forward_context: int, default: 0

Forward context in frames [T+1, …, T+forward]

accumulation_context: dict, default None

Dictionary of datum names containing a tuple of (backward_context, forward_context) for sensor accumulation. For example, ‘accumulation_context={‘lidar’:(3,1)} accumulates lidar points over the past three time steps and one forward step. Only valid for lidar and radar datums.

generate_depth_from_datum: str, default: None

Datum name of the point cloud. If is not None, then the depth map will be generated for the camera using the desired point cloud.

only_annotated_datums: bool, default: False

If True, only datums with annotations matching the requested annotation types are returned.

transform_accumulated_box_points: bool, default: False

Flag to use cuboid pose and instance id to warp points when using lidar accumulation.

use_diskcache: bool, default: True

If True, cache ScenePb2 object using diskcache. If False, save the object in memory. NOTE: Setting use_diskcache to False would exhaust the memory if have a large number of scenes.

autolabel_root: str, default: None

Path to autolabels if not stored inside scene root. Note this must still respect the scene structure, i.e, autolabel_root = ‘/some-autolabels’ means the autolabel scene.json is found at /some-autolabels/<scene-dir>/autolabels/my-model/scene.json.

ignore_raw_datum: Optional[list[str]], default: None

Optionally pass a list of datum types to skip loading their raw data (but still load their annotations). For example, ignore_raw_datum=[‘image’] will skip loading the image rgb data. The rgb key will be set to None. This is useful when only annotations or extrinsics are needed. Allowed values are any combination of ‘image’,’point_cloud’,’radar_point_cloud’

Refer to _SynchronizedDataset for remaining parameters.

class dgp.datasets.synchronized_dataset.SynchronizedSceneDataset(scene_dataset_json, split='train', datum_names=None, requested_annotations=None, requested_autolabels=None, backward_context=0, forward_context=0, accumulation_context=None, generate_depth_from_datum=None, only_annotated_datums=False, skip_missing_data=False, dataset_root=None, transform_accumulated_box_points=False, use_diskcache=True, autolabel_root=None, ignore_raw_datum=None)

Bases: _SynchronizedDataset

Main entry-point for multi-modal dataset with sample-level synchronization using scene directories as input.

Note: This class is primarily used for self-supervised learning tasks where the default mode of operation is learning from a collection of scene directories.

scene_dataset_json: str

Full path to the scene dataset json holding collections of paths to scene json.

split: str, default: ‘train’

Split of dataset to read (“train” | “val” | “test” | “train_overfit”).

datum_names: list, default: None

Select list of datum names for synchronization (see self.select_datums(datum_names)).

requested_annotations: tuple, default: None

Tuple of annotation types, i.e. (‘bounding_box_2d’, ‘bounding_box_3d’). Should be equivalent to directory containing annotation from dataset root.

requested_autolabels: tuple[str], default: None

Tuple of annotation types similar to requested_annotations, but associated with a particular autolabeling model. Expected format is “<model_id>/<annotation_type>”

backward_context: int, default: 0

Backward context in frames [T-backward, …, T-1]

forward_context: int, default: 0

Forward context in frames [T+1, …, T+forward]

accumulation_context: dict, default None

Dictionary of datum names containing a tuple of (backward_context, forward_context) for sensor accumulation. For example, ‘accumulation_context={‘lidar’:(3,1)} accumulates lidar points over the past three time steps and one forward step. Only valid for lidar and radar datums.

generate_depth_from_datum: str, default: None

Datum name of the point cloud. If is not None, then the depth map will be generated for the camera using the desired point cloud.

only_annotated_datums: bool, default: False

If True, only datums with annotations matching the requested annotation types are returned.

skip_missing_data: bool, default: False

If True, check for missing files and skip during datum index building.

dataset_root: str

Optional path to dataset root folder. Useful if dataset scene json is not in the same directory as the rest of the data.

transform_accumulated_box_points: bool, default: False

Flag to use cuboid pose and instance id to warp points when using lidar accumulation.

use_diskcache: bool, default: True

If True, cache ScenePb2 object using diskcache. If False, save the object in memory. NOTE: Setting use_diskcache to False would exhaust the memory if have a large number of scenes.

autolabel_root: str, default: None

Path to autolabels if not stored inside scene root. Note this must still respect the scene structure, i.e, autolabel_root = ‘/some-autolabels’ means the autolabel scene.json is found at /some-autolabels/<scene-dir>/autolabels/my-model/scene.json.

ignore_raw_datum: Optional[list[str]], default: None

Optionally pass a list of datum types to skip loading their raw data (but still load their annotations). For example, ignore_raw_datum=[‘image’] will skip loading the image rgb data. The rgb key will be set to None. This is useful when only annotations or extrinsics are needed. Allowed values are any combination of ‘image’,’point_cloud’,’radar_point_cloud’

Refer to _SynchronizedDataset for remaining parameters.

Module contents