dgp.datasets package¶
dgp.datasets.base_dataset module¶
Base dataset class compliant with the TRI-ML Data Governance Policy (DGP), which standardizes TRI’s data formats.
Please refer to dgp/proto/dataset.proto for the exact specifications of our DGP and to dgp/proto/annotations.proto for the expected structure for annotations.
- class dgp.datasets.base_dataset.BaseDataset(dataset_metadata, scenes, datum_names, requested_annotations=None, requested_autolabels=None, split=None, autolabel_root=None, ignore_raw_datum=None)¶
Bases:
object
A base class representing a Dataset. Provides utilities for parsing and slicing DGP format datasets.
- dataset_metadata: DatasetMetadata
Dataset metadata object that encapsulates dataset-level metadata for both operating modes (scene or JSON).
- scenes: list[SceneContainer]
List of SceneContainer objects to be included in the dataset.
- datum_names: list, default: None
List of datum names (str) to be considered in the dataset.
- requested_annotations: tuple[str], default: None
Tuple of desired annotation keys, i.e. (‘bounding_box_2d’, ‘bounding_box_3d’). Should match directory name containing annotations from dataset root.
- requested_autolabels: tuple[str], default: None
Tuple of annotation keys similar to requested_annotations, but associated with a particular autolabeling model. Expected format is “<autolabel_model>/<annotation_key>”.
- split: str, default: None
Split of dataset to read (“train” | “val” | “test” | “train_overfit”). If the split is None, the split type is not known and the dataset can be used for unsupervised / self-supervised learning.
- autolabel_root: str, default: None
Optional path to autolabel root directory.
- ignore_raw_datum: Optional[list[str]], default: None
Optionally pass a list of datum types to skip loading their raw data (but still load their annotations). For example, ignore_raw_datum=[‘image’] will skip loading the image rgb data. The rgb key will be set to None. This is useful when only annotations or extrinsics are needed. Allowed values are any combination of ‘image’,’point_cloud’,’radar_point_cloud’
- static get_annotations(datum)¶
- datum: Datum
Datum of type image, point cloud, etc..
- annotations: annotations_pb2
Annotation proto object corresponding to the datum.
- get_autolabels_for_datum(scene_idx, sample_idx_in_scene, datum_name)¶
Get autolabels associated with a datum if available
- scene_idx: int
Index of the scene.
- sample_idx_in_scene: int
Index of the sample within the scene at scene_idx.
- datum_name: str
Name of the datum within sample
- autolabels: dict
Map of <autolabel_model>/<annotation_key> : <annotation_path>. Returns empty dictionary if no autolabels exist for that datum.
- get_camera_calibration(calibration_key, datum_name)¶
Get camera calibration given its calibration key and datum name.
- calibration_key: str
Calibration key.
- datum_name: str
Datum name whose calibration is requested.
- camera: Camera
Calibrated camera with extrinsics/intrinsics set.
- get_datum(scene_idx, sample_idx_in_scene, datum_name)¶
Get datum given its scene index, sample_idx_in_scene, and datum_name
- scene_idx: int
Index of the scene.
- sample_idx_in_scene: int
Index of the sample within the scene at scene_idx.
- datum_name: str
Name of datum within simple
- datum: Datum
Datum indexed at scene_idx, sample_idx_in_scene with the given datum_name.
- get_datum_pose(datum)¶
Get the ego-pose associated with datum
- datum: Datum
Datum of type image, point cloud, etc..
- datum_pose: Pose
Pose object of datum’s ego pose
- TypeError
Raised if datum type is unsupported.
- get_file_meta_from_datum(scene_idx, sample_idx_in_scene, datum_name)¶
Get the sample file info from file datum.
- scene_idx: int
Index of the scene.
- sample_idx_in_scene: int
Index of the sample within the scene at scene_idx.
- datum_name: str
Name of the datum within sample
data: OrderedDict
- “timestamp”: int
Timestamp of the image in microseconds.
- “datum_name”: str
Sensor name from which the data was collected
- “filename”: str
File name associate to the file datum.
- annotations: dict
Map from annotation key to annotation file for datum
- get_image_from_datum(scene_idx, sample_idx_in_scene, datum_name)¶
Get the sample image data from image datum.
- scene_idx: int
Index of the scene.
- sample_idx_in_scene: int
Index of the sample within the scene at scene_idx.
- datum_name: str
Name of the datum within sample
data: OrderedDict
- “timestamp”: int
Timestamp of the image in microseconds.
- “datum_name”: str
Sensor name from which the data was collected
- “rgb”: PIL.Image (mode=RGB)
Image in RGB format.
- “intrinsics”: np.ndarray
Camera intrinsics if available.
- “extrinsics”: Pose
Camera extrinsics with respect to the vehicle frame, if available.
- “pose”: Pose
Pose of sensor with respect to the world/global/local frame (reference frame that is initialized at start-time). (i.e. this provides the ego-pose in pose_WC).
- annotations: dict
Map from annotation key to annotation file for datum
- get_point_cloud_from_datum(scene_idx, sample_idx_in_scene, datum_name)¶
Get the sample lidar data from point cloud datum.
- scene_idx: int
Index of the scene.
- sample_idx_in_scene: int
Index of the sample within the scene at scene_idx.
- datum_name: str
Name of the datum within sample
data: OrderedDict
- “timestamp”: int
Timestamp of the lidar in microseconds.
- “datum_name”: str
Sensor name from which the data was collected
- “extrinsics”: Pose
Sensor extrinsics with respect to the vehicle frame.
- “point_cloud”: np.ndarray (N x 3)
Point cloud in the local/world (L) frame returning X, Y and Z coordinates. The local frame is consistent across multiple timesteps in a scene.
- “extra_channels”: np.ndarray (N x M)
Remaining channels from point_cloud (i.e. lidar intensity I or pixel colors RGB)
- “pose”: Pose
Pose of sensor with respect to the world/global/local frame (reference frame that is initialized at start-time). (i.e. this provides the ego-pose in pose_WS where S refers to the point cloud sensor (S)).
- annotations: dict
Map from annotation key to annotation file for datum
- get_radar_point_cloud_from_datum(scene_idx, sample_idx_in_scene, datum_name)¶
Get the sample radar data from radar point cloud datum.
- scene_idx: int
Index of the scene.
- sample_idx_in_scene: int
Index of the sample within the scene at scene_idx.
- datum_name: str
Name of the datum within sample
data: OrderedDict
- “timestamp”: int
Timestamp of the radar point cloud in microseconds.
- “datum_name”: str
Sensor name from which the data was collected
- “extrinsics”: Pose
Sensor extrinsics with respect to the vehicle frame.
- “point_cloud”: np.ndarray (N x 3)
Point cloud in the local/world (L) frame returning X, Y and Z coordinates. The local frame is consistent across multiple timesteps in a scene.
- “velocity”: np.ndarray(N x 3)
Velocity vectors in sensor frame.
- “covariance”: np.ndarray(N x 3 x 3)
Covariance matrix of point positions in sensor frame.
- “extra_channels”: np.ndarray (N x M)
Remaining channels from radar, rcs_dbm, probability, sensor_id etc
- “pose”: Pose
Pose of sensor with respect to the world/global/local frame (reference frame that is initialized at start-time). (i.e. this provides the ego-pose in pose_WS where S refers to the point cloud sensor (S)).
- annotations: dict
Map from annotation key to annotation file for datum
- get_sample(scene_idx, sample_idx_in_scene)¶
Get sample given its scene index and sample_idx_in_scene.
NOTE: Some samples may be removed during indexing. These samples will NOT be returned by this function. An unmodified list of samples can be accessed via the samples property on each SceneContainer.
- scene_idx: int
Index of the scene.
- sample_idx_in_scene: int
Index of the sample within the scene at scene_idx.
- sample: Sample
Sample indexed at scene_idx and sample_idx_in_scene.
- get_scene_metadata(scene_idx)¶
Get scene-level metadata for the scene index.
- scene_idx: int
Index of scene.
- scene_metadata: OrderedDict
Additional scene-level metadata for the dataset item at index. Note: This is used for traceability and sampling purposes.
- get_sensor_extrinsics(calibration_key, datum_name)¶
Get sensor extrinsics given its calibration key and datum name.
- calibration_key: str
Calibration key.
- datum_name: str
Datum name whose calibration is requested.
- p_WS: Pose
Extrinsics of sensor (S) with respect to the world (W)
- property image_mean¶
- property image_stddev¶
- list_datum_names_available_in_all_scenes()¶
“Gets the set union of available datums names across all scenes. We assume that all samples in a scene have the same datums available.
- available_datum_names: list
DatumId.name which are available across all scenes.
- load_annotations(scene_idx, sample_idx_in_scene, datum_name)¶
Get annotations for a specified datum
- scene_idx: int
Index of the scene.
- sample_idx_in_scene: int
Index of the sample within the scene at scene_idx.
- datum_name: str
Name of the datum within sample
- annotations: dict
Dictionary mapping annotation key to Annotation object for given annotation type.
- Exception
Raised if we cannot load an annotation type due to not finding an ontology for a requested annotation.
- load_datum(scene_idx, sample_idx_in_scene, datum_name)¶
Load a datum given a sample and a datum name
- scene_idx: int
Index of the scene.
- sample_idx_in_scene: int
Index of the sample within the scene at scene_idx.
- datum_name: str
Name of the datum within sample
- datum: parsed datum type
For different datums, we return different types. For image types, we return a PIL.Image For point cloud types, we return a numpy float64 array
- TypeError
Raised if the datum type is unsupported.
- Exception
Raised if datum’s filename is not supported
- property metadata_index¶
Builds an index of metadata items that refer to the scene index, sample index index.
- metadata_index: dict
Dictionary of metadata for tuple key (scene_idx, sample_idx_in_scene) returning a dictionary of additional metadata information for the requested sample.
- class dgp.datasets.base_dataset.DatasetMetadata(scenes, directory, ontology_table=None)¶
Bases:
object
A Wrapper Dataset metadata class to support two entrypoints for datasets (reading from dataset.json OR from a scene_dataset.json). Aggregates statistics and onotology_table when construct DatasetMetadata object for SceneDataset.
- scenes: list[SceneContainer]
List of SceneContainer objects to be included in the dataset.
- directory: str
Directory of dataset.
- ontology_table: dict, default: None
A dictionary mapping annotation key(s) to Ontology(s), i.e.: {
“bounding_box_2d”: BoundingBoxOntology[<ontology_sha>], “autolabel_model_1/bounding_box_2d”: BoundingBoxOntology[<ontology_sha>], “semantic_segmentation_2d”: SemanticSegmentationOntology[<ontology_sha>]
}
- classmethod from_scene_containers(scene_containers, requested_annotations=None, requested_autolabels=None, autolabel_root=None)¶
Load DatasetMetadata from Scene Dataset JSON.
- scene_containers: list of SceneContainer
List of SceneContainer objects.
- requested_annotations: List(str)
List of annotations, such as [‘bounding_box_3d’, ‘bounding_box_2d’]
- requested_autolabels: List(str)
List of autolabels, such as[‘model_a/bounding_box_3d’, ‘model_a/bounding_box_2d’]
- autolabel_root: str, optional
Optional path to autolabel root directory. Default: None.
- Exception
Raised if an ontology in a scene has no corresponding implementation yet.
- static get_dataset_splits(dataset_json)¶
Get a list of splits in the dataset.json.
- dataset_json: str
Full path to the dataset json holding dataset metadata, ontology, and image and annotation paths.
- dataset_splits: list of str
List of dataset splits (train | val | test | train_overfit).
- property metadata¶
- class dgp.datasets.base_dataset.SceneContainer(scene_path, directory=None, autolabeled_scenes=None, is_datums_synchronized=False, use_diskcache=True, skip_missing_data=False)¶
Bases:
object
Object-oriented container for assembling datasets from collections of scenes. Each scene is fully described within a sub-directory with an associated scene.json file.
This class also provides functionality for reinjecting autolabeled scenes into other scenes.
- SCENE_CACHE = <diskcache.core.Cache object>¶
- property annotation_index¶
Build 2D boolean DataArray for annotations. Rows correspond to the datum_idx_in_scene and columns correspond to requested annotation types.
For example:
` +----+-------------------+-------------------+-----+ | | "bounding_box_2d" | "bounding_box_3d" | ... | +----+-------------------+-------------------+-----+ | 0 | False | True | | | 1 | True | True | | | 2 | False | False | | | .. | .. | .. | | +----+-------------------+-------------------+-----+ `
Returns ——- scene_annotation_index: xr.DataArrayBoolean index of annotations for this scene
- property autolabels¶
“Associate autolabels to datums. Iterate through datums, and if that datum has a corresponding autolabel, add it to the autolabel object. Example resulting autolabel map is: {
- <datum_hash>: {
<autolabel_key>: <autolabeled_annotation> …
}
- cache_dir = '/home/runner/.dgp/cache/dgp_diskcache_21323'¶
- cache_suffix = '21323'¶
- property calibration_files¶
Returns the calibration index for a scene.
- calibration_table: dict
Maps (calibration_key, datum_key) -> (p_WS, Camera)
For example: (p_WS, Camera) = self.calibration_table[(calibration_key, datum_name)]
- check_datum_file(datum_idx_in_scene)¶
Checks if datum file exists Parameters ——— datum_idx_in_scene: int
Index of datum in this scene
bool: True if datum file exists
- TypeError
Raised if the referenced datum has an unuspported type.
- check_files()¶
Checks if scene and calibration files exist Returns ——- bool: True if scene and calibration files exist
- property data¶
Returns the scene data.
- property datum_index¶
Build a multidimensional DataArray to represent a scene. Rows correspond to samples, and columns correspond to datums. The value at each location is the datum_idx_in_scene, which can be used to directly fetch the desired datum given a sample index and datum name.
For example:
` +----+-------------+-------------+---------+-----+ | | "camera_01" | "camera_02" | "lidar" | ... | +----+-------------+-------------+---------+-----+ | 0 | 0 | 1 | 2 | | | 1 | 9 | 10 | 11 | | | 2 | 18 | 19 | 20 | | | .. | .. | .. | | | +----+-------------+-------------+---------+-----+ `
- scene_datum_index: xr.DataArray
2D index describing samples and datums
- property datum_names¶
“Gets the list of datums names available within a scene.
- get_autolabels(sample_idx_in_scene, datum_name)¶
Get autolabels associated with a datum if available
- sample_idx_in_scene: int
Index of the sample within the scene at scene_idx.
- datum_name: str
Name of the datum within sample
- autolabels: dict
Map of <autolabel_model>/<annotation_key> : <annotation_path>. Returns empty dictionary if no autolabels exist for that datum.
- get_datum(sample_idx_in_scene, datum_name)¶
Get datum given its sample_idx_in_scene and the datum name.
- sample_idx_in_scene: int
Index of the sample within the scene.
- datum_name: str
Name of the datum within sample
- datum: Datum
Datum at sample_idx_in_scene and datum_name for the scene.
- get_datum_type(datum_name)¶
Get datum type based on the datum name
- datum_name: str
The name of the datum to find a type for.
- get_sample(sample_idx_in_scene)¶
Get sample given its sample_idx_in_scene.
NOTE: Some samples may be removed during indexing. These samples will NOT be returned by this function. An unmodified list of samples can be accessed via the samples property on each SceneContainer.
- sample_idx_in_scene: int
Index of the sample within the scene.
- sample: Sample
Sample indexed at sample_idx_in_scene for the scene.
- property metadata_index¶
Helper for building metadata index.
TODO: Need to verify that the hashes are unique, and these lru-cached properties are consistent across disk-cached reads.
- property ontology_files¶
Returns the ontology files for a scene.
- ontology_files: dict
Maps annotation_key -> filename
For example: filename = scene.ontology_files[‘bounding_box_2d’]
- random_str = '21323'¶
- property samples¶
Returns the scene samples.
- property scene¶
Returns scene. - If self.use_diskcache is True: returns the cached _scene if available, otherwise load the
scene and cache it.
If self.use_diskcache is False: returns _scene in memory if the instance has attribute _scene, otherwise load the scene and save it in memory. NOTE: Setting use_diskcache to False would exhaust the memory if have a large number of scenes.
- select_datums(datum_names, requested_annotations=None, requested_autolabels=None)¶
Select a set of datums by name to be used in the scene.
- datum_names: list
List of datum names to be used for instance of dataset
- requested_annotations: tuple, optional
Tuple of annotation types, i.e. (‘bounding_box_2d’, ‘bounding_box_3d’). Should be equivalent to directory containing annotation from dataset root. Default: None.
- requested_autolabels: tuple[str], optional
Tuple of annotation types similar to requested_annotations, but associated with a particular autolabeling model. Expected format is “<model_id>/<annotation_type>” Default: None.
- ValueError
Raised if datum_names is not a list or tuple or if it is a sequence with no elements.
dgp.datasets.frame_dataset module¶
Dataset for handling frame-level (unordered) for unsupervised, self-supervised and supervised tasks. This dataset is compliant with the TRI-ML Dataset Governance Policy (DGP).
Please refer to dgp/proto/dataset.proto for the exact specifications of our dgp.
- class dgp.datasets.frame_dataset.FrameScene(scene_json, datum_names=None, requested_annotations=None, requested_autolabels=None, only_annotated_datums=False, use_diskcache=True, skip_missing_data=False)¶
Bases:
_FrameDataset
Main entry-point for single-modality dataset using a single scene JSON as input.
NOTE: This class can be used to introspect a single scene given a scene directory with its associated scene JSON.
- scene_json: str
Full path to the scene json.
- datum_names: list, default: None
Select datums for which to build index (see self.select_datums(datum_names)). NOTE: All selected datums must be of a the same datum type!
- requested_annotations: tuple, default: None
Tuple of annotation types, i.e. (‘bounding_box_2d’, ‘bounding_box_3d’). Should be equivalent to directory containing annotation from dataset root.
- requested_autolabels: tuple[str], default: None
Tuple of annotation types similar to requested_annotations, but associated with a particular autolabeling model. Expected format is “<model_id>/<annotation_type>”
- only_annotated_datums: bool, default: False
If True, only datums with annotations matching the requested annotation types are returned.
- use_diskcache: bool, default: True
If True, cache ScenePb2 object using diskcache. If False, save the object in memory. NOTE: Setting use_diskcache to False would exhaust the memory if have a large number of scenes.
- skip_missing_data: bool, default: False
If True, check for missing files and skip during datum index building.
- class dgp.datasets.frame_dataset.FrameSceneDataset(scene_dataset_json, split='train', datum_names=None, requested_annotations=None, requested_autolabels=None, only_annotated_datums=False, use_diskcache=True, skip_missing_data=False)¶
Bases:
_FrameDataset
Main entry-point for single-modality dataset. Used for tasks with unordered data, i.e. 2D detection.
- scene_dataset_json: str
Full path to the scene dataset json holding collections of paths to scene json.
- split: str, default: ‘train’
Split of dataset to read (“train” | “val” | “test” | “train_overfit”).
- datum_names: list, default: None
Select datums for which to build index (see self.select_datums(datum_names)). NOTE: All selected datums must be of a the same datum type!
- requested_annotations: tuple, default: None
Tuple of annotation types, i.e. (‘bounding_box_2d’, ‘bounding_box_3d’). Should be equivalent to directory containing annotation from dataset root.
- requested_autolabels: tuple[str], default: None
Tuple of annotation types similar to requested_annotations, but associated with a particular autolabeling model. Expected format is “<model_id>/<annotation_type>”
- only_annotated_datums: bool, default: False
If True, only datums with annotations matching the requested annotation types are returned.
- use_diskcache: bool, default: True
If True, cache ScenePb2 object using diskcache. If False, save the object in memory. NOTE: Setting use_diskcache to False would exhaust the memory if have a large number of scenes.
- skip_missing_data: bool, default: False
If True, check for missing files and skip during datum index building.
dgp.datasets.pd_dataset module¶
- class dgp.datasets.pd_dataset.ParallelDomainScene(scene_json, datum_names=None, requested_annotations=None, requested_autolabels=None, backward_context=0, forward_context=0, generate_depth_from_datum=None, only_annotated_datums=False, use_virtual_camera_datums=True, skip_missing_data=False, accumulation_context=None, transform_accumulated_box_points=False, use_diskcache=True, autolabel_root=None)¶
Bases:
_ParallelDomainDataset
Refer to SynchronizedScene for parameters.
- class dgp.datasets.pd_dataset.ParallelDomainSceneDataset(scene_dataset_json, split='train', datum_names=None, requested_annotations=None, requested_autolabels=None, backward_context=0, forward_context=0, generate_depth_from_datum=None, only_annotated_datums=False, use_virtual_camera_datums=True, skip_missing_data=False, accumulation_context=None, dataset_root=None, transform_accumulated_box_points=False, use_diskcache=True, autolabel_root=None)¶
Bases:
_ParallelDomainDataset
Refer to SynchronizedSceneDataset for parameters.
dgp.datasets.synchronized_dataset module¶
Dataset for handling synchronized multi-modal samples for unsupervised, self-supervised and supervised tasks. This dataset is compliant with the TRI-ML Dataset Governance Policy (DGP).
Please refer to dgp/proto/dataset.proto for the exact specifications of our dgp.
- class dgp.datasets.synchronized_dataset.SynchronizedScene(scene_json, datum_names=None, requested_annotations=None, requested_autolabels=None, backward_context=0, forward_context=0, accumulation_context=None, generate_depth_from_datum=None, only_annotated_datums=False, transform_accumulated_box_points=False, use_diskcache=True, autolabel_root=None, ignore_raw_datum=None)¶
Bases:
_SynchronizedDataset
Main entry-point for multi-modal dataset with sample-level synchronization using a single scene JSON as input.
Note: This class can be used to introspect a single scene given a scene directory with its associated scene JSON.
- scene_json: str
Full path to the scene json.
- datum_names: list, default: None
Select list of datum names for synchronization (see self.select_datums(datum_names)).
- requested_annotations: tuple, default: None
Tuple of annotation types, i.e. (‘bounding_box_2d’, ‘bounding_box_3d’). Should be equivalent to directory containing annotation from dataset root.
- requested_autolabels: tuple[str], default: None
Tuple of annotation types similar to requested_annotations, but associated with a particular autolabeling model. Expected format is “<model_id>/<annotation_type>”
- backward_context: int, default: 0
Backward context in frames [T-backward, …, T-1]
- forward_context: int, default: 0
Forward context in frames [T+1, …, T+forward]
- accumulation_context: dict, default None
Dictionary of datum names containing a tuple of (backward_context, forward_context) for sensor accumulation. For example, ‘accumulation_context={‘lidar’:(3,1)} accumulates lidar points over the past three time steps and one forward step. Only valid for lidar and radar datums.
- generate_depth_from_datum: str, default: None
Datum name of the point cloud. If is not None, then the depth map will be generated for the camera using the desired point cloud.
- only_annotated_datums: bool, default: False
If True, only datums with annotations matching the requested annotation types are returned.
- transform_accumulated_box_points: bool, default: False
Flag to use cuboid pose and instance id to warp points when using lidar accumulation.
- use_diskcache: bool, default: True
If True, cache ScenePb2 object using diskcache. If False, save the object in memory. NOTE: Setting use_diskcache to False would exhaust the memory if have a large number of scenes.
- autolabel_root: str, default: None
Path to autolabels if not stored inside scene root. Note this must still respect the scene structure, i.e, autolabel_root = ‘/some-autolabels’ means the autolabel scene.json is found at /some-autolabels/<scene-dir>/autolabels/my-model/scene.json.
- ignore_raw_datum: Optional[list[str]], default: None
Optionally pass a list of datum types to skip loading their raw data (but still load their annotations). For example, ignore_raw_datum=[‘image’] will skip loading the image rgb data. The rgb key will be set to None. This is useful when only annotations or extrinsics are needed. Allowed values are any combination of ‘image’,’point_cloud’,’radar_point_cloud’
Refer to _SynchronizedDataset for remaining parameters.
- class dgp.datasets.synchronized_dataset.SynchronizedSceneDataset(scene_dataset_json, split='train', datum_names=None, requested_annotations=None, requested_autolabels=None, backward_context=0, forward_context=0, accumulation_context=None, generate_depth_from_datum=None, only_annotated_datums=False, skip_missing_data=False, dataset_root=None, transform_accumulated_box_points=False, use_diskcache=True, autolabel_root=None, ignore_raw_datum=None)¶
Bases:
_SynchronizedDataset
Main entry-point for multi-modal dataset with sample-level synchronization using scene directories as input.
Note: This class is primarily used for self-supervised learning tasks where the default mode of operation is learning from a collection of scene directories.
- scene_dataset_json: str
Full path to the scene dataset json holding collections of paths to scene json.
- split: str, default: ‘train’
Split of dataset to read (“train” | “val” | “test” | “train_overfit”).
- datum_names: list, default: None
Select list of datum names for synchronization (see self.select_datums(datum_names)).
- requested_annotations: tuple, default: None
Tuple of annotation types, i.e. (‘bounding_box_2d’, ‘bounding_box_3d’). Should be equivalent to directory containing annotation from dataset root.
- requested_autolabels: tuple[str], default: None
Tuple of annotation types similar to requested_annotations, but associated with a particular autolabeling model. Expected format is “<model_id>/<annotation_type>”
- backward_context: int, default: 0
Backward context in frames [T-backward, …, T-1]
- forward_context: int, default: 0
Forward context in frames [T+1, …, T+forward]
- accumulation_context: dict, default None
Dictionary of datum names containing a tuple of (backward_context, forward_context) for sensor accumulation. For example, ‘accumulation_context={‘lidar’:(3,1)} accumulates lidar points over the past three time steps and one forward step. Only valid for lidar and radar datums.
- generate_depth_from_datum: str, default: None
Datum name of the point cloud. If is not None, then the depth map will be generated for the camera using the desired point cloud.
- only_annotated_datums: bool, default: False
If True, only datums with annotations matching the requested annotation types are returned.
- skip_missing_data: bool, default: False
If True, check for missing files and skip during datum index building.
- dataset_root: str
Optional path to dataset root folder. Useful if dataset scene json is not in the same directory as the rest of the data.
- transform_accumulated_box_points: bool, default: False
Flag to use cuboid pose and instance id to warp points when using lidar accumulation.
- use_diskcache: bool, default: True
If True, cache ScenePb2 object using diskcache. If False, save the object in memory. NOTE: Setting use_diskcache to False would exhaust the memory if have a large number of scenes.
- autolabel_root: str, default: None
Path to autolabels if not stored inside scene root. Note this must still respect the scene structure, i.e, autolabel_root = ‘/some-autolabels’ means the autolabel scene.json is found at /some-autolabels/<scene-dir>/autolabels/my-model/scene.json.
- ignore_raw_datum: Optional[list[str]], default: None
Optionally pass a list of datum types to skip loading their raw data (but still load their annotations). For example, ignore_raw_datum=[‘image’] will skip loading the image rgb data. The rgb key will be set to None. This is useful when only annotations or extrinsics are needed. Allowed values are any combination of ‘image’,’point_cloud’,’radar_point_cloud’
Refer to _SynchronizedDataset for remaining parameters.