MultiViewPhotometricLoss

class packnet_sfm.losses.multiview_photometric_loss.MultiViewPhotometricLoss(num_scales=4, ssim_loss_weight=0.85, occ_reg_weight=0.1, smooth_loss_weight=0.1, C1=0.0001, C2=0.0009, photometric_reduce_op='mean', disp_norm=True, clip_loss=0.5, progressive_scaling=0.0, padding_mode='zeros', automask_loss=False, **kwargs)[source]

Bases: packnet_sfm.losses.loss_base.LossBase

Self-Supervised multiview photometric loss. It takes two images, a depth map and a pose transformation to produce a reconstruction of one image from the perspective of the other, and calculates the difference between them

Parameters

num_scales (int) – Number of inverse depth map scalesto consider
ssim_loss_weight (float) – Weight for the SSIM loss
occ_reg_weight (float) – Weight for the occlusion regularization loss
smooth_loss_weight (float) – Weight for the smoothness loss
C1,C2 (float) – SSIM parameters
photometric_reduce_op (str) – Method to reduce the photometric loss
disp_norm (bool) – True if inverse depth is normalized for
clip_loss (float) – Threshold for photometric loss clipping
progressive_scaling (float) – Training percentage for progressive scaling (0.0 to disable)
padding_mode (str) – Padding mode for view synthesis
automask_loss (bool) – True if automasking is enabled for the photometric loss
kwargs (dict) – Extra parameters

SSIM(x, y, kernel_size=3)[source]

Calculates the SSIM (Structural SIMilarity) loss

Parameters

x,y (torch.Tensor [B,3,H,W]) – Input images
kernel_size (int) – Convolutional parameter

Returns

ssim – SSIM loss

Return type

torch.Tensor [1]

calc_photometric_loss(t_est, images)[source]

Calculates the photometric loss (L1 + SSIM) :param t_est: List of warped reference images in multiple scales :type t_est: list of torch.Tensor [B,3,H,W] :param images: List of original images in multiple scales :type images: list of torch.Tensor [B,3,H,W]

Returns: photometric_loss – Photometric loss
Return type: torch.Tensor [1]

calc_smoothness_loss(inv_depths, images)[source]

Calculates the smoothness loss for inverse depth maps.

Parameters

inv_depths (list of torch.Tensor [B,1,H,W]) – Predicted inverse depth maps for all scales
images (list of torch.Tensor [B,3,H,W]) – Original images for all scales

Returns

smoothness_loss – Smoothness loss

Return type

torch.Tensor [1]

forward(image, context, inv_depths, K, ref_K, poses, return_logs=False, progress=0.0)[source]

Calculates training photometric loss.

Parameters

image (torch.Tensor [B,3,H,W]) – Original image
context (list of torch.Tensor [B,3,H,W]) – Context containing a list of reference images
inv_depths (list of torch.Tensor [B,1,H,W]) – Predicted depth maps for the original image, in all scales
K (torch.Tensor [B,3,3]) – Original camera intrinsics
ref_K (torch.Tensor [B,3,3]) – Reference camera intrinsics
poses (list of Pose) – Camera transformation between original and context
return_logs (bool) – True if logs are saved for visualization
progress (float) – Training percentage

Returns

losses_and_metrics – Output dictionary

Return type

dict

property logs: Returns class logs.

reduce_photometric_loss(photometric_losses)[source]

Combine the photometric loss from all context images

Parameters: photometric_losses (list of torch.Tensor [B,3,H,W]) – Pixel-wise photometric losses from the entire context
Returns: photometric_loss – Reduced photometric loss
Return type: torch.Tensor [1]

warp_ref_image(inv_depths, ref_image, K, ref_K, pose)[source]

Warps a reference image to produce a reconstruction of the original one.

Parameters

inv_depths (torch.Tensor [B,1,H,W]) – Inverse depth map of the original image
ref_image (torch.Tensor [B,3,H,W]) – Reference RGB image
K (torch.Tensor [B,3,3]) – Original camera intrinsics
ref_K (torch.Tensor [B,3,3]) – Reference camera intrinsics
pose (Pose) – Original -> Reference camera transformation

Returns

ref_warped – Warped reference image (reconstructing the original one)

Return type

torch.Tensor [B,3,H,W]

packnet_sfm.losses.multiview_photometric_loss.SSIM(x, y, C1=0.0001, C2=0.0009, kernel_size=3, stride=1)[source]

Structural SIMilarity (SSIM) distance between two images.

Parameters

x,y (torch.Tensor [B,3,H,W]) – Input images
C1,C2 (float) – SSIM parameters
kernel_size,stride (int) – Convolutional parameters

Returns

ssim – SSIM distance

Return type

torch.Tensor [1]