Coding for camera models and coordination systems
In this section, we are going to leverage everything we have learned to build a concrete camera model and convert between different coordinate systems, using a concrete code snippet example written in Python and PyTorch3D:
- First, we are going to use the following mesh defined by a
cube.obj
file. Basically, the mesh is a cube:mtllib ./cube.mtl o cube # Vertex list v -50 -50 20 v -50 -50 10 v -50 50 10 v -50 50 20 v 50 -50 20 v 50 -50 10 v 50 50 10 v 50 50 20 # Point/Line/Face list usemtl Door f 1 2 3 f 6 5 8 f 7 3 2 f 4 8 5 f 8 4 3 f 6 2 1 f 1 3 4 f 6 8 7 f 7 2 6 f 4 5 1 f 8 3 7 f 6 1 5 # End of file
The example code snippet is camera.py
, which can be downloaded from the book’s GitHub repository.
- Let us import all the modules that we need:
import open3d import torch import pytorch3d from pytorch3d.io import load_obj from scipy.spatial.transform import Rotation as Rotation from pytorch3d.renderer.cameras import PerspectiveCameras
- We can load and visualize the mesh by using Open3D’s
draw_geometrics
function:#Load meshes and visualize it with Open3D mesh_file = "cube.obj" print('visualizing the mesh using open3D') mesh = open3d.io.read_triangle_mesh(mesh_file) open3d.visualization.draw_geometries([mesh], mesh_show_wireframe = True, mesh_show_back_face = True)
- We define a
camera
variable as a PyTorch3DPerspectiveCamera
object. The camera here is actually mini-batched. For example, the rotation matrix, R, is a PyTorch tensor with a shape of [8, 3, 3], which actually defines eight cameras, each with one of the eight rotation matrices. This is the same case for all other camera parameters, such as image sizes, focal lengths, and principal points:#Define a mini-batch of 8 cameras image_size = torch.ones(8, 2) image_size[:,0] = image_size[:,0] * 1024 image_size[:,1] = image_size[:,1] * 512 image_size = image_size.cuda() focal_length = torch.ones(8, 2) focal_length[:,0] = focal_length[:,0] * 1200 focal_length[:,1] = focal_length[:,1] * 300 focal_length = focal_length.cuda() principal_point = torch.ones(8, 2) principal_point[:,0] = principal_point[:,0] * 512 principal_point[:,1] = principal_point[:,1] * 256 principal_point = principal_point.cuda() R = Rotation.from_euler('zyx', [ [n*5, n, n] for n in range(-4, 4, 1)], degrees=True).as_matrix() R = torch.from_numpy(R).cuda() T = [ [n, 0, 0] for n in range(-4, 4, 1)] T = torch.FloatTensor(T).cuda() camera = PerspectiveCameras(focal_length = focal_length, principal_point = principal_point, in_ndc = False, image_size = image_size, R = R, T = T, device = 'cuda')
- Once we have defined the camera variable, we can call the
get_world_to_view_transform
class member method to obtain aTransform3d
object,world_to_view_transform
. We can then use thetransform_points
member method to convert from world coordination to camera view coordination. Similarly, we can also use theget_full_projection_transform
member method to obtain aTransform3d
object, which is for the conversion from world coordination to screen coordination:world_to_view_transform = camera.get_world_to_view_transform() world_to_screen_transform = camera.get_full_projection_transform() #Load meshes using PyTorch3D vertices, faces, aux = load_obj(mesh_file) vertices = vertices.cuda() world_to_view_vertices = world_to_view_transform.transform_points(vertices) world_to_screen_vertices = world_to_screen_transform.transform_points(vertices) print('world_to_view_vertices = ', world_to_view_vertices) print('world_to_screen_vertices = ', world_to_screen_vertices
The code example shows the basic ways that PyTorch3D cameras can be used and how easy it is to switch between different coordinate systems using PyTorch3D.