Overview of 3D MNIST Dataset

The 3D MNIST dataset is available in HDF5 file format, here. The dataset contains 3D point clouds, i.e., sets of (x, y, z) coordinates generated from a portion of the original 2D MNIST dataset (around 5,000 images). The point clouds have zero mean and a maximum dimension range of 1. Each HDF5 group contains:

“points” dataset: x, y, z coordinates of each 3D point in the point cloud.

coordinates of each 3D point in the point cloud. “normals” dataset: nx, ny, nz components of the unit normal associate to each point.

components of the unit normal associate to each point. “img” dataset: the original MNIST image.

“label” attribute: the original MNIST label.

In addition to train and test point clouds, the dataset also contains full_dataset_vectors.h5 that stores 4096-D vectors obtained from voxelization of all the 3D point clouds and their randomly rotated copies with noise. The full_dataset_vectors.h5 is splitted into 4 groups:

>>> X_train = f["X_train"][:] # shape: (10000, 4096)

>>> y_train = f["y_train"][:] # shape: (10000,)

>>> X_test = f["X_test"][:] # shape: (2000, 4096)

>>> y_test = f["y_test"][:] # shape: (2000,)

Here is an example to read a digit and store its group content in a tuple.

>>> with h5.File('train_point_clouds.h5', 'r') as f:

# Reading digit at zeroth index

a = f["0"]

# Storing group contents of digit a

digit = (a["img"][:], a["points"][:], a.attrs["label"])

Let’s first visualize the contents stored in the tuple with matplotlib . The following code plots the first 15 images from the original 2D MNIST with their corresponding labels:

>>> # Plot some examples from original 2D-MNIST

>>> fig, ax = plt.subplots(3,5, figsize=(12, 12), facecolor='w', edgecolor='k') >>> fig.subplots_adjust(hspace = .5, wspace=.2) >>> for ax, d in zip(axs.ravel(), digits):

ax.imshow(d[0][:])

ax.set_title("Digit: " + str(d[2]))

First 15 example images along with their labels from the 2D MNIST.

Before visualizing the 3D point clouds, let’s first discuss about Voxelization. Voxelization is the process of conversion of a geometric object from its continuous geometric representation into a set of voxels that best approximates the continuous object. The process mimics the scan-conversion process that rasterizes 2D geometric objects, and is also referred to as 3D scan-conversion.

Voxel Grid with a single voxel shaded. Source: [5]

We use this process to fit an axis-aligned bounding box called the Voxel Grid around the point clouds and subdivide the grid into segments, assigning each point in the point cloud to one of the sub boxes (known as voxels, analogous to pixels). We split the grid into 16 segments along each axis resulting in a total of 4096 voxels, equivalent to the 4th level of an Octree.

>>> voxel_grid = VoxelGrid(digit, x_y_z = [16, 16, 16])

The code above generates a voxel grid of 16 × 16 × 16 = 4096 voxels, with the structure attribute representing a 2D array, where each row represents a point in the original point cloud and each column represents the n_voxel where it lies with respect to [x_axis, y_axis, z_axis, global] .

>>> voxel_grid.structure[0] array([ 5, 3, 7, 477])

The histogram shown below visualize the number of points present within each voxel. From the plot, it can be seen that there are a lot of empty voxels. This is due to the use of a cuboid bounding box to ensure that the Voxel Grid will divide the cloud in a similar way even when the point clouds are oriented to different directions.



>>> plt.title("DIGIT: " + str(digits[0][-1]))

>>> plt.xlabel("VOXEL")

>>> plt.ylabel("POINTS INSIDE THE VOXEL")

>>> count_plot(voxels[0].structure[:,-1]) # Check >>> # Get the count of points within each voxel.>>> plt.title("DIGIT: " + str(digits[0][-1]))>>> plt.xlabel("VOXEL")>>> plt.ylabel("POINTS INSIDE THE VOXEL")>>> count_plot(voxels[0].structure[:,-1]) # Check definition here.

Number of Points within each Voxel for a sample of digit 5.

We can visualize the Voxel Grid using the built-in helper function plot() defined in the file plot3D.py provided along with the dataset. This function displays the sliced spatial views of the Voxel Grid around the z-axis.

>>> # Visualizing the Voxel Grid sliced around the z-axis. >>> voxels[0].plot()

>>> plt.show()

Spatial View of the Voxel Grid of a sample of digit 5 sliced around the z-axis.

Now we visualize the 3D point clouds using an open-source 3D software, CloudCompare. We save the Voxel Grid structure as the scalar fields of the original point clouds and render 3D images with CloudCompare.

You can visualize the point clouds with any 3D software of your choice that supports point clouds and scalar fields!