Engineers Unlock Remarkable 3D Vision from Ordinary Digital Camera Technology

This story is a press release written by Optica.

Modern digital cameras are equipped with an impressive array of functions – from autofocus and image stabilization to panoramas and high-definition video. Recently a team of engineers from Duke University has unlocked a previously unrecognized 3D imaging capability of modern cameras by simply repurposing its existing components.

This new capability was successfully demonstrated in a proof-of-concept laboratory experiment using a small deformable mirror —a reflective surface that can direct and focus light. The research demonstrates how the equivalent technology in modern digital cameras, the image stabilization and focus modules, could be harnessed to achieve the same results without additional hardware.  

The purpose of the experiment was to extract depth-of-field information from a “single shot” image — rather than traditional 3D imaging techniques that require multiple images– without suffering any trade-offs in image quality. When integrated into commercial cameras and other optical technologies, this visualization technique could improve core functions, like image stabilization, and increase the speed of autofocus, which would enhance the quality of photographs.

“Real scenes are in three dimensions and they’re normally captured by taking multiple images focused at various distances,” said Patrick Llull, Duke Imaging and Spectroscopy Program (DISP), Duke University. “A variety of single-shot approaches to improve the speed and quality of 3D image capture has been proposed over the past decades. Each approach, however, suffers from permanent degradations in 2D image quality and/or hardware complexity.”

The research team, led by David Brady, a professor at Duke, was able to overcome these hurdles, developing an adaptive system that may accurately extract 3D data while maintaining the ability to capture a full-resolution 2D image without a dramatic system change, such as switching out a lens.

Brady and his team present their findings in Optica, the high-impact, Open Access journal from The Optical Society.  

A New Path to the Third Dimension
Humans are able to see in three dimensions by a process known as parallax, in which the information received by each eye is slightly offset from the other. The brain is able to interpret and process these slightly divergent signals, recognizing how the apparent displacement as seen by each eye relates to different distances. This allows humans to perceive depth.

Traditional 3D imaging relies on virtually the same principle in which images and scenes are recorded with two slightly off-set lenses. When projected or processed, the original 3D appearance is restored. This recording process, however, requires twice the data as a 2D image, making 3D photography and video more bulky, expensive, and data intensive.

“We want to achieve the same results with the equipment people already have in their handheld cameras with no major hardware modifications,” noted Llull.

Stabilization to Recover Information at Depth
Modern digital cameras, especially those with video capabilities, are frequently equipped with modules that take the jitter out of recordings. They do this by measuring the inertia or motion of the camera and compensate by rapidly moving the lens – making multiple adjustments per second -- in the module. This same hardware can also change the image capture process, recording additional information about the scene. With proper software and processing, this additional information can unlock the otherwise hidden third dimension.

The first step, according to the researchers, is to enable the camera to record 3D information. This is achieved by programming the camera to perform three functions simultaneously:  sweeping through the focus range with the sensor, collecting light over a set period of time in a process called integration, and activating the stabilization module.

As the optical stabilization is engaged, it wobbles the lens to move the image relative to a fixed point. This, in conjunction with a focal sweep of the sensor, integrates that information into a single measurement in a way that preserves image details while granting each focus position a different optical response. The images that would have otherwise been acquired at various focal settings are directly encoded into this measurement based on where they reside in the depth of field.

For the paper, the researchers used a comparatively long exposure time to compensate for the set-up of the equipment. To emulate the workings of a camera, a beam splitter was necessary to control the deformable lens: This extra step sacrifices about 75 percent of the light received. “When translated to a fully integrated camera without a beamsplitter, this light loss will not be an issue and much faster exposure times will be possible,” noted Llull.  

The researchers then process a single exposure taken with this camera and obtain a data-rich product known as a data cube, which is essentially a computer file that includes both the all-focused 2D image as well as an extra element known a depth map. This depth map data, in effect, describes the focus position of each pixel of the image. Since this information is already encoded into the single measurement, it’s possible to construct a depth map for the entire scene.  

The final step is to process the image and depth map with a commercial 3D graphics engine, similar to those that render 3D scenes in video games and computer-generated imagery used in Hollywood movies.  The resulting image can be used to determine the optimal focal setting for subsequent full-resolution 2D shots, as an autofocus algorithm does, but from only one image.  Additionally, synthetic refocusing may be used on the resulting 3D imagery to display the scene as viewed at different depths by a human. 

Though only performed in laboratory settings with surrogate technologies, the researchers believe the techniques they employed could be applied to basic consumer products. The result would be a more efficient autofocusing process, as well as the added third dimension to traditional photography.

“We have found a new path to extract 3D information from an otherwise 2D process. The benefits of this are dual functionality of tomographic imaging and full resolution 2D capture with little modification to existing systems,” concluded Llull.

Paper: "Image translation for single-shot focal tomography," Patrick Llull et al., Optica, 2, 9, 822 (2015)