You are here

Shaping data

from Duke Engineering: Leading Research 2013

It doesn’t tick, but biology keeps excellent time. Mathematician and computer science/ECE professor John Harer and his collaborators want to unwind the variety of biological clocks found in cells, looking closely at their timepieces to see how they work individually and how they work together.

He’s also interested in understanding the rhythms of the heart and how anomalies show up as complex shapes in EKGs. Topological data analysis is a mathematical tool that uses geometric algorithms to scrutinize a database and draw out inherent shapes within the data.

Harer explains that at a basic level, topological analyses convert the two-dimensional peaks and troughs of an EKG into a multi-dimensional dataset, transforming periodic patterns into circular shapes. And other anomalies turn into other interesting shapes. The added dimensions of the topological analysis ultimately let scientists perceive patterns in the data they wouldn’t normally see, Harer says.

Topological data analysis is based on topology, an older area of mathematics that scientists have only recently started using to look at large, complex datasets. The technique takes measurements, of ten health indicators of a thousand people, for example, and plots them as points in a ten-dimensional space to get a data cloud. Harer can use the cloud to look for spheres, curvature and other shapes to identify the factors, say age and weight, that may heavily influence blood pressure, cholesterol and the other included health measures.

That’s a hypothetical example, Harer says. He explains that the technique has worked well so far for finding subtle relationships between indicators, like linking between EKG and EEG patterns. It may also help him and other scientists look for characteristic structures in gas plumes, which could signal chemical warfare, help with tracking behavioral groups, which could indicate terrorist activity, and identify signature clock genes within a variety of cells.