Henry Pfister: Inferring Answers through Graphical Models

Constructing graphical models and inferring how variables relate to one another can help researchers analyze large and small data sets—and test devices before they’re even built

By Ken Kingery

Henry Pfister recently joined both the electrical and computer engineering department in Duke’s Pratt School of Engineering as well as the interdisciplinary Information Initiative at Duke (iiD). An expert in extracting information from both mountains and molehills of data, he will tackle informatics challenges in realms such as personalized health care, compressed sensing and wireless communications.

In today’s digitized world, many companies and research enterprises are compiling statistics at an exponential rate. In some fields, however, the sample size is too small to find trends and patterns in the data. In either case, it is Pfister’s job to create graphical models and algorithms to infer answers to the questions being asked of the data.

“Our goal is to abstract away all the details of the problem,” said Pfister, who joins the Duke community from Texas A&M University. “If you can understand the kernel of the fundamental problem, then the implications can help you answer many questions based on that kernel.”

Suppose, for example, doctors want to know whether or not a trial drug helps people live longer. A large clinical study will easily provide the general answer to that question, but what if you want to drill down to smaller categories? Looking at a subset of people with specific genetic profiles might yield a tiny sample size.

To get around this problem, Pfister builds graphical models to explore how certain variables correlate to one another. By finding overlaps and inferring their effects on each other, researchers can develop better algorithms to analyze smaller sample sizes and increase their confidence in the answers.

Graphical models are also used to build error-correcting codes for data storage applications. Building a hard drive controller microchip is expensive, and developers can’t test it until it’s built. Nor can they test it virtually, as it would take today’s fastest computers years of simulation to observe a single read error. But by understanding error-correcting codes based on graphical models, theorists can accurately approximate the system’s error rate and ensure that it is too low to notice during regular use.

Pfister’s work can be a boon to many research projects, and Duke’s engineering school is teeming with prospective collaborators.

“I believe that many problems being studied today are very similar across disciplines. Some people want to build a better cell phone while others want to design a better experiment, but the math problems they end up solving become very similar,” said Pfister. “Multidisciplinary centers, like the iiD, where people can find these common elements and work on them together, are very valuable. A big reason I came to Duke is the opportunity to work with a lot of exceptional researchers on these sorts of problems.

“And also the spring is beautiful in North Carolina.”