Information Theorists Settle 70-Year-Old Communications Question

By Ken Kingery

11/22/21 Pratt School of Engineering 5 min read

Colleagues call Duke mathematical proof ‘stunning achievement’ that could have broad implications

A dry-erase board with red and blue equations written all over it — Information Theorists Settle 70-Year-Old Communications Question

Since their creation nearly 70 years ago, Reed-Muller codes have been used to help transmit data in wireless communication, particularly in deep-space applications. With many codes for computer scientists to choose from, their popularity stems from their ability to correct errors caused by noisy transmissions without taking up too much of the channel’s capacity.

Proving capacity for different codes on various channels has been a pursuit of information theorists since 1948, when MIT’s Claude Shannon first introduced the idea of channel capacity. In a paper that essentially created the field of information theory, Shannon showed that any communications channel could be characterized by its bandwidth and noise. Given a channel with a particular bandwidth and noise characteristic, he demonstrated how to calculate the maximum rate at which data could be transmitted essentially without error. That rate is what is now known as the Shannon capacity.

A man in front of a white board with colorful equations on it Now, information theorists have proven that Reed-Muller codes achieve capacity — the maximum rate of information transmission — on a large set of channels known as binary memoryless symmetric (BMS) channels. The proof has been pursued by many electrical engineers, computer scientists and mathematicians over the years for various communication channels, with the simplest case of the binary erasure channel having been resolved just four years ago.

The newly released mathematical proof could eventually help communication systems become more efficient and use less power. But the more important contribution is to the mathematical study of inference problems in high dimension, because the techniques used in the proof could help researchers attack a wide variety of other problems.

The results were posted online to the arXiv on October 27, a non-peer-reviewed repository of scientific papers commonly used by researchers to receive feedback from peers before publication. The paper is currently under review in IEEE Transactions on Information Theory.

“Many of us in the coding theory and theoretical computer science community have spent a considerable effort trying to make headway on this classic and important problem. But despite all our attempts we have found progress to be elusive. It is a stunning achievement to see it now resolved. And the proof is one for the book.”

Rüdiger Urbanke

“It’s been recognized for some time that Reed-Muller codes are likely to achieve capacity on symmetric channels, but nobody had any good idea of how you might be able to prove it,” said Henry Pfister, professor of electrical and computer engineering and mathematics at Duke.

“It’s a difficult problem to solve because there’s no obvious handle to grab hold of when working through the proof,” added Galen Reeves, professor of electrical and computer engineering and statistics at Duke. “But if you can find a place to start, then a path starts to become clear.”

A man sitting at a desk

That handle became available in 2016 when Pfister and his coauthors, including graduate student Santhosh Kumar, proved Reed-Muller codes achieve capacity on the binary erasure channel—the simplest model of a communication channel. The insights gleaned from this proof sparked discussions between Pfister and Reeves about how to tackle a proof for the much wider and more complex binary memoryless symmetric channels. The resulting ideas were promising enough for the pair to secure a multiyear, $500,000 grant from the National Science Foundation.

“Ever since Shannon’s original work, which used a so-called ‘random coding’ approach, coding theorists have been trying to prove that capacity can be achieved with constructive codes, for which deterministic encoding and decoding procedures are available,” explained Daniel Costello, the Bettex Chair Professor Emeritus in the University of Notre Dame’s Department of Electrical Engineering, who was not involved with the work. “This is important because random codes cannot be efficiently implemented in practice.”

“People have long suspected that Reed-Miller codes might be capacity-achieving if decoded by more sophisticated ‘modern’ decoding methods,” added Dave Forney, a longstanding leader in the field and key figure in the development of the high-speed modem, who was not involved with the work. “Reeves and Pfister have now proved that this is true for a very broad class of channels, which include all of the most interesting standard channels.”

“Since modern coding theory and its analysis techniques are closely related to the methods and analysis techniques of deep learning, this paper may possibly have an impact beyond the fields of information theory and error-correction coding.”

dave forney

For this proof, the pair had to develop new mathematical tools as they worked their way toward the result. While Pfister had an initial insight that started the process, they mostly worked together to formulate the proof for the Gaussian channel, which is the standard BMS channel used to model wireless communications. Then, Reeves developed the final mathematical framework that worked for all BMS channels and pushed them over the finish line.

“Many of us in the coding theory and theoretical computer science community have spent a considerable effort trying to make headway on this classic and important problem,” said Rüdiger Urbanke, information theorist and professor at the École Polytechnique Fédérale de Lausanne in Switzerland. “But despite all our attempts we have found progress to be elusive. It is a stunning achievement to see it now resolved. And the proof is one for the book.”

The relatively short and elegant proof presented by Pfister and Reeves could open new opportunities for low-power, efficient decoders for communications devices. But according to the researchers, the mathematical approaches used in the ‘proof for the book’ present a path toward solving other challenges, and not just those in information theory.

“The methods that Reeves and Pfister use in their proof are likely to be of independent interest,” said Forney, adjunct professor emeritus of the Laboratory for Information and Decision Systems at the Massachusetts Institute of Technology. “Since modern coding theory and its analysis techniques are closely related to the methods and analysis techniques of deep learning, this paper may possibly have an impact beyond the fields of information theory and error-correction coding.”