In science and technology, efforts have long been made to improve the accuracy of various measurements, as well as to increase the resolution of images. The accompanying goal is to reduce the uncertainty of the estimates that can be made, as well as the inferences drawn from already collected data (visual or otherwise). However, uncertainty can never be completely eliminated. Since we have to live with it, there is a lot to be gained by quantifying uncertainty as precisely as possible, at least to some extent.
In other words, we want to know how uncertain our uncertainty really is.
This question is addressed in a new study led by postdoc Swami Sankaranarayanan of MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and his co-authors Anastasios Angelopoulos and Stephen Bates of the University of California, Berkeley; Israel Yaniv Romano, Technion, Institute of Technology; Phillip Isola, associate professor of electrical engineering and computer science at MIT. These researchers not only succeeded in obtaining an accurate measure of uncertainty, but also found a way to display it in a way that ordinary people can understand.
their papers, It was presented in December at the Neural Information Processing Systems Conference in New Orleans and relates to computer vision — a field of artificial intelligence that involves training computers to glean information from digital images. This study focuses on partially dirty or corrupted images (due to missing pixels), And methods (especially computer algorithms) aimed at revealing parts of signals that are corrupted or otherwise hidden. Sankaranarayanan explains that the algorithm “takes a blurry image as input and gives you a sharp image as output”—a process that typically takes several steps.
First, there’s an encoder, a neural network specially trained by the researchers to deblur blurred images. The encoder takes a distorted image and from it creates an abstract (or “latent”) representation of a clean image — made up of a list of numbers — that a computer can understand but doesn’t make sense to most humans. Next steps is a decoder, of several types, and usually also a neural network. Sankaranarayanan and his colleagues used a type of decoder called a “generative” model. In particular, they used an off-the-shelf version called StyleGAN that takes as its input a digit in an encoded representation (e.g. a cat) and then builds a full, cleaned image (that particular cat). Thus, the entire process, including the encoding and decoding stages, produces a clean picture from an otherwise chaotic rendering.
But how much confidence can one have in the accuracy of the final image? And, as stated in the December 2022 paper, what is the best way to represent uncertainty in images? The standard approach is to create a “saliency map,” which assigns a probability value — between 0 and 1 — to indicate the model’s confidence in the correctness of each pixel, one at a time. According to Sankaranarayanan, this strategy has a drawback, “because predictions are performed independently for each pixel. But meaningful objects occur in groups of pixels, not in individual pixels,” he adds, which is why he and his colleagues proposed a completely different approach to assessing uncertainty.
Their approach centers on “semantic properties” of images — groups of pixels that have meaning when put together, such as to form a human face, dog or other recognizable things. The goal, Sankaranarayanan maintains, is “to estimate the uncertainty in a way that correlates to groupings of pixels that humans can easily interpret.”
While standard methods may produce a single image that constitutes a “best guess” of what the real picture should be, uncertainty in that representation is often difficult to discern. The new paper argues that, in order to be useful in the real world, uncertainty should be presented in a way that is meaningful to non-machine learning experts. Instead of generating a single image, the authors devised a program that generates a series of images – each of which is likely to be correct. Additionally, they can set precise bounds on a range or interval and provide probabilistic guarantees that the true description lies somewhere within that range. A narrower range can be provided if the user is satisfied with 90% certainty, or a narrower range if greater risk is acceptable.
The authors argue that their paper presents the first algorithm, designed specifically for generative models, that can establish uncertainty intervals associated with meaningful (semantically interpretable) features of images, with “formal statistical guarantees”. While it’s an important milestone, Sankaranarayanan sees it as just one step toward an “ultimate goal.” So far, we’ve been able to do this for simple things like recovering images of human faces or animals, but we hope to extend this approach to more critical domains, such as medical imaging, where our ” Statistical guarantees “may be particularly important”
Suppose the film, x-ray, or chest x-ray is blurry, he adds, “and you want to reconstruct the image. If you’re given a series of images, you want to know what the true image is contained in, so you without missing any critical information”—information that might reveal whether a patient has lung cancer or pneumonia. In fact, Sankaranarayanan and his colleagues have already started working with radiologists to see if their algorithm for predicting pneumonia could be used in a clinical setting.
Their work may also be relevant to the field of law enforcement, he said. “A picture from a surveillance camera can be blurry, and you want to enhance it. Models for doing that already exist, but it’s not easy to measure uncertainty. And you don’t want to make a mistake in a life-or-death situation.” He and his colleagues Tools are being developed to help identify the guilty and exonerate the innocent.
Much of what we do, and much of what happens in the world around us, is shrouded in uncertainty, Sankaranarayanan noted. So taking a firmer grip on this uncertainty can help us in myriad ways. On the one hand, it can tell us more about what we don’t know.
Angelopoulos was supported by the National Science Foundation. Bates is supported by the Data Science Institute Foundation and the Simons Institute. Romano was supported by the Israel Science Foundation and a Career Development Fellowship from the Technion-Israel Institute of Technology. Sankaranarayanan and Isola’s research on this project was sponsored by the U.S. Air Force Research Laboratory and the U.S. Air Force Artificial Intelligence Accelerator under Collaborative Agreement No. FA8750-19-2-1000. The MIT Supercloud and the Lincoln Laboratory Supercomputing Center also provided resources to compute contributions to the results reported in this work.