Reliable deep learning in dynamic environments
After several years as a jazz saxophonist, Zachary Chase Lipton became fascinated with using machine learning to solve problems, particularly in healthcare.
“Subtle differences in how images are captured or represented can bring an otherwise superhuman system down to unacceptably poor levels of accuracy,” says Lipton, now an assistant professor of Machine Learning and Operations Research, and head of the Approximately Correct Machine Intelligence Lab at Carnegie Mellon University. He notes that claims of deep learning’s superhuman performance on medical tasks are typically predicated on evaluations in which the training data and test data sets are exchangeable (statistically identical), differentiated only by the random partitioning of a larger population from which both are sampled. However, this assumption is often violated in practice, sometimes with disastrous effects.
At SPIE Medical Imaging 2023, Lipton will discuss the fundamental difficulties of these problems. He will also discuss the vast landscape of principled methods and corresponding assumptions for developing robust and adaptive predictive models, as well as heuristic methods that have shown some promise on benchmarks but lack theory to guide their application.
What are some of your responsibilities as assistant professor of Machine Learning and Operations Research at Carnegie Mellon University?
My biggest responsibility is running my lab. Coordinating the efforts of about a dozen PhD students and several MA students on a diverse research agenda, keeping everybody fed, remaining on top of the literature, and coming up with new creative ideas is by far the most difficult part of my job.
Another key responsibility has been to teach courses in the Machine Learning Department. I've had the good fortune to get to design several exciting PhD-level courses and to teach – and learn from – a vibrant group of students.
Your lab is called the Approximately Correct Machine Intelligence lab. Is “approximate” as close as we can get with machine intelligence? Do you think this will change?
In some sense, all statistical inference is approximate. If you want to estimate the probability of a coin coming up "heads" merely by observing the outcomes of a sequence of tosses, you will never know the exact probability. But your estimate will converge to the true probability as the sample size grows. To learn from data is in some sense, necessarily, an approximate enterprise. But the lab name packs another meaning.: We're not just referring to this classical statistics / learning theory notion of "approximately correct."
We're also interested in the ways that machine learning is often trained on tasks that themselves are crude approximations for what we really want out of ML systems. Often, we train predictive models but are really hoping to guide decisions. Or we train on data from one distribution but are really hoping that the resulting system will perform well in other environments. Or we train on some simplified objective but actually care about a number of other desiderata that don't make it into our formal modeling process. My lab looks for these gaps between the problems we are actually solving and the more ambitious problems that we want to believe that we are solving.
What are some of your current projects that you’re most excited about?
I'm excited about many projects: these include sound engineering principles for combining data from different time periods and environments to make accurate predictions in a new environment of interest; causal inference; and generating high-quality factual summaries, especially in the context of medical documentation.
Your talk is titled “Reliable deep learning in dynamic environments.” What are some examples of the “dynamic environments” you’re working with?
Here, I'm just referring to "change." Whenever you train a model on data from some hospitals and then deploy it in a new hospital, or train on historical data and then deploy your models into a changing world, you face changes in the statistics of the data. This violates the core assumptions upon which any claims about an ML model's accuracy are contingent. So then what can we do? Absent any further assumptions or commitments, the answer is we are simply out of luck. But under clear assumptions or with some structural knowledge of the problem, there's hope.
What do you see as the future of deep learning in AI? What would you like to see?
Deep learning is the statistical workhorse for working with high-dimensional data and will be for the foreseeable future. On the one hand, we'll continue to improve deep learning models, and to be surprised by some of their predictive capabilities. On the other hand, I think we'll continue to see progress in developing wider frameworks around these models that exploit their predictive capabilities, but put them to the service of some more ambitious ends, e.g., to guiding decisions in the real world.
What would you like attendees to learn from your talk?
Just how serious problems of reliability are, and just how dangerously not-seriously they're currently being taken. I'd also like attendees to learn that some of the quackery that has been offered as somehow addressing these problems, in particular various methods for generating "saliency maps" that are supposed to somehow engender trust, do nothing at all to address these problems.
Do you think AI can ever learn to play the sax?
I think I saw a video of a robot playing John Coltrane's solo on Giant Steps on YouTube over 15 years ago, so why not? But robots can also dunk, and yet we're not lining up to attend the RoboNBA. More seriously, there's a big difference between "playing the sax" in the sense of making a sound and engaging in an actual creative enterprise.
Enjoy this article? Get similar news in your inbox |
|