The new edition of the international conference Neural Information Processing Systems, one of the most important in the area of machine learning, will begin at the end of November in New Orleans, USA. A total of four papers will be presented at this meeting, in which the researchers Cristóbal Guzmán and Pablo Barceló will participate as co-authors.
The 36th version of the international conference Neural Information Processing Systems (NeurIPS 2022), internationally recognized as one of the most important events in the field of machine learning, will be held from November 28 to December 9. It has taken place for more than 30 years and annually brings together more than 10,000 attendees, including both academics and people from the industry.
The conference, which this time will be held in New Orleans, United States, is known for being numerous and competitive. During the call for papers, nearly 10,000 papers are received from researchers specialized in various fields related to machine learning, such as neuroscience, statistics, optimization, computer vision and natural language processing, as well as life sciences, natural sciences and social sciences, among other disciplines. Of this high volume of papers, only about 25% are selected for presentation during the two-week event.
For this 2022 version, four papers were selected, with the participation of Cenia researcher and assistant professor at the Institute of Mathematical and Computational Engineering (IMC UC), Cristóbal Guzmán; a paper by Cenia principal investigator and director of IMC UC, Pablo Barceló; and a paper by Cenia associate, UC Engineering PhD student and iHealth Millennium Institute researcher, Pablo Messina. All three researchers will participate in the conference in person.
In this opportunity, Cristóbal Guzmán will be co-organizer of one of the many workshops to be held during NeurIPS, entitled Optimization for Machine Learning (OPT 2022). This meeting will bring together optimization experts who will share their perspectives and advances in the field, in addition to promoting interaction with machine learning specialists. In fact, Guzman explains, this workshop is one of the oldest held within NeurIPS. “On average, about 100 people come together. Among the people who work in machine learning, there are several who work a bit far from the center of everything that is happening, so they feel at home in this workshop and that is why they participate,” says the researcher.
Optimization problems in machine learning
Cristóbal Guzmán is a Civil Mathematical Engineer and PhD in Algorithms, Combinatorics and Optimization. The three papers in which he participated and were accepted in NeurIPS were published during this year: the first one in February, the second one at the end of March and the third one in May. All of them are related to the studies that the researcher has been carrying out during the last time and that are focused on understanding iterative methods to solve machine learning problems.
One of them deals with differential privacy in generalized linear models, is entitled “Differentially Private Generalized Linear Models Revisited” and was conducted in collaboration with researchers Raman Arora (Johns Hopkins U., USA), Raef Bassily (Ohio State U.), Michael Menart (Ohio State U., USA) and Enayat Ullah (Johns Hopkins U., USA). In the words of Cristóbal Guzmán, this line of work addresses how to solve optimization problems that appear in machine learning, but focusing on “protecting the identity and information of users who provide their data to train these models.” In this sense, the researcher comments that two of the fields where it is essential today to protect data privacy are the medical and financial sectors.
The academic explains that in order to guarantee privacy, it is generally necessary to add additional noise to the algorithms used, which degrades the quality of the solutions. “In this paper we work with a specific model, in which it is possible to at least prevent the amount of error introduced from growing along with the number of attributes in the model. We believe that mitigating this effect of dimensionality in a problem is something interesting,” says Guzmán.
The second paper in which Guzmán participates is entitled “Between Stochastic and Adversarial Online Convex Optimization: Improved Regret Bounds via Smoothness.” It was prepared together with a research team from the University of Amsterdam, Netherlands, which is composed of Sarah Sachs, Hédi Hadiji and Tim van Erven. The aim of the paper is to bring together two existing theories on how different optimization problems are solved when data is received and obtained sequentially. “The online model of optimization is about just that. You can imagine that in some way you are making predictions, but you don’t have the dataset immediately available, but rather you observe how users arrive at the system, and in that sense, you try to adjust your decisions according to the sequence that is produced,” explains Guzmán.
In this area, the researcher adds, there is the stochastic model, which assumes that “all users arriving at a system follow the same distribution. That is the classic statistical model. And then there’s the other one which is the adversarial model, where you believe that the sequence may be produced by someone malicious, so to speak, and they’re trying to make my algorithm fail.” The researcher adds that neither of these models is considered truly realistic: “Both are idealizations of certain characteristics that we seek to incorporate. So our motivation is rather to understand whether one can incorporate or interpolate these scenarios, between stochastic and adversarial situations.”
The third paper is entitled “A Stochastic Halpern Iteration with Variance Reduction for Stochastic Monotone Inclusion Problems” and was carried out in collaboration with a research team from the University of Wisconsin-Madison (USA) consisting of Xufeng Cai, Chaobing Song and Jelena Diakonikolas. “I work in optimization and the connection with machine learning and artificial intelligence is that the models that one needs to solve in order to train these systems are precisely optimization problems,” says Guzmán. However, he adds, today it is not only enough to train a model but also to try to incorporate the fact that those who provide their data are people with individual interests.
“There are people or institutions that act strategically and whose decisions affect the results of the rest, and that’s happening more and more in machine learning. Someone, for example, could give me corrupted data to make my models perform poorly,” he adds. In this context, understanding algorithms that can compute equilibrium conditions in the face of noisy data is a challenging and important problem, and this paper addresses a specific class of such problems where convergence results are guaranteed.
The dynamics associated with equilibrium problems could have implications for the operation of artificial intelligence systems linked to more sensitive and controversial issues such as those in the political arena: “In fact, my interest going forward is to explore more along the lines of what happens with these models that not only learn in a, let’s say, unconscious environment, but it actually reacts to how I observe it and a sort of feedback occurs, where people react to the predictions I make about them.”
Interpretability, an increasingly relevant area in artificial intelligence
The further the development of artificial intelligence progresses, the greater the demand for transparency of machine learning models. This is largely due to the fact that many companies and public institutions have joined the digital transformation and have adopted these new technologies, so that the decisions made by these models are becoming increasingly relevant, in the sense of not negatively affecting users.
The selected paper in this field is co-authored by Pablo Barceló, who specializes in database theory and logic in computer science. The study deals with interpretability in artificial intelligence, is entitled “On Computing Probabilistic Explanations for Decision Trees” and was prepared together with researchers Marcelo Arenas (director of the Millennium Institute Fundamentals of Data, professor of the Department of Computer Science of the UC School of Engineering and academic in joint charge of IMC UC), Miguel Romero (Cenia researcher and professor of the Faculty of Engineering and Sciences of Universidad Adolfo Ibáñez) and Bernardo Subercaseaux (Computer Science Engineering student of Universidad de Chile).
Knowing why a machine learning model makes a particular decision among several possible options is often quite complex and can affect people’s lives. Barceló explains the impact of this phenomenon in the following example: “If a bank does not grant a loan, it should, for legal reasons, be obliged to explain to the client why they did not give it, and should at least be able to demonstrate that it is not using certain protected information, such as the client’s gender, for example. That information shouldn’t be occupied for that, whereas machine learning models are going to use all the data at hand to make the decisions they deem optimal, unless they are designed to be more robust.” In this sense, the researcher explains the objective of his study: “What we are trying to investigate is which are the characteristics, within all the information that the model uses to make a decision, that mostly influence the decision it is making about some entity, object, person, etc.”
Barceló adds that not all models in artificial intelligence work alike, so there are some that are more transparent or easier to interpret than others, such as decision tree models or models based on linear classification, compared to deep neural network models that tend to be much more opaque. Given these two options, it is sometimes recommended to use one or the other model considering the type of decision that the artificial intelligence will have to make and whom it will affect. However, the aspect studied by the researcher has to do specifically with trying to interpret the opaque models, which in turn, have many benefits in terms of prediction: “We have to make sure that somehow we have better methods of explanation, better approximations to those explanations. And that’s what we’ve been trying to do, to get into the decisions of these more complex models.”
Medical imaging meets NeurIPS
The last paper entitled “Two-stage Conditional Chest X-ray Radiology Report Generation,” by student Pablo Messina, was selected to be part of the poster session during the MedNeurIPS 2022 workshop, a satellite event that has been taking place within the conference since 2017, and that this time will be held on December 2. This workshop brings together researchers in the field of medical imaging computing and machine learning to discuss key challenges in the area and opportunities for collaboration.
Pablo Messina is a PhD student at the UC School of Engineering and his thesis project is co-directed by Cenia’s director, Alvaro Soto, and Cenia researcher, Denis Parra. Messina specializes in automatic generation of radiological reports from medical images, and is currently involved in the iHealth Millennium Institute, together with Denis Parra, who will also attend the conference representing both projects.
The selected paper proposes a simple but novel model based on neural networks, which receives as input a chest X-ray (an image, either frontal or side view) and generates as output a radiological report (a natural language text). “The novel aspect of the paper is that we propose to generate the report in parts, guiding the model to talk about specific topics, and for this, in addition to the image, the model receives a “topic vector”. For example: the vector can represent the topic “edema”, “cardiomegaly”, “bones”, “lungs”, etc. Then we teach the model to only talk about bones when the subject is “bones”, or to only talk about lungs when the subject is “lungs”. This way, we put together the final report by concatenating the texts that the model generates for different topics,” explains the researcher.
“Although there is still much room for improvement, by approaching the problem in this way we obtained better results than those obtained by other papers in the literature, according to an evaluation tool called CheXpert labeler, developed by the Stanford ML group, which allows us to estimate the clinical quality of the report generated by the model,” adds Pablo Messina.
Currently, the number of radiologists available is not sufficient to meet the worldwide demand for medical imaging examinations. Therefore, developing artificial intelligence models that can automate the process of generating radiological reports could become a solution that would have a positive impact on the health of people who need to undergo radiological examinations.
“Our paper can serve as inspiration for future research, and can also open up the possibility of generating synergies between radiologists and AI models. Personally, I appreciate that our work was well received by the reviewers, who decided to accept it. I take it as a confirmation that the proposed idea is on the right track and is promising,” he concludes.