Perfiles - CENIA

Leopoldo Bertossi

Especialidad: IA explicable, causalidad, representación del conocimiento, modelos gráficos probabilísticos, ontologías y gráficos de conocimiento, aprendizaje automático, ciencia de datos y gestión de datos.

Email: bertossi@scs.carleton.ca

Leopoldo es profesor en SKEMA Business School Canadá, Montreal. Profesor emérito de ciencias de la computación en la Universidad de Carleton, Canadá. Becario senior de la Universidad Adolfo Ibáñez, Chile. Doctorado en matemáticas de la Pontificia Universidad Católica de Chile.

PUBLICACIONES

Publisher: Dagstuhl Reports, Link >

Abstract

The presentation turns around the subject of explainable AI. More specifically, we deal with attribution numerical scores that are assigned to features values of an entity under classification, to identify and rank their importance for the obtained classification label. We concentrate on the popular SHAP score [2] that can be applied with black-box and open models. We show that, in contrast to its general #P-hardness, it can be computed in polynomial time for classifiers that are based on decomposable and deterministic Boolean decision circuits. This class of classifiers includes decision trees and ordered binary decision diagrams. This result was established in [1]. The presentation illustrates how the proof heavily relies on the connection to SAT-related computational problems.

RL2 2022

Ir a la publicación

Publisher: arXiv, Link>

ABSTRACT

We describe how answer-set programs can be used to declaratively specify counterfactual interventions on entities under classification, and reason about them. In particular, they can be used to define and compute responsibility scores as attribution-based explanations for outcomes from classification models. The approach allows for the inclusion of domain knowledge and supports query answering. A detailed example with a naive-Bayes classifier is presented.

RL2 2022

Ir a la publicación

Publisher: Theory and Practice of Logic Programming, Link>

ABSTRACT

We propose answer-set programs that specify and compute counterfactual interventions on entities that are input on a classification model. In relation to the outcome of the model, the resulting counterfactual entities serve as a basis for the definition and computation of causality-based explanation scores for the feature values in the entity under classification, namely responsibility scores. The approach and the programs can be applied with black-box models, and also with models that can be specified as logic programs, such as rule-based classifiers. The main focus of this study is on the specification and computation of best counterfactual entities, that is, those that lead to maximum responsibility scores. From them one can read off the explanations as maximum responsibility feature values in the original entity. We also extend the programs to bring into the picture semantic or domain knowledge. We show how the approach could be extended by means of probabilistic methods, and how the underlying probability distributions could be modified through the use of constraints. Several examples of programs written in the syntax of the DLV ASP-solver, and run with it, are shown.

RL2 2022

Ir a la publicación

Publisher: arXiv, Link>

ABSTRACT

Weakly-Sticky(WS) Datalog+/- is an expressive member of the family of Datalog+/- program classes that is defined on the basis of the conditions of stickiness and weak-acyclicity. Conjunctive query answering (QA) over the WS programs has been investigated, and its tractability in data complexity has been established. However, the design and implementation of practical QA algorithms and their optimizations have been open. In order to fill this gap, we first study Sticky and WS programs from the point of view of the behavior of the chase procedure. We extend the stickiness property of the chase to that of generalized stickiness of the chase (GSCh) modulo an oracle that selects (and provides) the predicate positions where finitely values appear during the chase. Stickiness modulo a selection function S that provides only a subset of those positions defines sch(S), a semantic subclass of GSCh. Program classes with selection functions include Sticky and WS, and another syntactic class that we introduce and characterize, namely JWS, of jointly-weakly-sticky programs, which contains WS. The selection functions for these last three classes are computable, and no external, possibly non-computable oracle is needed. We propose a bottom-up QA algorithm for programs in the class sch(S), for a general selection function S. As a particular case, we obtain a polynomial-time QA algorithm for JWS and weakly-sticky programs. Unlike WS, JWS turns out to be closed under magic-sets query optimization. As a consequence, both the generic polynomial-time QA algorithm and its magic-set optimization can be particularized and applied to WS.

RL2 2022

Ir a la publicación

Publisher: Schloss Dagstuhl-Leibniz-Zentrum für Informatik, Link>

ABSTRACT

Database, Knowledgebase and Software systems, or their logical specifications, may become inconsistent in the sense of containing contradictory pieces of information. Since these types of technology are at some level based on classical logic, there is the major problem that in classical logic, any formula is implied by a contradiction. This therefore raises the need to circumvent this fundamental property of classical logic whilst supporting as much as possible of classical logic for these technologies. To address this, several new logics, with new formalisms, semantics and/or deductive systems, that can accommodate classical inconsistencies without becoming trivial, have been proposed. These logics are starting to be used in databases, knowledgebases and software specifications. In addition, we need strategies for analysing inconsistent information. This need has in part driven the approach of argumentation systems which compare pros and cons for potential conclusions from conflicting information. Also important are strategies for isolating inconsistency and for taking appropriate actions, including resolution actions. This calls for uncertainty reasoning and meta-level reasoning. Furthermore, the cognitive activities involved in reasoning with inconsistent information need to be directly related to the kind of inconsistency. So, in general, we see the need for inconsistency tolerance giving rise to a range of technologies for inconsistency management. We are now at an exciting stage in this direction. Rich foundations are being established, and a number of interesting and complementary application areas are being explored in decision-support …

RL2 2022

Ir a la publicación

Publisher: arXiv, Link>

ABSTRACT

In Machine Learning, the SHAP-score is a version of the Shapley value that is used to explain the result of a learned model on a specific entity by assigning a score to every feature. While in general computing Shapley values is an intractable problem, we prove a strong positive result stating that the SHAP-score can be computed in polynomial time over deterministic and decomposable Boolean circuits. Such circuits are studied in the field of Knowledge Compilation and generalize a wide range of Boolean circuits and binary decision diagrams classes, including binary decision trees and Ordered Binary Decision Diagrams (OBDDs). We also establish the computational limits of the SHAP-score by observing that computing it over a class of Boolean models is always polynomially as hard as the model counting problem for that class. This implies that both determinism and decomposability are essential properties for the circuits that we consider. It also implies that computing SHAP-scores is intractable as well over the class of propositional formulas in DNF. Based on this negative result, we look for the existence of fully-polynomial randomized approximation schemes (FPRAS) for computing SHAP-scores over such class. In contrast to the model counting problem for DNF formulas, which admits an FPRAS, we prove that no such FPRAS exists for the computation of SHAP-scores. Surprisingly, this negative result holds even for the class of monotone formulas in DNF. These techniques can be further extended to prove another strong negative result: Under widely believed complexity assumptions, there is no polynomial-time algorithm that checks, given a monotone DNF formula φ and features x,y, whether the SHAP-score of x in φ is smaller than the SHAP-score of y in φ.

RL2 2022

Ir a la publicación

Publisher: ACM SIGMOD Record, Link>

ABSTRACT

Database tuples can be seen as players in the game of jointly realizing the answer to a query. Some tuples may contribute more than others to the outcome, which can be a binary value in the case of a Boolean query, a number for a numerical aggregate query, and so on. To quantify the contributions of tuples, we use the Shapley value that was introduced in cooperative game theory and has found applications in a plethora of domains. Specifically, the Shapley value of an individual tuple quantifies its contribution to the query. We investigate the applicability of the Shapley value in this setting, as well as the computational aspects of its calculation in terms of complexity, algorithms, and approximation.

RL2 2022

Ir a la publicación

Publisher: arXiv, Link>

ABSTRACT

There are some recent approaches and results about the use of answer-set programming for specifying counterfactual interventions on entities under classification, and reasoning about them. These approaches are flexible and modular in that they allow the seamless addition of domain knowledge. Reasoning is enabled by query answering from the answer-set program. The programs can be used to specify and compute responsibility-based numerical scores as attributive explanations for classification results.

RL1 2022

Ir a la publicación

Publisher: arXiv, Link>

ABSTRACT

We describe some recent approaches to score-based explanations for query answers in databases and outcomes from classification models in machine learning. The focus is on work done by the author and collaborators. Special emphasis is placed on declarative approaches based on answer-set programming to the use of counterfactual reasoning for score specification and computation. Several examples that illustrate the flexibility of these methods are shown.

RL2 2022

Ir a la publicación

Publisher: arXiv, Link>

ABSTRACT

Consistent answers to a query from a possibly inconsistent database are answers that are simultaneously retrieved from every possible repair of the database. Repairs are consistent instances that minimally differ from the original inconsistent instance. It has been shown before that database repairs can be specified as the stable models of a disjunctive logic program. In this paper we show how to use the repair programs to transform the problem of consistent query answering into a problem of reasoning w.r.t. a theory written in second-order predicate logic. It also investigated how a first-order theory can be obtained instead by applying second-order quantifier elimination techniques.

RL2 2022

Ir a la publicación

info@cenia.cl

Edificio de Innovación UC, Piso 2
Vicuña Mackenna 4860
Macul, Chile

Leopoldo Bertossi

PUBLICACIONES

3 Overview of Talks 3.1 SHAP Explanations with Booleans Circuit Classifiers

Answer-Set Programs for Reasoning about Counterfactual Interventions and Responsibility Scores for C

Declarative Approaches to Counterfactual Explanations for Classification

Extending Sticky-Datalog+/- via Finite-Position Selection Functions: Tractability, Algorithms, and O

Inconsistency Tolerance (Dagstuhl Seminar 03241)

On the Complexity of SHAP-Score-Based Explanations: Tractability via Knowledge Compilation and Non-A

Query Games in Databases

Reasoning about Counterfactuals and Explanations: Problems, Results and Directions

Score-Based Explanations in Data Management and Machine Learning: An Answer-Set Programming Approach

Second-Order Specifications and Quantifier Elimination for Consistent Query Answering in Databases