1 Oct 2021 • on machine learning natural language processing academia edinburgh phd

Call for PhD Students

From September 2022, I will join the Institute for Language, Cognition and Communication (ILCC) at the School of Informatics, University of Edinburgh!

And there is more! I have funding for multiple PhD students: if you are interested in working with me, make sure to apply either to the UKRI CDT in Natural Language Processing or to the ILCC 3-year PhD program.

In general, I care about anything that can help Deep Learning models become more data-efficient, statistically robust, and explainable. As Artificial Intelligence and Machine Learning systems become more pervasive in areas like critical infrastructures, education, and healthcare, there is an increasing need of AI-based systems that we can trust. For example, the European Union is working on a new set of regulations that will enforce AI-based systems used in high-risk areas to be able to produce high-quality explanations to their users and high levels of robustness and accuracy, among other things. This will automatically exclude the vast majority of the Deep Learning systems that we love and work with on a daily basis.

My research focuses about filling this gap, and developing Deep Learning systems that can produce faithful explanations, that can learn from fewer examples (e.g. thanks to stronger inductive biases), and that can work even on out-of-distribution samples (such as adversarial inputs).

Probably you may want to know a bit more about my research so far in these directions – here are some pointers. Let me now if any of these clicks with you, and feel free to reach out!

Bridging Neural and Symbolic Computation

One way I am trying to address some of the limitations of modern Deep Learning models is by designing hybrid approaches, that inheret the strength of both neural and symbolic systems.

For example, let’s consider the problem of answering complex symbolic queries on (potentially very large) Knowledge Graphs. In our paper Complex Query Answering with Neural Link Predictors, presented at ICLR 2021, we presented an hybrid approach where the query answering task is reduced to solving an optimisation problem whose structure follows the compositional logic structure of the query. Using orders of magnitude less training data, our approach obtains significant improvements in comparison with the purely-neural state-of-the-art models developed in this space, while also being able to produce faithful explanations to its users. This paper obtained an Outstanding Paper Award at ICLR 2021.

Or, for example, let’s consider the problem of deductive reasoning – i.e. deriving logical conclusions. Previous research shows that even BERT-based models do not generalise properly when required to perform reasoning tasks that differ from these observed during training – e.g. because they require composing multiple reasoning patterns, that were never observed together at training time. We proposed several approaches for solving this problem, by designing neural models whose behaviour mimics the behaviour of logic deductive reasoners. Our approaches enable neural models to perform multi-hop reasoning over multiple documents (ACL 2019), and learn logic rules from graph-structured data (ICML 2020 and AAAI 2020).

More recently, we were wondering whether it could be possible to incorporate black-box algorithmic components, like Dijkstra’s shortest path algorithm or any ILP solver in a neural model. In our paper Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions, presented at NeurIPS 2021, we developed a very general (and extremely simple!) method for back-propagating through a massive variety of algorithmic components, effectively allowing neural models to use them as off-the-shelf components. See our presentation of this paper, as well as Yannic Kilcher’s explanation.

Incorporating Constraints in Neural Models

Some other times, we would like a neural model to comply with a given set constraints. For example, we would like that, when our model predicts that $X$ is a parent of $Y$ *, and *$Y$ is a parent of $Z$, we would also like it to predict that $X$ is a grandparent of $Z$. Constraints are key for developing statistically robust model – for example, think of adversarial perturbations in computer vision. In the case of adversarial perturbations, the model is essentially violating a single constraint, i.e. given an image $X$, if $Y$ is a semantically-invariant perturbation of $X$, the model should produce the same output for both $X$ and $Y$.

In our paper Adversarial Sets for Regularising Neural Link Predictors, presented at UAI 2017, we presented the first method for incorporating arbitrary constraints encoded in the form of First-Order Logic rules in a wide class of neural models. Our idea is very simple and general: during training, at each step, we can define an adversary that finds on which inputs the model maximally violates a given constraint, and then require the model to reduce the degree of such violations. We also show that, for a wide class of models and constraint types, we can have efficient and globally-optimal solutions to the problem of finding where the model maximally violates a constraint. This is pretty amazing, since (1) it makes the training procedure extremely efficient, adding very little overhead, and (2) if the search process does not return any significant violation of a constraint, it means that the model will never violate that constraint, for every possible input it may encounter. This provides a way of producing some kind of safety guarantees for a large set of neural models, which are very desirable in a lot of high-risk settings.

We explored further applications of these ideas in several settings. For example, in Adversarially Regularising Neural NLI Models to Integrate Logical Background Knowledge (CoNLL 2018), we show that some common-sense reasoning patterns can also be represented as constraints, and incporporating these in neural Natural Language Inference (NLI) models yields improvements both on in-distribution and out-of-distribution data. In Gone At Last: Removing the Hypothesis-Only Bias in Natural Language Inference via Ensemble Adversarial Training (EMNLP 2020), we show that we can use ensembles of adversaries for de-biasing neural NLI models. In Undersensitivity in Neural Reading Comprehension (Findings of EMNLP 2020), we found that neural Question Answering (QA) models can often ignore semantically meaningful variations in the input questions, and proposed a related training process for correcting such behaviour. In Make Up Your Mind! Adversarial Generation of Inconsistent Natural Language Explanations (ACL 2020), we identified that models for producing natural language explanations often violate self-consistency constraints, and can produce mutually inconsistent explanations.