neuralnoise.comHomepage of Dr Pasquale Minervini <br/> Researcher at the University of Edinburgh, School of Informatics
http://www.neuralnoise.com///
Mon, 12 Dec 2022 10:29:16 +0100Mon, 12 Dec 2022 10:29:16 +0100Jekyll v4.2.0Looking for Postdocs!<p>We have an opening for a 2-year postdoc – <a href="https://elxw.fa.em3.oraclecloud.com/hcmUI/CandidateExperience/en/sites/CX_1001/job/5583">more details are available here</a> – on a project titled <a href="https://web.inf.ed.ac.uk/eliai/projects/gradient-based-learning-of-complex-latent-structur">Gradient-based Learning of Complex Latent Structures</a>, with me as the Principal Investigator (PI), and <a href="http://nolovedeeplearning.com/">Antonio Vergari</a> (<a href="https://web.inf.ed.ac.uk/anc">IANC</a>) and <a href="https://ducdauge.github.io/">Edoardo Ponti</a> (<a href="https://web.inf.ed.ac.uk/ilcc">ILCC</a>) as co-PIs. The position is entirely funded by the <a href="https://web.inf.ed.ac.uk/eliai">Edinburgh Laboratory for Integrated Artificial Intelligence</a> (ELIAI) – if you want to know more, feel free to reach out!</p>
<p>You can apply <a href="https://elxw.fa.em3.oraclecloud.com/hcmUI/CandidateExperience/en/sites/CX_1001/job/5583">at this link</a>.</p>
<h3 id="project-description">Project description</h3>
<p>Imposing structural constraints on the latent representations learned by deep neural models has several applications, which can improve their explainability, their robustness, and their ability to generalise to out-of-domain distributions. For example, we can learn more explainable models by making them selectively decide which parts of the input to consider; and we can improve their generalisation properties by learning representations suitable for reasoning tasks, such as deductive reasoning and planning, and comply with any desired constraints. For instance, the intermediate structure can represent a relational graph between objects in the world; the relationships between multiple sub-questions in a complex question; or computation graphs which can be executed to produce a prediction.</p>
<p>In this project, we aim to investigate how we can derive better methods for back-propagating through mixed continuous-discrete complex latent structures, and how we can leverage them for learning more explainable, data-efficient, and robust deep neural models. The reason why discrete latent representations are not widely adopted by deep neural models is that they tend to not interact well with gradient-based optimisation methods, but this started to change recently (e.g., see <a href="https://arxiv.org/abs/2106.01798">Niepert et al., 2021</a>; <a href="https://arxiv.org/abs/2209.04862">Minervini et al. 2022</a>), enabling a wide range of applications and use cases.</p>
<p>Related papers:</p>
<ul>
<li>Niepert, Minervini, and Franceschi - <a href="https://arxiv.org/abs/2106.01798">Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions</a>. NeurIPS 2021</li>
<li>Minervini, Franceschi, and Niepert - <a href="https://arxiv.org/abs/2209.04862">Adaptive Perturbation-Based Gradient Estimation for Discrete Latent Variable Models</a>. AAAI 2023</li>
<li>Ahmed, Teso, Chang, Van den Broeck, Vergari - <a href="https://arxiv.org/abs/2206.00426">Semantic Probabilistic Layers for Neuro-Symbolic Learning</a>. NeurIPS 2022</li>
</ul>
<h3 id="position">Position</h3>
<p>The post holder will work on projects involving the design and application of deep learning models with discrete latent structures for improving their explainability, generalisation, and robustness properties. They will be part of the new <a href="https://web.inf.ed.ac.uk/eliai">Edinburgh Laboratory for Integrated Artificial Intelligence</a> and the <a href="https://edinburghnlp.inf.ed.ac.uk/">Edinburgh NLP Group</a>, a world-leading research group in Natural Language Processing.</p>
<p>The School of Informatics is one of the largest research centres in Computer Science in Europe, and it has been <a href="https://www.ed.ac.uk/informatics/news-events/stories/2022/informatics-ref2021-results-global-reach-genuine-i">ranked #1 in the UK</a> in terms of research power by a large margin. The Edinburgh NLP Group is consistently ranked among the <a href="https://csrankings.org/#/index?nlp&world">world’s leading research groups</a> in Natural Language Processing. We are offering an exciting opportunity to work in an interdisciplinary, collaborative, friendly, and supportive environment, integrating different sub-fields of Computer Science and Artificial Intelligence.</p>
Tue, 01 Nov 2022 01:00:00 +0100
http://www.neuralnoise.com///2022/postdoc/
http://www.neuralnoise.com///2022/postdoc/machine learningnatural language processingknowledge graphsneuro-symbolic reasoningacademiaedinburghpostdocmachine learningnatural language processingknowledge graphsneuro-symbolic reasoningacademiaedinburghphdPhD Projects<p>As mentioned <a href="/2021/research-interests/">here</a>, in September 2022 I joined the <a href="https://www.research.ed.ac.uk/en/organisations/institute-of-language-cognition-and-computation">Institute for Language, Cognition and Communication</a> (ILCC) at the <a href="https://www.ed.ac.uk/informatics">School of Informatics</a>, <a href="https://www.ed.ac.uk/">University of Edinburgh</a>, one of the <a href="https://csrankings.org/#/fromyear/2016/toyear/2022/index?nlp&world">world’s best schools in NLP and related areas</a>, as a faculty member in NLP! If you are interested in working with me, I have funding for multiple PhD students: make sure to apply either to the <a href="https://web.inf.ed.ac.uk/cdt/natural-language-processing">UKRI CDT in Natural Language Processing</a> or to the <a href="http://www.ilcc.inf.ed.ac.uk/study/possible-phd-topics-in-ilcc">ILCC 3-year PhD program</a>!</p>
<p>Some more details on the <a href="https://web.inf.ed.ac.uk/ilcc/study-with-us/studentships/linguistics-speech-technology-cognitive-science">ILCC PhD program</a> – there are <a href="https://web.inf.ed.ac.uk/ilcc/study-with-us/studentships/linguistics-speech-technology-cognitive-science">two deadlines for applying</a>: the first round is on 25th November 2022, and the second round is on 27th January 2023. I strongly recommend that non-UK applicants submit their applications in the first round, to maximise their chances of funding.</p>
<p>Regarding the <a href="https://web.inf.ed.ac.uk/cdt/natural-language-processing">NLP CDT program</a> – there are <a href="https://web.inf.ed.ac.uk/cdt/natural-language-processing/apply">also two deadlines for applying</a>: the first round is on 25th November 2022, and the second round is on 27th January 2023. Likewise, I strongly recommend that non-UK applicants submit their applications in the first round, to maximise their chances of funding.</p>
<p>If you are interested in working with me, you can apply via the ILCC PhD program’s and the NLP CDT program’s application portals. You will be asked to submit a research proposal: this is mostly used for assessing candidate PhD students and for matching them with potential faculty supervisor, and you can decide to work on different problems during your PhD. If you would like some feedback on your research proposal, get in touch!</p>
<p>In the following there’s a (non-exhaustive but fairly up-to-date) list of PhD topics we may decide to work on – this list is also available on the <a href="https://web.inf.ed.ac.uk/ilcc/study-with-us/possible-phd-topics-ilcc/language-processing-computational-linguistics">Possible PhD topics in ILCC</a> page. An older list of possible research topics is also available <a href="/2021/research-interests/">at this link</a>, and feel free to propose new project topics that intest you! I’m always happy to explore new directions!</p>
<h4 id="open-domain-complex-question-answering-at-scale">Open-Domain Complex Question Answering at Scale</h4>
<p>Open-Domain Question Answering (ODQA) is a task where a system needs to generate the answer to a given general-domain question, and the evidence is not given as input to the system. A core limitation of modern ODQA models (and, more generally, of all models for solving <a href="https://aclanthology.org/2021.naacl-main.200/">knowledge-intensive tasks</a>) is that they remain limited to answering simple, factoid questions, where the answer to the question is explicit in a single piece of evidence. In contrast, complex questions involve aggregating information from multiple documents, requiring some form of logical reasoning and sequential, multi-hop processing in order to generate the answer. Projects in this area involve proposing new ODQA models for answering complex questions, for example, by taking inspiration from models for answering complex queries in Knowledge Graphs (<a href="https://arxiv.org/abs/2011.03459">Arakaleyan et al., 2021</a>; <a href="https://www.ijcai.org/proceedings/2022/741">Minervini et al., 2022a</a>) and Neural Theorem Provers (<a href="https://arxiv.org/abs/2007.06477">Minervini et al., 2020a</a>; <a href="https://arxiv.org/abs/1912.10824">Minervini et al., 2020b</a>) and proposing methods by which neural ODQA models can learn to search in massively large text corpora, such as the entire Web.</p>
<h4 id="neuro-symbolic-and-hybrid-discrete-continuous-natural-language-processing-models">Neuro-Symbolic and Hybrid Discrete-Continuous Natural Language Processing Models</h4>
<p>Incorporating discrete components, such as discrete decision steps and symbolic reasoning algorithms, in neural models can significantly improve their interpretability, data efficiency, and predictive properties — for example, see (<a href="https://arxiv.org/abs/2106.01798">Niepert et al., 2021</a>; <a href="https://arxiv.org/abs/2209.04862">Minervini et al., 2022b</a>; <a href="(https://www.ijcai.org/proceedings/2022/741)">Minervini et al., 2020a</a>; <a href="https://arxiv.org/abs/1912.10824">Minervini et al., 2020b</a>). However, approaches in this space rely either on ad-hoc continuous relaxations (e.g., <a href="(https://www.ijcai.org/proceedings/2022/741)">Minervini et al., 2020a</a>, <a href="https://arxiv.org/abs/1912.10824">Minervini et al., 2020b</a>) or on gradient estimation techniques that require some assumptions on the distributions of the discrete variables (<a href="https://arxiv.org/abs/2106.01798">Niepert et al., 2021</a>; <a href="https://arxiv.org/abs/2209.04862">Minervini et al., 2022b</a>). Projects in this area involve devising neuro-symbolic approaches for solving NLP tasks that require some degree of reasoning and compositionality and identifying gradient estimation techniques (for back-propagating through discrete decision steps) that are both data-efficient, hyperparameter-free, accurate, and require fewer assumptions on the distribution of the discrete variables.</p>
<h4 id="learning-from-graph-structured-data">Learning from Graph-Structured Data</h4>
<p>Graph-structured data is everywhere – e.g. consider Knowledge Graphs, social networks, protein and drug interaction networks, and molecular profiles. In this project, we aim to improve models for learning from graph-structured data and their evaluation protocols. Projects in this area involve incorporating invariances and constraints in graph machine learning models (e.g., see <a href="https://arxiv.org/abs/1707.07596">Minervini et al., 2017</a>), proposing methods of transferring knowledge between graph representations, automatically identifying functional inductive biases for learning from graphs from a given domain (such as Knowledge Graphs – for example, see <a href="https://arxiv.org/abs/2207.09980">our NeurIPS 2022 paper on incorporating the inductive biases used by factorisation-based models into GNNs</a>) and proposing techniques for explaining the output of black-box graph machine learning methods (such as graph embeddings).</p>
Sat, 01 Oct 2022 02:00:00 +0200
http://www.neuralnoise.com///2022/phd-projects/
http://www.neuralnoise.com///2022/phd-projects/machine learningnatural language processingknowledge graphsneuro-symbolic reasoningacademiaedinburghphdmachine learningnatural language processingknowledge graphsneuro-symbolic reasoningacademiaedinburghphdCall for PhD Students<p>From September 2022, I will join the <a href="https://www.research.ed.ac.uk/en/organisations/institute-of-language-cognition-and-computation">Institute for Language, Cognition and Communication</a> (ILCC) at the <a href="https://www.ed.ac.uk/informatics">School of Informatics</a>, <a href="https://www.ed.ac.uk/">University of Edinburgh</a>!</p>
<p>And there is more! I have <strong>funding for multiple PhD students</strong>: if you are interested in working with me, make sure to apply either to the <a href="https://web.inf.ed.ac.uk/cdt/natural-language-processing">UKRI CDT in Natural Language Processing</a> or to the <a href="http://www.ilcc.inf.ed.ac.uk/study/possible-phd-topics-in-ilcc">ILCC 3-year PhD program</a>.</p>
<p>In general, I care about anything that can help Deep Learning models become more <em>data-efficient</em>, <em>statistically robust</em>, and <em>explainable</em>. As Artificial Intelligence and Machine Learning systems become more pervasive in areas like critical infrastructures, education, and healthcare, there is an increasing need of AI-based systems that we can <strong>trust</strong>.
For example, the European Union is working on a <a href="https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai">new set of regulations</a> that will enforce AI-based systems used in high-risk areas to be able to produce high-quality explanations to their users and high levels of robustness and accuracy, among other things.
This will automatically exclude the vast majority of the Deep Learning systems that we love and work with on a daily basis.</p>
<p>My research focuses about <em>filling this gap</em>, and developing Deep Learning systems that can produce <em>faithful explanations</em>, that can learn from fewer examples (e.g. thanks to stronger inductive biases), and that can work even on out-of-distribution samples (such as adversarial inputs).</p>
<p>Probably you may want to know a bit more about my research so far in these directions – here are some pointers. Let me now if any of these clicks with you, and feel free to reach out!</p>
<h3 id="bridging-neural-and-symbolic-computation">Bridging Neural and Symbolic Computation</h3>
<p>One way I am trying to address some of the limitations of modern Deep Learning models is by designing <em>hybrid</em> approaches, that inheret the strength of both neural and symbolic systems.</p>
<p>For example, let’s consider the problem of answering complex symbolic queries on (potentially very large) Knowledge Graphs. In our paper <a href="https://arxiv.org/abs/2011.03459">Complex Query Answering with Neural Link Predictors</a>, presented at <a href="https://iclr.cc/Conferences/2021">ICLR 2021</a>, we presented an hybrid approach where the query answering task is reduced to solving an optimisation problem whose structure follows the compositional logic structure of the query. Using orders of magnitude less training data, our approach obtains significant improvements in comparison with the purely-neural state-of-the-art models developed in this space, while also being able to produce faithful explanations to its users. This paper obtained an <a href="https://iclr-conf.medium.com/announcing-iclr-2021-outstanding-paper-awards-9ae0514734ab">Outstanding Paper Award</a> at <a href="https://iclr.cc/Conferences/2021">ICLR 2021</a>.</p>
<p>Or, for example, let’s consider the problem of <em>deductive reasoning</em> – i.e. deriving logical conclusions. <a href="https://arxiv.org/abs/1908.06177">Previous research</a> shows that even BERT-based models do not generalise properly when required to perform reasoning tasks that differ from these observed during training – e.g. because they require composing multiple reasoning patterns, that were never observed together at training time. We proposed several approaches for solving this problem, by designing neural models whose behaviour mimics the behaviour of logic deductive reasoners. Our approaches <a href="https://arxiv.org/abs/1906.06187">enable neural models to perform multi-hop reasoning over multiple documents</a> (<a href="https://acl2019.org/EN/index.xhtml.html">ACL 2019</a>), and <a href="">learn logic rules from graph-structured data</a> (<a href="https://icml.cc/Conferences/2020">ICML 2020</a> and <a href="https://aaai.org/Conferences/AAAI-20/">AAAI 2020</a>).</p>
<p>More recently, we were wondering whether it could be possible to incorporate black-box algorithmic components, like <a href="https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm">Dijkstra’s shortest path algorithm</a> or any <a href="https://en.wikipedia.org/wiki/Integer_programming">ILP solver</a> in a neural model. In our paper <a href="https://arxiv.org/abs/2106.01798">Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions</a>, presented at <a href="https://nips.cc/Conferences/2021/">NeurIPS 2021</a>, we developed a very general (and extremely simple!) method for back-propagating through a massive variety of algorithmic components, effectively allowing neural models to use them as off-the-shelf components. See <a href="https://www.youtube.com/watch?v=hb2b0K2PTxI">our presentation</a> of this paper, as well as <a href="https://www.youtube.com/watch?v=W2UT8NjUqrk">Yannic Kilcher’s explanation</a>.</p>
<h3 id="incorporating-constraints-in-neural-models">Incorporating Constraints in Neural Models</h3>
<p>Some other times, we would like a neural model to <em>comply</em> with a given set constraints. For example, we would like that, when our model predicts that <em>$X$ is a parent of $Y$ *, and *$Y$ is a parent of $Z$</em>, we would also like it to predict that <em>$X$ is a grandparent of $Z$</em>. Constraints are key for developing statistically robust model – for example, think of <em>adversarial perturbations</em> in computer vision. In the case of adversarial perturbations, the model is essentially violating a single constraint, i.e. <em>given an image $X$, if $Y$ is a semantically-invariant perturbation of $X$, the model should produce the same output for both $X$ and $Y$</em>.</p>
<p>In our paper <a href="http://auai.org/uai2017/proceedings/papers/306.pdf">Adversarial Sets for Regularising Neural Link Predictors</a>, presented at UAI 2017, we presented the first method for incorporating arbitrary constraints encoded in the form of First-Order Logic rules in a wide class of neural models. Our idea is very simple and general: during training, at each step, we can define an <em>adversary</em> that finds on which inputs the model maximally violates a given constraint, and then require the model to reduce the degree of such violations. We also show that, for a wide class of models and constraint types, we can have <em>efficient and globally-optimal</em> solutions to the problem of finding where the model maximally violates a constraint. This is pretty amazing, since (1) it makes the training procedure extremely efficient, adding very little overhead, and (2) if the search process does not return any significant violation of a constraint, it means that <em>the model will never violate that constraint, for every possible input it may encounter</em>. This provides a way of producing some kind of <em>safety guarantees</em> for a large set of neural models, which are very desirable in a lot of high-risk settings.</p>
<p>We explored further applications of these ideas in several settings. For example, in <a href="https://arxiv.org/abs/1808.08609">Adversarially Regularising Neural NLI Models to Integrate Logical Background Knowledge</a> (<a href="https://www.conll.org/2018">CoNLL 2018</a>), we show that some common-sense reasoning patterns can also be represented as constraints, and incporporating these in neural Natural Language Inference (NLI) models yields improvements both on in-distribution and out-of-distribution data. In <a href="https://arxiv.org/abs/2004.07790">Gone At Last: Removing the Hypothesis-Only Bias in Natural Language Inference via Ensemble Adversarial Training</a> (<a href="https://2020.emnlp.org/">EMNLP 2020</a>), we show that we can use <em>ensembles of adversaries</em> for de-biasing neural NLI models. In <a href="https://arxiv.org/abs/2003.04808">Undersensitivity in Neural Reading Comprehension</a> (<a href="https://2020.emnlp.org/">Findings of EMNLP 2020</a>), we found that neural Question Answering (QA) models can often ignore semantically meaningful variations in the input questions, and proposed a related training process for correcting such behaviour. In <a href="https://arxiv.org/abs/1910.03065">Make Up Your Mind! Adversarial Generation of Inconsistent Natural Language Explanations</a> (<a href="https://acl2020.org/">ACL 2020</a>), we identified that models for producing natural language explanations often violate self-consistency constraints, and can produce mutually inconsistent explanations.</p>
Fri, 01 Oct 2021 02:00:00 +0200
http://www.neuralnoise.com///2021/research-interests/
http://www.neuralnoise.com///2021/research-interests/machine learningnatural language processingacademiaedinburghphdmachine learningnatural language processingacademiaedinburghphdSome notes on Gaussian Fields and Label Propagation<p>In several occasions, we find ourselves in need of <em>propagating</em> information among nodes in an undirected graph.</p>
<p>For instance, consider graph-based Semi-Supervised Learning (SSL): here, labeled and unlabeled examples are represented by an undirected graph, referred to as the <em>similarity graph</em>.</p>
<p>The task consists in finding a <em>label assignment</em> to all examples, such that:</p>
<ol>
<li>The final labeling is consistent with training data (e.g. positive training examples are still classified as positive at the end of the learning process), and</li>
<li>Similar examples are assigned similar labels: this is referred to as the <em>semi-supervised smoothness assumption</em>.</li>
</ol>
<p>Similarly, in networked data such as social networks, we might assume that related entities (such as <em>friends</em>) are associated to similar attributes (such as political and religious views, musical tastes and so on): in social network analysis, this phenomenon is commonly referred to as <em>homophily</em> (love of the same).</p>
<p>In both cases, propagating information from a limited set of nodes in a graph to all nodes provides a method for predicting the attributes of such nodes, when this information is missing.</p>
<p>In the following, we introduce a really clever method for efficiently propagating information about nodes in undirected graphs, known as the <em>Gaussian Fields</em> method.</p>
<h3 id="propagation-as-a-cost-minimization-problem">Propagation as a Cost Minimization Problem</h3>
<p>We now cast the propagation problem as a binary classification task.
Let $X = \{ x_{1}, x_{2}, \ldots, x_{n} \}$ be a set of $n$ instances, of which only $l$ are labeled: $X^{+}$ are positive examples, while $X^{-}$ are negative examples</p>
<p>Similarity relations between instances can be represented by means of an undirected similarity graph having adjacency matrix $\mathbf{W} \in \mathbb{R}^{n \times n}$: if two instances are connected in the similarity graph, it means that they are considered <em>similar</em>, and should be assigned the same label.
Specifically, $\mathbf{W}_{ij} > 0$ iff the instances $x_{i}, x_{j} \in X$ are connected by an edge in the similarity graph, and $\mathbf{W}_{ij} = 0$ otherwise.</p>
<p>Let $y_{i} \in \{ \pm 1 \}$ be the label assigned to the $i$-th instance $x_{i} \in X$.
We can encode our assumption that <em>similar instances should be assigned similar labels</em> by defining a quadratic cost function over labeling functions in the form $f : X \mapsto \{ \pm 1 \}$:</p>
\[E(f) = \frac{1}{2} \sum_{x_{i} \in X} \sum_{x_{j} \in X} \mathbf{W}_{ij} \left[ f(x_{i}) - f(x_{j}) \right]^{2}.\]
<p>Given an input labeling function $f$, the cost function $E(\cdot)$ associates, for each pair of instances $x_{i}, x_{j} \in X$, a non-negative cost $\mathbf{W}_{ij} \left[ f(x_{i}) - f(x_{j}) \right]$: this quantity is $0$ when $\mathbf{W}_{ij} = 0$ (i.e. $x_{i}$ and $X_{j}$ are not linked in the similarity graph), or when $f(x_{i}) = f(x_{j})$ (i.e. they are assigned the same label).</p>
<p>For such a reason, the cost function $E(\cdot)$ favors labeling functions that are more likely to assign the same labels to instances that are linked by an edge in the similarity graph.</p>
<p>Now, the problem of finding a labeling function that is both consistent with training labels, and assigns similar labels to similar instances, can be cast as a <em>cost minimization problem</em>. Let’s represent a labeling function $f$ by a vector $\mathbf{f} \in \mathbb{R}^{n}$, $L \subset X$ denote labeled instances, and $\mathbf{y}_{i} \in \{ \pm 1 \}$ denote the label of the $x_{i}$-th instance.
The optimization problem can be defined as follows:</p>
\[\begin{aligned}
& \underset{\mathbf{f} \in \{ \pm 1 \}^{n}}{\text{minimize}}
& & E(\mathbf{f}) \\
& \text{subject to}
& & \forall x \in L: \; \mathbf{f}_{i} = \mathbf{y}_{i}.
\end{aligned}\]
<p>The constraint $\forall x \in L : \mathbf{f}_{i} = \mathbf{y}_{i}$ enforces the label of each labeled example $x_{i} \in L$ to $\mathbf{f}_{i} = +1$ if the instance has a positive label, and to $\mathbf{f}_{i} = -1$ if the instance has a negative label, so to achieve consistency with training labels.</p>
<p>However, constraining labeling functions $f$ to only take discrete values has two main drawbacks:</p>
<ul>
<li>Each function $f$ can only provide <em>hard</em> classifications, without yielding any measure of confidence in the provided classification.</li>
<li>The cost term $E(\cdot)$ can be hard to optimize in a multi-label classification setting.</li>
</ul>
<p>For overcoming such limitations, Zhu et al. propose a <em>continuous relaxation</em> of the previous optimization problem:</p>
\[\begin{aligned}
& \underset{\mathbf{f} \in \mathbb{R}^{n}}{\text{minimize}}
& & E(\mathbf{f}) \\
& \text{subject to}
& & \forall x \in L: \; \mathbf{f}_{i} = \mathbf{y}_{i},
\end{aligned}\]
<p>where the term $\sum_{x_{i} \in X} \mathbf{f}_{i}^{2} = \mathbf{f}^{T} \mathbf{f}$ is a $L_{2}$ regularizer over $\mathbf{f}$, weighted by a parameter $\epsilon > 0$ which ensures that the optimization problem has a unique global solution.</p>
<p>The parameter $\epsilon$ can be interpreted as the <em>decay</em> of the propagation process: as the distance from a labeled instance within the similarity graph increases, the confidence in the classification (as measured by the continuous label) gets closer to zero.</p>
<p>This optimization problem has a unique, global solution that can be calculated in closed-form. Specifically, the optimal (relaxed) discriminant function $f : X \mapsto \mathbb{R}$ is given by $\mathbf{\hat{f}} = \left[ \mathbf{f}_{L}, \mathbf{f}_{U} \right]^{T}$, where $\mathbf{\hat{f}}_{L} = \mathbf{y}_{L}$ (i.e. labels for labeled examples in $L$ coincide with training labels), while $\mathbf{\hat{f}}_{U}$ is given by:</p>
\[\mathbf{\hat{f}}_{U} = (\mathbf{L}_{UU} + \epsilon \mathbf{I})^{-1} \mathbf{W}_{UL} \mathbf{\hat{f}}_{L},\]
<p>where $\mathbf{L} = \mathbf{D} - \mathbf{W}$ is the <em>graph Laplacian</em> of the similarity graph with adjacency matrix $\mathbf{W}$, and $\mathbf{D}$ is a diagonal matrix such that $\mathbf{D}_{ii} = \sum_{j} \mathbf{W}_{ij}$.</p>
Sun, 01 Jan 2017 01:00:00 +0100
http://www.neuralnoise.com///2017/gaussian-fields/
http://www.neuralnoise.com///2017/gaussian-fields/machine learningsemi-supervised learningmachine learningsemi-supervised learning