Gaussian processes (GPs) are natural generalisations of multivariate
Gaussian random variables to infinite (countably or continuous) index
sets. GPs have been applied in a large number of fields to a diverse
range of ends, and very many deep theoretical analyses of various
properties are available. This paper gives an introduction to Gaussian
processes on a fairly elementary level with special emphasis on
characteristics relevant in machine learning. It draws explicit
connections to branches such as spline smoothing models and support
vector machines in which similar ideas have been investigated.
Gaussian process models are routinely used to solve hard machine
learning problems. They are attractive because of their flexible
non-parametric nature and computational simplicity. Treated within a
Bayesian framework, very powerful statistical methods can be
implemented which offer valid estimates of uncertainties in our
predictions and generic model selection procedures cast as nonlinear
optimization problems. Their main drawback of heavy computational
scaling has recently been alleviated by the introduction of generic
sparse approximations.
13,78,31 The mathematical literature
on GPs is large and often uses deep concepts which are not required
to fully understand most machine learning applications. In this
tutorial paper, we aim to present characteristics of GPs relevant to
machine learning and to show up precise connections to other
"kernel machines" popular in the community. Our focus is on a simple
presentation, but references to more detailed sources are provided.