|
Home
People
Research
Teaching
Consultancy
Join MLO
Find us
Resources
Maintained by
G.Brown
|
The MLO Research Culture
We promote an active seminar culture - see the seminars page.
We also maintain a Resources for Research (R-4-R) page.
Our Research Projects
The following is a partial list of past/present MLO group research
projects. For more information, for example if you are wishing to apply to study
for a PhD in these areas, please
contact the specified project lead.
Constrained MultiObjective Optimisation
Information Theoretic Feature Selection
Ensemble Methods and Diversity
Diversity Measures and Online Learning
Boosting and Products of Experts
Speech Analysis with Deep Learning
Semi-Supervised Boosting
Cluster Ensembles for Temporal Data
Guaranteed Approximation and Convergence in Multiobjective Optimization
Dynamical Systems Analysis of Non-Stationary Learning
Machine Learning for Adaptive Multi-Core Machines
Computational Modelling of Biochemical Networks
Constrained MultiObjective Optimisation
MLO Contact: Joshua Knowles and Richard Allmendinger
Our research is on optimization problems in which candidate solutions are evaluated by
conducting physical or biochemical experiments. Such
optimization processes may be subject to resourcing issues, as any
experiment may require resources in order to be conduced. The primary issue we study is scenarios where
resources required to conduct certain experiments are not consistently
available throughout the optimization process - we model the dynamic
availability of resources using what we call ephemeral resource
constraints. The second resourcing issue is related to optimizing subject
to changes of variables; here, as an example, consider the optimization
of combinations of drugs drawn from a non-stationary library. The final
resourcing issue is related to optimization in lethal environments. The aim
here is to evolve a population of hardware entities, such as automous robot
capsules, nano-machine, or drone planes, which can be accidentally
destroyed if wrong software (the EA solution) is uploaded on them. Hence,
the size of the population is at risk if a too aggressive search is used.
Our objective is to understand how these resourcing issues affect
evolutionary search, and to develop effective and efficient search
strategies for dealing with them.
Information Theoretic Feature Selection
MLO contact: Gavin Brown and Adam Pocock
Our current research is on developing a novel theoretical foundation for mutual information based feature selection. We begin by defining a
discriminative model, and aim to maximise the joint likelihood of this model. We derive an information theoretic feature selection term from this
model, which when minimised, maximises the model likelihood. We show that when using a flat prior over the features, this feature selection term is
exactly that optimised by a group of Markov Blanket discovery algorithms, and is approximated by a large group of mutual information based filters. In
recent work we use this likelihood perspective to investigate the properties of mutual information based filters, and understand the probabilistic
assumptions they impose on the data. We thus provide a unifying perspective, including 2 decades of literature in a common framework. Our
probabilistic interpretation of feature selection leads to several natural extensions for cost sensitivity and incorporating domain knoweldge. We are
now looking at using informative priors to guide feature selection by modifying these filter criteria to include domain knowledge. This gives a family
of selection criteria which can include domain knowledge about the size of the feature set, or the relationships between the features and the class
label.
Ensemble Methods and Diversity
MLO Contact: Gavin Brown
Ensemble Systems are groups of predictors treated as 'committee', to obtain better generalisation
than any single predictor, and have emerged as one of the most powerful pattern recognition
techniques of the last decade. The success of such methods rests on the committee members exhibiting some kind of
'diversity'. Our work has analyzed diversity at a fundamental level, with new observations on how
it can be formulated and exploited - the primary contribution was a new understanding of the Negative Correlation
learning algorithm, showing it is capable of explicitly managing
diversity.
G. Brown, J. Wyatt, P.Tino, Managing Diversity in Regression Ensembles
[PDF]
JMLR vol 6 (2006).
Diversity Creation Methods: A Survey and Categorisation, Brown, Wyatt, Harris, Yao.
[PDF]
Journal of Information Fusion, vol 6, 2005
Ensemble Diversity in Non-Stationary Environments
MLO Contact: Gavin Brown
and Richard Stapenhurst
The `diversity' of an ensemble
quantifies how different individual learners within an ensemble are. It is
widely accepted that good ensemble must have some diversity, but that too
much diversity will be detrimental to performance. We are
examining the relationship
between diversity and the distributions of voting margins,
which seems to suggest a far more straight-forward interpretation and
application of diversity than is common in the literature. Another facet
of our research involves non-stationary learning, where we wish to model
some process that changes over time, and the application of diversity to
this problem. Ensembles have been shown to perform well on non-stationary
problems, but generally the techniques for adapting to new concepts are
somewhat heuristic, or require tuned parameters.
We have shown that the diversity of an ensemble determines its ability to adapt; future work
focuses on exploiting this observation to produce state-of-the-art
techniques.
Boosting as a Product of Experts
MLO Contact: Nara Edakunni and Gavin Brown
Our research focuses on developing a probabilistic model for boosting
and framing the learning updates as a form of incremental model adaptation by
adding new experts to the ensemble. A probabilistic framework for boosting
provides a number of advantages including a simple and well motivated model of the data. Furthermore, it makes the
modeling assumptions made in boosting
explicit and allows us to seamlessly apply boosting across different problem
settings by varying the probabilistic model of the constituent experts.
A probabilistic model of boosting also enables us to use a plethora of inference
techniques like likelihood maximization and Bayesian inference to learn the
parameters of the model.
In a recent paper, we have shown that boosting corresponds to a Product of Experts model which
is a normalized product
of probabilities with the component
probabilities being contributed by the experts in the ensemble. The ensemble of
experts is expanded at each iteration by adding a new expert such that the
likelihood of the observed data, as predicted by the ensemble does not decrease
with the addition of an expert. We show that such a condition of non-decreasing likelihood at each iteration naturally
leads to a constraint on the parameters
of the expert similar to the famous weak learning criteria in boosting. For a specific parametrization of the expert
probabilities we can also show that incremental learning in PoE reduces to a variant of the AdaBoost algorithm.
N. Edakunni, G. Brown, and T.Kovacs, "Boosting as a Product of Experts",
Uncertainty in Artificial Intelligence, 2011
Speech Information Component Analysis with Deep Learning
MLO Contact: Ke Chen
It is well known that speech conveys various yet mixed information where
there are predominant linguistic information as well as non-verbal speaker-specific and emotional
information components. For human communication, all the information components in speech turn out to
be very useful and should be exclusively used for different tasks. For example, one often recognizes
a speaker regardless of what is spoken for speaker recognition, while it is effortless for him/her to
understand what is exactly spoken by different speakers for speech recognition. In general, however,
there is no effective way to automatically extract an information component of interest from speech
so that the same representation has to be used in different speech information tasks. The
interference of different yet entangled speech information components in most existing acoustic
representations hinders a speech or speaker recognition system from achieving better performance.
Recent studies in machine learning reveal that learning deep architectures provides a new way for
tackling complex AI problems. In our work, we have proposed a novel deep neural architecture for learning
intrinsic speaker-specific characteristics. As a result, multi-objective loss functions are proposed
for learning speaker-specific characteristics and regularization via normalizing interference of
non-speaker related information and avoiding information loss. We have demonstrated that a resultant
speaker-specific representation is insensitive to text/languages spoken and environmental mismatches
and hence outperforms MFCCs and other state-of-the-art techniques in speaker recognition. In our
ongoing work, we are developing novel yet biologically inspired deep architectures for speech information
component analysis towards extracting different task-specific information components of interest from speech
and applying them to various speech information processing tasks.
Selected Publications
- Chen K. & Salman A., Learning speaker-specific characteristics with a deep neural
architecture.
IEEE Transactions on Neural Networks and Learning Systems, vol. 23, 2012. (to appear)
[PDF]
- Chen K. & Salman A., Extracting speaker-specific information with a regularized Siamese deep network.
Advances in Neural Information Processing Systems 25 (NIPS'11), MIT Press, 2011.
[PDF]
Semi-Supervised Boosting Learning
MLO Contact: Ke Chen
Semi-supervised learning concerns the problem of learning in the presence of labeled and unlabeled
data. Several boosting algorithms have been extended to semi-supervised learning with various
strategies. However, none of them takes all three semi-supervised assumptions, i.e., smoothness,
cluster and manifold assumptions, together into account during boosting learning. In this work, we
proposed a novel cost functional consisting of the margin cost on labeled data and the regularization
penalty on unlabeled data based on three fundamental semi-supervised assumptions. Thus, minimizing
our proposed cost functional with a greedy yet stage-wise functional optimization procedure leads to
a generic boosting framework for semi-supervised learning. In extensive experiments we demonstrated
that our algorithm yields favorite results for benchmark and real world classification tasks in
comparison to state-of-the-art semi-supervised learning algorithms including newly developed boosting
algorithms.
In our ongoing studies, we work on formal analysis of our proposed semi-supervised boosting framework
and exploiting other useful information sources for semi-supervised learning.
Selected Publications
Chen K. & Wang S., Semi-supervised Learning via Regularized Boosting Working on Multiple
Semi-supervised
Assumptions.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(1): 129-143, 2011.
[PDF]
Cluster Ensembles for Temporal Data
MLO Contact: Ke Chen
As an emerging area in machine learning, clustering ensemble approaches have been recently studied
from different perspectives. The basic idea behind clustering ensemble is combining multiple
partitions on the same data set to produce a consensus partition expected to be superior to that of
given input partitions. Both empirical and theoretic studies suggest that clustering ensemble
provides an alternative technique to overcome weakness underlying individual clustering algorithms.
In our work, we propose a weighted clustering ensemble approach guided by clustering validation
criteria to reconcile initial partitions to candidate consensus partitions from different
perspectives to a final partition. In addition, our approach tends to capture the intrinsic structure
of a data set, e.g., the number of clusters. As our weighted cluster ensemble algorithm can combine
any input partitions to generate a clustering ensemble, we also investigate its limitation by formal
analysis and empirical studies. On the other hand, temporal data clustering provides underpinning
techniques for discovering the intrinsic structure and condensing information over temporal data but
the representation-based temporal data clustering methodology is subject to a fundamental weakness ¨C
information loss. The joint use of different representations under our proposed weighted clustering
ensemble framework effectively overcomes this fundamental weakness by exploiting various information
sources underlying temporal data. Our approach has been applied in benchmark time series, motion
trajectory and time-series data stream clustering tasks. In our ongoing studies, we work on formal
analysis in justifying the effectiveness of clustering ensemble and developing novel yet
theoretically justifiable clustering ensemble approaches.
Selected Publications
Yang Y. & Chen K., Temporal data clustering via weighted clustering ensemble with different
representations.
IEEE Transactions on Knowledge and Data Engineering 23(2): 307-320, 2011.
[PDF]
Guaranteed Approximation and Convergence in Multiobjective Optimization
MLO Contact : Joshua
Knowles
Approximation algorithms, which are methods that deliver solutions
guaranteed in the worst case to be no more than a fixed amount epsilon away from
optimal, are well-known in single-objective optimization. In
multiobjective optimization, the concept of approximation must be extended in two
ways: to cover vector fitness values; and, to cover sets. Thanks to
work by Yannakakis and Papadimitriou[1], and by Laumanns et al[2], we have the
notions of an epsilon-approximation set and an epsilon-Pareto approximation
set, which are alternative types of approximation to a Pareto optimal
set. In ongoing work (that dates back to some of my PhD[3] studies in 1999
onwards), I am investigating what types of algorithm give guaranteed
approximation to a Pareto front, and with what type of convergence. I
am particularly interested in the case where the epsilon value is not selected
a priori by the user, but is adapted during optimization to
give the closest possible approximation. In recent work with
López-Ibáñez and Laumanns[4], we were able to characterize both theoretically and
empirically the approximation and convergence properties of several of
the most common archiving algorithms used in multiobjective
optimization. Two of the currently best methods are hypervolume-based archiving[3,5], and
archiving based on a hierarchical, adaptive grid[6].
- C. H. Papadimitriou, M. Yannakakis. The complexity of tradeoffs,
and optimal access of web sources. FOCS, 2000.
- M Laumanns, L Thiele, K Deb, E. Zitzler. Combining convergence and diversity in evolutionary multiobjective optimization.
Evolutionary computation, 10(3): 263--282, 2002.
- J. Knowles. Local-search and hybrid evolutionary algorithms for
Pareto optimization. PhD thesis, University of Reading, 2002.
- M. López-Ibáñez, J. Knowles, M. Laumanns. On
sequential online archiving of objective vectors.
Evolutionary Multi-Criterion Optimization, LNCS 6576: 46-60, 2011.
- J. Knowles, D. Corne. Properties of an adaptive archiving
algorithm for storing nondominated vectors.
IEEE Transactions on Evolutionary Computation, 7(2):100-116, 2003.
- M. Laumanns, R. Zenklusen. Stochastic convergence of random search
methods to fixed size Pareto front approximations. European Journal of Operational Research
213(2): 414-421, 2011.
Dynamical Systems Analysis of Non-Stationary Learning
MLO Contact: Jon Shapiro and Joe
Mellor
The process in which a learning agent receives the data from which to
learn one example at a time is called online learning.
Online learning is most useful in two main contexts; in
scenarios where the dataset from which to learn is so vast that
attempting to consider all data points in the dataset becomes
intractable, and in situations where the agent recieves data as a stream
in real time and so must learn at the same time as responding to the
environment.
In both situations, the distribution from which the training data
comes from can change over time, this is called a non-stationary
environment.
The changing environment and the learning algorithm can be viewed as
dynamical systems -- our research is pursuing this in the context of
Iterated Function systems. The overall hypothesis of the project is
that objects of study in the theory of Dynamical systems can be used to
analyse online learning algorithms in non-stationary environments. This
could lead to a better understanding of convergence properties of
algorithms in certain environments and potentially allow for better
design of these online algorithms.
Machine Learning for Adaptive Multi-Core Machines
MLO Contact: Gavin Brown - (or visit
the project website)
The computer industry is undergoing the "multi-core" revolution. When you buy a PC off
the shelf these days, it is inevitably "dual-core" or "quad-core". This idea of more and more CPU "cores" executing in parallel is expected to
continue to the hundreds and thousands. The problem of coordinating these cores is challenging and unsolved. The iTLS project
applies Machine Learning technologies to address this problem.
Computational Modelling of Biochemical Networks
MLO Contact: Pedro Mendes
Computational modeling and simulation of biochemical networks is at the core of systems biology and
this includes many types of analyses that can aid understanding of how these systems work. COPASI is
a generic software package for modeling and simulation of biochemical networks which provides many of
these analyses in convenient ways that do not require the user to program or to have deep knowledge
of the numerical algorithms. COPASI is a flexible framework capable of: steady-state and time-course
simulations, stoichiometric analyses, parameter scanning, sensitivity analysis (including metabolic
control analysis), global optimization, parameter estimation, and stochastic simulation.
Mendes P, Hoops S, Sahle S, Gauges R, Dada J, Kummer U. Computational modeling of biochemical
networks using
COPASI. Methods in molecular biology (Clifton, N.J.). 2009; 500: 17-59.
[PDF]
|
|
|