Modern stochastic modelling techniques are capable of building very complex probabilistic structures, to represent a great variety of practical problems. The key to both the construction and analysis of such models is the concept of conditional independence, whereby each variable is related locally (conditionally) to only a few other variables. Therefore, although the model exhibits great complexity globally, it has relatively simple local structure. Conditional independence plays a key role, allowing models to be expressed by graphical representations, in which nodes represent random variables of interest, and links indicate local conditional independence relations. The essential points of this approach are: (a) complex models are built up using modular components, and (b) the graph structure provides the key to global analysis using local computations or simulations.
Research into these Highly structured models has developed in several fields; this Network focuses on three of the most important.
Image analysis
Statistical science is making three distinct but interdependent contributions to the processing and analysis of digital images. Firstly, there is the use of probabilistic models for true images, in which features and their inter-relationships can be modelled economically using spatial stochastic processes, typically with simple local conditional structure. Such models arise from an interaction of ideas from statistical mechanics and spatial statistics.
The second contribution is to the modelling of the degradation that occurs between the true image and that actually recorded, and again is usually modelled parsimoniously using simple local stochastic structures. The combination of these two model components, through Bayes’ theorem, produces powerful tools for the analysis of noisy images.
The third contribution is the creation and implementation of stochastic algorithms, including stochastic relaxation and simulated annealing, for the realisation of these image analytic tools. Anticipated future developments in statistical image analysis include the development of higher-level and hierarchical image models, which will lead, in combination with the other model components, to algorithms for tasks such as object recognition and probabilistic diagnosis.
Expert systems and graphical models
Probabilistic reasoning in expert systems uses a directed graphical model to represent qualitative conditional independence assumptions, which allow the specification of a full joint distribution based on simple local components. Techniques for propagating the effects of observed evidence around the network exploit the graphical structure. One area of current research concerns the problem of when the network gets so tightly connected that the graphical algorithms become inefficient - the need then is for simulation techniques similar to those in the other two strands of this proposal. Another important issue, which effectively bridges the gap between artificial intelligence and statistics, is how to use data to help build new, or modify existing, networks. This shifts attention away from the type of probability calculations which are required for processing individual cases, and towards Bayesian statistical reasoning, providing a strong link with the third strand below.
Graphical modelling also arises naturally as a data-analytic tool, generally using maximum likelihood rather than Bayesian inference.
Bayesian statistics and applications
Bayesian hierarchical models build complex structures through simple conditional independences of groups of parameters given further stages of hyperparameters. Bayesian modelling now frequently works with large and complex representations of problems in many application areas. Recent developments in computing the resulting statistical inferences again make use of the conditional probability structures.
Conditional independence structures are also finding application in areas such as genetics, where the local structures are the links in a complex pedigree, and spatial epidemiology, where they represent geographical proximity or transportation links. Anticipated research in this area will include new initiatives in statistical applications, as well as new computational developments. Important practical questions arise over both realism and robustness of complex models. Computational questions concern the convergence characteristics of the stochastic algorithms, and techniques and methodologies for applying them, including efficient sampling algorithms.