Data-driven uncertainty quantification for high-dimensional engineering problems

Abstract

In the context of complex industrial systems and civil infrastructures, taking into account uncertainties during the design process has received much attention in the last decades. Although there is significant progress in modelling such systems, there are always discrepancies between ideal in-silico designed systems and real-world manufactured ones.

Starting from a realistic computational model that reproduces the behaviour of an engineering system, uncertainty quantification aims at modelling the various sources of uncertainty (including natural variability and lack of knowledge) affecting its input parameters as well as propagating these uncertainties to the response quantities of interest (e.g. performance indicators). Due to the high-fidelity and related computational costs of such models, the use of Monte Carlo methods for uncertainty quantification is often not a viable solution. To overcome this limitation, the use of surrogate models has become well established. A surrogate model is an analytical function that provides an accurate approximation of a computational model, based on a limited number of runs of the simulator at selected values of the input parameters and an appropriate learning algorithm.

In this thesis, the focus is the application of modern uncertainty quantification techniques in the presence of a large number, up to several thousands, of system parameters. As the dimensionality of the input space increases, the performance of surrogate modelling methods decreases, an issue that is known as curse of dimensionality. Furthermore, we approach the problem from a purely data-driven perspective, i.e. the entire analysis needs to be conducted based only a limited number of observations and little to no assumptions about the inner workings of the system. This scenario has high practical relevance, e.g. due to complex workflows involving various software packages to simulate a system or real-world applications for which only measurements of the input parameters and model responses are available. However such data-driven approaches introduce additional challenges related to the (unknown) stochastic properties of the input space. To quantify those, one typically resorts to well-known inference techniques (discussed in Chapter 2), but such methodologies also suffer from the curse of dimensionality.

To enable data-driven uncertainty quantification in high-dimensional input spaces, we propose a combination of machine learning techniques for data compression and state-of-the-art surrogate modelling introduced by the uncertainty quantification community. The first fundamental ingredient, dimensionality reduction, is discussed in Chapter 3. Through a literature review on the rather broad topic of dimensionality reduction, we highlight the strengths and weaknesses of various techniques as well as their area of application.

The second fundamental ingredient, surrogate modelling, is discussed in Chapter 4. Beyond a general formulation, focus is given on two state-of-the-art techniques, namely Kriging and polynomial chaos expansions, that are used throughout this thesis.

A novel methodology for enabling surrogate modelling in high dimensional spaces is introduced in Chapter 5. The proposed algorithm couples the input compression and surrogate modelling steps in such a way that the resulting performance of the surrogate is optimal. Furthermore we demonstrate its consistently superior performance on several benchmark applications (in terms of the predictive accuracy of the surrogate), compared to traditional approaches that treat dimensionality reduction and surrogate modelling as two disjoint steps.

In Chapter 6, we propose a workflow for data-driven uncertainty quantification in high dimensional spaces, which is the ultimate goal of the thesis.
The proposed workflow capitalises on the findings of the previous Chapters. Having access to a compressed space of manageable size and a surrogate, we show how one can eventually calculate statistical properties of the quantities of interest, such as their moments, quantiles and even their full probability distribution function. After applying this methodology on benchmark applications, we demonstrate that this workflow can lead to improved estimates of the uncertainty of the quantities of interest, especially in the extreme value regions.

Finally, in Chapter 7 we show how the methods presented in this thesis can be applied to a realistic engineering application related to the structural health monitoring of wind turbines. The goal is to estimate the fatigue accumulation and peak loads, as well as their uncertainty, on various components of a wind turbine, given the inflow wind speed over 10 minute time intervals. We do so by processing a limited amount of observations that are generated by specialised software.

This manuscript introduces new techniques that enable uncertainty quantification in a wide class of problems for which it was initially not possible.
This has strong practical implications considering the numerous relevant problems nowadays in e.g. structural health monitoring, earthquake engineering, weather forecasting, hydrogeology and control engineering, where the input space is high-dimensional (e.g. time series or image inputs). The new methodology and our findings are summarised in Chapter 8, along with suggestions for future research on this topic.

Keywords

Data-driven, uncertainty quantification, machine learning, surrogate modelling, high dimensionality, dimensionality reduction.

BibTeX cite

@PHDTHESIS{LataniotisThesis,
author = {Lataniotis, C.},
title = {Data-driven uncertainty quantification for high-dimensional engineering problems},
school = {ETH Z\"urich, Z\"urich, Switzerland},
year = {2019}
}