Abstract:
The research objective of the dissertation is to develop methods to address the curse of
dimensionality in the field of approximate dynamic programming, to enhance the scalability
of these methods to large-scale problems.
Several problems, including those faced in day to day life involve sequential decision
making in the presence of uncertainty. These problems can often be modeled as Markov
decision processes using the Bellman’s optimality equation. Attempts to solve even reasonably complex problems through stochastic dynamic programming are faced with the curse
of modeling and the curse of dimensionality. The curse of modeling has been addressed in
the literature through the introduction of reinforcement learning strategies, a strand of approximate dynamic programming (ADP). In spite of considerable research efforts, curse of
dimensionality which affects the scalability of ADP for large scale applications still remains
a challenge. In this research, a value function approximation method based on the theory
of diffusion wavelets is investigated to address the scalability of ADP methods.
The first contribution of this dissertation is an advancement of the state-of-the-art in the
field of stochastic dynamic programming methods that are solved using ADP approaches.
An important intellectual merit is the innovatively designed diffusion wavelet based value
function approximation method which is integrated with ADP to address the curse of dimensionality. The innovation lies in this integration that exploits the structure of the problem
to achieve computational feasibility.
The ADP method with diffusion wavelet based value function approximation is tested on
the problem of taxi-out time estimation of aircrafts (time duration between gate-pushback
and wheels-off) to establish a proof of concept for the research objective. The second
contribution of this dissertation is the modeling of the taxi-out time estimation of flights
as a stochastic dynamic programming problem with the capability to provide sequential
predictions in real-time as the system evolves. The model aims to accurately predict the
taxi-out time of a flight at least fifteen minutes before its scheduled gate pushback time.
As a case study for Detroit International Airport, results indicate that there is a 6 % to
12 % increase in the percentage of flights predicted accurately (with a root mean square
error of two minutes) using ADP when compared with a regression model for taxi-out time
predictions.
The outcomes of this dissertation research provide a generic methodology for sequential
decision making under uncertainty in large scale applications by uniting concepts from signal
processing, statistics, stochastic processes, and artificial intelligence, which may provide
solutions for future automated decision making in large scale complex applications in other
engineering domains.