fabian s. klinke

Decision Theory

decision variable $a \in A$ , where A is called the domain of the decision space. $dom(a) = A$
outcome $y \sim P(y | a)$
utility function $U: y \to U(y) \in \mathbb{R}$

1. Principal of Maximal Expected Utility

a^{*} = \underset{a}{\text{argmax}} \mathbb{E}_{P(y \mid a)}\{U(y)\}

\sum_{y} P(y | a) \cdot U(y)

$\text{max} f(x)$ : searches for the maximal value for $f$
$\text{argmax} f(x)$ searches for the $x$ which leads to the maximal value of $f$

everything is optimal since everything has some cost function which will be optimised against -> our definition of optimal has nothing to do with a normative judgement

Background reading: Toussaint, Ritter & Brock: The Optimization Route to Robotics – and Alternatives. Ku ̈nstliche Intelligenz, 2015

Theorem: If preferences are "consistent" (orderable, transitive, continuous, substitutable, monotone, decomposable) $\Rightarrow$

There exists a utility function $U$ such that the agent's preferences $y \preceq y^{\prime}$ are consistent to $U(y) \leq U\left(y^{\prime}\right)$
The utility of a probabilistic outcome is the expected utility $\mathbb{E}\{U(y)\}=\sum_y P(y) U(y)$

every agent that has a preference, there exists a utility function

2. Decision Networks

graph of a decision network