Decision Theory

  • decision variable aAa \in A, where A is called the domain of the decision space. dom(a)=Adom(a) = A
  • outcome yP(ya)y \sim P(y | a)
  • utility function U:yU(y)RU: y \to U(y) \in \mathbb{R}

1. Principal of Maximal Expected Utility

a=argmaxaEP(ya){U(y)}a^{*} = \underset{a}{\text{argmax}} \mathbb{E}_{P(y \mid a)}\{U(y)\} yP(ya)U(y)\sum_{y} P(y | a) \cdot U(y)
  • maxf(x)\text{max} f(x): searches for the maximal value for ff
  • argmaxf(x)\text{argmax} f(x) searches for the xx which leads to the maximal value of ff

everything is optimal since everything has some cost function which will be optimised against -> our definition of optimal has nothing to do with a normative judgement

Background reading: Toussaint, Ritter & Brock: The Optimization Route to Robotics – and Alternatives. Ku ̈nstliche Intelligenz, 2015

Theorem: If preferences are "consistent" (orderable, transitive, continuous, substitutable, monotone, decomposable) \Rightarrow

  • There exists a utility function UU such that the agent's preferences yyy \preceq y^{\prime} are consistent to U(y)U(y)U(y) \leq U\left(y^{\prime}\right)
  • The utility of a probabilistic outcome is the expected utility E{U(y)}=yP(y)U(y)\mathbb{E}\{U(y)\}=\sum_y P(y) U(y)

every agent that has a preference, there exists a utility function

2. Decision Networks

graph of a decision network
graph of a decision network

Command Palette

Search for a command to run...