Forcing infinite memory for lim inf total payoff objectives in countable MDPs

on Wed, 16:30 for 30min

We look at an example of a countable MDP with integer-valued transition rewards. Every infinite run induces a sequence of total payoffs (the sequence of sums of all rewards seen so far).The objective is to maximise the probability that the lim inf is non negative. We present a counterexample to show that infinite memory is required in order to satisfy the lim inf objective for epsilon-optimal strategies. Furthermore, this holds even if the step-counter is implicit in the state and the MDP is finitely branching.

Overview Program