MachineLearning/mdp library
Markov Decision Process (MDP) utilities - generic
A compact, engineering-oriented MDP helper that provides:
- generic value iteration and policy iteration solvers for discrete MDPs
- support for arbitrary state/action indexes with transition and reward represented as dense arrays (List)
- convergence criteria and iteration limits
Contract:
- Input: number of states, number of actions, transition P
as'(prob), reward Ras', discount gamma. - Output: value function and greedy policy.
- Errors: throws ArgumentError on shape mismatches.