MachineLearning/ppo library

Proximal Policy Optimization (PPO) - lightweight stub

A compact PPO-style agent that demonstrates the API and high-level mechanics: actor/critic, clipped surrogate objective, and update epochs. This implementation is intentionally simplified for clarity and testing — it does not aim for production RL performance but is useful as a starting point for experiments and unit tests.

Classes

PPO