MachineLearning/ppo library
Proximal Policy Optimization (PPO) - lightweight stub
A compact PPO-style agent that demonstrates the API and high-level mechanics: actor/critic, clipped surrogate objective, and update epochs. This implementation is intentionally simplified for clarity and testing — it does not aim for production RL performance but is useful as a starting point for experiments and unit tests.