Evals and environments for computer use and software engineering work
Building RL environments for the next generation of capabilities in frontier models