OpenAI unveils specs for desired AI model behavior
In a bid to “deepen the public conversation about how AI models should behave,” AI company OpenAI has introduced Model Spec, a document that shares the company’s approach to shaping desired model behavior.
Model Spec, now in a first draft, was introduced May 8. The document specifies OpenAI’s approach to shaping desired model behavior and how the company evaluates trade-offs when conflicts arise. The approach includes objectives, rules, and default behaviors that will guide OpenAI’s researchers and AI trainers who work on reinforcement learning from human feedback (RLHF). The company will also explore how much its models can learn directly from the Model Spec.