【新智元导读】刚刚,OpenAI正式上线史上最贵API——o1-pro,输入/输出价格贵到离谱,最高可达DeepSeek-R1的千倍。OpenAI研究员戏称,大模型界的劳斯莱斯。
但即便如此,这一行为依然让o1和o3在FrontierMath上,表现得比在其他未经优化的复杂推理领域中更亮眼。 不过,这种差距应该不会像某些在MMLU上采用 ...
measured by the MMLU benchmark, reach 90.8%, outperforming many industry-leading models. Also read: Krutrim-2 – Can India’s language-first AI outpace global benchmarks? The o3-mini focuses on ...
It scored 87.2 points on the Massive Multitask Language Understanding (MMLU) Pro benchmark, a test that gauges a model’s knowledge. That bested DeepSeek-R1’s 84 points but trailed the 89.3 ...
一些您可能无法访问的结果已被隐去。
显示无法访问的结果