OpenAI 悄悄放了个大招,最新发布的 o3 和 o4-mini 模型,据说在“看”这件事儿上有了质的飞跃。这次升级,核心在于让AI不仅仅是识别图像,而是真正开始“用图像思考”。这听起来有点科幻,但似乎预示着视觉AI发展的一个新阶段。
OpenAI quietly dropped a bombshell, with the latest released o3 and o4-mini models reportedly making a qualitative leap in "seeing." The core of this upgrade lies in enabling AI to not just recognize images, but to truly begin "thinking with images." This sounds a bit like science fiction, but it seems to herald a new stage in the development of visual AI.
以前的AI,顶多算是个图像识别器,能告诉你图里有什么。但这次的 o3 和 o4-mini,厉害之处在于它们能把图像融入到自己的“思维链”里。简单说,就是能理解图像背后的含义,像人一样分析图表、看懂草图,甚至能基于图像做出更深入的推理。
举个例子,以前你给AI一张图表,它可能只能告诉你里面有几条线,分别代表什么数据。但现在,它可以分析这些数据之间的关系,然后告诉你这个图表说明了什么问题。这种能力,应用前景可就大了,比如辅助医生分析医学影像,或者帮助工程师理解复杂的工程图纸。
In the past, AI was at best an image recognizer, capable of telling you what's in a picture. But the strength of o3 and o4-mini this time lies in their ability to integrate images into their own "chain of thought." Simply put, they can understand the meaning behind images, analyze charts and understand sketches like humans do, and even make deeper inferences based on images.
For example, if you gave AI a chart in the past, it could only tell you how many lines there are and what data each line represents. But now, it can analyze the relationships between these data and then tell you what the chart illustrates. This ability has great potential applications, such as assisting doctors in analyzing medical images or helping engineers understand complex engineering drawings. 📈📊🤔
除了视觉理解能力的提升,这次发布的模型还具备更强的自主性。它们可以像一个智能助理一样,自己调用各种工具来解决问题。比如,需要上网搜索信息,就自己打开网页;需要进行数据分析,就启动 Python 代码。
据说,为了解决一个特别复杂的问题,这些模型甚至可以连续调用几百次工具。这种自主Agent能力,无疑将大大拓展AI的应用范围。
In addition to the improvement in visual understanding, the models released this time also have stronger autonomy. They can act like an intelligent assistant, independently using various tools to solve problems. For example, if they need to search for information online, they will open a webpage themselves; if they need to perform data analysis, they will launch Python code.
It is said that, in order to solve a particularly complex problem, these models can even call tools hundreds of times in a row. This autonomous Agent capability will undoubtedly greatly expand the application scope of AI.
目前,ChatGPT Plus、Pro 和 Team 用户已经可以体验这些新模型。OpenAI 也计划在几周内发布 o3-pro,并提供更全面的工具支持。
Currently, ChatGPT Plus, Pro, and Team users can already experience these new models. OpenAI also plans to release o3-pro in the coming weeks and provide more comprehensive tool support. 🛠️
另外,OpenAI 还在考虑收购 AI 辅助编程工具 Windsurf,如果成功,无疑将进一步增强其在编程领域的实力。
至于未来,医学专家已经开始惊呼,o3 的智能程度简直“接近天才”,能给出媲美顶级专科医生的专业回答。看来,AI 的发展速度,可能真的要超出我们的想象了。
In addition, OpenAI is reportedly considering acquiring Windsurf, an AI-assisted programming tool, which, if successful, would undoubtedly further enhance its strength in the programming field.
As for the future, medical experts are already exclaiming that o3's intelligence is "close to genius," capable of providing professional answers comparable to top specialists. It seems that the speed of AI development may indeed exceed our imagination. 🤯
🧠 收藏➕关注 每日掌握前沿科技,同步提升英语硬实力!科技英语双丰收!🎉
🧠 Collect ➕ Follow to master cutting-edge technology daily and improve your English skills simultaneously! Reap the rewards of both technology and English! 🎉
本文作者:topwind
本文链接:
版权声明:本博客所有文章除特别声明外,均采用 BY-NC-SA 许可协议。转载请注明出处!