“computer use”赛道战火将起，哪个模型最强，这个评估平台可以了解一下 - 代码天地

“computer use”赛道战火将起，哪个模型最强，这个评估平台可以了解一下

企业开发 2024-11-01 22:43:15 阅读次数: 0

“computer use”赛道战火将起，哪个模型最强，这个评估平台可以了解一下

原创 ully AI工程化 2024年10月29日 11:53 北京

随着 anthropics 最新模型的发布，在“computer use”领域的产品一下子变得多了起来，那如果想要上手体验，哪个最方便使用且能代表当前最佳体验呢，笔者推荐 open-interpreter，它可以说是这一领域的探路者，笔者也曾多次介绍过这一工具（Open Interpreter迎来更新，更炫能力上线！），当前 star 量也高达 54k，随着这一赛道被大众所关注，它也将会是最为受益的项目之一。

AI工程化

未来将会有更多模型进入这一领域，那么将如何判断哪种模型更适合在“computer use”场景使用呢？

这里介绍一个专门用于评估这一领域表现的工具——OSWorld（https://os-world.github.io/）。

，时长00:17

OSWorld 是论文“OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments”提出的，它是一个专为多模态代理设计的真实计算机环境，支持在多种操作系统（如 Ubuntu 、Windows 和macOS）上进行开放任务的评估。其核心目的是提供一个可复现、可扩展的平台，以全面测试多模态代理在真实世界任务中的表现。

它涵盖了 369 个真实世界计算机任务，每个任务都配有详细的初始状态设置和自定义评估脚本，其中最新的claude模型也在它的榜单之中，不出意外拔得头筹，比第二名openAI优势明显。

参考资料

[1]

示例: https://github.com/anthropics/anthropic-quickstarts/tree/main/computer-use-demo

猜你喜欢

转载自blog.csdn.net/sinat_37574187/article/details/143331910

“computer use”赛道战火将起，哪个模型最强，这个评估平台可以了解一下

SSAO By Computer Shader(一)

Computer

【Computer 】

Your computer 睡眠&休眠了解一下喽~

Computer: CMD and use windows system to better

Computer: Use the mouse to open the analog keyboard

Attention Mechanisms in Computer Vision: A Survey（一）

computer io模型第一篇

【29】带你了解计算机视觉（Computer vision）

一个FLAG #14# The SetStack Computer

Computer Hardware

Computer English

upper computer

Computer Vision

Computer Game

server computer

Computer Transformation

Computer Abstractions

Computer Network

Calendar of Computer Image Analysis, Computer Vision Conferences

Introduction to Computer Networking学习笔记（一）：网络层模型及IP header

09龙芯平台openstack环境搭建指导-neutron-computer部署

07龙芯平台openstack环境搭建指导-nova-computer部署

Microsoft Planetary Computer(MPC)：在云平台上创建虚拟环境和部署MMrotate

计算机视觉模型、学习和推理 Computer vision：models，learning and inference（免费下载）

一个神奇的computer vision 的网站----来自一位知乎大神---Keith Price

Computer Vision (一) Image Classification : Nearest Neighbor , K-Nearest Neighbor , Data Set splits

Multiple View of Geometry In Computer Vision学习笔记（一）1Multiple View Geometry

图像处理中的全局优化技术(Global optimization techniques in image processing and computer vision) (一)

今日推荐

周排行

阿里云服务器ECS开放8080端口

求正弦和余弦

链表倒数第n个节点

vue.js入门（13）实战demo

Java学习——day 15

My First Day in CSDN

Oracle11g 密码延迟认证导致library cache lock的情况分析

SAP ALV输出字段内容前增加空格

CloudFlare 推出免费 VPN 服务「Warp」，你懂的！

BUG(跑SLAM14-ch10)

每日归档

更多

2025-03-16(0)

2025-03-15(0)

2025-03-14(0)

2025-03-13(0)

2025-03-12(0)

2025-03-11(0)

2025-03-10(0)

2025-03-09(0)

2025-03-08(0)

2025-03-07(0)