Anthropic 官方权威揭秘提示工程:如何成为卓越的提示工程师?避免陷入哪些误区?

在这里插入图片描述

Anthropic 官方专家团队于 9 月 7 日最新发布了关于提示工程的深度分享,分享嘉宾包括:

  • Amanda Askell,领导 Anthropic 的微调团队,研究科学家,她以其在伦理、哲学和人工智能安全方面的专业知识而闻名。

  • Alex,Anthropic 开发者关系团队 Leader

  • David Hershey,Anthropic 客户合作部门

  • Zach Whitten,Anthropic 的一名提示工程师

原文较长且技术细节讨论较多,为了便于阅读和吸收,现将核心观点和干货整理如下。

什么是提示工程?

1.提示工程旨在引导模型完成任务,最大限度地发挥模型的能力。

原文: “I guess I feel like prompt engineering is trying to get the model to do things, trying to bring the most out of the model, trying to work with the model to get things done that you wouldn’t have been able to do otherwise. ”

译文: 我认为提示工程就是试图让模型去做事,试图最大限度地利用模型,试图与模型合作完成你原本无法完成的事情。

2.提示工程的核心是清晰的沟通,像与人交谈一样与模型交谈。

原文: “So a lot of it is just clear communicating. I think at heart, talking to a model is a lot like talking to a person and getting in there and understanding the psychology of the model.”

译文: 所以很多时候,这只是清晰的沟通。我认为,从本质上讲,与模型交谈就像与人交谈一样,要深入了解模型的心理。

3.提示工程涉及反复试验,不断调整提示以获得最佳结果。

原文: “I think the engineering part comes from the trial and error. Okay. So one really nice thing about talking to a model that’s not like talking to a person is you have this restart button, this giant go back to square zero where you just start from the beginning.”

译文: 我认为工程部分来自于反复试验。好的。与跟人交谈不同,与模型交谈有一件非常好的事情,那就是你可以按下重启按钮,回到原点,从头开始。

4.提示工程需要系统性思考,例如考虑数据来源、延迟和系统集成。

原文: “You have to like think about where data comes from, what data you have access to. So like if you’re doing RAG or something, like what can I actually use and do and pass to a model? You have to like think about tradeoffs in latency and how much data you’re providing and things like that.”

译文: 你必须考虑数据的来源,你能访问哪些数据。比如,如果你正在做检索增强生成(RAG),你必须考虑你能实际使用、处理和传递给模型的数据是什么?你必须考虑延迟和提供的数据量之间的权衡,以及类似的事情。

5.提示工程类似于用自然语言编程,但更注重清晰的描述和指令,而非抽象概念。

原文: “I mean, I kind of think of prompts as like the way that you program models a little bit. …but if you think about it a little bit, it’s like programming a model. ”

译文: 我的意思是,我认为提示有点像用自然语言编程模型。……但如果你仔细想想,这就像是在给模型编程。

如何成为一名优秀的提示工程师?

1.具备清晰的沟通能力,能够清晰地陈述和描述概念。

原文: “Yeah, good question. I think it’s a mix of, like Zach said, sort of like clear communication. So the ability to just like clearly state things, like clearly understand tasks, think about and describe concepts really well.”

译文: 是的,问得好。我认为这需要多种因素的结合,就像扎克说的那样,需要清晰的沟通能力。也就是说,需要能够清晰地陈述事情,清晰地理解任务,很好地思考和描述概念。

2.愿意不断迭代,分析模型的错误并调整提示。

原文: "And then I think now I’m like, what you’re actually doing, people think that you’re writing one thing and you’re kind of done. And then I’ll be like, to get a semi-decent prompt, when I sit down with the model, I’ll, earlier I was prompting the model and I was just like, in a 15 minute span, I’ll be sending hundreds of prompts to the model. It’s just back and forth, back and forth, back and forth. And so I think it’s this willingness to iterate and to look and think what is it that was misinterpreted here, if anything, and then fix that thing. "

译文: 我觉得现在更像是,人们以为你写完一个提示就差不多了。但实际上,为了得到一个还不错的提示,当我坐下来和模型交流时,我之前就在引导模型,在 15 分钟内,我会给模型发送数百个提示。我只是在反复尝试,思考模型在这里是不是理解错了,然后修正它。

3.能够预判提示可能出错的地方,并针对边缘情况进行测试。

原文: “I think also thinking about ways in which your prompt might go wrong. So if you have a prompt that you’re going to be applying to say 400 cases, it’s really easy to think about the typical case that it’s going to be applied to. to see that it gets the right solution in that case and then to move on. I think this is a very classic mistake that people made. What you actually want to do is find the cases where it’s unusual. ”

译文: 我认为还需要思考你的提示可能会出错的地方。如果你有一个提示要应用到 400 个案例中,你很容易只考虑它将被应用到的典型案例。看看它在这种情况下是否得到了正确的解决方案,然后就继续下一个。我认为这是一个很多人都会犯的典型错误。你真正想做的是找到那些不寻常的案例。

4.仔细阅读模型输出,理解其思维过程。

原文: “Like in a machine learning context, you’re supposed to look at the data. It’s almost a cliche, look at your data. And I feel like the equivalent for prompting is look at the model outputs, just reading a lot of outputs and reading them closely.”

译文: 就像在机器学习中,你应该查看数据。这几乎是一个老生常谈的话题,要查看你的数据。我觉得提示工程的等价物就是查看模型的输出,阅读大量的输出并仔细阅读它们。

5.能够跳出自身对任务的理解,清晰地向模型传达所有必要信息。

原文: “On the theory of mind piece, one thing I would say is it’s so hard to write instructions down for a task. It’s so hard to untangle in your own brain all of the stuff that you know that Quad does not know and write it down.”

译文: 关于心智理论,我想说的是,为一个任务写下指令是如此困难。要把你脑子里知道的,而模型不知道的所有东西都梳理出来并写下来,这太难了。

6.将提示视为代码,注重细节、版本控制和实验跟踪。

原文: “Um, but that said, like you are compiling the set of instructions and things like that into outcomes a lot of times. And so precision and, and like a lot of the things you think about with programming about like version control and managing what it looked like back then when you had this experiment and, and like tracking your experiments and stuff like that, that’s all, um, you know, just equally important to code.”

译文: 嗯,但话虽如此,你确实经常需要将指令集之类的东西编译成结果。因此,很多你在编程时会考虑的事情,比如版本控制、管理实验时的代码状态,以及跟踪实验结果等等,这些对于提示工程来说都同样重要。

7.乐于尝试,敢于挑战模型的能力边界。

原文: “I would say like trying to get the model to do something you don’t think you can do. Like any, the time I’ve learned the most from prompting is like when I’m probing the boundaries of what I think a model is capable of.”

译文: 我想说的是,试着让模型去做一些你认为它做不到的事情。就像,我从提示中学到最多的时刻,就是当我试图触碰我认为模型能力边界的时候。

8.善于利用模型自身的能力,例如让模型解释其错误或生成示例。

原文: “I was going to say one of the first things I do with my initial prompt is like, I’ll give it the prompt and then I’ll be like, I don’t want you to follow these instructions. I just want you to tell me the ways in which they’re unclear or any ambiguities or anything you don’t understand. ”

译文: 我想说的是,我做第一个提示时,首先会做的事情之一就是,我会给它看提示,然后说,“我不想让你按照这些指示去做。我只想让你告诉我,这些指示有哪些不清楚的地方,或者有哪些模棱两可的地方,或者你有哪些不明白的地方”。

关于提示工程的一些常见误区

1.认为提示只是一个简单的搜索框,只需输入关键词。

原文: “I think a lot of people still haven’t quite wrapped their heads around what they’re really doing when they’re prompting. Like a lot of people see a text box and they think it’s like a Google search box.”

译文: 我认为很多人还没有完全理解他们在提示时到底在做什么。很多人看到一个文本框,就以为这是一个谷歌搜索框。

2.过分依赖角色扮演或比喻来欺骗模型,而忽略了清晰地描述任务本身。

原文: “I think as models are more capable and understand more about the world, I guess I just don’t see it as necessary to lie to them. I mean, I also don’t like lying to the models just because, you know, I don’t like lying generally.”

译文: 我认为,随着模型越来越强大,对世界的理解也越来越深入,我不认为有必要对它们撒谎。我个人也不喜欢对模型撒谎,因为你知道,我一般不喜欢撒谎。

3.认为存在一个完美的提示可以解决所有问题,而不愿进行迭代和调整。

原文: “It’s a bit of a double-edged sword though, because I feel like there’s like a little bit of prompting where There’s always like this mythical better prompt that’s going to solve my thing on the horizon. ”

译文: 不过,这也是一把双刃剑,因为我觉得在提示工程中,总是存在一种迷思,认为存在一个更好的提示可以解决我的所有问题,而我不愿意进行迭代和调整。

4.过分关注语法和标点符号,而忽略了清晰的表达和逻辑。

原文: “I usually try to do that because I find it fun, I guess. I don’t think you necessarily need to. I don’t think it hurts.”

译文: 我通常会这样做,因为我觉得这很有趣。我不认为你必须这样做。我不认为这样做有什么坏处。

5.低估了模型的能力,认为需要简化任务或隐藏复杂性。

原文: "Like, I feel like in the past, like, I would somewhat intentionally hide complexity from a model where I thought, like, it might get confused or lost or, like, hide, like, it just couldn’t handle the whole thing. "

译文: 比如,我觉得在过去,我会故意对模型隐瞒复杂性,因为我觉得它可能会感到困惑或迷失,或者无法处理整个事情。

提示工程的未来

1.提示工程不会消失,但会随着模型能力的提升而不断演变,追求极致。

原文: “On the question of where prompt engineering is going, I think this is a very hard question. On the one hand, I’m like, maybe it’s the case that as long as you will want the top. Like, what are we doing when we prompt engineer? It’s like what you said. I’m like, I’m not prompt engineering for anything that is easy for the model. I’m doing it because I want to interact with a model that’s like extremely good. And I want to always be finding the kind of like top 1%, top 0.1% of performance and all the things that models can barely do.”

译文: 关于提示工程的未来,我认为这是一个非常难回答的问题。一方面,我认为,只要你想追求极致,就像你说的那样,我们做提示工程是为了什么?我做提示工程不是为了让模型做简单的事情。我做这件事是因为我想与一个非常优秀的模型互动。我想一直都在寻找模型所能做到的前 1% 甚至 0.1% 的极致性能。

2.未来,提示工程将更加注重元提示,例如让模型帮助生成提示或解释其推理过程。

原文: “Yeah, I’m definitely working a lot with meta prompts now. And that’s probably where I spend most of my time is finding prompts that get the model to generate the kinds of outputs or queries or whatever that I want.”

译文: 是的,我现在肯定在大量使用元提示。这可能是我花费最多时间的地方,我一直在寻找能够让模型生成我想要的输出或查询的提示。

3.提示工程将更加注重从用户那里获取信息,例如通过引导式交互或让模型对用户进行“采访”。

原文: “Anecdotally, I’ve started having Claude interview me a lot more like that is like the specific way that I try to listen information because again I find the hardest thing to be like Actually pulling the right set of information out of my brain and putting that into a prompt is like the hard part to me they’re not forgetting stuff and so like specifically asking Claude to like interview me and then turning that into a prompt is a thing that I have turned to a handful of times and”

译文: 举个例子,我已经开始让 Claude 更多地像这样“采访”我了,这就像是我尝试获取信息的特定方式,因为我再次发现,最难的事情就是把我大脑中正确的信息提取出来,并将其放入提示中,对我来说,这是最难的部分,它们不会忘记东西,所以,特别是要求 Claude 像这样“采访”我,然后将其转化为提示,这是我已经做过很多次的事情

4.最终,提示工程将演变成一种能够将人类思维清晰地“外化”给模型的能力。

原文: “So like often my style of prompting, like there’s various things that I do, but a common thing that’s very like a thing that philosophers will do is I’ll define new concepts. So because my thought is like you have to put into words what you want. And sometimes what I want is fairly like nuanced, like the what is a good chart or like usually, you know, like I don’t know, like how is it that you when should you grade something as being correct or not? And so there are some cases where I will just like invent a concept and then be like, here’s what I mean by the concept.”

译文: 所以,就像我经常使用的提示风格一样,我做了很多事情,但有一件事很像哲学家会做的事情,那就是我会定义新的概念。因为我的想法是,你必须把你想要的东西用语言表达出来。有时候,我想要的东西相当微妙,比如什么是好的图表,或者通常情况下,我不知道,比如你什么时候应该给某样东西打分,说它是正确的还是错误的?所以在某些情况下,我会发明一个概念,然后说,这就是我所说的这个概念。

参考资料:https://www.youtube.com/watch?v=T9aRN5JkmL8

猜你喜欢

转载自blog.csdn.net/w605283073/article/details/147003900
今日推荐