模块 #2:深入解析使用 AI 构建的五个阶段
我的理解
本讲提出了用 AI 构建的五阶段成长框架,从“黑盒阶段”(只给目标、不提供任何上下文)一路进化到“共同创造者”阶段。核心洞见在于:现有工作流程是为人与人协作设计的,而 AI 遵循完全不同的规则,这种错位才是体验不稳定的根本原因。以 WBR(周度业务回顾)为贯穿案例,演示了黑盒状态下 AI 如何因缺乏 schema 而“幻觉”出根本不存在的数据库表——这不是 AI 的过错,而是我们未能提供足够上下文。AI 被训练得“乐于助人”,训练过程奖励看起来完整的回答,因此在缺乏事实时它会给出自信的猜测而非坦承不知。解决方案始终在我们这侧:像给学生提供考试材料一样,把必要事实注入 prompt。随机试错是最慢的学习路径,结构化的五阶段路径可以帮助我们系统性地缩小与 AI 协作的差距。
相关链接
- Ch02-L02 研究 LLM内部机制 记忆知识上下文 — LLM 内部机制是理解五阶段框架的基础理论层
- Ch02-L07 案例研究 周度业务回顾 问题分析 — WBR 案例是本讲所预告的完整端到端展开
- Ch01-L06 技术洞察 3 从用户到构建者 — 从用户到构建者的主题与五阶段框架直接相承
- Ch04-L11 技术洞察 像管理者一样思考 — 进阶阶段中像管理者一样分配 AI 任务的思维方式
原文
Lesson 8 of 68 视频 - 模块 #2:深入解析使用 AI 构建的五个阶段 / Video - Module #2: Deep dive into the five stages of building with AI 好,我们开始吧。Okay, we can start. Hide transcript Transcript 00:00 好,我们开始吧。Okay, we can start. 00:02 模块 2,yen。Module 2 yen. 00:03 全都是你的了。It’s all yours. 00:05 太好了,谢谢。Great, thank you. 00:09 现在还在解决一些屏幕共享的问题。Currently still dealing with some screen sharing issues. 00:13 好了,我想大家都应该能看到我的屏幕了。All right, I think everyone should be able to see my screen. 00:16 欢迎回到第一模块,And welcome back in module one, 00:18 YZ 刚才提到,AI YZ just said AI is a very big change 00:20 将彻底改变我们的工作方式。in how we work. 00:23 我们常用‘范式转变’这个词来描述它。And we use the term like paradigm shift. 00:27 那么日常生活中这种感觉是怎样的?And what does that feel like day to day? 00:29 感觉很不稳,It feels unstable, 00:31 仿佛脚下的地面都在晃动。like the ground is moving under you. 00:34 你可以想象一下 And you can think about the shift 00:35 从仅有图书馆 from only having the libraries 00:37 转变到拥有互联网的过程。to having the Internet. 00:39 旧规则不再管用了。And old rules stop working. 00:41 新的技能变得对成功至关重要。New skills become important for success. 00:44 最困难的不仅是 And the hardest part is not only 00:45 养成新习惯。learning new habits. 00:47 你还需要放下一些旧习惯。You also need to let go of some old habits. 00:49 你现在正在路上。You are now in your way. 00:51 在本节中,我们将为你提供一条清晰的路径。And in this session, we give you a clear path on that. 00:54 我们将介绍一个简单框架 We’ll show a simple framework 00:56 帮助你从传统工作流程 that will help you move from old workflows 00:58 平稳过渡到适应 AI 时代的新流程。to the new ones that fit the AI era. 01:02 不过,在讨论解决方案之前,But before we talk about fixes, 01:03 我们先要直面并命名这些痛点。we should name the pain. 01:05 这种变化带来了真正的压力 And this change creates real stress 01:08 和问题给所有试图用 AI 进行构建的人。and real problems for everyone who tries to build with AI. 01:13 当我们开始使用 AI 时,And when we start working with AI, 01:15 体验常常磕磕绊绊。the experience is often bumpy. 01:17 它可能一次表现得很好 It may work great once 01:19 ,但下一次在同一个任务上就失败了。and then fails the next time on the same task. 01:23 而且每天都有可能冒出新工具 And a new tool may show up every day 01:26 我们被新闻头条吸引 and we are drawn in the headlines 01:28 却始终不确定该关注哪个。and are never sure which one we need to follow. 01:32 而且它能出色处理难题 And it also handles a hard task well 01:34 却在更简单的问题上栽跟头。and then trips on an easier one. 01:37 这些是不匹配的表现症状。These are symptoms of a mismatch. 01:40 这是因为我们现有的工作流程 This is because our existing workflows 01:43 是为人与人协作而设计的。were built for human to human collaboration. 01:45 但 AI 遵循着不同的规则。But AI follows different rules. 01:47 这个环节的全部内容就是 And this session is all about 01:49 更新我们的工作方式 updating how we work 01:51 让这两方面能够完美契合。so those two sides would fit together. 01:55 那么,我们该如何缩小这个差距 And how do we close this gap 01:58 呢?在工作方式发生巨大变革时 during a big change in how we work? 02:00 随机试错是学习的一种缓慢方法。Random trial and error is a slow way to learn. 02:03 这就好比没有飞行课本,就在驾驶舱里 It’s like pressing cockpit buttons with no lesson plan, 02:09 边按按钮边学飞。just learning how to fly right there in the cockpit. 02:11 你也许不知道某件事为什么会成功。You may not know why something worked. 02:14 这就是我们制定清晰课程的原因。And that’s the reason why we prepare a clear curriculum. 02:17 那就是一个结构化的学习路径,也就是课程。A structured path is that curriculum. 02:21 所以更具体地说,So more specifically, 02:23 我们将学习过程分为五个阶段。we will divide into the learning stage, into five stages. 02:25 在这个展示中,In this presentation, 02:27 在这个讲座中,in this lecture, 02:28 它不是一套技巧列表。it is not a list of tricks. 02:31 这是你作为构建者的成长之路。It is a growth path for you as a builder. 02:34 并且每个阶段都会建立在上一个阶段的基础上。And each stage would add on the last. 02:37 总的来说,你将获得更多与 AI 的协作 Overall, you would gain more teamwork with AI 02:40 以及更大的控制力。and more control. 02:41 你的角色将从基础用户 Your role would move from the basic user 02:43 转变为真正的系统设计师。towards some Real system design. 02:47 所以今天我们将完整走一遍整个路径。So today we’ll walk the full path. 02:49 从阶段 1 的 AI 作为黑盒子,Stage 1 AI as a black box, 02:52 一路到阶段 5,all the way up to stage five, 02:54 AI 作为共同创造者。AI as a co creator. 02:56 到最后,你就会知道 And by the end you will know 02:58 该如何进行调整 what to adjust each time 02:59 每次生成式 AI 给出不良结果时。the AI gives a bad result. 03:02 与其只是通过猜测和反复试错 Instead of just guessing and trial and error 03:05 来比较使用 AI 的不同方法,to compare different ways of using AI, 03:07 我们确实需要一些共享的示例或阶段。we do need some shared example or stage. 03:11 对我们而言,那就是每周的商业审查会议 For us that is the weekly Business Review 03:14 或者简称 wbr。or wbr. 03:16 对很多人来说,这可能是个不错的选择。And for many people probably it’s good. 03:21 虽然我相信很多人 Although I believe for many people 03:23 已经熟悉 wbr 这个概念了。they are already familiar with the concept of wbr. 03:26 但对不熟悉这个概念的人来说,But for those who are not familiar with it, 03:29 它指的是每周一次,it means every week, 03:31 尤其是像亚马逊这样的科技公司,and especially tech companies like Amazon, 03:34 很多员工,特别是高层领导,a lot of people, especially senior leaders, 03:37 会聚在一起审视关键业绩指标。would sit together and review the top line metrics. 03:41 而且在那一刻,And during that moment, 03:44 这是少数能 it’s one of the few opportunities 03:47 与 M1 或 M2 高层领导并肩站立、for a junior or a senior like data analyst 03:55 展示研究成果的机会之一。to just stand with M1 or M2 leadership 04:02 所以一方面,这意味着 So on one hand this means 04:04 WBR 是一个展示影响力的绝佳场合。the WBR is a strong place to show the impact. 04:07 但另一方面 But on the other hand 04:09 这也意味着巨大的压力 it also means it gives a lot of pressure 04:12 因为准备工作稍有不慎就容易出错。because it’s so easy to mess up the prep work and make mistakes. 04:18 这就促使我们对所有工作都要反复 And that makes us to double and triple check 04:21 双重、三重检查。everything we do. 04:23 这会造成很大差距 This causes a lot of gap 04:24 我们即将进行的演示 and the demo we will do very soon 04:26 将清晰展示这一差距。will show that gap. 04:30 为了提供更多背景,And to give more context, 04:32 在典型的 WBR 场景下,in a typical WBR scenario, 04:34 数据分析师通常会编写一些 SQL 查询,usually the data analyst would put some SQL queries, 04:38 使用一些 SQL 查询来提取数据 use some SQL queries to pull data 04:42 从生产数据库中。from some production database. 04:43 在这里,我们将使用 SQLite。And here we’ll use SQLite. 04:45 我们使用哪种数据库并不重要,It doesn’t really matter which kind of database we use, 04:48 主要为了简单起见。we use that mainly for simplicity. 04:51 然后通常人们会将它复制粘贴 And then usually people would then copy paste it 04:54 到 Google 表格或 Excel 中 into either a Google spreadsheet or an Excel 04:57 以便获得一些数字数据。so that we have some numbers. 05:00 它会施展一些 Excel 技巧,It will do some Excel trick, 05:03 生成一些结果,generate some realization, 05:04 然后分析师会对这些结果进行分析,and then the analyst would do some analysis on it 05:07 并发送给整个团队,and send out to the entire team 05:09 以展示这个过程多么繁琐且容易出错。just to show how tedious and how error prone this is. 05:13 我已经准备好了这个电子表格 I prepared this spreadsheet 05:16 你们可以随意复制一份。and feel free to also make a copy of it. 05:20 幻灯片组里就有它的链接 You can get the link to it 05:22 你们可以直接获取。right here from the slide deck. 05:24 讲座结束后,我们会把幻灯片 We will also send the slide deck 05:27 连同录音一起发给大家。along with the recording after this lecture. 05:30 首先,And to begin with, 05:30 如我们刚才所提,as we just mentioned, 05:32 我们可能有一个数据库。we may have a database. 05:34 数据库有一些表,The database has some tables, 05:37 但数据库的具体格式或表结构并不重要。but it’s not really important that what kind of format or table the database has. 05:43 反正我们已经有了一个某种格式的数据库 We have a database anyway in some format 05:48 通常人们会 and usually what people do here is 05:49 使用现有的 SQL 查询。they have an existing SQL query. 05:52 再说一遍,Again, it’s not important what exactly is 05:53 SQL 查询的具体内容并不重要。the SQL query. 05:56 我们有一个 SQL 查询 We have a SQL query 05:58 人们通常会直接运行它 and then people would just execute a SQL query 06:00 来获取一些数据。and get some numbers. 06:02 这些就是 This would be the Numbers 06:03 分析师将进行分析的数字。the analyst would perform the analytics on. 06:08 然后他们会全选,And then what they will do is they will select all, 06:10 并复制所有数据,copy all of the data, 06:12 返回 Google spreadsheet 中,go back into the Google spreadsheet 06:17 直接粘贴这些数字。and just paste the numbers. 06:20 之后,电子表格会更新 After that the spreadsheet would update 06:22 可视化图表,分析师就会说 the visualization and the analyst would say 06:25 ‘这真的很奇怪’。that’s really weird. 06:27 为什么这条曲线形状这么奇怪?Why is this curve in this weird shape? 06:30 然后他们得回去 Then they would need to go back 06:32 调试一番,发现原来,and debug and find out, oh, 06:34 是笨蛋我忘了粘贴到正确位置。this is because stupid me forgot to paste in the correct place. 06:38 我们本该保留表头 We should have leave this header 06:40 ,从第二行开始粘贴 and paste from the second row 06:42 ,而不是从第一行。instead of from first row. 06:44 所以我们得重做这项工作,So then we will need to redo the work, 06:46 从这里粘贴过来,paste it from here, 06:47 再仔细检查一下 Verizon。double check the Verizon. 06:50 哦,现在运行正常了 Oh it works now 06:51 看起来挺好的。it looks good. 06:53 然后,我们需要在它基础上进行一些分析 And then we would need to perform some analysis 06:57 ,可能截个图,on top of that and probably do a screenshot, 06:59 通过邮件发出去。send it out through email. 07:02 这样我们可能会听到经理办公室传来尖叫声 Then we will get probably some screaming from the manager’s office 07:08 因为我们忘了把日期从黑色星期五当周改成下一周 because we forgot to change this date from the week of Black Friday to the next week 07:15 导致时间范围与标题不一致。and we will see an inconsistency from the time range with the title. 07:20 这基本上意味着我们出了个错,And this basically means we made a mistake, 07:24 可能需要在发送前修正它,we probably will need to fix it before we send it out 07:27 最好能在发送前就发现它。and even better notice it before we send it out. 07:31 所以这就是我们刚才提到的那个差距。So this is the gap we just mentioned. 07:35 从这里可以看到,And as we can see here, 07:37 虽然 WBR 的呈现时刻至关重要,although for the WBR that moment of presentation really matters, 07:40 但准备过程却包含许多琐碎且充满压力的步骤。the prep has a lot of small and stressful steps. 07:46 我得运行 SQL,I run SQL, 07:47 手动复制粘贴,I copy and paste by hand 07:48 每一步都要集中注意力,而且很容易出错。and each step takes focus and can go wrong. 07:51 一个小错误就可能损害业务叙事 And one small error can hurt the business story 07:54 ,也会让领导们在观看时失去对我的信任。and also hurt the trust on me when the leaders are watching. 07:59 总的来说,我想说明的是 So overall what I want to explain here is 08:01 这个工作流程让我们陷入琐碎的重复劳动,the workflow pushes us towards some careful repetitive work, 08:05 而不是真正获得洞见。not insight. 08:07 我所花费的精力 And the energy I spend 08:09 都用来避免犯错,goes into not making mistakes, 08:11 而不是进行分析。not doing an analysis. 08:14 它们投入到数据的准备工作中。They go into the preparation of the data. 08:18 人们可能会把我当成一台谨小慎微的机器,And people may judge me like a careful machine, 08:21 而不是一个犀利的分析师来评判。not like a sharp analyst. 08:24 如果我们退一步看,And if we step back a bit, 08:26 这意味着每份工作实际上包含两种类型的工作。this means every job has actually two kinds of work. 08:30 一种是机械性工作,One is the mechanical work, 08:32 比如这里写 SQL、复制粘贴,like the SQL writing here or the copy pasting here, 08:35 或者发现标题这里有差异,or noticing there is a discrepancy on the title here, 08:38 这些都是机械性的。it’s all mechanical. 08:40 还有另一种工作 And there are also another kind of work 08:42 是真正的判断性工作,are on the real judgment, 08:44 比如我能进行什么样的分析,like what kind of analysis I do, 08:47 以及我能为业务带来什么样的洞见。what kind of insights I can bring to the business. 08:51 不幸的是,现在 And the unfortunate fact is right now 08:53 我们大部分时间都花在了机械性工作上,most of our time actually goes into the mechanical side, 08:58 比如领导对我的工作评价如何,like how well my work was perceived by leadership 09:02 主要取决于我检查标题是否正确检查了多少次。is how many times I double check whether the title is correct most of the time. 09:09 我们想要颠倒这个比例。And we want to flip that ratio. 09:11 这就是我们将用 AI 来解决的问题 And that is the problem we will address with AI 09:15 你可能会问:AI 如何帮助 and You may ask how can AI help 09:18 这个缓慢而重复的过程?with this slow and repetitive process? 09:21 大多数人从我们所说的第一阶段开始,And most people start at what we call the stage one, 09:24 也就是黑箱阶段。the black box stage. 09:27 你可以这样来想。Think of it like this way. 09:28 你雇用了一个新同事,可能名叫 ChatGPT,You hire a new colleague probably named ChatGPT, 09:32 让他坐下说:好的,给我做一个周报。sit them down and say okay, make me a weekly report 09:34 你要知道,我们不提供数据库访问权限,and know that we give no database access, 09:37 不提供样本报告,也没有关于哪些指标重要的任何上下文。we give no sample report, no context about what metrics matter. 09:42 它们会像 ChatGPT 一样全力以赴 They will try their best like ChatGPT does 09:45 但结果仍然只是一个猜测。but the result will be a guess. 09:47 如今,大多数人就是这样使用 AI 的。This is how most people use AI today. 09:49 更糟糕的是,And when to make things even worse, 09:51 当猜测出错时,when a guess is wrong, 09:52 我们无法诊断原因,we have no way to diagnose why 09:54 人们往往分享随机的提示技巧,而不是清晰的方法。people share random prompt tricks instead of a clear method. 10:00 你可能会听到这样的建议:哦,You may hear tips like oh, 10:02 给 AI 许诺一些小费钱,promise AI some money like tip 10:04 它就会更努力工作。and it will try harder. 10:06 那些只是未经验证的小窍门,And those are unverified tricks, 10:07 并非真正的工程方法。not engineering. 10:09 所以在这个阶段,So overall with this stage, 10:11 总体上我们可能会失控,we may lose control, 10:13 无法以稳定的方式进行训练或改进。we cannot train or improve on a steady way. 10:16 下一个演示将清楚地展示它的实际样子。And the next demo will show exactly what this looks like. 10:22 那么,我们来运行一个简单的任务 So let’s run a simple task 10:25 用黑盒方式吧。with the blackbox style. 10:26 这里的任务是 The task here is 10:27 我们就用一个简单的步骤吧。let’s just use a very simple step. 10:30 只需写一个 SQL 查询来获取我们的每周数据。Just write a SQL query for our weekly numbers. 10:33 我们的猜测非常简单。And our guess is simple. 10:35 如果我们只给模型一个目标,却不提供 If we give a goal but no facts 10:38 数据库的相关事实,模型就必须猜测 about a database, the model must guess 10:40 而且它应该失败。and it should fail. 10:42 在演示中,不仅要观察它失败了,And in the demo, not only watch that it fails, 10:45 还要看清它是怎么失败的。but how it fails. 10:47 注意它的语气,因为有时语气听起来很肯定 Notice the tone because sometimes it can sound sure 10:50 但内容却是错误的。while the content is wrong. 10:53 顺便说一句,我们这里已经准备好了一些对话 And here we already cooked some conversation 10:58 示例。by the way. 10:59 在这里,你可以看到 Here you will be able to see 11:00 两个不同的链接。two different links. 11:01 一个是 Deep seq 的,One is for Deep seq, 11:03 一个是 GPT 的,one is for GPT, 11:04 它们都在幻灯片中提供。they’re all available on the slide deck. 11:06 为了节省时间,And here for the sake of time, 11:07 这里我们只回顾 Deep seq。we’ll only review the deep seq. 11:10 不过,GPT 的对话几乎一模一样。But the GPT conversation is nearly exactly the same. 11:13 所以,我们使用的提示内容是。So the prompt we used was. 11:14 你是一位资深经验丰富的分析师。You are an experienced senior analyst. 11:16 我有一个电商数据库 I have an E Commerce database 11:18 ,想计算主要指标 and I want to calculate top line metrics 11:19 两个日期之间的,并编写 SQL 查询。between two dates and write a SQL query. 11:23 然后它提供了一些 SQL 查询 Then it gives us some SQL query 11:26 ,甚至还附上了示例输出结果。and even gives us some example output. 11:30 不过,如果我们复制这个 SQL 查询,However, if we copy this SQL query, 11:33 到我们的数据库中运行它,go to our database and run it, 11:35 就会看到一些错误信息。we will see some error messages. 11:37 基本上,它表示不存在这样的 orders 表,Basically it says no such table orders 11:41 这意味着它所谓的幻觉了,which means it so called hallucinates 11:43 或者说它虚构了一个名为 orders 的表。or it makes up a table called orders. 11:46 它根本就不存在。It just doesn’t exist. 11:48 不过,要记住,这不是他们的错,However, remember that it’s not really their fault, 11:50 而是我们的责任。it’s our fault. 11:52 我们实际上没有给它提供任何上下文 We didn’t really give it any context 11:54 或表名相关的信息。or any information about the table name. 11:57 但这通常就是人们会做 But this is usually what people would do 12:00 的事,使用 AI 时。when they use AI. 12:02 他们通常就是把这里的错误信息 They would just copy paste this error message 12:05 复制粘贴过去,然后让 AI 帮忙修复。from here to here and ask AI to fix it. 12:08 而且如你所知 And as you may already know 12:09 如果没有具体信息 without any actual information 12:11 AI 的能力是非常有限的。what the AI can do is very limited. 12:14 它无法修复这个问题 It cannot fix this problem 12:17 没有非常具体的表名。without any very specific table names. 12:20 它可能会想,哦,交易如何?It may guess like oh, how about transactions? 12:23 这听起来是个合理的表名,That sounds like a plausible table name, 12:26 但实际上 but it will not work 12:28 行不通,因为那完全是碰运气。because it’s pure luck. 12:32 它也会猜测是关于购买的 It also guesses about purchases 12:34 但其实我们用的是另一个名称。but actually we use a different name. 12:37 所以从这个演示中我们能学到什么?So what we can learn from the demo here 12:41 人们称其为 hallucination,但究竟发生了什么?is people call this hallucination, but what is actually happening? 12:48 我想打个比方,就好像 What I want to do an analogy on is like 12:50 想想一个学生在考试时 think about a student on a test 12:52 试卷问:哦,请写一个查询 and the exam asks oh, write a single query 12:54 每周收入的单条查询语句。for weekly revenue. 12:57 但学生从未见过数据库的架构。But the student never saw the database schema. 12:59 因此,他们将面临两个选择。So they will have two choices. 13:02 一种是给出空白答案,得零分。One is give an empty answer and get zero points. 13:05 另一个选择是给出错误的答案 And another option is give a wrong answer 13:07 ,但要自信满满,这样很可能还能拿到部分分数。but sound confident so that probably I will get some partial credit. 13:12 这正是 That’s exactly the reason 13:12 学生会填入内容 why the student would fill something in 13:14 以及 AI 会做同样事情的原因。and why the AI does the same thing. 13:17 因为这个模型也面临同样的压力。Because the model faces the same pressure. 13:20 它被训练得很有帮助。It is trained to be helpful. 13:22 正如你可能已经注意到的,Like you probably already see, 13:24 你是一个乐于助人的助手。you are a helpful assistant. 13:26 这就是训练的目标。That’s the goal for the training. 13:28 训练过程会奖励那些 And the training process rewards the answers 13:30 看起来完整的回答。that looks complete. 13:33 当事实缺失时,And when the facts are missing, 13:34 一个自信的猜测会比 a confident guess would score better than 13:36 简单地说‘哦,我不知道’得分更高。oh, I don’t know that simple answer. 13:39 而这个模型会遵循那种激励机制。And the model follows that incentive. 13:42 那么,该怎么解决这个问题呢?So how to fix that? 13:43 解决方案在我们这边。The fix is on our side. 13:45 我们没给学生提供考试材料 We didn’t give the student the exam materials 13:48 所以现在给他们材料,看看会有什么变化。so let’s give them the materials and see what changes. 13:53 左侧这里 And here on the left you see 13:53 显示的是那个 field demo 的 prompt。the prompt from that field demo. 13:57 它要求不带架构的 SQL。It asks for SQL with no schema. 14:00 右边我们添加了数据库架构。And on the right we add the database schema. 14:03 同样的目标,同样的学生,Same goal, same student, 14:05 但现在他们有了 but now they have what they need 14:06 正确回答所需的一切。to answer correctly. 14:08 在这里,我们还能看到 In here we can also see 14:10 两个对话。two conversations. 14:12 我们现在切换到这个。We’ll switch to this one. 14:14 所以我们一开始还是用了错误的提示 So we still use the wrong prompts to begin with 14:18 让我们来看看。let’s see. 14:20 但在那之后 But then after that 14:22 我们把它放到 we give it, 14:24 最下面,我觉得是这样。I think it’s probably all the way below. 14:35 我们给它这个数据架构,We give it this data schema, 14:38 例如有哪些表、哪些列以及它们的数据类型。like what kind of tables and what kind of columns data type we have. 14:46 因此,基于这个,So that based on this, 14:48 AI 就能生成一个 SQL 查询,the AI is able to get a SQL query 14:52 我们只需复制粘贴到数据库中,and we can just copy paste into our database 14:56 却出现了错误。and wrong. 14:58 这次它终于能给出正确的结果了。And this time it is able to give us the correct result. 15:04 但我要说的是 But what I want to mention is 15:06 我们刚才所做的其实有一个专有名词。what we just did actually has a name. 15:12 我们称其为上下文注入。We call it a context injection. 15:14 它的意思是 And it just means 15:15 只要你提供正确的事实,when you supply the right facts, 15:18 你就变成了上下文提供者。you become a context provider. 15:21 而这个角色支撑着后续的一切。And that role supports everything that comes next. 15:25 另外有一点我要提一下,And one thing I want to mention is 15:27 这与大家可能熟悉的 this is pretty different from something people might be familiar with, 15:29 prompt engineering 有很大不同。which is called prompt engineering. 15:34 因为提示工程和上下文,Because the prompt engineering and the context, 15:37 如果我们称之为上下文工程或上下文注入,if we call it context engineering or context injection, 15:41 它们是关于完全不同的东西。they are about totally different things. 15:44 提示工程关乎‘如何’。The prompt engineering is about the how. 15:47 就像处理事实一样,你可以通过调整措辞来获得更好的回答 Like with the facts, you tune the wording to get a better answer 15:52 而且你会发现很多小技巧,比如假装自己是著名作家 and you may see many small tricks like say you are a famous writer 15:57 、提供假提示、说“一步步思考” or you offer a fake tip or you say think step by step 16:00 或“深呼吸一下”,这些就能神奇地带来更好的结果。or you say take a deep breath and you will magically get better result. 16:06 我们承认这些技巧有时确实有用 And we admit that those sometimes help 16:09 因为它们并非魔法。because they are not magic. 16:11 原因是 And the reason is 16:12 当你声称自己是我们最畅销的作者时。when you say you are our best selling author. 16:15 这是一种薄弱的方式来描述你想要的质量,It’s a weak way to say what quality you want, 16:19 但问题是这些提示很容易失效,but the problem is that these lines would break easily 16:23 因为它们高度依赖具体的模型版本,did depend on the exact model version 16:26 今天有效的提示在下次更新后可能就失效了。and one line that may work today may fail after another update 16:31 我们不应该在这种脆弱的技巧上构建严肃的工作流程。and we should not build a serious workflow on fragile tricks like this. 16:37 另一方面,上下文工程关注的是“什么”而非“如何” On the other side, context engineering is about the what instead of the how 16:42 ,因为结果质量通常不太依赖巧妙的措辞 because the result quality usually depends less on the clever phrasing 16:46 ,而更多取决于你提供的事实内容。and more on what kind of facts you supply. 16:50 你可以把它想象成 And think about it as the gap 16:52 幸运短语与清晰项目说明之间的差距。between a lucky phrase and a clear project briefing. 16:57 因此,简报可以包含模型所需的内容 So the brief can hold what the model needs 17:01 例如 API 文档、风格指南、用户画像 like API docs like style guides, a user profile 17:05 硬约束以及清晰的事实 or hard constraints and clear facts 17:07 这些能让任务变得简单得多 make this task much easier 17:10 输出也更稳定。and the output also steadier. 17:13 你也可以用通俗易懂的语言来表述成功标准 You can also put the success criteria 17:17 ,例如,写作要精确且具说服力。in plain language like the writing should be precise and convincing. 17:20 这还要稳固得多 This is even more steady or solid 17:22 比‘哦,你是我们最畅销的作者’ compared with oh you are our best selling author. 17:28 后者指的是提示工程 The latter is the prompt engineering 17:29 ,而前者指的是上下文工程。and the former is the context engineering. 17:31 我希望这个例子能清楚说明一切 I hope this example explains all right 17:34 ,这样我们就能为一个任务提供良好的上下文 after we can give good context for one task 17:38 。你经常想把更多工作交给它 you often want to hand off more work 17:41 ,这是第二阶段:AI 作为实习生 and this is stage two AI as an intern 17:44 ,这里的工作就像一场聊天。and here the work looks like a chat. 17:47 你分配任务,You assign a task, 17:48 模型生成初稿,the model returns the first draft, 17:50 你进行修改。you give fixes. 17:52 这就好比指导一个 It’s like guiding a junior hire 17:53 很努力但需要严密监督的新人员工。who tries hard but needs close supervision. 17:57 一开始这可能觉得很高效 At first this may feel productive 17:58 我们终于开始把任务委托出去了。we’re finally delegating. 18:01 但正如我们将看到的 But as we will see 18:03 这种聊天密集风格有一个隐藏代价 this chat heavy style has a hidden cost 18:06 因为我们过度进行逐步手把手指导 because we’re doing too much step by step handholding 18:10 这种纠错会逐渐损害质量 and this correction can slowly hurt quality 18:12 下一个演示将说明原因。and the next demo would show why. 18:16 所以这里,我们让 AI 实习生来编写完整的 Python 脚本 So here we would ask the AI intern to write the full Python script 18:21 或 WBR 的 SQL 查询 or the SQL query for the WBR 18:24 ,我会按常规方式管理:and I will manage it like in the common way, 18:25 从一条消息开始,他们大概会做出一些回应。I start with a message and they will probably do something. 18:29 我会进行许多小幅修正 I will do many small fixes 18:30 就像和一个新手进行漫长的聊天一样。like a long chat with a junior. 18:33 其实就是在这里发生的 It actually happens here 18:35 我们使用同一个对话。we use the same conversation. 18:38 我们还是会进行后续对话 We still do follow up conversations 18:40 ,所以我们会问它‘哦,那个能行吗?’ so we’ll ask it oh can that. 18:44 看起来不错,That looks good, 18:44 但你能改成按天显示吗?but can you make it per day? 18:46 因为我们观察到的一点是 Because one thing we notice is 18:49 我们有总收入和总订单数。we have total revenue, total orders. 18:52 哦,对了 Oh by the way 18:53 这是背景 this is the background 18:54 这是一家电商公司。is this is an E commerce company. 18:56 考虑一下亚马逊。Think about Amazon. 18:57 我们有一个数据库 We have a database 18:59 关于营收订单或访客的。about revenue orders or visitors. 19:02 我们将在其基础上进行一些简单的分析。We will do some simple analysis on top of it. 19:05 这是背景介绍。This is the background. 19:06 在这里,我们希望绘制一张图表 And here we really want to plot a chart 19:09 ,来展示不同数值如何随时间变化。showing how the different numbers changes over time. 19:13 这就是 And that’s the reason 19:14 为什么我们试图每天进行它的原因。why we try to make it per day. 19:16 然后它努力尝试 Then it tried hard 19:17 ,为我们生成了一个不同的 SQL 查询。gives us a different SQL query. 19:20 我们将它存入我们的数据库中。We put it into our database. 19:23 我们执行它,复制粘贴结果 We execute, copy, paste the result 19:25 然后告诉它我接下来还想做什么。and tell it what I further want to do. 19:28 我想进行漏斗分析。I want to do a funnel analysis. 19:30 这就像我们分析有多少访客访问网站,It’s like we try to see how many visitors go to the website, 19:34 有多少人进入购物车结账,how many of them check out, go out to the carts 19:37 以及有多少人最终完成购买,and how many of them check out and finish the purchase process 19:40 这就是漏斗分析。is the funnel analysis. 19:43 然后它更新了查询 Then it updates its query 19:45 我们进一步发现它要长得多。and we further it’s a lot longer. 19:50 我们进一步运行它,发现了一些错误 We further execute it and then found some errors 19:55 让它再修正一次,最终它就做对了。asked it to correct again and finally it is able to get it right. 20:01 然后我让它把所有内容进一步 And I ask it to further visualize everything 20:03 可视化成 HTML。into an HTML. 20:04 这是 HTML 代码。Here’s the HTML. 20:06 初看之下,似乎还不错。It looks okay at the first glance. 20:08 它会提供一些数字数据,包括图表 It would give you some numbers with some charts 20:13 、访客参与度和漏斗分析。and visitor engagement and funnel analysis. 20:18 不过,其他内容 However, if something else 20:20 我们将在下一张幻灯片中展示。we will show in the next slide. 20:24 我们还有另一种 We have another alternative 20:25 实现相同功能的方法。of doing the same thing. 20:27 瞧瞧 And look at this 20:27 这就是 AI 给出的 is what the AI gives 20:28 使用我们的新方法论时,with our new methodology. 20:31 我们基本上第一部分 We have basically the first section 20:34 已经涵盖了传统方法中的所有内容。basically did everything we had in the old way. 20:40 但我们不仅能看到收入的变化,But we can see not only how the revenue changes, 20:43 还能看到流量的变化,we can see how the traffic changes, 20:45 以及漏斗分析。we can see the funnel analysis. 20:47 此外,In addition to that, 20:48 它无需提示就知道 it also knows without prompting 20:50 如何进行渠道性能分析。how to do channel performance. 20:52 它通过雷达图展示了 It shows this radar realization about whether 20:54 电子邮件和直接渠道 email or direct will give us better performance 20:56 在电子商务上的性能表现哪个更好。on the E commerce. 21:00 它还会分析 It also tries to analyze 21:02 哪些类型的客户最为重要。what kind of customer matter the most. 21:06 尝试提供客户洞察 Try to give customer insights 21:07 以及日常矩阵指标的概览。and also daily matrix metrics overview. 21:11 所以这里的区别天差地别 So the difference here is day and night 21:13 常规方法是将它当作实习生处理 for the usual way of treating it as an intern 21:16 并在整个聊天过程中进行繁重的修正。and do very heavy correction along the chat. 21:20 它能生成还算可以但有些漂移的作品 It gives us okay but drifted work 21:23 ,但采用了全新的方式 but with a new way 21:25 ——顺便提一句,这种方式甚至 which by the way is even easier 21:28 比逐一修正所有内容还要简单。compared with correcting everything. 21:32 它提供了更加详细 It gives a lot more detailed 21:34 、全面且富有洞察力的可视化效果。and comprehensive and even more insightful visualization. 21:39 另一个我想强调的区别 Another difference I want to put here 21:42 是,它在过程中会犯下许多错误。is it also makes a lot of mistakes along the way. 21:46 我们虽然花费了更多时间,We spend more time on it, 21:48 但并不一定能获得更好或更优秀的结果。but not necessarily getting a good a better result. 21:51 这个演示实际上展示的是。This is the demo actually shows. 21:55 换句话说,In other words, 21:56 这就像一个小修复的螺旋过程。there’s like a spiral of small fixes. 21:58 每次修复可能解决一个问题 Each fix might fix some one issue 22:01 却会在下次对话中引入新问题。but also add another issue in the next chat. 22:04 这正是我们通常会遇到的结果。This is what we usually get. 22:08 那个杂乱的对话线程其实引发了一个问题 And that noisy thread actually rises the question 22:11 :为什么?why? 22:12 但是这个模型至少还记得那些 Corrections 吗?But does the model at least remember the Corrections. 22:16 让我澄清一下大家 And let me clear a common misconception 22:18 对那个 LLM 的一个常见误解。about that LLM. 22:22 我通常喜欢用一个比喻:One analogy I wanted, I usually like to make 22:25 想象你和一个 is imagine you work with a colleague 22:26 没有长期记忆的同事共事。who has no long term memory. 22:30 所以每天早上你进去时 So every morning you walk in 22:31 他们就把昨天的一切都忘光了。and they have forgotten everything from yesterday. 22:34 唯一的继续方式就是把 The only way to continue is to hand them 22:36 之前所有对话的书面摘要递给他们。a written summary of all the previous conversations. 22:40 这正是左边 ChatGPT 或 Deep Seq 的情况:That’s exactly what happens here for either ChatGPT or Deep Seq on the left, 22:45 如果我说‘哦,我是 John’,if I say oh, I am John, 22:48 下一个回复就会用到我的名字。the next reply uses my name. 22:51 感觉像是模型在记住,It feels like a model remembers, 22:53 但右边才是实际情况。but on the right is actually the reality. 22:56 每次你发送消息,Every time you send a message, 22:58 这个应用(ChatGPT)就会将 the app, the ChatGPT would paste 23:01 迄今为止的所有历史对话全部 the full chat, so far, all the historical conversations 23:03 粘贴到上下文窗口里。into the context window. 23:06 顺便一提,And by the way, 23:07 这里的上下文窗口指的是 the context window here just means 23:10 模型能看到的内容。what the model sees. 23:12 而且模型每轮都会重新新鲜读取它。And the model reads it fresh every turn. 23:15 而且没有任何大脑或数据库 And there’s no brain or database 23:17 只存储今天的工作内容。storing just today’s work. 23:20 我要特别强调,这一点至关重要。I want to emphasize that this matters a lot, 23:22 因为如果你给那位同事一个 20 页的乱七八糟摘要,because if you hand that colleague a 20 page messy summary 23:27 里面满是划掉的笔记、道歉和过时的草稿,full of crossed out notes, apologies, outdated drafts, 23:31 他们很难从中找出真正的指示。they will struggle to find the actual instructions. 23:34 而这正是长而杂乱的聊天线程 And that’s exactly what a long noisy chat thread 23:37 对模型的影响。does to the model. 23:40 那么回到我们那位无记忆的同事。So back to our memoryless colleague. 23:44 每轮对话,你都要 Every turn you hand them 23:45 把迄今为止所有内容的完整书面摘要交给它们。the full written summary of everything so far. 23:48 来看看这个摘要的内容。And look at what the summary contains. 23:50 演示结束后,我们看到了原始请求 After the demo, we have our original request 23:53 、他们的错误代码、我的修正、and their wrong code, my correction, 23:55 他们的道歉、他们第二次尝试(又有不同 bug)、their apology, their second attempt with a different bug, 24:00 以及我的第二次修正。my second correction. 24:02 因此,真正的规格被隐藏 So the actual spec is buried 24:04 在多层错误和修复之中。under layers of mistakes and fixes. 24:06 左边那个就是聊天螺旋。That’s the chat spiral on the left. 24:09 你就像管理实习生那样来管理这个模型。And you manage the model like an intern. 24:11 你一轮一轮地修复它。You fix it turn by turn. 24:13 而且每次 And each turn 24:13 应用都会把整个杂乱的对话线程 the app resends the whole noisy thread 24:16 重新发给 AI。to the AI. 24:18 然后这位同事,即 AI,And then the colleague, which is the AI, 24:20 会阅读这一堆信息,试图 reads this pile and tries to figure out 24:21 弄明白你真正想要的是什么。what do you actually want. 24:24 真正的指令相当分散,The real instructions are quite scattered around, 24:26 散布在大概五条不同的消息中。probably five different messages. 24:29 而那些旧的坏代码仍然存在,And the old bad code is still there, 24:31 继续争夺注意力。competing for attention. 24:33 右侧正好展示了确切的修复方案。And the right side shows exactly the fix. 24:36 与其给他们那个乱糟糟的一堆,Instead of handing them that messy pile, 24:39 不如写一句或一页干净的内容,you write one clean sentence or one clean page 24:42 包含他们所需的一切。with everything they need. 24:45 这就是修复脏上下文的方法。That’s how you fix the dirty context. 24:48 所以这个修复只是个演示而已。So the fix is just a demo. 24:53 我们修复的方法是:How we do the fix, 24:55 将所有更正、约束和要求 we just take all the corrections, constraints and requirements 24:58 整合成一条清晰的指令,and merge them into one clear instruction 25:00 然后只发送这条单一指令,而非整个对话历史。and then send that single instruction instead of a whole conversation history. 25:06 但我们仍需提到的是 But one thing we still need to mention is 25:07 我们该如何做到这一点?how do we do that? 25:09 很多工具支持编辑上一条消息 Many tools will allow us to edit the previous message 25:12 ,而非新增一条消息。instead of adding a new one. 25:15 ChatGPT、Claude 和 Like both ChatGPT, Claude and also 25:18 我们这里使用的平台都支持编辑功能。the platform we use here allow editing. 25:21 所以要把上下文窗口当作你自己的文档 So treat the context window as a document you own 25:26 并像整理交给同事的简报一样保持整洁。and keep it tidy the way you Would clean up a brief 25:33 所以这里我们就这样做。So here we’ll do it this way. 25:35 我有一个电商数据库,I have an E commerce database, 25:37 想先计算主要指标。want to calculate top line metrics before. 25:39 所以这部分完全一样。So this part is exactly the same. 25:41 然后我们说明我们的需求。And then we state what we want. 25:43 写一个 SQL 查询,生成全面的每日指标 Write a SQL producing comprehensive per day metrics 25:46 并包含漏斗分析所需的日期。and also date needed for funnel analysis. 25:48 记住,我们要求你每天 Remember that we ask you to do per day, 25:50 进行漏斗分析。we ask you to do funnel analysis. 25:52 我们只需将它整理成一个简洁的句子 We just put it together as a clean sentence 25:56 把所有原始材料复制粘贴到一个句子中,and we copy paste all the raw materials in one sentence, 26:01 SQL 表的各种结构 all the different structures of the SQL table 26:04 它就能一键搞定。and it will do in one shot. 26:06 它会进行大量思考,It performs a lot of thinking, 26:09 这是 Deep seq 的典型做法,which is typical for Deep seq, 26:13 然后据此生成一个非常全面的 SQL。and then gives us a very comprehensive SQL based on that. 26:17 我们运行它 We execute it 26:18 然后将它粘贴到 AI 中 and also paste it to the AI 26:22 让它生成一个可视化图表。and ask it to generate a visualization. 26:26 就这样。That’s it. 26:27 它就能为我们生成所有这些 It will be able to generate all this 26:31 详尽且富有洞见的可视化图表。detailed and insightful visualization for us. 26:35 特别注意,And especially note that 26:36 我们的提示词完全没有改动。we didn’t change our prompt. 26:38 我们提供给它完全相同的信息 We give it exactly the same kind of information 26:41 只是为了让它更井井有条。but just to make it more organized. 26:44 而且差异非常大。And the difference is a lot. 26:47 因此,我们将这一技能称为上下文整理。So we call this skill context curation. 26:50 所以,我们不是盲目提供上下文,So instead of blindly provide a context, 26:54 而是有条理地精心挑选上下文,we curate the context in an organized way 26:57 让你来决定模型真正看到什么。and you shape what the model actually sees. 27:01 这就是区别所在。That’s the difference. 27:03 好吧,手动精选是有效的 All right, so manual curation works 27:06 但它仍然需要人工操作。but it’s still human work. 27:08 这就引出了下一层次的关键问题。That leads to the key question for the next level. 27:11 AI 能否收集并主张自己的上下文?Can the AI gather and claim its own context? 27:14 在我看来,这条界线区分了用户和构建者。For me, that line splits a user from a builder. 27:19 用户很可能想要一个更好的技巧 A user probably wants a better trick 27:21 来得到某个答案。for one answer. 27:22 而一个构建者想要将一个技巧自动化。And a builder wants to automate a trick. 27:24 这就好比从单纯开车 It is like moving from riding the car 27:26 转向掀开引擎盖,自己真正打造一辆车。to opening the hood so you can really build your car. 27:35 因此,这将我们带到了第三阶段 So this makes us to the stage three 27:37 AI 作为队友。AI as a teammate. 27:39 该模型能够自主规划并采取行动 The model can plan and act on its own 27:42 而非仅仅等待我们的下一步指令。beyond waiting for our next instruction. 27:46 所以这就好比我们让一位同事 So it’s like we ask a colleague 27:49 去搞定数据库架构。to figure out a database schema. 27:53 我们不会对它说,嘿,We do not tell it, hey, 27:55 你得先用这个语句 you need to first use this statement 27:57 列出那个表 to list that table 27:59 然后再去探索那个表,and then explore that table 28:01 找出列的列表。for another list of columns. 28:04 我们就直接问它。We just ask it. 28:05 我们只需问那位同事 We just ask the colleague 28:07 他们就会走到终端前,and they will walk over to the terminal, 28:09 运行几个命令,they will run a few commands, 28:10 查看输出,look at the output, 28:11 调整策略或重试 adjust their approach or fails 28:13 后,带着文档回来。and come back with a document. 28:15 这就是代理式 AI 的意思。That’s what agentic AI means. 28:18 模型不再只是简单地在聊天中回应请求,Instead of just the response a request in the chat, 28:21 而是会规划自己的步骤。the model plans its own steps. 28:24 这个模型会利用工具搜集信息 The model uses tools to gather information 28:26 并根据搜集到的内容进行调整。and adjusts based on what it finds. 28:29 所有这些都要求模型 And all of that requires the model 28:31 能够访问真实的工作环境。to have access to a real working environment. 28:34 这就是 Cursor,它是我们本次演示中使用的 And this is Cursor, the AI first IDE 28:39 首个 AI IDE。we use in this demo. 28:41 不过你可以随意使用其他工具,比如 Codex 或 Claude Code,But feel free to use other tools like Codex or Claude Code, 28:46 它们基本上没什么区别。they are pretty much the same. 28:48 我们这里主要用 Cursor,是因为 Here we use Cursor mainly because 28:50 Codex 和 Claude Code 很多时候都是 CLI a lot of times Codex and Claude Code are CLI tools, 28:54 命令行工具。command line tools. 28:54 而且 Cursor 具有图形用户界面(GUI)。And Cursor has a GUI graphical user interface. 28:57 它可能用起来更简单,It might be easier to use, 28:59 但就像我们之前聊天中提到的,but as we mentioned in the chat before, 29:01 Claude Code 也有图形界面。Claude Code also has a gui. 29:04 Codex 在 Mac 上也有自己的应用,Codex also has its own app on Mac, 29:08 因此你可以随意在这个工作空间中使用其他工具。so feel free to use other tools here based on this workspace. 29:12 核心理念是感知-行动循环。The core idea is a perception action loop. 29:15 如果使用 plainchat And if we use plainchat 29:18 模型就被限制只能处理文本输入和文本输出。it limits the model to the text input and text output. 29:21 不过在 IDE 里,模型能够看到 But in the IDE the model can see 29:25 这里显示的文件树 the file tree which is shown here 29:26 和源代码。and the source code. 29:27 它能访问我们磁盘上的所有文件 It has access to all the files 29:31 并且基于此,And based on that 29:32 它能够 it would be able to act on what it sees 29:33 通过编辑并运行这里的命令,对所见内容进行操作。by editing and running commands like here. 29:39 那个循环使得工作能够从对话 And that loop is how work moves past conversation 29:41 阶段过渡到执行阶段。into execution. 29:44 我们接下来就会看到这一点。And as we will see. 29:47 在这里,我们用一个例子 And here we will use an example 29:51 来展示我们的能力。to demo what we can do. 29:53 我们这里使用和之前同样的目标。Here we use the same goal as before. 29:56 请同事学习数据库的 schema。Ask the colleague to learn about the database schema. 30:01 在这里,我们不粘贴它的 schema And here we do not paste the schema for it 30:04 只需直接让它去做即可。we just ask it do it. 30:06 我们切换到 Cursor 吧。Let’s switch to Cursor. 30:10 这里就是数据库 And here is the database 30:13 我们直接问它就好。and we can just ask it. 30:16 对了,By the way, 30:17 我们需要把它切换到 agent 模式,we need to switch this to agent, 30:20 这样它就能看到并做出反应了。which means it will be able to see and react. 30:23 此外,Cursor 还支持语音输入功能 And Cursor also has this voice input 30:27 能让打字速度变得飞快。which makes typing really fast. 30:31 现在查看电商数据数据库 DB Now take a look at the ecommerce data DB 30:34 ,并撰写一份描述其架构的文档。and write a document describing the schema of the database. 30:41 就这样。That’s it. 30:42 然后光标就会尝试执行它自己的任务。Then the cursor would try to do its own job. 30:47 例如,它会使用某些命令 For example, it uses some command 30:49 来尝试找到数据库的位置。try to find where’s the database. 30:51 它使用某些 SQLite 命令 It uses some SQLite command 30:54 来导出表列表。to dump the list of tables. 30:56 注意 And note that all of this 30:57 我们并没有明确要求它做这些事情。we didn’t really ask to do specifically. 31:01 我们只是让你导出数据库的 schema We just ask you to dump the schema 31:05 并写一份文档。of the database and write a document. 31:07 这就是它生成的文档。So here’s the document it wrote. 31:09 如果我们按 F1 功能键 If we use F1, the function key 31:11 并点击打开预览,and clicked open preview, 31:13 它就能将 Markdown 预览 it is able to render the markdown preview 31:16 渲染成一个漂亮的格式。as a nice form here. 31:19 这就是电商数据库的架构图。And this is the E commerce database schema. 31:21 它首先列出所有表 It first lists all the tables 31:23 ,对于每个表列出其列 and for each table list the columns 31:26 ,同时指出数据库中不存在的内容 and also what is not in the database 31:29 以及快速参考资料。and the quick references. 31:32 所以这基本上就是同事 So it’s pretty much like what a colleague would really right 31:36 面对这项任务时会写的,但时间短得多。when they face this task, but with a lot shorter amount of time. 31:46 好,如果我们把这种方法和内部方法对比一下,All right, so if we compare this approach with the internal approach, 31:51 会发现一个不同点。we notice one difference. 31:54 也就是说,我们从逐步微观管理的老板 That is we move from a step by step boss 31:57 转变为结果负责人 micromanaging boss to an outcome owner 31:59 ,不需要列出每一个细微的操作步骤。and we don’t need to list every single click. 32:04 我们只需观察它规划并运行命令 We just watch it to plan and run commands 32:07 检查数据库、写入文件 inspect the db, write the file 32:09 它会自动收集自己的上下文。it will collect its own context. 32:15 我还是要强调,I still want to emphasize that 32:17 那个文档看似简单,实际上 that document looks simple, but actually 32:19 它展示了我们的第三项技能。that shows our third skill. 32:23 而且每个团队其实都有那么一位资深人士 And every team actually has that one senior person 32:27 对系统了如指掌,who knows everything about the system, 32:28 知道表之间如何连接,how, how the tables connect, 32:30 哪些列名具有误导性,以及各种边缘情况。which column names are misleading, what the edge cases are. 32:34 但这一切都只存在于他们的脑海里。But all of it lives in their head. 32:36 如果他们走了,If they leave, 32:38 团队就会丢失这一个月积累的知识。the team would lose the month of accumulated knowledge. 32:41 但如果我们将其制作成文档,But if we make it a document, 32:44 AI 队友就永远不会丢失全部内容。the AI teammate can never lose all of it. 32:49 这恰恰揭示了 AI 的一个局限性 And this makes this actually reveals one limit of AI 32:53 :如果某些信息不在文档中,that if something doesn’t exist on the document, 32:57 AI 就无法获取,the AI has no access to it 32:59 也完全不知道那些未被记录的内容。and the AI would know nothing, which is not written down. 33:05 这正是外部化如此重要的原因。That’s exactly why externalization matters. 33:08 就像我们需要把知识从人们的头脑 Like we need to move knowledge out of people’s head 33:11 和聊天记录中提取出来,放到文件中一样。and chat threads and we put them into files. 33:16 我们刚才编写的数据库 schema MD The database schema MD we just wrote 33:19 就是这样。is exactly that. 33:20 这是一项部落知识 It’s a piece of kind of tribal knowledge 33:22 被转化为稳定文档 turned into a stable document 33:24 任何人和 AI 代理都能打开并复用。that any person or any AI agent can open and reuse. 33:31 我们来试着把这个想法具体化 Let’s try to make this idea more concrete 33:34 比如,当新队友入职时 that for example, when we have a new teammate onboarding 33:36 新员工只需阅读它就能快速上手。a new hire can just read it and ramp up. 33:41 而且如果有另一个代理,And if we have another agent, 33:43 下一个代理就可以直接从该文件入手,the next agent would then start from that file 33:46 而不必从头重新发现那些表格。instead of rediscovering tables from zero. 33:49 而且,如果我们遇到一些边缘情况或 And if we have some edge cases and trouble knowledge 33:52 某些误导性列的棘手知识,也可以将其记录到那个文档中,about some misleading columns, we can also put that into that document 33:56 这样 AI 代理就不会再掉入同一个陷阱了。so that the agent won’t fall into the same trap again. 34:01 而且即使是你自己,And even you yourself, 34:02 六个月后,six months later, 34:03 读起来依然能回想起 you can still read it and remember 34:04 整个系统的架构是如何契合的。how the system fits together. 34:08 而且我想强调 And I want to emphasize that 34:09 这些共享的书面事实 this shared written facts 34:12 能够传播并不断累积。they can spread and compound. 34:15 当模型生成自己的参考文件时,When the model writes its own reference files, 34:17 你就开始为整个组织构建知识库。you begins to grow a knowledge base for the entire organization. 34:21 小文件能提升所有人的速度。And small files can lift everyone’s speed. 34:27 对不对?Right? 34:28 接下来,我们将开始 And next we will begin 34:30 演示这些复合物是如何运作的。show how these compounds. 34:32 我们现在开始处理这个文件,We will just begin working on the file 34:33 也就是我们刚刚创建的那份。we just made. 34:36 现在我们已经有了这个文件。Now we already have this file. 34:38 我们将尝试让另一个代理 We will try to ask another agent 34:40 直接使用这个文件,而无需重新发现数据架构,to use this file without rediscovering the data schema, 34:44 就能生成一份完整的每周业务回顾报告 but build a full weekly business review 34:47 包含所有关键指标、趋势和图表 with all the key metrics, trends and charts 34:50 该时间范围内。for this time range. 34:53 在这里,我们来创建一个新的代理 And here we’ll create another new agent 34:57 还记得那个没有记忆的同事比喻吗?which remember that memoryless colleague metaphor? 35:02 这个代理无法访问所有这些历史对话,This agent won’t have access to all of these historical conversations, 35:06 但我们会让它访问这个数据库 schema md but we give it access to this database schema md 35:11 文件,就像这里我们提到这个文件 like here we use to mention this file 35:15 一样,以确保 AI 了解它。to make sure the AI is aware of that. 35:18 然后我们对 AI 说,哦,And then we ask the AI oh, 35:19 看起来语音输入功能不见了。it looks like the Voice input was gone. 35:22 如果我用这个的话。If I use this. 35:22 那我就这么做了。So I’ll do that. 35:24 最后,我们来看一下数据库的 schema md 文件 At last take a look at our database schema md 35:29 试着搞清楚数据库里都存了些什么。trying to understand what the database holds. 35:32 我们来做一个每周业务回顾吧 And let’s do a weekly business review 35:35 比如绘制每日指标 like plot daily metrics 35:37 思考领导层关心的 and think about what the leadership would care about 35:39 每周业务顶线指标 our weekly business top line metrics 35:41 然后写一份 Markdown 报告。and write a markdown report. 35:45 你可以进行可视化处理 You can do visualization 35:48 并将它们嵌入到 Markdown 中 and embed them in the markdown 35:53 同样,AI 也会进行规划。and then similarly the AI would do their planning. 35:58 他们确实会阅读这份文档。They will actually read this document. 36:04 哦,有件事我忘了提。Oh, one thing I forgot to mention is. 36:07 抱歉,我忘记提数据了。Sorry, I forgot to mention the data. 36:10 我们关心的日期范围 The range, the date range we’re interested in 36:12 是 2025 年 12 月 1 日至 12 月 7 日。is from December 1st to December 7th, 2025. 36:24 好的。All right. 36:25 然后,它会创建一个 Python 虚拟环境 And then it creates some Python virtual environment 36:30 安装必要的依赖,并编写查询 install dependencies and write some queries 36:32 来聚合弱矩阵。to aggregate the weak matrix. 36:34 然后使用 Matplotlib 进行一些 Python 可视化 And then do some Python visualization based on Matplotlib 36:41 ,并进行周环比比较,所有这些都自动完成。and do week over week comparison everything on its own. 36:46 我们并没有 We don’t, 36:46 真正要求它去做这些。we didn’t really ask to do. 36:50 它会生成一些 Python 代码。And it writes some Python code. 36:55 我们注意到,And one thing we notice is 36:56 它会直接开始分析,it just begins the analysis right away 36:59 而不需要进行另一轮的探索。without another round of discovery. 37:03 这恰恰就是 And that’s exactly the value 37:04 这个数据库 schema md 的价值所在。of this database schema md. 37:08 然后运行 WBR 报告生成脚本。And run WBR report generator script. 37:16 它生成了这个包含不同图形的图像 Which creates this figure of different figures 37:20 最后输出这个 MD 文件 and it finishes with this MD file 37:22 我们可以再次打开预览,again we can go into the open preview, 37:25 看起来你发现了一个 bug looks like you found a bug 37:29 然后重新生成这些图形。and then regenerate the figures. 37:32 不过我觉得这对我们的用途来说没问题。But I think that’s okay for our purposes. 37:34 它包含一个执行摘要、It has an executive summary, 37:37 顶线指标、周总计,it has a top line metrics, weekly total 37:40 然后按不同日期划分显示不同构成,and then divided by different day show the different composition 37:45 甚至按类别和客户细分显示收入。and even revenue by category, by customer segment. 37:50 Deep SEQ 到底做了什么,Exactly what Deep SEQ did, 37:52 但本地用 MD 格式实现。but in MD format locally. 37:56 这就是显性知识堆叠的方式。So this is how the explicit knowledge stacks. 38:03 好了,三个阶段已完成,All right, so three stages done, 38:05 三个构建者角色已获得。three builder roles earned. 38:07 目前为止的情况是这样的。Here’s the picture so far. 38:09 首先,作为 Context provider,First, Context provider, 38:12 你要添加基本事实,让模型不再胡乱猜测。you add the basic facts so the model stops guessing. 38:14 其次,Context curator,And second, Context curator, 38:17 你能将杂乱的聊天记录转化为清晰的规范。you turn a messy chat into a clean spec. 38:20 第三点,这就是我们刚才所做的。Third, this is what we just did. 38:23 上下文架构师。Context architect. 38:25 你需要设计文件和工具 You design files and tools 38:27 ,让代理能够在实际项目中运行。so agent can work inside a real project. 38:31 所以你能看出焦点发生了移动。So you can see that the focus moves. 38:33 你会花费更少的精力 You spend less energy 38:34 去追求模型的一个完美回复。on one perfect reply from the model. 38:37 你会花更多精力在这样一个设置上 You spend more on a setup 38:39 即模型能够独立找到好的答案。where the model can find good answers alone. 38:42 你从单纯按按钮的操作者 You move from the button pusher 38:44 转变为系统思考者。towards the system thinker. 38:47 出现了一个新问题。There’s a new problem. 38:49 要是任务很复杂呢?What if the task is complicated? 38:51 要是它规模巨大怎么办?What if it’s huge? 38:52 如果一个环境无法容纳所有事实,该怎么办?What if one environment cannot hold all the facts? 38:56 这就是我们接下来要处理的。That is what we tackle next. 38:59 有时目标是复合的。Sometimes the goals are compound. 39:02 研究 10 家公司就相当于 10 份工作。Like researching 10 companies is 10 jobs. 39:06 每个子任务 With each sub job 39:07 都有其自身的事实和范围。it has its own facts and scope. 39:10 而且一个环境无法同时容纳所有这些。And one environment cannot hold all of them at once. 39:13 现在角色将转向 And now the role would shift 39:15 从带领一名队友到指导整个项目。from leading one teammate to directing a project. 39:22 所以这个问题我可能会晚点再回答。So I will answer the question probably later. 39:26 最终我们将回答这个问题 We’ll have answer the question 39:28 一次性直接。in one shot in the end. 39:29 不过欢迎随时在聊天框里多提问题。But feel free to type in more questions in the chat. 39:33 在这里,我们首先来展示 And here we will first show that 39:36 如果将整个项目视为单一提示,if we treat the whole project like a single prompt, 39:39 通常会失败 we will usually fail 39:40 解决方案在于需要重新设计工作流程。and the solution lies and we need to redesign the workflow. 39:47 好了,我们来看看,And here, let’s see, 39:48 这里我们 we just take the 10 company research job 39:49 故意用错误的方法来处理那 10 个公司研究任务。on purpose with the wrong method. 39:53 我们采用第三阶段的习惯 We use the stage three habit 39:56 即设定一个明确目标,进行一次移交 like one clear goal, one handoff 40:00 并将整个项目整合到一个综合提示中。and put the whole project into one comprehensive prompt. 40:04 在这里,我们同时使用了 ChatGPT 和 Manus。And here we use both ChatGPT and Manus. 40:09 在 ChatGPT 里,我们先让它列出美国市场 10 家电商公司 In ChatGPT we first ask it to give 10 e-commerce companies 40:15 ,然后调查它们 in the US market and then research their performance 40:17 在黑色星期五期间的表现。during Black Friday season. 40:20 我觉得我用了 GPT Pro I think I used the GPT Pro 40:23 ,但不太确定。but I’m not super sure. 40:25 在这里,它能给我们 And here it’s able to give us 40:27 提供一大堆结果。a bunch of results. 40:29 它看起来很长。It looks long. 40:30 不过,如果我们仔细查看结果,However, if we take a closer look at the result, 40:34 就会发现这是 GPT 为每家公司报告的内容。we will see that this is what GPT reports for each company. 40:40 基本上每家公司就一两句 One basically one sentence or two sentences 40:43 非常简短。very short for each company. 40:46 同样,对于 Manus,它也提供了一些报告,And similarly for Manus it gives us some report, 40:52 但每个公司的报告都非常简短。but for each company is also very short. 40:57 这基本上完全败坏了本意。And this basically defeats the purpose. 41:00 我们真正希望能生成一份详尽的报告 We really want a deep report 41:02 为每家公司。for each company. 41:03 如果你真正试一试,And if you really try it out, 41:04 就会发现极其困难 you will find it’s extremely hard 41:06 强迫 LLM 生成一份很长的报告。to force the LLM to give you a very long report. 41:15 顺便一提,And by the way, 41:16 所有这些内容都可以在幻灯片中找到。all of this can also be found in the slide deck. 41:22 原因在于,And the reason is 41:24 你可以这样来理解它。you can think of it like this way. 41:28 你可以把桌子想象成一个固定大小 You have a desk and the desk is like a fixed size 41:31 的空间,你需要同时展开 10 家公司的研究资料。and you need to spread out research on 10 companies at once. 41:36 这几乎就和 LLM 面临的情况一模一样。It’s pretty much like what the LLM is facing. 41:38 它面对着一个固定尺寸的桌子,It faces a fixed size desk, 41:41 但发现桌子太小,无法容纳全部内容。but then it found the desk too small for all of it to fit. 41:45 因此,每家公司只能分到一小块地方。So each company will get a tiny corner. 41:48 你能看到标题 You can see the headers 41:49 ,但无法查看详细信息。but cannot read the details. 41:52 那个桌子就是我们刚才提到的 And that desk is the context window 41:54 上下文窗口。we just mentioned. 41:54 这就是 And that’s the reason why 41:55 窗口填满后,when the window fills up, 41:57 模型会采取捷径的原因。the model would take shortcuts. 41:59 它可能在前几家公司描述得很详细 It may start detailed on the first few companies 42:02 但到了第七家公司就变得模糊起来。and get vague by company 7. 42:05 它可能会给每家公司相同的空间 It may give every company equal space 42:07 结果每家公司都只得到同样浅薄的描述。and every company gets equally shallow. 42:11 这份报告,就像我们看到的,The report, just as we saw, 42:14 看起来很完整,但实际上毫无用处。would look complete but actually not useful. 42:18 那该怎么解决呢?And how to solve that? 42:20 当桌子容纳不下 10 家公司时,When the desk is too small for 10 companies, 42:22 我们可以给每家公司配一张自己的桌子。we can get each company its own desk. 42:26 所以我们可以像这样来做。So we could do it like this. 42:28 首先,我们将大任务拆分成小任务。First we break the big task. 42:30 10 家公司被拆分成 10 个独立的任务。10 companies become 10 separate tasks. 42:33 然后我们逐一单独运行它们。And then we run each one alone. 42:35 每家公司都有独立的崭新上下文窗口 Each company gets its own fresh context window 42:38 让模型能专注深入处理 so the model can do deep work 42:40 而不受其他九份报告的空间竞争干扰。without nine other reports competing for space. 42:43 然后第三步是进行最终整合 And then the third is a final pass 42:46 将这 10 个可靠的部分合并成一份完整的报告 would combine the 10 solid pieces into one report 42:50 这种模式称为‘上下文分而治之’。and this pattern is context divide and conquer. 42:54 所以桌子的尺寸实际上没变,So the desk size actually has not changed, 42:57 但现在每个任务都能舒适地 but each task now fits comfortably 42:59 完成,因为它们各自有独立的 because they all have its own task, 43:02 上下文窗口。they all have separate context windows. 43:03 这个设计既尊重了窗口的实际限制 And this design respects the real limits of the window 43:06 又充分利用了模型的强大能力。and still uses the model strength. 43:10 现在,我们来运行 split 和 merge 的想法 And now let’s run the split and merge idea 43:12 在专为它设计的工具中吧。in the tool built for it. 43:14 我们这里使用 Manus,Here we use Manus, 43:16 但我相信 Codex 和 Claude Code 等上下文也支持它。but I believe contexts like Codex and Claude Code also support it. 43:21 再给它列出 10 家美国电商公司 Again give it 10 e-commerce companies in the US 43:23 ,但排除 Amazon,因为在这个例子中,我们就是 Amazon。except Amazon, because in this example we are Amazon. 43:28 我们清楚地知道内部究竟发生了什么。We know exactly what happens inside. 43:30 提示的唯一不同之处 And the only difference in the prompt 43:32 就是三个词。is three words. 43:33 使用网页研究。Use web research. 43:35 ‘Web research’是 Manus 中的一个术语 Web research is a term in Manus 43:39 用来编码我们刚才描述的内容。which encodes what we just described. 43:41 不过,Codex 和 Claude Code 之间非常相似。But it’s very similar between Codex and Claude Code. 43:48 我们还基本上 We also basically 43:49 构建了我们自己的版本 we build our own version 43:51 基于 GitHub 上的那个版本 based that on GitHub. 43:53 你也可以点开这个链接 You can also click into this link 43:55 ,了解 Manus 网页研究背后的原理 to see the mechanics behind the Manus web research 43:59 ,以及如何自己搭建一个版本。and how to build your own version. 44:03 在这里,如果我们使用这个,And here if we use this, 44:05 Manus 会 what Manus would do is 44:07 将其分成 10 个不同的表格或上下文窗口。they would divide it into 10 different tables or context windows. 44:11 对于这个表格子任务,And for this table subtask, 44:14 它会搜索 Walmart,it knows to search for Walmart 44:18 进行大量搜索,and do a lot of search 44:20 最后完成最终的写作任务。and finally perform the final writing task. 44:25 最终,另一个代理会 In the end another agent will 44:27 或主代理会将报告整合起来,or the main agent will put together the report 44:30 并生成这样的形式。and make it like this. 44:33 我们来看看,If we take a look at it, 44:34 就会发现,嘿,we will see that hey, 44:36 这是 eBay 的报告 this is the report for ebay 44:38 比之前详细多了。and it’s much more detailed than before. 44:40 我们用了四个不同的部分 We have four different sections 44:42 而不是两个句子。instead of two sentences. 44:43 同样,对于苹果、Target And similarly for Apple, for Target, 44:46 大家的情况都相同。everyone is the same. 44:49 所以我们基本上用一个很短的提示语 So it’s basically we use a very short prompt 44:55 通过遵守那个限制来绕过它。to get rid of that limit by respecting that limit. 45:04 好的,有一件事我想说明 All right, and there’s one thing I want to explain 45:06 ,那个拆分与合并的设置就是一个简单的多代理系统。that that split and merge, that setup is a simple multi agent system. 45:10 在我们继续深入之前,And before we go further, 45:12 我想先澄清一个常见的误解。I wanted to fix another common misconception. 45:15 许多提示在讨论多代理时,Many prompts, when they talk about the multi agents, 45:18 会要求模型扮演某些职位,they would ask the model to act as like some job title, 45:21 比如两个职位:一个是 qa,一个是 dev。like two job titles, one is qa, one is dev. 45:25 他们让它们进行辩论。And they ask them to debate. 45:27 感觉很自然,但实际上 It feels natural, but it actually 45:29 却误导了这个工具。misleads the tool. 45:31 就好比你雇了个 Because like imagine you hire a chef 45:33 精通所有菜系的厨师 who was training every single cuisine 45:36 却只准他做意大利菜。and you tell them you are only allowed to cook Italian. 45:40 这类似于人为的限制 This is like artificial constraint 45:42 并非真正有效的任务委托。not really an effective delegation. 45:44 这正是 And that’s exactly what happens 45:45 我们让 AI 扮演开发者或 QA 专家角色时所发生的情况。when we ask the AI to perform like a dev or like a QA expert. 45:52 这是因为模型已经接受了涵盖所有角色的广泛训练 This is because the model already has broad training 45:56 ,而我们却被要求做相反的事。across all the roles and we’re asked to do the opposite. 45:59 人们之所以这么做,And the reason people do that, 46:01 是因为人类寿命太短,humans do that because our lifetime is too short 46:03 不可能让每个人都掩盖所有事情。and it’s not possible for everyone to mask everything. 46:07 但这个模型并没有这样的限制。But the model has no such constraint. 46:09 将其锁定在单一桑拿中并非真正的杠杆。Locking it into one sauna is not leverage. 46:13 这就像一个人为施加的限制。It’s like an artificial constraint. 46:15 因此,多代理系统的真正价值在于 So the real value of multi agent is 46:18 将这些上下文拆分开来,为每个代理 split those contexts, giving each agent 46:20 提供一个干净、专注的工作空间。a clean focused workspace. 46:24 好吧,如果角色扮演不是核心理念,All right, if play acting roles is not a core idea, 46:28 那核心是什么?what is that? 46:29 关键在于,The useful part is 46:30 我们将上下文 we split the context 46:31 进行了分割,这符合硬件和上下文窗口的实际运作机制。that matches how the hardware and context window actually work. 46:36 对于我们分析的 10 家公司,我们会运行多个分析师流程 For the 10 companies we run several analyst passes 46:42 :一个子代理 and agent would only see the sub agent 46:43 只看到公司 A,另一个子代理只看到公司 B will only see company A, another sub agent would only see company B 46:47 ,等等。这样就没有单一线程 and and then there is no single thread 46:50 同时持有所有 10 家公司信息,从而避免竞争。that hold all 10 names at once and avoid that competition. 46:56 获胜的关键在于设计 The win is a workflow design 46:57 一个适应窗口大小的工作流程。that fits the window size. 46:59 这并不是一个虚假的公司会议 Not really a fake company meeting 47:02 让多个代理互相争吵。with the multi agents arguing with each other. 47:05 再总结一下。So one more summary. 47:07 我们为你的构建者成长之旅 We add a fourth big step 47:09 增添了第四个重要步骤。to your growth as a builder. 47:10 你现在通过了一个 You now go past one agent 47:12 工作空间中的一个代理。in one workspace. 47:14 你现在成为系统设计师。You become system designer. 47:15 你运行多个专注代理 You run several focused agents 47:17 来应对单个代理无法处理的规模。to handle scale that one agent cannot handle. 47:22 还有最后一个阶段尚未完成。And one final stage still remains. 47:24 迄今为止,每个任务都只有一个明确的目标。So far, every task still had one clear goal. 47:27 编写这个查询 Write this query 47:28 构建这个报告 build this report 47:29 研究这些公司 research these companies 47:30 但有时候,目标本身还在逐步形成。But sometimes the goal itself is still taking shape. 47:33 这就是共同创作。That’s co creation. 47:35 所以可以把它想象成和一位思维敏捷的同事一起头脑风暴。So think of brainstorming with a sharp colleague. 47:40 你可能会说,哎呀,You may say that oh, 47:41 我们的 WBR 数据表明留存率有所下降。our WBR numbers show a dip in retention. 47:44 我们该怎么应对呢?What could we do about it? 47:45 然后你的同事提出五个想法 And then your colleague would throw out five ideas 47:47 ,你说:‘哦,and then you say oh, 47:49 第三个不错,the third one is interesting, 47:50 但我们上季度试过类似的东西 but we tried something similar last quarter 47:52 ,因为 X 和 Y 原因失败了。’ and it failed for reason X and Y. 47:55 然后他们会完善这个想法 And then they will refine the idea 47:57 通过反复的反馈。based on a feedback that back and forth. 47:58 这就是共同创作。That is co creation. 48:00 你提供判断力和专业知识 You bring judgment, you bring domain knowledge 48:02 模型则提供广度和速度。and the model brings breadth and speed. 48:05 这个目标会随着你的深入而逐渐清晰。That goal sharpens as you go. 48:08 所以我们最后再看一次 WBR 示例。So we go back to the WBR example one last time. 48:12 现在这份报告已经准备就绪。Now the report is ready. 48:14 我们已经看到了那个报告 We already see that report 48:15 看起来很不错。that looks nice. 48:16 然后我们想把它 And then we want to put it 48:19 呈现在 CEO 面前。in front of the CEO. 48:22 我们尝试与 AI 协作。And we try to work with AI. 48:24 我们怎样把它转化为 How do we turn it into something 48:26 CEO 真正愿意阅读和使用的东西?a CEO would actually read and use? 48:29 所以这个应用不在于明确的执行步骤,So this app is less about a clear execution, 48:32 而更多在于判断力和故事性。it’s more about the judgment and story. 48:34 所以这里,我让 AI 和我一起思考:So here I will just ask the AI to think with me together, 48:39 这个 WBR 中最令人惊讶的一行是什么?like what is the most surprising line in this wbr 48:44 从中提出三个强有力的举措,and suggest three strong initiatives from that 48:47 并起草一页备忘录,向 CEO 推销这个想法。and draft a one page memo to the CEO that sells the idea. 48:59 接着,它会重新阅读 WBR And then it rereads the WBR 49:02 提取具体数字和故事钩子 for concrete numbers and the narrative hooks 49:05 并进行验证。and verifies. 49:07 这个故事真的管用,对不对?This story actually works, Right? 49:14 然后根据分析 And then based on some analysis 49:16 它实际生成了几条 SQL 查询 it actually wrote some SQL queries 49:19 来进行双重验证。to double check. 49:21 而且根据分析 And based on analysis 49:23 它撰写了最终报告。it wrote this final report. 49:26 令人惊讶的是,这行 The surprising line is 49:28 听起来像是直接的 LED 业务。sounds like a direct LED business. 49:31 不过,email 在每位访客收入上都优于 direct However, email outperforms direct on both revenue per visitor 49:36 和其他一些指标 and some other metrics 49:37 ,而且有三个强有力的举措。and there are some three strong initiatives. 49:40 所以,这可能是联合战略工作 So this is a joint strategic work 49:44 专门为这个 demo 而做的。probably for this demo per se. 49:47 很难验证它是否有效 It’s hard to verify whether that works or not 49:49 因为这需要真正站在 CEO 面前 because that needs to really stand in front of CEO 49:53 进行演示并检验结果。and do the presentation and probe the result. 49:55 但我想强调的是 But what I want to emphasize is 49:57 这种方法确实是存在的。there is indeed this way. 50:00 我们完全有可能 It is possible that we could have 50:02 与 AI 进行生动而富有创意的互动交流。a lively and creative back and forth with AI. 50:07 因此,这个模型就像一个强劲的引擎 So the model acts like a strong motor 50:10 能驱动广泛的事实和快速的想法。for wide facts and fast ideas. 50:13 我的作用就是引导它。And my role is to steer it. 50:16 我加入领域知识,I add domain sense, 50:17 设定优先级,I add priorities, 50:18 并添加过滤条件。I add filters. 50:19 例如,如果我不喜欢这个想法,For example, if I don’t like this idea, 50:22 我可能会告诉你,保留第二个倡议,看起来不错。I may tell you hold the second initiative looks good. 50:27 试着多扩展一些。Try to expand more. 50:28 第一个不合理,因为 abc The first one doesn’t make sense because abc 50:31 通过这样做,我的洞见就融入了这个上下文 so by doing this my insight becomes part of this context 50:35 从而更好的上下文能生成更好的草稿。and better context yields better drafts. 50:39 因此,在这个阶段 So in this stage 50:41 关键上下文已超越文件和数据库的范畴。the key context goes beyond the files and database. 50:44 这会不会是我的判断力、我的品味 Is it maybe my judgment, my taste 50:46 ,甚至是我对那位我认识的 CEO 的观察?and even my observation to the CEO I know? 50:49 哦,CEO 们 Oh the CEO have the 50:50 不喜欢‘革命’这个词,they don’t like the word 50:54 我也不知道。I don’t know revolution. 50:56 试试用其他一些词语 Try to use some other words 50:58 试着表述得更柔和,从而改变 try to formulate softer and that shifts 51:01 什么被视为有价值的东西。in in what counts as valuable. 51:04 就像之前,它能更快地编写代码 Like before it is writing code faster 51:09 现在更像是进行判断。and now it’s more like judgment. 51:16 所以,这里有一个简单的框架 So here’s a simple frame 51:18 来处理那个价值问题。for for that value question. 51:21 我们多年来被训练 We have been trained for years 51:22 成最快的桨手——现在我们需要 to be the fastest rower—we need to be 51:26 成为最快的程序员。the quickest coder. 51:28 我们需要成为最谨慎的分析师。We need to be the most careful analyst. 51:31 但如果你和 AI 都来写 Python 代码,But if you and the AI both write Python, 51:33 你得花一个小时。you take an hour. 51:34 AI 需要 30 秒。The AI takes 30 seconds. 51:36 比拼速度注定会失败。Competing on speed is a losing game. 51:39 但另一种视角是把它 But another perspective is to treat it 51:41 当作船长一样对待。like a captain. 51:43 船长不会比发动机转得更快。The captain doesn’t roll faster than the engine. 51:45 船长决定目的地,The captain decides the destination, 51:47 了解条件,reads the conditions 51:49 并掌控结果。and owns the outcome. 51:50 所以就像我们上一个演示中展示的那样,So pretty much like what we just showed in our last demo, 51:53 你的价值在于知道该构建什么以及原因,your value is knowing what to build and why 51:56 而不是你打字的速度有多快。not like how fast you can type it. 51:59 这就是转向方向盘 And this is the shift steer 52:02 使用引擎,让引擎来控制速度。and use the engine and let it handle the speed. 52:08 所以,这个表格把‘桨手 vs 船长’的故事 So this table turns the Rower vs Captain story 52:11 转化成了更贴近日常工作的细节。into more daily work detail. 52:13 左侧是划桨手风格。On the left side we have the rower style. 52:15 我们必须注重细致的实践操作。We need to focus on careful hands on work. 52:18 我们必须投入大量时间 We need to spend a lot of time 52:19 在重复性任务上。on repeated tasks. 52:20 我们发布单个静态文件 We ship one static file 52:22 ,例如一份报告 like a single report 52:23 ,这种模式很容易自动化。and that pattern is Easy to automate. 52:26 右侧是队长风格。And on the right side we have the captain style. 52:29 我们专注于方向与系统设计 We focus on direction and system design 52:31 时间主要用于设定目标、提供上下文以及判断模型。and the time goes into goals, context and judgment model. 52:35 这个模式契合 And that pattern fits 52:36 一种更长远的、以 AI 参与循环的职业生涯。a longer career with AI in the loop. 52:39 有了心态阶段和方法原则,With both the mindset stages and the method principles, 52:42 我们现在就能在一边展现全貌。we can now pull the full picture on one side. 52:48 所以我们从最简单的步骤开始。So we started with the simplest move. 52:50 我们向模型提供它所需的各种事实 We give the model the facts it needs 52:54 ,然后从中学会保持上下文干净 and then from there we learn to keep the context clean 52:58 ,将知识写入文件 and we write knowledge into files 53:00 ,并将大型任务分配给不同的代理。and we split big jobs across the different agents. 53:05 因此,这些上下文技能 So each of these context skills 53:07 每掌握一个,就能开启协作的下一个阶段 unlock the next stage of the collaboration 53:10 而这种协作遵循一种模式。that is a pattern. 53:12 与 AI 协作方式的每一步提升,Every step up in how you work with AI 53:14 都建立在具体上下文技能 rests on a concrete context skill 53:16 之上,你本周可以练习。you can practice this week. 53:20 所以,如果你本周只有时间做三件事,So if you need, if you have time to do only three things this week, 53:24 我的建议是:第一步,my suggestion is step one, 53:27 记录下那些重复性的工作。write down the repeat work. 53:31 下次再做相同任务时,Next time you do the same task again, 53:33 花几分钟把步骤 spend a few minutes, save the steps 53:34 保存到一个简短的 Markdown 文件中。in a short markdown file. 53:38 这是上下文 This is the context 53:39 模型能够重复利用的。the model can reuse. 53:40 第二个方法是 And the second, 53:41 在它失控螺旋之前,及时合并线程。merge the thread before it spirals. 53:45 试试看“三轮规则”吧。Like try a three turn rule. 53:47 如果对话线程超过三个轮次,If a thread goes past three turns, 53:50 就暂停、编辑,并将其合并成一条消息。pause, edit, make it into one message. 53:54 第三,给定一个目标。And third, give a goal. 53:56 不止是步骤,Not only steps, 53:57 我们要在 AI 优先的编辑器中打开一个项目,we open a project in an AI first Editor, 54:00 不要用 ChatGPT。do not use ChatGPT. 54:02 先陈述一个高层次的目标,State a high level outcome, 54:04 然后观察它,阅读文件 watch it, read the files 54:06 并收集相关上下文。and gather the context. 54:08 像这样的小习惯,能把你从单纯的用户 Small habits like this would nudge you from users 54:10 轻轻推向构建者,这也与我们今天所走的路径 towards builders and they line up with the path 54:14 完全契合。we worked today. 54:16 谢谢你抽出时间。And thank you for your time. 54:17 集中注意力。Focus. 54:17 我知道我们已经迟了大约 15 分钟。I know we are like 15 minutes past. 54:20 现在开放提问环节。Let me open the floor for questions. 54:22 先在 Zoom 聊天区回答问题。First answer the questions on the zoom chat. 54:27 如果数据非常脏,会发生什么?What will happen when data is very dirty? 54:29 我从一开始就意识到了这一点。I’m even aware of it to begin with. 54:31 直接走开吧。Just go away. 54:32 向数据代理提供数据。Feed data agent. 54:33 代理能判断数据不干净 Will the agent be able to tell that data is not clean 54:35 并进行清理吗?and clean? 54:36 这是一个非常好的问题。Now that’s a very good question. 54:38 我觉得我会做两件事。I think I would do two things. 54:42 首先,我会与之合作 The first is I will work with the agent 54:44 把它当作我的思考伙伴和共同创造者。like it’s my thinking partner, it’s my co creator. 54:48 我会先向它说明我的总体目标 I will first state my overall goal to it 54:51 ,然后让它找出差距在哪里。and then ask it to identify what’s the gap. 54:54 希望它能识别出,嘿,Hopefully it will be able to identify hey, 54:56 数据清洁度就是那个差距所在。the cleanness of the data is the gap. 54:59 然后我会与它 And I will then work with it 55:00 合作,制定一个清理数据的计划。to figure out a plan to clean data. 55:04 更重要的是,我将说明清洗数据的标准 More importantly, I will define what is the criteria 55:08 是什么,这些内容将在下节课中详细讲解。of cleaning the data which will be introduced in our next lecture in detail. 55:14 通过这种方式,我们为其提供足够的上下文,By this we give it enough context, 55:18 和整体大局,we give it a good big picture 55:19 让它明白:哦,so it knows oh, 55:20 我这么做的目的是为了获得最终分析。the reason why I’m doing this is to get a final analysis. 55:23 这几乎就像人类管理同事一样。It’s pretty much like managing a human managing a colleague. 55:29 下一个问题:The next question 55:30 你如何确信生成式 AI 的输出实际上是正确的?how do you be confidence that AI output is actually correct? 55:35 这是一个很好的问题。That’s a great question. 55:37 我们下节课会详细讲解,We will also go into a lot more detail in the next lecture, 55:40 简而言之,它基本上和人类差不多。and the short answer is it’s pretty much like human. 55:44 管理直属下属时 When you manage a direct report 55:46 你不是双重检查,而是交叉检查。you don’t double check, you cross check. 55:49 我的意思是,经理不会 By that I mean a manager won’t 55:51 逐行重写代码,rewrite every line of code 55:53 也不会对每一行代码都进行审查。or do code review on every single line. 55:56 这会很快让经理 This will quickly make the manager 55:58 成为整个团队的瓶颈。a bottleneck of the entire team. 56:00 经理要做的是进行交叉核查。What a manager would do is cross check. 56:03 从这个来源取这个数字 Take this number from this source 56:04 从那个来源取那个数字。that number from that source. 56:07 它们是否彼此匹配?Do they match each other? 56:09 它们合理吗?Do they make sense? 56:11 它们与过去的经验相符吗?Do they match the pass experience? 56:13 这就是经理的核心能力所在。And this is the core competency of a manager. 56:17 迅速从杂乱的语境中找出问题 Quickly identify the problems from messy context 56:21 并评估员工的工作质量。and assess people’s work quality. 56:27 我看到 Maggie 举手了 And I see Maggie has the hand 56:30 不过在轮到你之前,你麦克风还静音着呢。before we go to Maggie, you’re on mute. 56:37 课结束时有人问我,I get asked by the end okay, 56:38 ‘好吧,那个可以。’ that works over. 56:40 谢谢大家。Thank you. 56:41 而且 Jen 从一开始的问题是,And Jen’s question from the beginning 56:43 接近最新的情况似乎是 approach the latest seems like 56:44 我们让 AI 处理了更多工作步骤,we handle off more work steps AI 56:46 而我们却不知道它是如何完成这些工作的。without knowing how I finish the work. 56:50 是的,我觉得这与之前的问题相关 Yes, I think that’s relevant with the previous question 56:56 我们总是需要定义成功标准 like we need to it’s always our job to define what is the success criteria 57:01 并审查生成式 AI 产生的思路,不用太多。and review the thought that AI generated Not too much. 57:10 所以我觉得你本质上问的是和 Shane 相同的问题 So I think you’re asking essentially the same question 57:17 只是表述方式不同。as what Shane asked but in a different way. 57:21 所以我还是建议进行交叉验证。So my suggestion is still the cross check that. 57:29 因此,在 AI 时代,有两个素质或能力 So two qualities or two competencies especially matter 57:33 尤为关键。in the AI era. 57:35 第一个是想想一位 NBA 教练。One is think about a NBA coach. 57:39 教练不一定比球员打篮球打得更好 A coach not necessarily play basketball better than the player 57:44 但教练依然能评估球员的表现 but the coach is still be able to evaluate the players work 57:48 并提供有效的指导建议。and give effective coaching suggestions. 57:52 这是一个非常宝贵的技能 That’s a very valuable skill 57:55 ,我们需要去学习它。and we need to learn about it. 57:58 在人力资源管理中,经理管理 In human management it’s not uncommon for a manager to manage 58:01 技术能力比自己更强的下属是 a direct report who is even stronger in technical sense 58:04 很常见的事,我们得习惯这种状况。and we need to get used to that. 58:09 这在人类世界里可能很少见 It’s probably rare in human world 58:11 但在 AI 世界中却极为普遍。but it’s very very common in AI world. 58:14 我无法写出比向我汇报的 AI 更好的代码。I cannot write better code than the AI who reports to me. 58:18 所以有两个品质。So two qualities. 58:19 一是如何管理比我更强的团队成员 One is how to manage the team members stronger than me 58:24 ,二是如何快速发现问题并评估工作质量,而无需双重检查。and B how to quickly catch the problems and assess the work quality without double checking. 58:32 这是我们特别需要学习的内容 That’s something that we especially need to learn 58:35 我们下节课会详细讨论。and we’ll talk more about it in the next lecture. 58:40 你可以自行决定是否作答。You can decide whether to answer this. 58:41 Shane 在问题中提供了更多上下文。Shane added more context in his question. 58:44 比如,他开发了一个专属的 AI 技能 For example, he built a dedicated AI skill 58:47 但这其实给他造成了压力。but that actually creates stress for him. 58:50 如果你读一下他的问题,If you read his question, 58:53 目前的做法是使用 AI 进行交叉验证。the current approach is AI cross-verification. 58:57 你可以利用 AI It is possible that you use AI 58:59 进行交叉验证。to cross-verify. 59:01 其实人类也会用那个。Actually humans use that too. 59:03 想想公司里,Think about in companies they may, like, 59:04 比如腾讯或 Meta 之类的。I don’t know, Tencent or Meta. 59:08 他们可能会派两个团队同时研究同一个课题 They may get two teams working on the same topic 59:11 其中一个团队获胜,另一个则失败。and one team would win, the other would lose. 59:14 所以这基本上和 So it’s pretty much the same 59:15 我们之前做的一模一样。like what we did. 59:17 但在 AI 时代,But in the AI era, 59:19 我们可以轻松组建 10 个 AI 团队 it’s feasible for us to spin up 10 AI teams 59:22 进行竞争,and then compete, 59:24 这对人类来说却是不可能的。but it’s not possible for the humans. 59:27 所以我觉得你做的还行 So I think what you did is okay 59:28 但我们需要有策略地去做这件事。but we need to do that strategically. 59:31 例如,For example, 59:32 为 AI 提供一种可靠的方法 giving AI a reliable way 59:35 来检查一致性。to check for the consistency. 59:39 不过我同意,这可能会带来很大的额外负担。But I agree that it may add a lot of overhead. 59:44 原因在于 And the reason is 59:45 你还在进行双重检查。you’re still double checking. 59:47 我们要做的是进行交叉验证。What we need to do is cross checking. 59:51 我们下节课再详细讨论。We’ll talk about more details in next session. 59:54 但是还有一个问题 But there is another question 59:56 :如何避免 AI slap of how to avoid AI slap 59:58 ,并更有效地利用 AI。and use AI more effectively. 01:00:01 我觉得很多垃圾内容 I think a lot of slop 01:00:03 是我们自己产生的。comes from us. 01:00:05 我们对待 AI 结果的态度很马虎,We are sloppy in how we treat AI’s results, 01:00:09 提供上下文的方式很马虎,we are sloppy in how we provide context, 01:00:12 交叉验证也做得粗糙。and we are sloppy in cross validation. 01:00:14 我们只是让 AI 进行交叉验证,We just ask AI to cross validate, 01:00:18 但实际上需要我们自己努力 but we actually need to put in the work 01:00:21 提供正确的上下文,比如,to provide the correct context and for example, 01:00:24 调整边际条件,来 change the marginal condition so that to, 01:00:26 真正进行交叉验证。to kind of cross check. 01:00:29 如果 AI 能处理那些情况 If AI can handle those situations 01:00:31 ,让我们建立信心,so that we can build confidence, 01:00:33 我们也不能什么都全交给 AI 去做。we cannot just ask AI to do everything. 01:00:38 更多细节稍后详述。More details coming up. 01:00:40 但是我想提醒大家 But I want to kind of raise the awareness 01:00:42 ,我们不能简单地把一切都交给 AI。that we shouldn’t just hand off to AI. 01:00:48 这是协作。It’s collaboration. 01:00:54 接下来,我来回答这两个问题。And then I’ll answer the following two questions. 01:00:56 Zoom 上还有最后两个问题。Two last questions on the zoom. 01:00:59 首先是如何挑选模型。The first is how to choose models. 01:01:02 我再提一下之前我举过的那个比喻。I want to mention again an analogy I did. 01:01:07 很多时候,AI 就像人类一样。A lot of times AI is like humans. 01:01:10 当我们询问如何挑选模型时,When we ask how to choose models, 01:01:11 就好像我有一支人类团队一样。it’s like, I have a team of humans. 01:01:14 它们有些擅长写作,Some of them are good at writing, 01:01:16 有些擅长编码的方法。some of them are good at coding how. 01:01:18 其中有些比 And some of them are smarter than, 01:01:18 其他的一些更聪明。than others. 01:01:20 我该如何选择让哪个 How do I choose which one 01:01:21 模型来处理哪个任务。to work on which. 01:01:23 它们完全一样。It’s exactly the same. 01:01:25 所以你要建立信任,So you build trust, 01:01:27 建立对人们的理解。you build understanding of people. 01:01:29 你知道,嘿,You know, hey, 01:01:30 这个比另一个 this one is better than another one 01:01:31 在编码上更好。in coding. 01:01:33 所以我可能会用同一个人 So probably I would use the, that that person 01:01:35 为生成式 AI 做完全相同编码的任务。for coding exactly the same thing for AI. 01:01:39 用它,基于它构建,感受一下。Use it, build with it, get some sense. 01:01:42 我的建议是,如果直接回答问题 And my suggestion, if I directly answer a question 01:01:45 那最好是针对编程的。is for coding. 01:01:46 我觉得几乎所有 AI 模型都能表现得很好 I think nearly every AI model can do pretty well 01:01:52 随便挑一个就行。and just randomly pick one up. 01:01:54 很可能。Probably. 01:01:55 除了 Gemini 之外。Except the Gemini. 01:01:57 如果你用 Google 搜索,If you Google probably, 01:01:59 大概就知道我说的是什么了。you know what I’m talking about. 01:02:02 而且我听说有时 And I heard that sometimes 01:02:03 Google 有自己的 Gemini 内部版本。Google has its own internal version of Gemini. 01:02:06 它功能很强大,It’s quite powerful, 01:02:07 但我们无法使用它。but we don’t have access to it. 01:02:09 另外,Gemini 在写作方面非常出色。And for writing, Gemini is very good. 01:02:13 Claude Opus 4.6 表现非常出色。Claude Opus 4.6 is very good. 01:02:15 4.7 简直是噩梦。4.7 is nightmare. 01:02:16 至于深度思考,Claude 表现得很出色,And for deep thinking, Claude is very good, 01:02:21 尤其是 Opus,especially opus 01:02:23 而 GPT 则是全能选手,and GPT is all rounder, 01:02:26 不过它的写作有点怪味。but it has a weird smell in the writing. 01:02:30 基本上就是这样了。So that’s pretty much it. 01:02:32 而且使用 Claude 来构建 SOP And it’s more easy to build SOP 01:02:34 或处理重复性工作会更容易。with Claude or repeated work. 01:02:36 这是一个非常好的问题。That’s a very good question. 01:02:38 我想指出,I want to mention that 01:02:40 我们并不想把 SOP 交给 AI。SOP is not something we want to give to the AI. 01:02:44 我们可能以后再讨论这个吧。Probably we can talk about it later. 01:02:46 我可以给你发一个 I can send a link 01:02:48 我关于这个的博客文章链接。to my blog post on it. 01:02:52 通常,我们在使用生成式 AI Usually when we work with AI 01:02:54 或与人合作时,or working with people, 01:02:56 他们并不真正想要标准操作流程。they don’t really want standard operating procedures. 01:02:59 他们想要的是成功标准。What they want is the success criteria. 01:03:02 什么决定了成功,What defines success, 01:03:03 什么决定了完成?what defines done? 01:03:05 那么,他们如何判断 And if how can they tell 01:03:06 自己是否达到了目标呢?whether they reach the goal? 01:03:08 否则,And if not, 01:03:09 他们如何判断差距在哪里,how can they tell where’s the gap 01:03:11 以及该往哪个方向努力?and what direction to go? 01:03:14 相比 SOP,这更重要 That’s more important comparing the SOP 01:03:16 因为生成式 AI 足够智能,能够自己制定计划。because AIs are smart enough, are smart enough to figure out plans by their own. 01:03:22 有时我们并非该领域的专业专家。Sometimes we’re not the domain subject matter experts. 01:03:27 学习什么呢,有什么建议?What is the suggestion to learn? 01:03:28 嗯,这是个好问题。Yeah, that’s a good question. 01:03:32 对于这个问题,我实在没有 For that I don’t really have 01:03:33 一个很好的答案。a very good answer. 01:03:34 我认为差距在于 I guess the gap is 01:03:36 我们评估这项工作的标准。how we evaluate the work. 01:03:40 那是最关键的一点。That’s the most important thing. 01:03:41 我们做不到。We can’t. 01:03:42 我们不知道怎么做没关系,It’s okay for us not know the know how how to do it, 01:03:45 但知道什么构成了优秀却非常重要。but it’s very important for us to know what defines good. 01:03:49 因此,我建议学习新领域时 So my suggestion on learning for a new field 01:03:53 从如何评估质量入手。is learn from how to assess the quality. 01:03:58 目标的不同 And that difference in the goal 01:03:59 会带来截然不同的学习路径。will bring a very different learning path. 01:04:04 只要我们知道如何辨别作品的好坏,And as long as we know how to tell whether the work is good, 01:04:07 就能非常有效地运用 AI。we can very effectively use AI. 01:04:11 但要明白,传统学习方法 But know that the traditional way of learning 01:04:12 其实是完全相反的另一面。is really the other side. 01:04:15 它们会教你很多该做什么,They will tell you a lot about what to do, 01:04:17 但几乎不告诉你如何评估。but tell you very little about how to evaluate. 01:04:22 好吧,Maggie,该你了。All right, Maggie, I think it’s your turn. 01:04:24 好的,我有一个基础的实践问题 Okay, I have a basic practical question 01:04:28 ,它可能与其他模块很相关 and it may be really relevant to other module 01:04:31 ,但我就是想了解 but I just want to learn 01:04:33 还有其他什么额外想法吗?Any other additional thoughts? 01:04:37 而且在实际应用中,有时我收到临时请求,And for in practice and sometimes when I get an ad hoc request 01:04:43 会与 AI 工具进行多轮对话,and I discuss with AI tools and we have like multiple round of conversation 01:04:48 到最后我们制定出一些计划 and by the end we have some plan 01:04:52 但我自己都忘了是怎么一步步推导出这个最终计划的。but even myself I have forgot and how we came to the final plan. 01:05:01 所以我想问 And so my question is 01:05:02 在实际操作中,有什么好的方法 in practice, what is a good practice 01:05:05 来结束对话?to close the conversation? 01:05:08 这是一个很好的问题。That’s a good question. 01:05:09 我可以在这里演示 And I can show you here 01:05:11 当我们编写这份文档时 that when we write this document 01:05:14 它也是一份活文档。is also a living document. 01:05:17 通常我都会 And usually what I do is 01:05:18 保留文档的两个版本。I keep two versions of the document. 01:05:21 一是讨论过程 One is the discussion process 01:05:22 ,比如我们刚才讨论的内容以及得出的结论。like what we discussed, what conclusion we had. 01:05:27 它可能是一个逐日的历史记录或变更日志 It may be a day by day history or change log 01:05:31 ,另一个则是最终状态,例如,and another is the final state like for example, 01:05:34 对于这个文档,我们可能会有这份报告。for this document we might have this report. 01:05:39 哦,这是最终报告 Oh, this is the final report 01:05:41 我们可能会提交给 CEO 的。that we might submit to the CEO. 01:05:43 我们可能还会维护另一份文档 We might maintain another document 01:05:45 用来记录所有的思考过程。documenting all the thought processes. 01:05:48 所以区别在于:And so that the difference is: 01:05:50 不要用 ChatGPT,do not use ChatGPT 01:05:52 而是用 Cursor 或 Claude Code,but use either Cursor or Claude Code 01:05:56 这样就能使用一个很强大的工具 files。so that you get access to a very powerful tool named files. 01:06:00 所有内容都会保存在磁盘上,Everything will be on the disk, 01:06:01 你不会丢失它们。you won’t lose it. 01:06:03 第二个要点是为文件维护变更日志 And the second is maintain a changelog for the file 01:06:06 尤其是关键决策。especially on the key decisions. 01:06:09 这也类似于你和同事 It’s also like when you discuss with a colleague 01:06:10 讨论维护 wiki 的情况。maintaining a wiki. 01:06:16 嗯,好,谢谢,这很有帮助。Yeah, okay, thank you, that’s helpful. 01:06:19 谢谢。Thank you. 01:06:20 还有其他问题吗?Any other questions? 01:06:22 我也趁这个时间发一下链接 Let me also use this time to send out the link 01:06:26 其他人可以提问题。while other people can ask questions. 01:06:30 阿尔弗雷德。Alfred. 01:06:32 你好。Hi. 01:06:33 抱歉。Sorry. 01:06:34 所以我认识你。So I know you. 01:06:35 我在看那个课 I look at the course 01:06:36 你们在 Cursor 上讲的。you guys are teaching on Cursor. 01:06:39 没错。Right. 01:06:39 但我现在正试着学 Claude Code But I’m trying to learn Claude Code now 01:06:42 ,而且不想在 Cursor 上多花钱。like and I don’t want to pay extra on Cursor. 01:06:45 我能不能直接在 Cursor 和 Claude Code 里试着学 Can I just try to learn it in Cursor, Claude Code 01:06:49 ,然后就用那种方式来做?and then do it that way? 01:06:51 你有什么建议吗?Any suggestion that you. 01:06:52 因为你在命令行或 Cursor 中运行 Claude Code 时 Because you talk about Claude Code when you run it 01:06:57 会提到它,on the command line or in Cursor, 01:06:58 这是因为它们在后台有一些提示工程机制。because they have some prompt plumbing behind whatever. 01:07:01 没错。Right. 01:07:02 这样它就会显得更有学识了。Then it may make it feel more knowledgeable. 01:07:06 没错。Right. 01:07:06 那么,我们怎么自己来调整它呢?So how can we adjust it by myself. 01:07:09 不好意思,我没听清最后一个问题。Sorry, I didn’t get the last question. 01:07:11 调整成什么?Adjusted to what? 01:07:12 那么,我们如何调整我的 Claude Code So like how can we adjust my Claude Code 01:07:15 ,让它变得像 Cursor 一样强大呢?to become more powerful like Cursor? 01:07:18 我懂了。I see. 01:07:23 这个有点难回答,It’s a bit hard to answer, 01:07:25 但我先来介绍一下如何使用 Claude Code。but let me first introduce how to use Claude Code. 01:07:28 所以有一个 a。So there is a. 01:07:30 等等,等等。Wait, wait. 01:07:31 抱歉,我觉得这里有个误会。Sorry, I think there is a misunderstanding. 01:07:33 Alfred 的问题是:Alfred’s question is 01:07:35 Cursor 更强大,Cursor is more powerful 01:07:36 Claude Code 落后了。and Claude Code is behind. 01:07:38 我不认为那是我们的原话。I don’t think that’s what we said. 01:07:40 我们说过,它们几乎相当 What we said was they are almost on par 01:07:43 在能力和功能方面,in terms of capability and features, 01:07:45 而且我们教授的所有方法——我们特别强调的—— and all the methodology we teach—we emphasize— 01:07:50 都是不依赖特定工具的。is tool-agnostic. 01:07:53 你可以使用它在 Cursor You can use it in Cursor 01:07:54 、Claude Code 或 Codex 上。or Claude Code or Codex. 01:07:56 几乎没什么区别。There is almost no difference. 01:07:59 嗯,我来澄清一下。Yeah, let me clarify this. 01:08:01 我觉得这个问题可能是因为 I think the question might stem from 01:08:03 我贴在这里的那个链接引起的。the link I pasted here. 01:08:05 它表示,Claude Code It says Claude Code is weaker than Cursor 01:08:08 在某些特定情况下不如 Cursor 强大。on some certain circumstances. 01:08:12 比如。For example. 01:08:12 哦,我们只在乎 Oh, we only care about 01:08:14 它没有我也能运行多久。how long it will work without me. 01:08:18 因此,在这个特定方面 So for this very specific aspect 01:08:19 在这个特定例子上,比 Cursor 差。on this very specific example is worse than Cursor. 01:08:23 不过总体而言,我的感受是 But overall my perception is 01:08:26 作为几乎所有 AI 工具的重度使用者 I’m a heavy user on nearly all of the AI tools 01:08:30 它们基本上都差不多。is they are pretty much the same. 01:08:33 尽管 Claude Code 很受欢迎 Although Claude Code is very popular, 01:08:36 但我并不觉得它有 I don’t really see Claude Code has anything very unique 01:08:37 Cursor 或其他地方找不到的独特之处。that you cannot get from Cursor or everywhere else. 01:08:41 这就是关键点。That’s the point. 01:08:44 所以,Alfred,我们问题的答案就是:So the answer to our question, Alfred, 01:08:46 继续使用 Claude Code,keep using Claude Code 01:08:49 这个课程就能提供你所需的一切。and you get everything you need from this course. 01:08:53 因为,你知道,像我这样的新手,Because, you know, like a newbie like me, 01:08:55 我没法切换很多平台,I can’t switch a lot of platform 01:08:57 因为在一个平台上构建好后,because when you build in one platform, 01:08:59 我不知道怎么在其他平台上构建类似的东西。I don’t know how to build like a. 01:09:02 我不知道怎么用 MD 来切换它。I don’t know MD to switch it. 01:09:04 我对这个领域还是新手。I still new to this field. 01:09:06 那么。So. 01:09:07 嗯,对不起那个。Yeah, sorry for the. 01:09:09 没关系。No worries. 01:09:10 这是一个很好的问题,我很高兴有人提出这个问题 That’s a very good question and I’m glad they asked this 01:09:13 因为这恰恰解释了为什么我们没有介绍 because that’s exactly the reason why if you pay attention to what we didn’t introduce, 01:09:16 任何特定平台的特性。如果你留意一下我们没提到的内容,就会明白这一点。we didn’t introduce any platform specific feature. 01:09:22 Cursor 的规则。Cursor rules. 01:09:23 我们并没有说要引入 Cursor 规则。We didn’t say introduce Cursor rules. 01:09:25 Claude 拥有它专属的 Claude 技能或斜杠命令。Claude has its own Claude skill or slash command. 01:09:29 我们从未介绍过它。We never introduced it. 01:09:30 所以我们所介绍的所有内容都是空文件 So everything we introduced are empty files 01:09:32 而且所有工具都能读取 MD 文件。and all the tools can read MD files. 01:09:35 那才是关键所在。That’s the thing that matters. 01:09:39 谢谢。Thank you. 01:09:40 然后你来谈谈变更日志。And then you talk about the change log. 01:09:42 我们该如何做呢?How can we. 01:09:43 我们只是提示它们 Are we just prompting them 01:09:45 它们就会帮我做电锯吗?and then they will help me to do a chainsaw. 01:09:49 是的,完全正确。Yes, exactly. 01:09:50 你只需告诉它来文档化这个。You just tell it document this. 01:09:54 哦,好的。Oh, okay. 01:09:54 懂了,懂了。Got it, Got it. 01:09:55 好的,谢谢你。Okay, thank you. 01:09:56 没问题。No problem. 01:09:57 而且在无头模式下,非聊天方式 And headless, not chat applies 01:09:59 适用于我们使用 Claude Code 的情况。when we use Claude Code. 01:10:01 我现在就演示给你看,I can show you right here, 01:10:04 有个小技巧:我说 ‘hi’,there is a trick that say I say hi 01:10:08 然后你双击 Escape 键,and then you can double click the escape key 01:10:12 它就会显示 ‘rewind’。and it says rewind. 01:10:16 你只需进入 CLI You can just go to the CLI 01:10:17 ,运行 continue、restore、summarize,或者 and run continue, restore, summarize from here, 01:10:22 其他任何命令即可。or anything. 01:10:23 只需双击 esc 逃脱键即可。Just press the esc the escape key twice. 01:10:27 就这样。That’s it. 01:10:31 我还发现你将 Claude I also noticed you aliased Claude 01:10:33 设置为 Opus 4.6 的别名。to Opus 4.6. 01:10:36 好。All right. 01:10:37 是的。Yes. 01:10:38 Effort Max 模型用 4.6 Effort Max model is 4.6 01:10:41 因为我超级讨厌 4.7。because I really hate 4.7. 01:10:43 我觉得写作简直是场噩梦。I think the writing is a nightmare. 01:10:48 它依然能生成任何文件。It can still make any files. 01:10:50 是的,谢谢你,Regan。Yes, thank you, Regan. 01:10:52 这很有帮助。That’s helpful. 01:10:54 好吧,还有其他问题吗?All right, any other questions? 01:11:01 好吧,如果没问题,All right, if no questions, 01:11:03 我们就到此结束。we’ll end here. 01:11:04 谢谢大家容忍我 And thank you people for tolerating me 01:11:06 熬过了那个时刻。for getting past the. 01:11:09 6 点标记 The 6 mark 01:11:10 下午 6 点标记。6pm mark. 01:11:11 希望大家喜欢这个讲座 And hope you enjoy this lecture 01:11:15 我们社区里见 and see you around in both the community 01:11:17 也下周再见 and also next next week. 01:11:19 谢谢。Thank you. 01:11:20 来参加我们的办公时间吧。Come to our office hour. 01:11:21 非常棒。It’s very good. 01:11:22 嗯。Yeah. 01:11:23 下周六的下周六是可选的办公小时 Next next Saturday is the optional office hour 01:11:26 但我鼓励大家一定要来。but I encourage everyone to come. 01:11:29 我们会在那里展示一些作业相关的内容 We will show something there about the homework 01:11:32 然后周日就是讲座了。and then Sunday is the lecture. 01:11:35 谢谢大家。Thank you. 01:11:36 回头见。See you around. 01:11:37 谢谢。Thank you. 01:11:38 太感谢了。Thanks so much. Module 2_ The AI Builder’s Journ… 1.3 MB
手动方式中使用的 Google Spreadsheets。
自动化方式中使用的 SQLite 数据库。
最后更新:2025 年 11 月 23 日
English Original Module 2_ The AI Builder’s Journ… 1.3 MB
Google Spreadsheets used in the manual way.
SQLite database used in the automated way.
Last updated: Nov 23rd, 2025