OCT 23 Optional: Live coding workshop HW 1+2 w/ Isaac Flath THU 10/233:30 AM—5:00 AM (GMT+5:30) OPTIONAL Recording
Notes
Recording
Optional: Live coding workshop HW 1+2 w/ Isaac Flath
Oct 23, 20253:30 AM - 5:00 AM GMT+5:30
Audio Transcript
Chat Messages
Hamel Husain
Hello?00:00:22
Isaac Flath
Hey, how's it going?00:00:24
Hamel Husain
Very good.00:00:26
Background looks good. Yeah, there's no… There's no, like, blur or…00:00:30
A ghosting going on at all between you and your background?00:00:36
Isaac Flath
Yeah, the green screen helps a lot.00:00:40
Hamel Husain
But it used to, like, ghost a little, like, a bit more than this, so you did something to make it better.00:00:43
Isaac Flath
Yeah, well, I used to have a cloth green screen that had, like, shadows in it, because it wasn't tight, but now, like, if we look at the green screen… and then I, like, adjusted… I did a lot of fiddling with the lighting.00:00:49
Hamel Husain
It's like, if I look at, like…00:01:01
Isaac Flath
Let's see here…00:01:05
Like, now it's… I still have a little bit of this shadow, which I want to figure out how to get rid of, but it's pretty smooth.00:01:07
And then there's a setting in,00:01:16
The background effects that you can check for if you have a green screen, which helps as well, so…00:01:21
Do you have, like, one of those portable green screens now that you, like…00:01:28
Hamel Husain
Unfold or something like that?00:01:33
Isaac Flath
Yeah, yeah, it's, I mean, it's pretty big, though. It's a projector.00:01:36
Hamel Husain
screen, sort of?00:01:39
Isaac Flath
Yeah, pretty much. I think it's, like, 6 or 7 feet long, though. It's, like, it's pretty big. Wide this way.00:01:40
Hamel Husain
I have one of those, too, in the closet.00:01:47
It takes up, like, the whole room, though.00:01:50
Isaac Flath
Yeah, it takes up a lot of space.00:01:53
I might, I don't know. If I get my own…00:01:58
office someday. Maybe I'll do something different, but… until then, this looks a lot better.00:02:03
It's a lot better than people seeing that my desk is right next to the fridge and the dishwasher, crammed between the… I'm, like, between the kitchen and the cupboards, and, like, the pantry, so I'm, like, crammed in there.00:02:10
Hamel Husain
Oh, okay, I never knew that. I never knew, honestly.00:02:21
Isaac Flath
Oh, that's good.00:02:24
Hamel Husain
I thought, I always thought you had your own room.00:02:24
This whole time.00:02:27
Isaac Flath
No?00:02:29
Nope, I am,00:02:32
I have to, like, strategically, like… I'm like, okay, I got a meeting at 1, so, like, we can't run the dishwasher at that time, because the dishwasher's literally, like, right there. It gets loud.00:02:34
So… We'll see.00:02:45
We'll see if anybody shows up.00:03:06
Hamel Husain
Yeah, there's a few people in the waiting room right now. People don't usually come in till on time.00:03:09
Isaac Flath
Oh, gotcha, okay.00:03:14
Hamel Husain
I just joined early, cause…00:03:16
Isaac Flath
I can't see…00:03:18
Hamel Husain
I was already here, so…00:03:18
Isaac Flath
Yeah.00:03:21
whenever I do calls, I'm like, I hope either a lot of people come, or absolutely nobody. You know, because I can't, like, cancel it if there's two people.00:03:25
But, should be good.00:03:35
The people in the waiting room, do you know, are they product managers, or coders, or both, or… not sure?00:03:39
Hamel Husain
I can look them up, I can try to look them up, let me see.00:03:44
Isaac Flath
Guess there's not a lot of coding on Homework 1 and 2 anyway.00:03:51
Hamel Husain
We can just let them in. There's only 3 people, so I can just let them in.00:03:53
One's a founder…00:03:57
Isaac Flath
Okay.00:04:00
Hamel Husain
One is a… another founder, okay.00:04:00
We can, like, do a poll.00:04:10
On the fly, let's see if I can…00:04:13
I think there's a way for me to do this.00:04:16
Have you ever done a poll?00:04:21
host tools…00:04:25
Isaac Flath
Are you doing a Discord poll, or…00:04:28
Hamel Husain
No, I was gonna do a Zoom poll.00:04:30
Isaac Flath
Oh, okay. Makes sense.00:04:34
Hamel Husain
If I can. Polls and quizzes, create.00:04:35
Okay.00:04:40
Isaac Flath
I created a, channel for this, so…00:04:50
I don't know how… I'm pretty bad at watching the Discord while I'm talking, so feel free to interrupt me anytime, of course.00:04:54
Hamel Husain
Okay, I'm gonna let everyone in, I'm gonna disable wait room.00:05:21
I'm gonna let everyone in.00:05:27
I'm gonna save…00:05:31
Okay, I put a poll… Gabriella, do you see a poll somewhere in Zoom showing up?00:05:43
Like a voting poll?00:05:49
Gabriela de Queiroz
Yes, I do. C.00:05:51
Hamel Husain
You do see it?00:05:53
Gabriela de Queiroz
Yes.00:05:54
Hamel Husain
Okay, cool. It's the first time I ever tried doing this.00:05:55
Gabriela de Queiroz
Pretty cool.00:05:58
Hamel Husain
Nice to see you here, Gabriella, by the way.00:06:02
Gabriela de Queiroz
See you here.00:06:05
Excited for this today.00:06:06
Hamel Husain
Yeah, I am too.00:06:08
Cool. This is cool. Okay, so we got…00:06:16
I'll share the poll results in a bit. So the polls are streaming in.00:06:20
Did you get the poll, Isaac, as well?00:06:26
Isaac Flath
Yeah, I did.00:06:28
Hamel Husain
Okay.00:06:29
Isaac Flath
I did.00:06:29
Hamel Husain
Yeah, so we have, like, the poll show… is kind of…00:06:30
There's 26 people answered. It's pretty… it's a little bit uniform-ish distribution right now. That's interesting.00:06:35
Isaac Flath
Oh, okay. There's a little bit of a spike on developer, engineer, and product manager, but…00:06:43
I don't know what I am these days anymore. Yeah, I feel that. I was actually like, what should I put?00:06:50
Hamel Husain
Yeah, I don't know what I am, as well. Yeah.00:06:55
Isaac Flath
I was like, I guess if you don't know, maybe that makes you a founder? Is that what a founder is? You don't know what you're doing?00:06:59
Okay.00:07:06
Gabriela de Queiroz
Exactly, that was me, you know, I'm doing consultancy, and my founder is my company, like, am I the founder of the company? I don't know.00:07:07
Isaac Flath
Alright, should we get started?00:07:20
Hamel Husain
Yeah, we can kick it off.00:07:21
44% is product manager, so…00:07:23
Isaac Flath
No grip.00:07:27
Hamel Husain
So we have a 13% founder.00:07:27
28% engineer, product manager is about 40%.00:07:30
Other is 15%.00:07:35
Awesome. So, a lot of different skill sets.00:07:38
Isaac Flath
Yeah, so my hope is, to cover the homeworks without really assuming knowledge. So,00:07:41
evals, I think. You need a little bit of product information and knowledge and a little bit of coding, stuff as well. Ai can help with both.00:07:50
And, so some of the product stuff might be… like…00:07:59
A little bit obvious for the product managers, and the coding stuff might be harder, and for people who have a coding background, the coding stuff might seem a little bit trivial, but maybe thinking about some of the product considerations as we go over the course of these homeworks might be new or interesting.00:08:04
And, seeing how it applies, and how you use AI to kind of help with these things and to learn, I think is helpful. So…00:08:23
I'm gonna go ahead and share my screen, and then we will jump into it. I believe there's a, yeah, there's a new Discord, channel for this, and I think there's the Zoom chat. So, feel free to ask questions whenever, and Hamil will interrupt me anytime, or…00:08:31
Hamel Husain
Yeah, the Discord channel is called October 20… October-22-homework-review.00:08:48
You can find it under the homeworks section.00:08:57
So, I encourage you to ask questions there, just in case. I mean, you could always, like, put messages in Zoom, but, like, Zoom gets lost, and then no one can see your question. You might not even see the answer to your question, so it's a lot better to put it in Discord.00:09:01
Isaac Flath
Cool. So,00:09:17
To start with, what I'm gonna be using is, I'm gonna be using something called VS Code and GitHub Desktop. These are the easiest to install. If you've already got tools that you like, use those. I put in the Discord a link to this, code.visualstudio.com. Since I'm on Mac, it says download for Mac. You can install it there.00:09:20
And GitHub Desktop is something that you can go from, the link provided and download that application, as well.00:09:39
That lets you clone this Git repository, so after the session, feel free to give that a try to install the application and clone this repository, install VS Code, and then we can… that's what I'll be using. But there's a million different AI tools.00:09:49
you can try a new one every day, and you will not run out of AI tools to try. So, I just picked this one. Don't read into it too much. I thought this would… had good… had the clear install instructions, that's why I picked it.00:10:05
So, okay.00:10:18
So what I've got here is I have the repository cloned. We're going to be looking at homework 1.00:10:21
So, inside here, this is the directory, this is the folder structure inside of VS Code. You can go into Homeworks and Homework 1, and there is this README, which has lots of information.00:10:28
This is… the Markdown file.00:10:42
And… We can… Read it a little bit nicer with a preview.00:10:48
So…00:10:56
Hamel Husain
And you can get to that preview, you could do Command-Shift-P, and then there's something… that's the command palette that Isaac was using, which, like, you know, shows you all the different things. And then I think you went to Markdown Preview. You just started typing Markdown.00:10:57
Isaac Flath
Exactly. If it's not there for you, there's over here. I don't even remember which ones these are. There's extensions here.00:11:13
And you can search for anything, like, if you need something, you don't have a markdown thing, you can search and you can see there's a lot of…00:11:21
things for Markdown.00:11:30
And what I'd recommend is just find one that has, I don't know, a few million installs and use that.00:11:32
So… Okay, so,00:11:39
This is Homework 1. Homework 1 is all about, kind of, getting started and thinking about the prompting and thinking about things. And so, step one is going to be to write an effective system prompt, and we need to put that in a place where the actual application can access it, so we will see that.00:11:45
open a file and edit the system prompt, so we'll look at that. And there's all kinds of stuff that we want to consider. What's the bot's role and objective? How does it respond? How much freedom does it give? Output formatting? All of this. Some of this is important for, like.00:12:03
We want it to give, like, an accurate recipe. Some of it is more, how much freedom, for example, is largely tied into, like, who's our target audience.00:12:23
If we're making this for professional chefs who are interested in, kind of expanding what they do, and brainstorming, giving ideas, you might want this to kind of pitch off-the-wall ideas. If you're making it for…00:12:33
beginner chefs trying to cook at home to save money, and they've never really cooked before, you probably don't want a lot of creativity. You want them to make something quick and easy, etc. And so, all of these questions often tie into, like, well, who's your audience? What do you want the product to do?00:12:49
You know, clear output formatting,00:13:06
Ties into user experience, and also what,00:13:10
what kind of things you're looking for. And then we'll expand and diversify the dataset. But let's get started with this first one, write an executive system prompt.00:13:15
So, this says open backendutils.py, so this is the backend directory.00:13:24
We can open utils.py, and we see this is the system prompt.00:13:29
So I find this a little bit hard to read.00:13:35
And so we're gonna use Copilot, which is an AI agent built into VS Code, and what we can do is we can ask it to make things a little bit more readable. What we were just looking at was a Markdown file, and we previewed it in this nice format.00:13:37
I would like to be able to, like, read it a little bit nicer than… than this, and so we're gonna go ahead and ask the agent to do that.00:13:55
Please move… The system prompt… into a Markdown file, And then import that.00:14:03
for the system.00:14:15
Prompt.00:14:18
variable.00:14:19
This is the system prompt variable. You can see it's equal to…00:14:21
You see there's, like, parentheses, and then a bunch of strings, bunch of text. I'm just gonna move that to a Markdown file so we can work with it a little bit easier.00:14:27
And that's something that… Shouldn't… should not be a problem for the agent to do.00:14:36
Alright, perfect. We can keep the changes, keep the changes, and what we can see is we can see the code and learn a little bit from it, if we haven't.00:14:53
Where it's gonna look at the, you know, it's gonna look at the current file, where is that located, the path of the current file.00:15:02
and looked in systemprompt.md, it just created that. And system prompt is going to be that file.readText.strep. There's a lot of, like, syntax in there. I think once this is created, like, read it, make sure it makes sense. You don't necessarily always have to…00:15:10
like, know all the syntax up front. But I can see what it's doing, I can see that it's putting it in a… it's getting a file path for System PromptMD, I can see the file, I can see that it's reading text, and it's doing some sort of stripping.00:15:28
From… from the edges.00:15:41
And so now I can… I can edit my system prompt here, where, I think it's a little bit more comfortable.00:15:44
Alright.00:15:52
So, you are an expert chef recommending delicious and useful recipes.00:15:54
Present only one recipe at a time. Okay, so there's a few things here.00:15:59
Recommending delicious and useful recipes. So, useful is a little bit vague, so let's… let's do this. We can add this system prompt, and we can say, I don't know.00:16:11
Ask questions.00:16:21
to help me… Think about how to improve the prompt.00:16:22
And, what might need to be considered to make it… Respond… any good.00:16:29
Way for a product.00:16:40
Things… Like, who's the audience?00:16:43
And so we can often ask AI to help ask us questions. Now, I could have also said, go to systemprompt.md and write a good prompt for me, and it would have spit something out, but there's a whole lot of stuff that00:16:49
is problematic about that. Again, we just talked about… we talked earlier about, recipes. Like, if we say useful recipes, it's like, useful to who?00:17:01
You know, useful to someone who's got 15 minutes to spare, useful for someone who's cooking for…00:17:11
You know, they're 5 kids, useful for someone cooking in a restaurant, as a chef.00:17:16
It's kind of unclear.00:17:22
Good. So we can kind of go through, and we can start to think. I think at this stage, and yeah, feel free to jump at any time as well, Hamel, I think at this stage, it's like, get something reasonable, but don't go too far with this. You know, you don't want to end up with, like.00:17:28
3,000 lines of a specification, because not all of it would be needed, but you needed to have, like, something basic. Something basic, so…00:17:42
I might say.00:17:53
So maybe this line is okay. Who is your primary audience?00:17:57
Hamel Husain
I like it a lot, I like this process a lot, by the way.00:18:01
Isaac Flath
Oh, thank you.00:18:04
Hamel Husain
It's a good way to think through things.00:18:04
Isaac Flath
Okay, so who is my primary audience? Are they beginner cooks, experienced cooks, busy parents, health-conscious users with dietary restrictions? Those are some options. Maybe there's some others. But let's say…00:18:08
My primary… Audience are very… busy, new… professionals…00:18:22
Hamel Husain
I got a question for you.00:18:36
Isaac Flath
That are… yes.00:18:37
Hamel Husain
So some people would, like, answer these questions in the chat, and then say, like, you know, like, maybe answer them in the chat in a bulleted list.00:18:38
And then tell the AI to, like, update the prompt. Do you like writing it here, or do you… what's your preference?00:18:46
Isaac Flath
Oh…00:18:53
I don't really have a huge preference, I think either way, I mean, either way, I'm gonna write this, I'm gonna write it poorly, and I'm gonna edit it, or I can put it in the chat. Let's do it in the chat, why not? Use AI for it. And then we can see that we might want to edit it afterwards, probably.00:18:54
Hamel Husain
Yeah.00:19:09
Isaac Flath
The point is, the main thing that I find that's really bad when people put stuff in the chat with AI is if they put stuff in the chat with AI, and then it writes the prompt, and then they never read it, because they assume, like, oh, it wrote what I wanted because I told it, like, you still have to read the prompt.00:19:12
So, I like using voice transcription,00:19:31
A lot, especially in here, just to make things easier, so,00:19:34
I'll go ahead and do that.00:19:39
My primary audience are beginner cooks. They need step-by-step guidance. It's gotta be easy to get ingredients, because they're just going to, like.00:19:42
their local grocery store, or, like, a Giant Foods or something. And it's gotta be…00:19:55
fairly quick to make. You know, beginners don't want to spend 2 hours cooking a meal, things that are less than an hour to cook.00:20:02
The main use case is, just getting started being a little bit healthier, learning to cook, for their day-to-day.00:20:11
They need… For level of detail, it should have.00:20:22
prep and cooking times. It should mention what equipment is needed. It can have a little bit about nutritional information, but that's not the focus, so don't put much in there.00:20:28
And it should just immediately respond.00:20:39
With a recipe, it does not need to, proactively ask about allergies and restrictions, but if a00:20:44
Recipe has some sort of…00:20:52
Common allergy, it should call that out at the top. You know, for example, if it has shellfish, or, if it has nuts, or things that are common allergies, it should, call that out at the top for users.00:20:56
And I can just, like, this is kind of a common thing that I'll do.00:21:12
And just transcribe.00:21:17
What tone fits your brand and users?00:21:19
I'd like it to respond and have an output format that's, you know, fairly casual. It needs to be precise, but…00:21:23
to make it easy to follow, but it's, like, focusing on beginners, so it's clarity and simplicity is the most important thing. But I don't want it to sound stuffy.00:21:33
And, yeah, so I see, how should it handle uncertainty, like, I don't really know.00:21:48
To be honest. I don't know what uncertainty it's talking about, so I'm not gonna guess.00:21:53
I'll just…00:21:58
keep that in mind, as that's something that might be off, and when we go through the open coding later, we can see, like, if we see uncertain things, then we can make the prompt better, so I'm not gonna worry about that, because I don't know…00:22:00
I don't really know the answer to that.00:22:13
What about recipe variety? Not gonna worry about that as well.00:22:15
Yeah, so let's do this.00:22:21
update the… System prompt. With this info.00:22:26
And so, yeah, I might just use this as a starting point.00:22:33
And I think the key thing is, is that…00:22:37
you just need to think a little bit about how this is gonna be used to, like, give it at least a chance. If you just say you're an expert, please return recipes, then it really…00:22:43
It really doesn't even have a chance at working.00:22:54
So let's see, you're a friendly cooking guide. So it changed it to Expert Chef to Friendly Cooking Guides. That was probably in response to the tone. Helping beginner cooks learn to make healthy, everyday meals.00:22:57
Approachable, clear, and easy to follow.00:23:08
Okay, so I'm gonna say, they do not, I'm just gonna add, have… Anything beyond… Normal, basic.00:23:17
Cooking equipment?00:23:28
This might even be doing… going a little bit too… too early, but, you know, it's… for some reason, it's a worry in my mind, so I'm at it.00:23:32
Simple recipes that take less than an hour, readily available at standard grocery stores, clear step-by-step, present only one complete recipe at a time.00:23:41
Never ask follow-up questions.00:23:51
Okay, the format is the recipe name, Allergen Alert. Okay, I told it to put it at the top.00:23:55
Serving size, default… you know, I think I'm gonna do 4 servings, I'm gonna assume this is for…00:24:04
I don't know, small family?00:24:09
Prep and cooking time… Equipment list…00:24:12
You know, I think I said here… I set up here, do not have anything beyond… so I'm just gonna assume that we don't need to list equipment needed, because it's always gonna recommend… I want it to always recommend basic equipment,00:24:21
So I'm not gonna do, like, an equipment list.00:24:36
So that should, like, make it maybe a little tighter for users.00:24:40
Brief note on nutritional benefits.00:24:45
Yeah, I like this. I mean, they're trying to be healthier, but, you know, they're not trying to optimize on it, so keep their goal top of mind so they feel good about it. I like that.00:24:53
Sounds good. Mix up your recommendations.00:25:07
Like…00:25:11
Here, this is where, like, it helps, I know we haven't talked about how the actual chatbot works in this, but, this is where it helps to know, at a high level, even if you don't know the code, how the app works. This app isn't tracking00:25:12
recommendations or recipes. So when it says don't suggest the same recipes repeatedly, it has no way to actually not suggest… like, it doesn't know what recipes it suggested before.00:25:28
I guess if it's, like, a continued chat, if they, like, keep asking the same question over and over.00:25:42
But I don't see them doing… like, I don't see someone saying, like, hey, can you give me a recipe for salmon? And then following up with me, like, oh, give me the same… give me a different salmon recipe. I don't… I don't think it would just give the same one back.00:25:48
So I'm gonna do that, and I'm just gonna move this recipe… this… if the user doesn't specify ingredients…00:26:05
Assume they only have…00:26:12
Oh, and it already says this here. Use ingredients readily available at a standard grocery store. So I'm just gonna delete that.00:26:15
Cool. And this is, like, an okay start. You know, you can spend as much time as you'd like on it.00:26:23
Hamel Husain
I like that you read every single word of the prompt. That's really important. A lot of people don't do that, especially when you use AI. People just…00:26:29
Say, oh, AI wrote it, but you don't want to do that.00:26:37
Isaac Flath
Yeah, and there were small things that I didn't like that I changed along the way, yeah.00:26:41
Hamel Husain
Yeah, I mean, if you left that repeated stuff in, I think it could have confused things quite a bit.00:26:49
Isaac Flath
Yeah.00:26:56
Okay.00:26:58
Okay, so we've got a starting system prompt. This is our best guesses at a product. If you have more information, you can add more, but, you know, in general, I look at this and I think it's reasonable to think that, like, it at least has a chance. And I can see where the gaps are, and then I can iterate from there.00:27:00
If you don't have anything, or it's just, like, a single line prompt, then it's like, of course everything will be wrong, because you didn't give it any information. So, that's kind of the goal here, is, like, it's a starting point before we do evals.00:27:18
Cool.00:27:30
So let's go back to the homework.00:27:33
And here's some things, it says, you know, response rules,00:27:36
What should it never do? We included some stuff there. How much freedom…00:27:42
We talked a little bit about the creativity levels, talked a little bit about the output formatting was in there. So those are just kind of things to think about.00:27:47
And you can see, I could go back again and ask it to ask me more questions than I could think about, and you can kind of iterate through that process as much as you'd like.00:27:57
Are there any questions, or should I kind of keep going?00:28:07
Hamel Husain
Don't see any questions anywhere,00:28:12
If someone has a question, please ask it in Discord, or… Now, or even…00:28:15
Yeah, or even, like, just by voice. There's not that many people here, so feel free to ask a question.00:28:22
If you have any.00:28:28
Pardeep
Is it better to ask questions here on Zoom, or should we just ask questions on Discord, so we have a record of these were the questions?00:28:29
Hamel Husain
Discord's probably better, just so we can all see it, and you can get to it later, easily.00:28:36
Pardeep
Yep, fair enough.00:28:42
Isaac Flath
But you can also just unmute and, like, voice ask, too. Yeah.00:28:43
Hamel Husain
You can do that.00:28:46
Isaac Flath
I'm happy to just chat with people.00:28:47
Alright, so, step 2 says, expand and diversify the query dataset. So, open this file.00:28:55
Add at least 10 new diverse queries to this file. Your queries should test various aspects, including things like…00:29:02
Cuisines, recipes… okay, so the idea here is you're trying to get, like, a representative sample of what users might ask.00:29:11
If you already have a built app, that you might be able to just, like, go look at some history or logs and, see what people have actually asked.00:29:21
If not, you… again, you think about, like, what would a user… what… what might a user ask? And a user that, like, we want to, like, optimize for.00:29:32
So… Let's go into this data.00:29:43
Samplequeries.csv. Okay, so we see a start here.00:29:48
So, this is, you know, another place where it's very easy. I could just ask AI to, like, generate a bunch.00:29:53
But I think at least for the first, this is asking for, like, 10. For the first ones, I think it's really beneficial just to, like.00:30:02
Try and think of 5 or 6 or 10, actual user queries that you might ask.00:30:10
Just because it's a good thinking exercise, you know, if I can do a lot from it, especially as you try and think of, like, what are all the variants.00:30:17
You can see AI is trying to, auto-suggests some stuff.00:30:24
And I want to think about these in terms of… the,00:30:31
In terms of the audience that…00:30:38
might be doing it. Like, I don't want to ask about how do I make a, French silk pie from scratch, because…00:30:41
that's not what a beginner trying to be healthy is trying to cook, you know? That's not the goal of this app. You know, maybe that is something you want to support for a different audience, but that's… that's a little bit different. So…00:30:50
We might,00:31:02
We might say, I have chicken or rice, what can I cook? That looks good, give me a dessert recipe with chocolate. I'm gonna go ahead and…00:31:05
I'm just gonna keep… keep this one.00:31:12
And we're gonna go here. So, let's think of others.00:31:15
What?00:31:19
Give me a breakfast… That's not eggs.00:31:26
Maybe someone's been eating eggs every time, and they just want something other than eggs.00:31:34
I can say… Okay, so I did a breakfast one. What would something be for lunch?00:31:38
Recipe for… Healthy pasta dish.00:31:48
That might be more like a dinner or something. Or a lunch.00:31:59
Maybe I can say, what can I make Make with broccoli.00:32:04
CAI is trying to help quite a bit here.00:32:22
And so, keep going. So, like, if you get stuck and you're like, I don't know…00:32:24
I don't know what to do. My advice is, to try and think of… keep thinking, because if you can't even think of example queries that your users might ask, then how are you going to build a product to help your users?00:32:32
So, this is, like, an important step, and I think if people get stuck.00:32:46
They very often can just create them, but again, it's like, if you don't have a good, clear vision in your mind of, like, how your product's gonna be used, how this product's gonna be used.00:32:50
it's gonna be really hard to… to do any kind of, like, meaningful evals. And so spending a lot of time here. If you really get stuck, and you… you need it, you can… I would recommend doing something like,00:33:00
Based on the target audience, What are some…00:33:14
Types of things users might want to cook.00:33:22
And, kind of, rather than trying to get it to generate the synthetic thing.00:33:29
Try and find ways that you still have to think to create these sample queries.00:33:34
And so I might see, like, okay, some protein stuff. Okay, protein, maybe that's a theme I can think of.00:33:39
I'd like a high-protein lunch.00:33:46
Oh, snack. I like snack. Something like that.00:33:53
Okay, some sort of… some sort of grain bowl. People like grain bowls.00:33:57
I'm looking to start using quinoa for a… A veggie bowl.00:34:05
That is something that people love to do when they're starting, to eat healthier.00:34:19
And so, again, you could just ask them to generate, but I think it's important to be able to00:34:24
be able to come up with ideas of, like, how your product's gonna be used.00:34:33
Fast 15-minute meal… No time. Maybe we do it based on time.00:34:44
And you can kind of keep iterating and keep thinking of how this is going to be used.00:34:49
I don't know.00:34:55
Bake… let's see, this chat says 22 in this chat. Does this a…00:35:00
Something I need to look at.00:35:06
Cool.00:35:15
So, if you look back at the homework, it gave, various ideas. So we had specific cuisines, we talked about, some of those. There's dietary restrictions, available ingredients, we've done a ton of those. There's things like meal types, cooking time constraints, we did some skill levels.00:35:18
I don't need to worry about skill levels, because I'm assuming everyone is beginner-friendly. And here I can decide00:35:36
What if somebody comes in and asks for an advanced recipe?00:35:43
Because our app is supposed to serve beginners. Is that something we don't want to optimize to? Is that something we want to support, and optimize for? If you're just starting out, you probably just want to focus on your core audience.00:35:48
What happens when those beginners start becoming more comfortable? What do you do then?00:36:02
vague or ambiguous queries. I can go in and say… Give me… more… types of… Queries other than these?00:36:09
And I think the core thing is, like, use the AI to, like, help you brainstorm and think, rather than…00:36:24
Try and automate away too quickly.00:36:31
So, we can say, okay, vague or ambiguous, queries. Beans.00:36:35
Skill levels. Let's see here. Cooking time constraints. I think we had a 15-minute meal. Let's do a 45-minute meal. I have…00:36:42
I have about 45 minutes.00:36:54
And… Ground pork.00:36:58
And, let's see, what else? We have available ingredients, we've done that, we've done meal types, we've done cooking time constraints, let's say,00:37:08
I… I… I'm having… let's do a friends over for dinner.00:37:20
And want to impress them.00:37:28
Well, let's just say I'm having friends over for dinner.00:37:31
There will be 8 of us.00:37:33
So, how's it gonna respond to that, knowing that that's kind of an atypical request? And so, yeah, I think,00:37:37
Just kind of think it through. See a question?00:37:45
Should sample queries include queries that aren't well-formed? Like, I don't know, I just want breakfast that has jam, or we don't handle the LLM to handle this. You don't… and I don't trust the LLM to handle anything. So, yeah, if you think that's a realistic user query, and they're probably gonna ask that, I would definitely include that,00:37:48
As a query.00:38:07
Dumb.00:38:13
conclude that.00:38:13
The only things you really don't want to include are things that you think are unrealistic. So, like, I wouldn't want to say, and this happens a lot if you use AI too much to create these, and you don't review. You might see something like, I am looking…00:38:19
for a recipe…00:38:35
for my…00:38:42
Hamel Husain
Someone asked, hey, why aren't we using tuples? Tuples are in homework, too.00:38:42
So, just in case anyone's watching, it might be confusing, because you already, you know, gone through the course so much. You might have forgot about Homework 1 at this point, but at this point, we're just warming you up to the idea. And that's what we did in Homework 1.00:38:49
Isaac Flath
Yeah, we will be doing tuples soon, for sure. So, like, this is a query that, like, I would often see something like… something odd like this that AI's created, and you probably wouldn't want to do this, where it's, like, a breakfast potluck with meatloaf. Is that really something that a user is gonna ask, or is that just something…00:39:05
kind of weird. And so, he wouldn't want something…00:39:23
Yeah, you wouldn't want something that a user's not actually gonna ask, or is so off the wall that it's not… it's, like, unreasonable for you to, like, prioritize fixing that.00:39:29
Alright, cool. Any questions on this so far, or should we keep, keep rolling?00:39:42
Hamel Husain
I don't see any questions.00:39:50
Jon Pedley
It's interesting that in the unrealistic query thing. I mean, there's… every time there's, you open a newspaper, it seems like, well, that's not an… well, that is an anachronism. But, every time you open a news story, there seems to be some, you know, example of how somebody managed to get, you know, some AI to…00:39:51
to, you know, tell people to kill themselves, or something like that. So, it seems to me you do sort of have to think about00:40:11
things that you wouldn't expect people to use your product for to protect yourself.00:40:19
Just to make sure that, you know, you won't get… Into the paper.00:40:26
Isaac Flath
Yeah, so, yeah, figuring out these kind of ways that people might, intentionally… so I guess, I guess there's two things. There's accidental misuse of your product, like an expert chef coming in and being disappointed that all it's recommending is basic stuff when he's trying to00:40:33
You know?00:40:52
cook for a wedding. Like, there's, like, accidental misuse that you could protect against. And then there's, like, intentional misuse. The intentional misuse…00:40:53
is gonna be, you know, something that you kind of start to look into the, start to put into the prompt about, safety. I think there was something in there, and then eventually there's, you know, ways that you could try to identify if somebody is,00:41:03
Attempting.00:41:19
to do something malicious.00:41:21
So I think that's… I don't know what your thoughts are, Hamel, but I guess I kind of put that in a different category. There's, like, a…00:41:24
Hamel Husain
Yeah.00:41:30
Isaac Flath
users that are, like, honestly trying to use the product, and, like, how do we make that good? And then there's, like, the people trying to hack, break, destroy, whatever.00:41:31
Hamel Husain
Yeah. I mean, it's like, you gotta be really thoughtful, and, like, really understand, like, what matters, so…00:41:39
If you have Recipe Bot, And it's going, you know…00:41:48
Can you get it to say bad words?00:41:53
are, like… you know, other offensive things to you? Probably.00:41:57
But is that… You know, someone…00:42:03
if they're doing that, like, you know, do you really want to protect against that? It's not really clear, right? Whereas if you're in, like, an airline.00:42:08
And if you're going to make a statement about, like, the price of a seat or something, then maybe, like, you're gonna do that incorrectly, then that might be more problematic?00:42:16
You know, if you're, like, in healthcare, or if you're, like, a mental health coach.00:42:25
And you're telling people…00:42:32
like, go kill yourself, that seems very problematic, because, like, you kind of have, like… so…00:42:33
A lot of times people get carried away with this, like, oh, can you jailbreak?00:42:40
Yeah, you can totally jail… there's no, like, 100% defense against…00:42:45
steering and LLM to say whatever. It's just a matter of, like, do you care? Like, should you care? And, you just have to be honest, like…00:42:49
Yeah, sometimes it's not worth focusing on that, because if someone's gonna go through all that trouble,00:42:59
You know… Then, they might.00:43:05
Jon Pedley
Yeah, that makes sense.00:43:08
Isaac Flath
Every time any coding agent or OpenAI or Claude changes their system prompt, it's like hours later, someone has jailbroken it and shared the whole system prompt and shared all kinds of ways to jailbreak it. Like, it's like, it's kind of like, oh, it's…00:43:10
It's annoying, but yeah.00:43:25
Okay.00:43:30
So next step, run the bulk test and evaluate. After you've updated this, run the bulk dataset. It says run this.00:43:31
I can say, run this… You can run it yourself if you'd like,00:43:42
If you don't know, how to, you can…00:43:49
Again, leverage AI to help you with things. Alright, no module named.00:43:53
So here… I didn't have the dependencies installed. It's gonna try and install them for me.00:43:59
I actually thought it was just gonna work, so this was even a better demo than I expected.00:44:09
Now, if you're… grabbed a random repo off the internet, you probably don't want to just, like, install whatever requirements it tells you to install.00:44:17
If it's a repo that he trusts, or something that your team built, or something.00:44:27
You could probably be okay with it.00:44:31
Alright, saved 11 results into this, results 2025…00:44:34
Okay, so we can see here…00:44:40
Great.00:44:48
We can see all of our, this is the ID.00:44:51
This was the query.00:44:55
here is the response. So we've got this in a big file. It's not the easiest to read, but we can see we've got everything there, so that's perfect.00:44:57
Onwards to Homework 2.00:45:08
Alright, homework 2.00:45:12
Okay, we identify key dimensions.00:45:17
We identify 3 to 4 related to our functionality, and so this is where we're getting into unique combination of tuples. So,00:45:20
A tuple is just a set of three things, or however many things you want.00:45:31
And so, we want to think about this prompt. I'm going to go ahead and start a new chat, so I'm gonna hit this plus to clear it.00:45:37
And, we can ask…00:45:44
Identify 3 to 4 dimensions relevant to your recipe box, and… okay, so here is again where you want to think about00:45:51
What is the user of this app gonna use?00:45:58
And what dimensions do they care about? We've already defined our,00:46:01
Our users as being kind of beginner chefs, and so that kind of, narrows it quite a bit.00:46:07
And so, if they're beginner chefs that are just starting their journey, seems like time to prepare would be an important one. Like…00:46:15
Okay, so we'd say, I have a few key dimensions, I want you to… Create unique examples of…00:46:24
unique combinations of.00:46:37
So let's say time to prepare.00:46:43
We already put in the prompt that it should be, like, less than an hour, so let's just say,00:46:47
15… 30, 45, 60 minutes.00:46:52
Great.00:47:01
So, what's another?00:47:02
maybe we want to do dietary restriction? That's… that's one that kind of comes into the,00:47:05
The issue is that, like, if somebody says no shellfish,00:47:11
How important is it that you give them a recipe with no shellfish? Is that a… is that a safety risk? So that might be something you'd think about, that safety side of, or the jailbreak side of.00:47:17
Time to prepare.00:47:27
I like meal type.00:47:31
So let's say breakfast.00:47:33
Lunch.00:47:35
Dinner or snack.00:47:37
Okay.00:47:41
Hamel Husain
Someone's asking, Isaac, like, Bonnie's asking.00:47:43
Where… where would someone non-technical do this?00:47:49
And I think they're saying, what I infer is, like, hey, like.00:47:53
Do you think this IDE approach you're showing is approachable for a non-technical audience? Do you… do you…00:47:57
Do you, recommend using, like, the co-pilot chat? Would you use Claude Code in the terminal? Do you mind just giving a little bit of…00:48:07
What your opinion is of, like… Where you should start.00:48:15
Isaac Flath
Yeah.00:48:19
Yeah, so, like, this that I'm doing here, that I'm putting in the chat, I could definitely do this in ChatGPT online, I could do this in there, I could have…00:48:20
you know, copy and pasted from the code into there and done these things. I think that this IDEVS code, it is overwhelming, because there's, like, here's all this file, here's all these venues, there's stuff here, there's this chat.00:48:29
There's this terminal, so it is overwhelming at first. I think it's worth it to… to spend some time.00:48:45
And just get used to this IDE.00:48:52
Because there's a whole lot of stuff that you might want to do with code. So, for example, this let me kind of work with the stuff here and understand the system prompt, find where it was with the agent.00:48:55
put it in a file. So, I think if you're willing to put some time into learning this,00:49:09
I think… I think it's time well spent.00:49:16
You can certainly use Cloud Code. The reason I like VS Code is because it does have a lot of stuff, but you learn this one tool, and you get everything. Like…00:49:19
you get the agent, you can look at your text files and your prompts and your data. I can look at my CSV in the same place.00:49:31
I can look at the code, all in the same place.00:49:40
I can look at my files. And so I'm a fan of this. I do realize that it's, like, overwhelming and a lot at first, but the one nice thing is, like, if you learn this, you don't have to learn00:49:45
you know, anything else. If you do clawed code, you can do a lot, but then eventually, when you have to look at a piece of code or markdown file, you might have to learn a new tool for that. Or if you want to,00:49:57
So you end up with, kind of, a few tools as you get more and more and more into things. At least, that's my… like, if you want to do a chat, then you have to jump to, somewhere else often.00:50:14
So I like this because it kind of gives everything in one interface, but yeah, I do realize that it's…00:50:25
Hamel Husain
Yeah, I agree with you. I think if you're, like, starting out, I think it's probably easier to use the Copilot chat than Cloud Code.00:50:33
Because… you can, like, control what you're putting in the chat more? Like, you can control the context.00:50:40
It can be, like, easier.00:50:46
I think. My opinion.00:50:48
So, I think this is good.00:50:51
Isaac Flath
Yeah, I've…00:50:55
But yeah, I've worked with a lot of people who have never coded before, and getting them started in VS Code, and it's always like, there is so much, like, what is this, and what is this? So it's…00:50:56
Yeah, it, it, it is… I get there's a lot.00:51:07
Katya May
Could I ask a question, Isaac? So, I'm completely new, but… so, and I took the course last time.00:51:13
And then went away, realized I need to learn how to code, and I'm back again. So, and I tried out VS Code, and somehow ended up in Cursor. If you were comparing for a new person, and I like Cursor, Cursor versus VS Code, because they're very, you know…00:51:22
You know, one fork or the other. I like cursor. Any reason that you would say VS Code instead of just cursor? Because you have to learn the same things.00:51:36
Isaac Flath
No, Curse was fine. I picked VS Code because,00:51:45
Because it's just a little bit more popular in general, but yeah, I think Cursed is just totally fine. It's equivalent, in my opinion.00:51:51
Katya May
Yeah, I like it.00:51:58
Isaac Flath
Cool. And there's actually a lot of other tools I could have picked. I could have picked Windsurf, I could have picked…00:52:02
Like, Zed. Zed might be a little bit harder, but, yeah.00:52:08
If you don't know what to pick, and you're gonna spend a lot of time figuring it out, just pick VS Code, or Cursor, you know?00:52:14
So you don't… Yeah.00:52:20
Alright.00:52:23
Okay, so I've got time to prepare, I've got meal type.00:52:26
You know, I want to think, about…00:52:29
you know, something else, if I look at my… Homework 2 directory.00:52:36
I think, cuisine type? Sounds good.00:52:52
So maybe we want Italian, American… Chinese, maybe, I don't know, Greek.00:52:58
So, a few types of food. And so, alright, so that looks good, so I've got that. Generate unique accommodations, write a prompt to generate 15 or 20 of these, okay?00:53:12
Please generate… 15 or 20 unique combinations.00:53:24
Of these values.00:53:33
Nitish Kackar
Can I ask a quick question on dimensions?00:53:37
How would we model something that's not very, like, categorical? For example, like.00:53:40
query that says, replace certain ingredient, or I don't want… abuse.00:53:47
Oil, or something like that.00:53:54
I would be at a dimension for that.00:53:57
Isaac Flath
Like, for ingredient type?00:54:00
Is that what you're…00:54:02
Nitish Kackar
Yeah, like, queries that people want to substitute some things in the recipe.00:54:04
That's not, like, super straightforward, but, like, users may request.00:54:10
Isaac Flath
Okay, yeah, let's try it. Substitutions… So, for example…00:54:15
Sub pasta, or… what are some other substitutions that you can think of?00:54:25
Nitish Kackar
Like, no meat, or… Yeah, less oil.00:54:32
Katya May
Tofu instead of meat.00:54:40
Isaac Flath
Okay.00:54:46
So we'll say they should all be realistic, queries? Human?00:54:48
a beginner?00:55:00
Chef?00:55:03
may ask?00:55:07
To get a recipe.00:55:08
Let's just give this a shot. Use common substitutions The ones listed… Are, some examples.00:55:11
But if there are other really common ones, Include those as well.00:55:26
So this is one way to do it. The other way I might do it is I might say, here are some substitutions, like, I might do this in two-step, here are some substitutions, like, no meat, tofu instead of meat, like, here are some substitutions, give me a list of, like, 10 other possible substitutions that people do commonly, and then I can, like, use that00:55:35
brainstorming list.00:55:53
to create some categories.00:55:55
The goal here isn't necessarily to create every possible variation that a user might do, because that would be…00:55:59
you know, thousands of queries, and if you come up with every possible combination of questions, you can just hard-code the bot.00:56:05
But, at least give you a starting point. Once you get, like.00:56:13
Once you get the process started, then ideally you can start grabbing, like, actual user questions over time, but,00:56:18
Yeah, I would just kind of use AI to try and brainstorm. I'm gonna take this, because I'm a little bit less confident that these substitutions, like, I think it's probably gonna give me some weird substitutions.00:56:25
So, I'm just gonna ask for a few more examples.00:56:36
And see if, because I think some of them will probably be bad.00:56:41
Francisco?00:56:46
Francesco Lanciana
Yeah, so on this, dimensions, like, I think in the book it said something around.00:56:48
they're supposed to be, like, where you're expected to fail. Is that kind of how you've thought about it here as well, or has it just been, like.00:56:55
you know, common categories that you… you think about. Like, was… yeah, was there anything specifically about these that was like, yeah, these… these are where it'll fuck up?00:57:06
Isaac Flath
So I think, it's a little bit of a combination of, like, I wanna… I wanna well represent what users are actually gonna ask, but then…00:57:16
like, I don't want something so simple that, of course, it's gonna be… like, I don't need to test something that I know is gonna succeed. At this stage, I don't really know what's gonna succeed or fail,00:57:26
So let's see here…00:57:38
Okay, so here it's doing something weird, so we'll talk about that. So, yeah, I mean…00:57:40
Hamel Husain
You want to come up with a hypothesis where possible, like, when… you know, ideally, when you're generating static data, you… the whole goal is to trigger failure modes.00:57:45
And so, to the extent where you can…00:57:54
Put some of your own hypotheses in there?00:57:58
is good. Sometimes you don't have any idea.00:58:01
You know, and this is the homework, we didn't… we didn't even do error analysis yet, so it's like an iterative process, like, maybe you do some dumb synthetic data generation, and you're like, oh, like… and you learn… you keep learning, and then you keep iterating.00:58:04
Francesco Lanciana
Yeah, okay, gotcha.00:58:20
Isaac Flath
Yeah, and I think it's, like, you don't want to, like, go so far about, like, oh, I bet… I bet… I bet this'll screw up the bot. Like, you… you want some that are realistic, but, like, the goal isn't to, like.00:58:26
Like, they still have to be realistic, you can't just, like, just try and… Trip things up.00:58:37
Francesco Lanciana
Thanks.00:58:45
Jon Pedley
I did a lot better than I got with ChatGPT. I got things like…00:58:50
Can you and none recipe for tonight?00:58:55
for example.00:59:02
Isaac Flath
Yeah, so it's like, it's interesting, so definitely…00:59:04
So I wasn't very clear, so it didn't really give me… it just jumped to the recipes instead of these unique combinations.00:59:10
Hamel Husain
Is it tuples? Yeah. Yeah.00:59:16
Isaac Flath
So I'm gonna go ahead and hit Restore.00:59:20
Checkpoint.00:59:23
and say, I want them… to be tuples, so I can quickly… Evaluate… before generating The user queries.00:59:26
I don't know, maybe I could have… Like, it's not…00:59:43
completely formulaic, like, maybe I could have just evaluated them from there, but, you know, I think this is, a really nice thing to have, to have it broken up.00:59:46
For example… I might say, for example, 30 minute… breakfast… American…00:59:56
No meat, no meat, I don't know, something like that.01:00:20
No eggs. Go with that.01:00:24
Hamel Husain
You want your meat, okay.01:00:28
I'll do that bacon.01:00:32
Good.01:00:33
Isaac Flath
No, no bacon, okay.01:00:34
Give me, 25 more examples like that.01:00:36
in that format.01:00:44
So a lot of times, if you're… if it gets to you… if AI gives you the wrong thing, it's because you didn't… like, hear it now, it's giving me a much better list.01:00:46
So, okay, so here, write a second prompt for the LM to take 5 to 7 of the generated tuples and create a natural query. Okay, so what I can do here is I could just tell it to use those, but I actually, again, want to look at everything. I want to see…01:00:59
If any of these Seem like ones that are completely unrealistic.01:01:19
Like a breakfast, dairy-free breakfast, Italian lunch with vegetarian.01:01:25
And so, I look through all of these and think, do these seem reasonable, or are there any that are…01:01:38
like, contradictory… Like, use chicken instead of beef. Like, that seems really odd to me.01:01:45
Like, why would someone say, hey, I want an Italian dinner, but use chicken instead of beef? Like, wouldn't they just say use chicken? That doesn't seem like a substitution.01:01:53
That people would commonly put in. So, I probably would, remove that one.01:02:02
It says to take 5 to 7 of the generated tuples, so I just need to pick 5 or 7 of these.01:02:09
Let's see, so let's see,01:02:18
There is a good one. Use tofu instead of meat. That was one of the ones that came up. I think a no-dairy one would be a really great thing, so we've got a lunch and a dinner.01:02:21
Here's a soil, healthier one.01:02:37
Alright, so I probably want, like, a 15-minute one,01:02:43
Great.01:02:50
Probably want at least one, one breakfast. Here's a vegan… vegan breakfast?01:02:51
It'd be interesting.01:02:59
Seems… Okay, so…01:03:00
Yeah, so we've got 5, so that's the… that's kind of the thing, is you kind of read and you pick them, and you think, like, oh, actually, here.01:03:07
No substitutions needed, I'm… That sounds, like, interesting.01:03:14
Throw one of those in. So think of things that might be, realistic if it's, like.01:03:18
A 15-minute dinner using chicken? That doesn't… Seems like a really fast…01:03:27
If you have to cook the meat, that seems really fast, but…01:03:35
You know, so think through, like, what's actually realistic, because some of these combinations are probably not going to be realistic, and then pick those.01:03:38
And again, we can make a prompt.01:03:46
So I'm actually going to copy this, and because I only wanted to generate it for these, I can… I'll just create a new conversation, then I don't have to worry about it, taking the previous, context, because the context isn't there.01:03:48
Okay.01:04:02
Take these combinations of generated tuples, Then create user queries.01:04:04
And again, I want to say a little bit about what I want. I want them…01:04:15
Realistic, as a beginning chef, might enter them.01:04:22
That means it might… varying… Amounts of ambiguity.01:04:31
Grammatical correctness, And casual speech.01:04:42
So, yeah, let's start there.01:04:50
Oh, go ahead, Francesco.01:04:54
Francesco Lanciana
Yeah, I just asked in the chat, but I'll just ask here again. Would you have, like, optional tuples as well? Because in this case, it seems weird to have a substitution, like, dimension.01:04:56
On everything, so it's like, that's not gonna be that often,01:05:09
That it's gonna be happening, so could you have, like, 3 and 4, like, dimensions? Like, switch between them, or…01:05:14
Isaac Flath
Sure, yeah, absolutely. Yeah, you can do anything.01:05:20
Hamel Husain
you want. Like, so don't, like, the rules are, like, what… this is just an idea, this tuple thing.01:05:24
Francesco Lanciana
Hmm.01:05:29
Hamel Husain
You should generalize it.01:05:30
Isaac Flath
So where it makes sense…01:05:33
Hamel Husain
for your application. In this case, yeah, probably, like, it's probably not a dimension that you always want to be there. You know, you probably want, like.01:05:34
Okay, substitution, yes or no?01:05:44
Or something. Maybe you don't even want it. You have to pick the things that are, like, really important for your application. You know, like, do you really want to explore this dimension? Every dimension you add has a cost, also.01:05:46
Francesco Lanciana
Hmm.01:05:59
Hamel Husain
So you have to kind of see… you kind of, like, try… it's really important to see if you can try to base it on real user patterns, if you can. If you don't have users, then that's tricky, but try your best.01:06:00
Francesco Lanciana
Cool. Thanks.01:06:15
Christopher Bradford
So, I understood the question to be something more like.01:06:18
can this approach with tuples support varying numbers of tuples, right? So could we have some that have 3 values and some that have 4 values, that sort of thing, in the same operation, or do we have to keep those separate?01:06:22
Hamel Husain
No, you can have varying tuples, it doesn't really matter. You're just trying to make sure you give…01:06:40
The AI, some… Factors that it can consider when it's… when it's, like, generating its synthetic data.01:06:45
How you supply that is totally up to you.01:06:56
You know, these dimensions… we just, like…01:07:00
the whole idea behind tuples is that if you just ask an LLM to just generate queries, it's gonna give you pretty homogeneous things. So if you, like, start with these tuples, it'll guide it towards varying on the dimensions that you care about.01:07:03
And so that can be anything. That can be, like, can vary, doesn't…01:07:18
have to be the same number of tuples, it can be whatever tuples you want. It's just, like, it's just a…01:07:23
Basically, it's a prompting technique.01:07:28
And if you really wanted to take it further, like, you know, sometimes it's better to generate one example at a time, if you've seen that, like, they're too homogeneous. You want to make sure, like, it's not too homogeneous.01:07:32
Well, we found, like, this approach…01:07:46
kind of helps you steer the LLM to give you more diverse queries, and like, in a way that you care about.01:07:48
Isaac Flath
Yeah, there was actually a, this guy named Carpathy, he's a big AI guy, did an interview recently, and one of the things he said, he was talking about how models don't have a lot of diversity. If you go to ChatGPT, he says, and you ask it to tell you a joke, and you do that odd times, it's got, like, 3 jokes, you know, so…01:07:57
Trying to give it to give you, yeah, that diversity is… Tricky, otherwise.01:08:18
So, I'm gonna breeze over this next step, because it's not, you know, super involved. So, I said, run my chatbot in this repo, so if you don't know how to code.01:08:28
it ran it. It did see that I had some environment variables, and so, if I didn't,01:08:37
I would be asking it how to do this, and then it opens it. I do my chatbot. There's lots of ways that you can automate running and saving these.01:08:46
My advice is the first time you use it, and often, like, often, is, like, actually use the product that you're building yourself, like, launch it and do it, like, the dumb way, at least sometimes.01:08:55
if we're talking like this, like, oh, there's… I have 6 queries.01:09:06
You can just do it manually, copy and paste it, get a feel. You'll probably find, if it's the first time anyone's used your app, you'll find that, like, I don't know, dumb stuff like, wow, when I do this, I can't see the send button, because, like, it overlaps, and I don't have a good footer, or…01:09:11
Stuff like that, so…01:09:27
If you've never used your own app, and actually put realistic queries in, then, like.01:09:31
Put them in and do it the slow way, and then afterwards, you can, like.01:09:38
find ways to, like, automatically run and get results and… and all that.01:09:43
After the first round, at least that's my opinion.01:09:49
So you do that a bunch of times, you end up with some queries,01:09:54
This is a one provided. I'm gonna go ahead and skip and jump to this so that we have time for the rest, which is, open coding.01:09:59
So how do we do open coding? Well, there's a couple things you could do.01:10:07
You could read through this.01:10:13
And say, what you got here?01:10:17
do this, take some notes in, like, a Google Sheet, or an Excel sheet, or something like that, that's totally fine. Let's go ahead and try and make something, very quickly.01:10:19
So, I'm gonna say… Let's go ahead and do this. I'm gonna add some information here, and say…01:10:31
I'm just gonna transcribe.01:10:43
Create a single file, self-contained HTML file that allows me to…01:10:46
view the results in this CSV file. I want to be able to see the query, what the response was, read it, and then have somewhere that I can take, notes for open coding.01:10:53
And so, I want to be able to go through them all. Keep in mind, this is a CSV file, but there's…01:11:06
like, lines on here that, records on here that span quite a few lines, and just do this in a simple, self-contained HTML file that I can use.01:11:14
And so, there's a little bit in there, right? Like, I knew to ask it for a HTML file, we'll see if it even works.01:11:26
And so, there's a little bit of information that you kind of need to know. You know this is a CSV file. I…01:11:35
have done this a lot of times, so I kind of had the intuition that this… the fact that this spans multiple lines would trip it up, so I gave it a little bit about that, and then we can see if we can create something01:11:44
very simple to just view this a little bit nicer. There's nothing particularly bad about doing this once, where I'm scrolling over.01:11:56
But it gets very annoying, so making a small annotation thing for yourself is helpful.01:12:05
Alright, let's see if it works.01:12:16
I would bet it does not work on this one, we will see.01:12:19
Alright, so let's check our CSV file. Is it this one?01:12:29
No records found.01:12:36
We can iterate back and forth here, and we'll say…01:12:39
Let me go ahead and keep it.01:12:45
I tried importing.01:12:48
Showed no results found.01:12:54
So we'll see if we can get something simple or not.01:12:58
But the idea is that you kind of want something that you can, I don't know, take some notes in.01:13:04
Alleges it fixed it.01:13:26
Alright, great, I'm pretty sure it's having the exact problem that I told it it was gonna have.01:13:30
I see the first query, But the response says… No.01:13:35
Response?01:13:44
I like to carry it at times, you know.01:13:45
This is, I'm not gonna get an app that is, like, deployed and shared that's, like.01:13:48
production grid.01:13:56
Typing is something you can work with as a starting point.01:14:02
really quickly.01:14:07
Alright, let's take a look there.01:14:12
Maybe the multi-line CSV is not working right.01:14:19
Hamel Husain
That's how you know this is real, because you're struggling with AI for this.01:14:26
Isaac Flath
Yeah, well, I didn't want to, like, I don't know…01:14:32
Hamel Husain
Yeah, yeah.01:14:34
Isaac Flath
I thought about preparing the file in advance, so that I could just be like, oh, and here's this thing I vibe-coded, but…01:14:35
I don't know, I felt like that would be cheating.01:14:42
So…01:14:46
That's the wrong file.01:14:56
Okay, there we go.01:15:01
So we've got this. Is it the most beautiful thing ever? No, but it's a lot better than I had.01:15:03
So, let's do some open coding. Not gonna do a ton, because we don't…01:15:09
Hamel Husain
Nice.01:15:14
Like…01:15:15
Isaac Flath
I don't know, I feel like it's pretty annoying okay.01:15:16
So what you got for 30-minute eggs and cheese dinner? So…01:15:22
Here's a quick and delicious egg and cheese sandwich that you can make in about 30 minutes, perfect for a satisfying dinner.01:15:27
or snack with simple ingredients. I don't know why it's saying snack when…01:15:35
It's also dinner, I guess that's fine.01:15:44
Seems a little overly enthusiastic, but that's okay. So, like, I'm thinking about…01:15:47
Not just is the recipe correct or not.01:15:53
But also, like, is this the vibe that I want my, users to see.01:15:57
I guess I might say, like.01:16:03
Maybe a bit too promotional slash over the top.01:16:06
Focus more on… Clarity… And to the point?01:16:13
So, that might be one thing I put,01:16:24
So, it said eggs and cheese, we've got… okay, we got eggs and cheese, that's great.01:16:27
1 tablespoon of butter or oil.01:16:32
So you might… this would be a little bit of a judgment call here, is, is saying…01:16:36
butter or oil, okay. If it's… if it's for an expert chef, clearly that's fine, they can make their own decisions. If it's for a beginner chef, do you just want to say, use this? Use oil or use butter?01:16:44
salt and pepper to taste.01:16:56
Do you want to be more specific there?01:17:00
So I might say… Pink, one, ingredient.01:17:02
That was the worst spelling I could have done. Pick one ingredient, not… Two options.01:17:11
I'm gonna say this optional is okay, because it's very clear that it's optional. Optional, a slice of ham, bacon, or cooked sausage for extra flavor? Okay, great.01:17:21
Prepare your ingredients.01:17:31
If you're adding meat, cook the ham, bacon, or sausage set aside.01:17:33
Okay, so I feel like it'd be nice if this, if the step is optional, also propindant.01:17:38
with optional. Like, I know it says optional here, but this step is only done01:17:51
if you're using the optional… so I feel like this should be labeled as optional as well.01:17:57
Keep in a toaster or skillet.01:18:07
In the same skillet, melt the butter, or heat some oil. So again, I don't… I don't like the…01:18:14
I like to pick one ingredient. Like, these are all, like.01:18:21
kind of clarity questions, but also kind of, like, product questions. Like, do you want to give lots of options, or do you want it just to be super simple?01:18:25
Depends on your audience and what you're trying to do.01:18:33
Being careful not to break the yolks,01:18:36
Is that a big deal, if the yokes break?01:18:45
If it is, you know, is this a beginner recipe? If it's, like, That finicky? I don't know.01:18:48
Don't do things… like… not breaking… Yokes.01:19:02
if… Because that's finicky.01:19:11
like, I don't know, I don't know about you guys, but whenever I break eggs, I… I'm… like, 25% of them I break the yolks on, so…01:19:16
Either it's important, and maybe it's a bad recipe for total beginners.01:19:24
Or it's not important, and we should remove it.01:19:28
So they're not worried if they do.01:19:32
Cook the eggs sunny-side up or over-easy.01:19:35
Okay, great.01:19:45
Place a slice of cheese over each egg at the last minute of cooking so it melts slightly good.01:19:47
Assemble a sandwich, etc.01:19:52
Would you like a variation? Maybe with some veggies? If we remember in my prompt, I said don't ask questions,01:19:55
So I'm gonna say, don't ask.01:20:03
Questions. Now, this was pre-generated, based with the assignment, not the ones, using the prompt that I actually did, so that's…01:20:05
part of why. But, you know, I would go through, and that's kind of the level of detail. There's a video in the, in the repo.01:20:13
where Hamel and I went through this for, like, I think it was, like, a half hour, and we did it on,01:20:21
I don't know, like, 7 or 8 of them, so I'm not gonna do that here again, but that's the idea. It's like, some of it is, is it wrong?01:20:27
Those are… those are errors. Is it,01:20:36
just not the right taste, like, anything that's negatively impacting. Now,01:20:40
you don't want to get too tied up with it. I tend to, like, read through it once, especially here. I read through it once, and I just think about it, and I just take notes, and then I move on. If I really stood here for the next 30 minutes, could I come up with more? Sure, but I already got plenty for this. I already got the most egregious things, so, good enough, I'll move on.01:20:47
if I was reading through, and time after time after time, on first read, I didn't have anything for the notes, I might say, like, oh, okay, let me… let me go back, and now I'm gonna be even pickier and think a little bit more, because my criteria, like, I can make my product even better. But this is kind of the… the deal.01:21:07
Anything to add there, Hamel?01:21:25
Hamel Husain
No, I think that's pretty good.01:21:28
Hopefully the export thing works, let's see.01:21:30
Isaac Flath
So I got here, got my export, there's my open coding notes. So, yeah.01:21:36
Cool.01:21:41
And this is also something, like, if you're, like, a product manager who doesn't do coding, like, if you do this, you can get started, you don't have to wait for the development team, and if you do this a bunch.01:21:50
like, if… I feel like if somebody came to me and was like, I have this viewer that I made, and it's helpful, this is what it does and what it looks like, these asterisks are weird, I don't like that, and I wish I saw the system prompt, otherwise this is what I want. Like, that's a very clear, like, ask of a developer.01:22:01
And you actually know what it looks like. You can… you can see the things you like and don't like, so I think this is great, and it gets you started immediately. And you can see I didn't do anything, like, super crazy on the AI coding side.01:22:19
Cool.01:22:33
So you do that a bunch,01:22:35
Once you do that, we do,01:22:39
Go back to the homework assignment, we do axial coding, I think it's called.01:22:42
Cool, so we went through the bot, and you generated it, you did some manually, maybe you automated after you did some manually.01:22:50
We did open coding, where we're reviewing the traces, and we're just kind of identifying themes, patterns. Once you do 10, 15, 20, you'll probably already see the patterns. You'll see that in the,01:22:58
And the video, the video is here.01:23:10
I believe.01:23:15
Open an axial coding walkthrough.01:23:17
So, if you go to this link here… You'll see that… Hamel and I walk through.01:23:20
for 35 minutes, and we're just doing this together the whole time. So, check that out if you want more.01:23:30
Hamel Husain
Out.01:23:37
Isaac Flath
Let's see here…01:23:45
And then, next step, axial coding and taxonomy. So once you have a bunch of these, you want to create, like.01:23:46
a category.01:23:55
And so that's… that's these. And this is gonna be different, you know, every single time. But this is the idea, like, what is it? Missing service-side information. And so maybe I say… let's open up my…01:23:57
Actual… thing here.01:24:10
So maybe I say,01:24:18
So maybe I look and I say, like, okay, what is the failure mode here? The failure mode is that it didn't… didn't pick out the allergen, or whatever the case may be. And you really try and narrow it down to, like, what's the most important thing in each one? At least for the starting point.01:24:24
So… Let me open my results viewer again.01:24:42
Hamel Husain
And you can have multiple axial codes, like, per example. For simplicity, we didn't show that, you know, but, like, you can. We just want to make it to where, like, you understand the process.01:24:51
On the first go-around.01:25:03
Isaac Flath
So we will say no title, no title.01:25:08
I don't know, whatever the case may be, but…01:25:13
Yeah, pick the category of the errors, and then, something that you can aggregate to, make sure you understand it. That's kind of the point here. I know I'm going to go in quick, because we're running out of time here.01:25:17
Make sure there's enough that…01:25:30
Not just you understand it, but then, like, someone else on your team, whether it's, you know, a colleague, or…01:25:33
a developer or anyone else can understand what this failure mode is. Missing server side… serving side information fails to specify the number of servings. This is something… I think Shreya probably made this, it's very clear. And then if you can have, like, examples, and you'll see this bot response.01:25:41
is not, like, the full recipe response, it's just enough to make it very clear what is this failure mode.01:26:01
You know, provides a recipe with ingredients like 2 large eggs, 1 cup, spinach, without specifying how many people this serves.01:26:10
And so the idea here is that you're kind of building up,01:26:16
Like, a library of, like, what are the things you need to fix?01:26:20
In your app.01:26:25
Overcomplicated, simple recipes. So what is that? Overcomplicated, simple recipes. Like, to me, I read this, and I'm like, I don't know what that is.01:26:27
Bot provides a recipe with too many ingredients or steps for what should be a simple dish. Quick egg, spinach, and cheese recipe, please.01:26:36
Garlic powder, red pepper flakes, multiple preparation steps that could be simplified. Okay, so now I get an idea of01:26:44
Of what this is, and that's…01:26:51
pretty much what you want to build. But…01:26:54
what the failure modes are come from this open coding. Once you do a bunch, then you start to create these categories, and you try and make up what are these categories, like, oh, it seems like…01:26:58
It seems like it's always overcomplicating simple recipes, and then you start building out that dictionary based on that.01:27:08
Are there any questions?01:27:19
Got about 8 minutes left, I'm happy to answer questions, I'm sure Hamel is too.01:27:21
Jon Pedley
I love that, I think that… I mean, these actual things, there's no… they are… they're subjective, they're based on your understanding of your users. Many… some people might not think that red pepper flakes and garlic powder are really overcomplicating, but you know who your… you have to really know who your product is for and be able to make that call.01:27:33
Rather, so, you know, you still need…01:27:52
Product thinking and using your brain rather than,01:27:57
You know, just letting machines do it for you.01:28:01
Isaac Flath
Yeah, absolutely.01:28:07
Francisco?01:28:08
Francesco Lanciana
Yeah. Was what you were showing us just then, like, an example of a rubric, for the failure mode? Like, you would have gone, like, that was in, like, chapter 4 of the book, or is this kind of pre-rubric? Because it was kind of going through how to, like, tell if it is or not.01:28:10
Isaac Flath
Yeah, so, let me, beer.01:28:28
Hamel Husain
Yeah, I mean, we just wanted to make sure you… this is, like, example failure modes. I just wanted to make sure we… you knew what we were talking about in these examples, because, like, we just put the failure mode, then be like, what the hell?01:28:32
what the hell failure mode is this? I don't understand what you're talking about. So we just, like…01:28:45
Francesco Lanciana
Yeah. You know.01:28:49
Hamel Husain
Anticipated that, and just wrote in a lot of detail.01:28:50
Isaac Flath
Yeah, I mean, and it's helpful for teammates, and it's helpful otherwise, because, like, over… yeah, like, like, saying there, it's like, overcomplicated, simple recipes. Like, when I read this, I might think, if something says garlic powder and red pepper flakes.01:28:54
I would say, is that overcomplicated? Like, maybe I would say no, and my colleague would say, yes, that's overcomplicated. And this kind of is one benefit, kind of getting ahead of ourselves, I think, of making, like, LM as a judge, is like, what is the rubric? Like, what is…01:29:08
Like, if we don't know if this is overcomplicated or not, and where that boundary is, it's gonna be really hard to evaluate, it's gonna be really hard to make it consistent,01:29:27
So, yeah, I think being really clear… It's just helpful.01:29:38
And these might end up being examples in your prompt anyway, maybe.01:29:43
Francesco Lanciana
Yeah. Would you say you'd go further than that with a rubric, though? Like, this would be not enough for a rubric for each value mode?01:29:48
Or would it be, like, at, you know… But this looked kinda good.01:29:55
Isaac Flath
I mean, I would start… I would start here. I wouldn't… I wouldn't necessarily go through…01:30:01
you know, every thing, and say, like, is it yes or no for every single one of these? I would try and kind of categorize, like, which of… for this particular trace, which of these does it fall into? I think this is enough detail for now, and then,01:30:07
I try not to… go too far, too early.01:30:25
I would say it's enough, unless there's confusion as to what something means. If it isn't, if there is confusion, then I would give you more detailed. The more specific you are, the better.01:30:31
But if, for example, if you add…01:30:44
Two or three lines to the prompt about missing dietary restriction information, and then it, like, almost completely solves the problem.01:30:48
then I don't want to waste time, like, really, really making this super clear, because I already solved my problem.01:30:57
So I would say, like, start with something that just seems reasonable, maybe is clear, but not super clear. See if some of the simple fixes fix it. If you still can't fix it, then maybe you need to get more and more specific, so that you can be more and more clear in your prompt, or in your product, or in other places.01:31:04
But… yeah, I would…01:31:23
Start with a simple explanation, see if that fixes it, and then just keep getting more complicated, only if you have to, would be my advice.01:31:25
Francesco Lanciana
Yeah, makes sense. Thank you.01:31:35
Oz Yilmaz
I have a question in terms of when you start the open coding, when you're building a new product. We were con… like, I was explaining the course to my teammates recently, and we're going to change and build something new, and we were like, okay, do we start open coding immediately? Or, like, because we have, like, this baseline prompt that works.01:31:37
that will… we know will fail, with many cases, like, how do you… like, the initial way before me taking the course was we wipe-coded, literally. We put an example, and we're like, okay, this fails, let's add this to the prompt, yadda yadda.01:31:57
But, would you say I start open coding immediately at the start, and then refine the prompt, and do that iterative cycle from the start? Or is there a point where you're like, okay, the prompt is good enough, now I start open coding? How do you go about that?01:32:11
Hamel Husain
I think it's good to vibe check your app first a little bit. Like, if you're, like, really early, you don't need to jump into any evals. Because, like, you kind of try and figure out what you even want, like, do you even want a recipe bot? Or, like, you're just exploring, you know?01:32:26
Your own, like, yourself?01:32:42
And you're kind of doing open coding in your head?01:32:44
Open coding is kind of like, oh, like, you're just writing down your reactions, in a way, right? So… Exactly.01:32:47
And sometimes you don't even need to… like, it's like that feedback loop can be shorter, just like, oh, look at a… look at a response, react, and fix it, and iterate. And you're like, oh, I don't want a recipe bot anymore, I want a different kind of recipe bot. This is, like, doesn't feel right. So that, yeah, yeah, like, get through all that, and then do…01:32:52
Or, do evals after.01:33:10
Oz Yilmaz
Thank you.01:33:13
And I'm glad you white-coded the CSV, because I went through it yesterday, and I was like, I'm doing something wrong, clearly, this is taking too many steps.01:33:14
Isaac Flath
So thank you, Isaac, for that.01:33:21
Yeah, absolutely. No problem.01:33:24
Hamel Husain
Any more questions?01:33:34
Katya May
Alright, well, I'll jump in, with two. The first one is a quick one. So Isaac, your…01:33:35
If you can show your screen again, how did you get that…01:33:43
small, tiny preview. I think it's called Minimap, maybe?01:33:48
of the long… like, I'll… the long CSV file.01:33:51
This year? Yeah, that one.01:33:59
Hamel Husain
Yeah.01:34:04
Isaac Flath
It just came with VS Code.01:34:04
Katya May
Do you find that helpful? Because I think I used to have that, and I turned it off.01:34:06
Hamel Husain
It looks cool, but, like, I was like…01:34:12
Isaac Flath
Oh, yeah.01:34:15
Hamel Husain
point, it's like, I don't know.01:34:15
Isaac Flath
No, it's not helpful to me at all. So yeah, if you go to… if you hit the Command-Shift-P to bring this up, just like I did to find the markdown, then I just searched for minimap, and there's a toggle mini-map, and I can turn it off or on.01:34:16
Katya May
Okay, cool, alright. I think, yeah, I probably wouldn't always want it, but sometimes my… because I have, like… I had to build a multi… instead of the bulk tester part of our recipe bot, like, for my app, I had to build a multi-turn…01:34:31
conversation bulk tester.01:34:47
It took me days to build… like, I'm not technical, or I'm becoming technical, I guess. So it's really long, and so that will actually help me jump when I know, oh, it's…01:34:52
Conversation 20, and it's way at the bottom.01:35:02
Cool, alright. And then my second question was, back when, I think it was John mentioned about,01:35:07
jailbreaking and etc, and yes, you know, we kind of roll our eyes, but as a newbie, new person, I had thought about… and I guess I just want to check if this is…01:35:14
re, suggested, or good practice, separating, I guess, my evals, and just for now, focusing on evaluating, is my chatbot01:35:28
doing what I want it to do in terms of the conversation it's having with the users, and then separately, later on, when I'm more comfortable with that, evaluating, you know, how well is my chatbot resistant to jailbreak attempts, or,01:35:40
prompt injection attempts. Comments?01:35:58
Suggestions?01:36:02
Hamel Husain
Yeah, I mean, you shouldn't.01:36:04
Isaac Flath
I think it makes sense.01:36:05
Hamel Husain
focus on, yeah, you should focus on, like, does your app do what you want? Because, like, no one cares about jailbreaking anything that doesn't work.01:36:05
Katya May
Yeah, that's true!01:36:12
Wonder!01:36:15
Hamel Husain
That's totally fine.01:36:15
The jailbreaking thing, I would just want to say, people get carried away by this…01:36:17
Don't get carried away by jailbreak defense, because…01:36:21
I mean, yeah, it… like, you have to really think, like, is it harmful to your business? Like, is it really harmful? And be honest with yourself. You can get…01:36:25
Well, Michael…01:36:35
Katya May
Okay, mine will eventually be, like, mental health arena, so…01:36:36
Hamel Husain
Okay, then it, then it died.01:36:39
Katya May
Yeah.01:36:41
Hamel Husain
Interesting. It's not so much jailbreaking that maybe it's like…01:36:42
Can it go in a dark direction?01:36:47
Katya May
Unintentionally.01:36:50
Hamel Husain
You know, so…01:36:51
There's a… there's, like, a line there. Like, how…01:36:53
Is there a way that someone can accidentally steer it in the wrong direction?01:36:57
Katya May
Okay.01:37:02
Alright. Add that to my ever-growing list, thank you.01:37:03
And thanks for having this session. I was one of those in the last cohort that was like.01:37:09
I know I need this information, this knowledge, but I've realized I now need to learn how to code, so… But I'm back, and I'm not crying.01:37:14
Well, that's great.01:37:25
Isaac Flath
Whoa.01:37:25
Hamel Husain
Thank you.01:37:26
Isaac Flath
This is… I'm gonna plug myself now that you said that, so thank you for the setup. So, yeah, so there's 3 more of these next week, covering the other homework, so we'll keep going through them.01:37:26
And I do teach an AI coding course, so if anyone's interested, you should check that out.01:37:36
Francisco?01:37:43
Hamel Husain
Check that out. So you're not plugging it hard enough, so let me just interject.01:37:44
So, I've been working with Isaac for, like, a decade. He's really good at what he does. He's, like, one of the best people I ever worked with.01:37:48
And, one thing that he's teaching is, like, he's teaching people how to do…01:37:57
how to be effective with AI in coding.01:38:03
And he's gone really deep on this subject. I worked with Isaac previously in several capacities. One was at an AI coding lab.01:38:06
that we both worked at, called Answer AI.01:38:16
And we also worked on open source together, we worked on developer tools together, so we worked on this project called NBDev. So we've both worked a lot on01:38:19
like, developer tools, and Isaac is really good.01:38:29
And he teaches… he's, like, a really good teacher, so if you are interested in learning, becoming better at AI-assisted coding, check out his course.01:38:33
put a link… we'll send an email as well to everyone, but I can find the link, or maybe, Isaac, you can find the link and put it…01:38:46
Isaac Flath
Yeah, yeah, yeah, yeah, I'll do that. I'll share a link. And yeah, I think, next cohort, we're actually gonna have for, like, every single week, we'll have, like.01:38:56
3 different phases, and so they'll be like, here's how you get started with this concept, here's the intermediate, and, like, here's, like.01:39:05
you know, really advanced, like, how do you contribute to, like, open source library kind of thing? And so it'll be… it'll be broken out. There'll be content for,01:39:13
You know, for all levels that are doing it. Now, if you don't want to do any coding, then…01:39:22
It's probably not the right… the right course, but if that's something that you want to learn, then,01:39:27
Yeah, I would love to have people.01:39:32
Francisco?01:39:34
Francesco Lanciana
Yeah, two quick ones. One, how did you do the, voice recording, like, voice-to-text in VS Code? Is that… because I don't see, like, the button to actually enable that, I've just never seen anyone do that.01:39:37
Isaac Flath
Yeah, so there's… there's a lot of, voice transcription tools, and, that one was Mac Whisper. So, last week, I was using Monologue, the week before, I used Whisper Flow, so there's a lot of them, that work, that work great.01:39:49
And what these tools do is they often allow you to put, like, a shortcut key, and so this one01:40:05
for my Mac Whisper, I have it set to my right ALT key, so if I press and hold my right ALT key.01:40:13
then the voice transcription starts. And then I transcribe, and when I release, it pastes it in wherever it is, whether it's in Google Doc, or in VS Code, or in Discord. So I can voice transcribe it, anywhere.01:40:19
So mine… the one I was using was, Mac Whisper, but I've used Monologue, that works okay. I've used Whisper Flow, I've used Super Whisper.01:40:32
Francesco Lanciana
Yeah, okay. They like…01:40:42
Isaac Flath
They like… they like the Whisper brand, I guess, but…01:40:43
Hamel Husain
I use, I use WhisperFlow a lot, because I also use it on my phone.01:40:45
Oh.01:40:50
Francesco Lanciana
Okay.01:40:51
Hamel Husain
So…01:40:51
Francesco Lanciana
Yeah.01:40:52
Hamel Husain
It can be flaky sometimes, but it's good.01:40:54
Isaac Flath
I would really recommend, getting one, because it's…01:40:56
AI can often take what you write and then, like, reorganize it nicely, and so when you're trying to think about, like.01:41:00
what might a user want, or you're trying to think about, like, what do I want out of this thing I'm vibe coding, or coding better, or…01:41:09
Whether you're talking to a coding agent, or… It's,01:41:16
it's really nice just to turn on the transcription and just be able to talk, and sometimes I'll talk for 3 or 4 or 5 minutes, just trying to organize my… like, get what I'm thinking out of my head, and then I can sometimes ask, whether it's Copilot or Gemini or Claude or ChatGBT, like.01:41:22
can you organize this for me? Because I've just got, like, a big block of transcribed text. And I think it just saves a lot of time, too.01:41:39
Talking's faster than typing, for almost everybody.01:41:47
Hamel Husain
I would say it's a tremendous amount of time.01:41:50
I get really annoyed when I have to, like, type stuff.01:41:52
Francesco Lanciana
Like, 2 months now.01:41:55
Isaac Flath
It's funny, because a lot of times when I'm, like, presenting, I often type a lot more when I present, and I don't know why, because when I type, I'm also telling people what I'm typing, and I'm like, I should just transcribe, this makes no sense.01:41:58
Francesco Lanciana
So… Yeah.01:42:10
Isaac Flath
I transcribe a lot.01:42:12
Francesco Lanciana
One more question is just on the,01:42:14
Yeah, a quick one on the… like, how long do you typically spend now on the open coding, axial coding, like, now that you've gotten used to it, you know, you've got the feel for it, like, is that stage pretty quick, or…01:42:17
Does it still take, you know, the better part of a day, or, like, does it super depend?01:42:32
Yeah.01:42:38
Hamel Husain
Yeah, I mean, the whole eval, I would say it's, like, 75% of the importance and the value of the whole process.01:42:39
So I do spend a lot of time on it,01:42:47
And I, like, try to find more data to annotate, look at more data, do all kinds of analysis on it.01:42:52
of, like… Oh, okay, you know, I go a little bit beyond, like.01:42:58
a lot of data analytics stuff. Like, I compute a lot of statistics over the traces, see if I can find patterns, stuff like that.01:43:04
It just tends to be, like, you can find a lot of low-hanging fruit.01:43:11
And just fix your application.01:43:16
And it's like, when you get to the steady state, then, like, the Z valves become more important.01:43:19
But a lot of applications nowadays are in that beginning phase, so… I end up spending a lot of time doing01:43:24
Air analysis.01:43:30
That's just me.01:43:32
Isaac Flath
Yeah, it's the same for me. I think it's like… Just do…01:43:34
you know, I end up finding stuff doing open coding, like, oh, wow, every time it asks for, you know, a diet that doesn't do well, or like, wow, it really doesn't know what a vegan diet is, or…01:43:39
doesn't know what a keto diet is, or I find, like.01:43:49
You know, all this… every time… it feels like every time I do short queries, it just doesn't seem to feel right. Like, and if you do enough of it, you start to get these intuitions about, like, oh, I bet this query probably isn't going to do well in my product.01:43:53
And then you can go try it. And so, I find it's like…01:44:04
Yeah, it's… I find it's… it's…01:44:10
really helpful, and if you do a bunch of it, like, eventually it gets to a point where you're like, every time it goes, you're like, oh yeah, of course, it's like… like, you type in the query, and you're like, it's probably gonna miss the title on this one, you know? And… that's really helpful, and you can just fix a lot, and you can just iterate super quickly, and so…01:44:12
Yeah, I think just generally being a critical user of your own product is…01:44:32
Just really, really, really helpful, so… I spend most of my…01:44:38
most of my time, at least when I'm trying to improve the AI generation aspect.01:44:42
Most of my time is probably open coding.01:44:47
Francesco Lanciana
Yeah. Thanks.01:44:51
Isaac Flath
If I really know really precisely and exactly what the problem is and what causes it.01:44:54
Then, like, what triggers the problem to… the thing to go wrong, then usually fixing it doesn't take that long anyway.01:44:59
So…01:45:08
Francesco Lanciana
Nope.01:45:12
Hamel Husain
Alright, it's probably a good time to wrap things up. Thanks everybody for coming, and we will see you again soon.01:45:15
Thank you.01:45:22
Isaac Flath
Thank you. Bye.01:45:23
Katya May
Thank you so much. Bye.01:45:24
Get live hands-on guidance on applying the AI eval course concepts to the homework and get your questions answered. This is perfect for coders who want to learn more about better AI workflows, and for PMs interested in understanding the details and how AI can help you do AI Evals effectively. We'll use AI agents and models as a partner to speed up our work and deepen our understanding. You will learn to: Apply AI Evals concepts from the course to real datasets Use AI tools to explore and learn new concepts in a practical way Learn how to use AI effectively to complete tasks quicker This is a series for everyone, from developers to product managers. The only requirement is a desire to learn by doing.
[
Home
](/parlance-labs/evals/2025-3/home)[
Community
](/parlance-labs/evals/2025-3)