Contrast
< Back to Blog
Original link:

https://www.youtube.com/watch?v=08qXj9w-CG4

2024-01-09 05:14:12

LangChain v0.1.0 Launch - Agents

video content Image generated by Wilowrid

Agents and tool use in general are one of the key concepts of lang chain .

And we want to make it really easy for people to build these types of agentic workflows .

So one of the big things that we focused on with 0.1 is making agents in lane chain easier to understand , easier to customize and more reliable in general .

And so I wanna walk through what length , what agents are at a high level .

And then also some of the improvements that we've made .

So at a high level , agents involve taking a language model and asking it to reason about which actions to take and then taking those actions .

And oftentimes that is in a loop that repeats until the language model decides that it's done that it doesn't want to take any uh any actions .

And so there are a few different things here that are important to understand .

And , and so one of the things that we've added is a conceptual page for agents that goes over all of these , it goes over the schema things .

video content Image generated by Wilowrid

So we have a concept of an agent action and an agent finish and intermediate steps , agent action is basically uh when the LLM decides to take an action , an agent finish is when it decides to respond .

And then intermediate steps are basically the action and the observation from that action that have happened previously .

And so we explain that in some detail here , we then cover what an agent is .

An agent basically takes in some inputs as well as any intermediate steps that have already happened and decides what to do next .

We then have a concept of tools and tool kits .

And these are basically representations of actions that a language model can take along with the function for actually taking those actions .

So it's basically the the schema definition for what should go into a function and the function itself , we can then combine an agent and tools with an agent executor , which is basically a loop that runs to call a language model .

video content Image generated by Wilowrid

Figure out what to do , take that , take the tool that it decides to use , call it , get an observation and then repeat until it decides that it's finished .

So we've added this conceptual page um with far more detail than what I just explained to cover all of that .

We've also added this page that covers the various agent types that we have in L chain .

So there are a lot of different agent types .

Um There , there are I think seven that we have uh documented here .

Some of them are good for local models .

Some of them are good for the newest open A I models .

Some of them support parallel function calling .

Um Some of them support conversational history really easily .

Some of them only support simple tools like that have a single input because they are generally prompting strategies that are useful for simpler models .

Other of them support more complex uh multi input tools .

Wilowrid Advertisement
video content Image generated by Wilowrid

And so we've added a table that documents all of this and also has some verbiage on when to use because a big question that we get is when to use certain agents in different scenarios .

And so we want to have more guidance around that here .

All of these tables or all of these agent types in this table also have their own page that covers uh how to invoke it , um how to use it , um how to create it .

And a big part of this is we've actually also pulled the prompts out from the agents themselves .

So the prompts are really , really important to guide the agent to what to do .

And so previously , they were obfuscated a little bit in the agent definition .

Now , what we do is we have the , the prompts live outside and then we have these creation functions that take in an LLM tools and prompts and then create this agent .

So I want to walk through what this looks like in a notebook a little bit .

video content Image generated by Wilowrid

Um So we're going to import some stuff that we know we're going to use , we're then going to load the prompt that we have from the hub .

This is a really simple prompt .

Um We're then going to load the language model that we want to use .

Um We're then going to load the tools that we want to use now that we have the prompt , the language model and the tools we can start to create our agent .

So we can pass those in , we can create , we can get back the agent object .

This right now doesn't execute any tools .

Um It just kind of like takes an input and and decides what to do .

Part of that input always needs to be this intermediate steps key .

This is used to track any progress that the agent has already made .

So we can pass in this input , we can get back a result .

Uh We can see that this result .

Um If we want to look at the full result , we can see that this is an agent action thing .

Um And it's saying , hey , you should call this tool um with this input .

But again , this isn't actually doing anything .

So if we call this again , this is stateless .

video content Image generated by Wilowrid

If we call this again , it will give us back the same thing because we haven't actually executed anything in order to execute it , we need to use the agent executor .

So here we pass in agent and we pass in tools .

Um And now when we evoke it it will , it will do things under the hood , um , and return a response .

Um , and so we get back , uh , this response .

If we go to Lanes Smith , we can see exactly what's going on under the hood .

So we go here , we can see the agent executor and we can see that there's basically three important calls that it makes .

First it makes a call to open A I .

Um , it has this tool that it's using because we're using the open A I function calling agent .

So it has this function definition here .

It's got the prompt here .

Again , as you can see , we're using a really simple prompt .

Um And then it's output this , this function call thing .

We can then see that it calls this Tavili thing , the Tavili search , which is , which is a search engine um and return some output .

Wilowrid Advertisement
video content Image generated by Wilowrid

And then we can see the final call to the LLM um where we have a more this is this is the input now the input is longer .

So we have kept track of these intermediate steps .

Um And we can see the output here if we want to see even more exactly what is going on .

We can , we can do this , we now have access to the prompt templates and the parsing .

So here if we go to the , the chat prompt template , um we can see that the input is we have a few different things .

So we have the input , uh which is the input key .

Um And this is the original question that we asked , but now we also have these intermediate steps .

So this is the tool call um as well as the observation um that we built up over time .

Um And , and , and uh are now using to basically tell the agent .

Hey , I've already done this , I've already looked it up .

This is where I got back .

Can I respond ?

Now , the last thing that I want to briefly cover is the streaming part of the agent .

Um So here , you know , when we invoked it , it actually took quite a while to stream back the responses .

video content Image generated by Wilowrid

So if we stream here , the things that we streaming aren't tokens , they're the steps from the agent .

Um And so we can get first uh the action that it takes .

Uh We then get back the observation from that action .

Um And then we get back here the final output so we can use this to stream the steps that are being taken and communicate that to end users so that they know that stuff is happening .

Agents um have always been a huge part of lang chain and more generally just tool calling in general .

And so we've put a lot of emphasis on making it really clear how to use agents in lang chain .

We've added this conceptual guide , we have added this table of uh uh uh different agent types and we have added a bunch of guides on how to build a custom agent .

This is maybe one of the most popular ones because it actually goes through what an agent built with L cell looks like .

video content Image generated by Wilowrid

And so , you know , it's more complicated than calling a single line , but you have way more control over the inputs , the formatting , the prompts of parsing that's being done .

So we've added a really comprehensive custom agent guide .

We've added a guide for streaming .

Um We've added a guide for building an agent that returns structured output .

We'll probably add this as an agent type by itself soon .

Um And then lots of functionality around the agent executor itself , including using it as an iterable handling , parsing errors .

Um and a lot of other important things .

Wilowrid Advertisement
Original video



Partnership

Attention YouTube vloggers and media companies!
Are you looking for a way to reach a wider audience and get more views on your videos?
Our innovative video to text transcribing service can help you do just that.
We provide accurate transcriptions of your videos along with visual content that will help you attract new viewers and keep them engaged. Plus, our data analytics and ad campaign tools can help you monetize your content and maximize your revenue.
Let's partner up and take your video content to the next level!
Contact us today to learn more.