
LANGCHAIN — How to Make GPT-4 Output Structured Data Using LangChain
The best way to predict the future is to invent it. — Alan Kay.
If you’re looking to make GPT-4 output structured data, LangChain provides a solution through its output fixing parsers, which allow for handling badly formatted outputs using a focused prompt. Below is a step-by-step guide on how to achieve this using LangChain.
First, install the required dependencies by running the following commands:
yarn add langchain
yarn add zodNext, you can use the following code snippet to implement the solution:
import { z } from "zod";
import { ChatOpenAI } from "langchain/chat_models/openai";
import { PromptTemplate } from "langchain/prompts";
import { LLMChain } from "langchain/chains";
import {
StructuredOutputParser,
OutputFixingParser
} from "langchain/output_parsers";
const outputParser = StructuredOutputParser.fromZodSchema(
z.array(
z.object({
fields: z.object({
Name: z.string().describe("The name of the country"),
Capital: z.string().describe("The country's capital")
})
})
).describe("An array of Airtable records, each representing a country")
);
const chatModel = new ChatOpenAI({
modelName: "gpt-4", // Or gpt-3.5-turbo
temperature: 0 // For best results with the output fixing parser
});
const outputFixingParser = OutputFixingParser.fromLLM(
chatModel,
outputParser
);
const prompt = new PromptTemplate({
template: `Answer the user's question as best you can:\n{format_instructions}\n{query}`,
inputVariables: ['query'],
partialVariables: {
format_instructions: outputFixingParser.getFormatInstructions()
}
});
// For those unfamiliar with LangChain, a class used to call LLMs
const answerFormattingChain = new LLMChain({
llm: chatModel,
prompt: prompt,
outputKey: "records", // For readability - otherwise the chain output will default to a property named "text"
outputParser: outputFixingParser
});
const result = await answerFormattingChain.call({
query: "List 5 countries."
});
console.log(JSON.stringify(result.records, null, 2));In the code above, we first define a Zod schema to specify the structure of the expected output. Then, we create an output fixing parser using LangChain’s capabilities, and finally, we call the LLM chain to obtain the structured data output.
With this solution, you can easily handle badly formatted outputs from LLMs, ensuring that the data is structured and ready for use in the next steps of your pipeline.
By using LangChain’s output fixing parsers, you can make the most out of GPT-4’s capabilities while ensuring that the output is in a structured format, ready for further processing or storage.






