LANGCHAIN — Announcing Langsmith

Software and cathedrals are much the same — first we build them, then we pray. — Sam Redwine.

LANGCHAIN — Automating Web Research

The most dangerous phrase in the language is, ‘We’ve always done it this way.’ — Grace Hopper.

medium.com

LangChain has announced LangSmith, a unified platform designed to help developers with debugging, testing, evaluating, and monitoring LLM applications. LangSmith aims to bridge the gap between LLM-powered application prototypes and production. In this tutorial, we’ll explore the key features and benefits of LangSmith, along with code snippets and examples demonstrating its functionalities.

Debugging

LangSmith offers full visibility into model inputs and outputs at each step in the chain of events. This allows teams to experiment with new chains and prompt templates, identify the source of unexpected results, errors, or latency issues, and track token usage and performance. Below is an example of how to use LangSmith to visualize the exact sequence of calls and the inputs and outputs at each step in the chain:

# Example code for visualizing model inputs and outputs using LangSmith
import langsmith

# Initialize LangSmith client
client = langsmith.Client(api_key='your_api_key')

# Get model inputs and outputs
model_inputs, model_outputs = client.get_model_inputs_outputs(model_id='your_model_id', run_id='your_run_id')

# Visualize the model inputs and outputs
print(model_inputs)
print(model_outputs)

Testing

Developers often need to understand the impact of changing chains/prompts on their outputs. LangSmith simplifies the creation of datasets from traces or manually curated data and allows running chains and prompts over these datasets. The following code snippet illustrates how to run chains and prompts over datasets using LangSmith:

# Example code for running chains and prompts over datasets using LangSmith
import langsmith

# Initialize LangSmith client
client = langsmith.Client(api_key='your_api_key')

# Create a dataset from traces or manually curated data
dataset = create_dataset()

# Run chains and prompts over the dataset
results = client.run_chains_over_dataset(dataset=dataset, chains=['chain1', 'chain2'], prompts=['prompt1', 'prompt2'])
print(results)

Evaluating

LangSmith seamlessly integrates with open source evaluation modules, offering heuristic and LLM-assisted evaluations. The code snippet below demonstrates how to perform LLM-assisted evaluation using LangSmith:

# Example code for LLM-assisted evaluation using LangSmith
import langsmith

# Initialize LangSmith client
client = langsmith.Client(api_key='your_api_key')

# Perform LLM-assisted evaluation
evaluation_results = client.perform_llm_assisted_evaluation(model_id='your_model_id', prompt='your_prompt')
print(evaluation_results)

Monitoring

LangSmith enables developers to track system-level performance, model/chain performance, debug issues, and understand user interaction with their applications. Below is an example of monitoring system-level performance using LangSmith:

# Example code for monitoring system-level performance using LangSmith
import langsmith

# Initialize LangSmith client
client = langsmith.Client(api_key='your_api_key')

# Track system-level performance
latency = client.track_latency(application_id='your_application_id')
cost = client.track_cost(application_id='your_application_id')
print(latency, cost)

By leveraging the capabilities of LangSmith in debugging, testing, evaluating, and monitoring LLM applications, developers can streamline the process of taking their prototypes to production. LangSmith provides a unified platform for managing LLM applications, offering a single, fully-integrated hub for these essential tasks.

LANGCHAIN — Goodbye CVEs, Hello LangChain Experimental?

Real artists ship. — Steve Jobs