
LANGCHAIN — What Is the State of AI in 2024?
In the age of technology, ignorance is a choice. — Donny Miller
In 2023, the AI landscape witnessed a surge in interest in Generative AI, particularly following the emergence of ChatGPT. This has led to a myriad of questions about the incorporation of GenAI into products, the best reference architectures to follow, the most suitable models for specific use cases, the technology stack to use, and the testing of LLM applications.
LangChain, with its unique position in the ecosystem, has been able to shed light on how teams are actually building with LLMs through anonymized metadata in LangSmith, its cloud platform. This article delves into the common things people are building, the usage of LangChain Expression Language (LCEL), the most used LLM providers, the usage of open-source model providers, the most used vectorstores, the most used embeddings, and the top advanced retrieval strategies. Furthermore, it discusses how developers are testing LLM applications and the metrics associated with the testing.
What are People Building?
LangSmith has been a valuable tool for understanding what people are building. It is evident that while LangSmith integrates seamlessly with LangChain, it is also widely used outside the LangChain ecosystem. Retrieval has emerged as a dominant way to combine data with LLMs, with LangChain offering integrations with 60+ vectorstores. Additionally, agent-based queries have constituted a significant portion of the complex queries, enabling the LLM to make decisions, although there are still reliability and performance concerns.
LCEL Usage
LangChain Expression Language (LCEL) has gained traction as an easy way to compose components together, making it perfect for creating complex, customized chains. The usage of LCEL has rapidly increased over the past few months due to the addition of more features and improved documentation.
Most Used LLM Providers
OpenAI has emerged as the leading LLM provider of 2023, with AzureOpenAI closely following. Other hosting services and open-source model providers like Hugging Face, Fireworks AI, and Ollama have also gained significant traction.
Most Used OSS Model Providers
Developers primarily run open-source models locally, with providers offering API access to these models also being widely used. Fireworks AI, Replicate, Together, and Anyscale are among the top providers offering API access to OSS models.
Most Used Vectorstores
Local vectorstores such as Chroma, FAISS, Qdrant, and DocArray have been extensively used, along with hosted offerings like Pinecone and Weaviate. Additionally, databases with integrated vector functionality, including Postgres (PGVector) and Supabase, have also seen significant usage.
Most Used Embeddings
OpenAI and Hugging Face have been the most widely used providers for calculating embeddings for pieces of text. Vertex AI has also gained popularity as a hosted provider, along with other providers like Cohere and Amazon Bedrock.
Top Advanced Retrieval Strategies
Developers have been relying on advanced retrieval strategies such as self-query, hybrid search, contextual compression, multi-query, and time-weighted vectorstore, to enhance the retrieval process.
How Are People Testing?
LangSmith has emerged as one of the best ways for evaluating and testing LLM applications, with most users formulating metrics to evaluate their apps. Common evaluators include LLM evaluators and custom evaluators, reflecting the specificity of evaluation based on the application being worked on.
What Are People Testing?
Developers are primarily concerned with evaluating the correctness of their applications, with multiple types of feedback associated with test runs. This suggests the difficulty in finding a single metric for evaluation and the need for diverse evaluation techniques.
Conclusion
As the first real year of LLM app development comes to a close, LangSmith has become the dominant way for teams to bring their applications from prototype to production, irrespective of whether they are using LangChain or not. These usage statistics provide insights into what people are building, how they are building, and testing their applications.
LangSmith is emerging as the dominant way that teams are bringing their applications from prototype to production — whether they are using LangChain or not. If you are interested in enterprise access or support, please reach out or sign up.






