avatarLaxfed Paulacy

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

1770

Abstract

s to their schema. The core of the product consists of two LangChain agents that perform the NL-to-SQL translation.</p><h2 id="12d8">RAG Agent</h2><p id="4833">The RAG agent is used when developers lack a substantial set of sample Question<>SQL pairs for fine-tuning or training the LLM. It connects to the database, extracts essential information for SQL generation, and uses tools such as a schema-linking tool, SQL execution tool, and a few-shot sample retriever tool.</p><div id="2ce0"><pre><span class="hljs-comment"># Example code for connecting to the database and extracting essential information</span> <span class="hljs-keyword">from</span> dataherald.rag_agent <span class="hljs-keyword">import</span> RAGAgent

rag_agent = RAGAgent() rag_agent.connect_to_database(<span class="hljs-string">"your_database_credentials"</span>) table_schema = rag_agent.extract_table_schema() <span class="hljs-comment"># Other essential information extraction</span></pre></div><h2 id="f1a0">Agent with LLM-as-a-Tool</h2><p id="76d1">Once there are more than 10 golden SQL per table, the more advanced agent can be used, which involves fine-tuning a model and using the LLM-as-a-tool. This agent executes generated SQL queries against the database to validate correctness and retrieve necessary information.</p><div id="b03c"><pre># Example <span class="hljs-keyword">code</span> for using the advanced agent <span class="hljs-keyword">with</span> LLM-<span class="hljs-keyword">as</span>-a-tool <span class="hljs-keyword">from</span> dataherald.llm_agent <span class="hljs-keyword">import</span> AdvancedAgent

advanced_agent = AdvancedAgent() advanced_agent.fine_tune_model(<span class="hljs-string">"your_dataset"</span>) result = advanced_agent.execute_query(<span c

Options

lass="hljs-string">"generated_sql"</span>)

Other operations <span class="hljs-keyword">with</span> the advanced agent</pre></div><h2 id="22e3">Conclusion</h2><p id="db06">Dataherald empowers developers and data teams to efficiently translate natural language queries into SQL, catering to companies of all sizes. It provides support for conversational interfaces and self-service data access. The upcoming developments in Dataherald include LangChain integration, increased support for open source LLMs, and the ability for agents to ask follow-up questions.</p><p id="3d90">If you’re struggling with NL-to-SQL translations and want to streamline the process, consider exploring Dataherald.</p><div id="08ba" class="link-block">

      <a href="https://readmedium.com/langchain-langchain-v0-1-0-65f2a23345f4">
        <div>
          <div>
            <h2>LANGCHAIN — LangChain v0.1.0</h2>
            <div><h3>First, solve the problem. Then, write the code. — John Johnson</h3></div>
            <div><p>medium.com</p></div>
          </div>
          <div>
            <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/1*nu7ZXSdSXeo6aCLEJYoZpg.jpeg)"></div>
          </div>
        </div>
      </a>
    </div><p id="a03e">In this tutorial, we’ve explored the working of Dataherald, an open-source NL-to-SQL engine built on LangChain. We’ve looked into the RAG agent and the advanced agent with LLM-as-a-tool, providing code snippets for connecting to databases, extracting essential information, and executing SQL queries. Dataherald offers a promising solution for accurately translating natural language queries into SQL, catering to a wide range of businesses.</p></article></body>

LANGCHAIN — What Is DataHerald?

First, solve the problem. Then, write the code. — John Johnson.

Dataherald is an open-source natural language to SQL engine built on LangChain, with a focus on providing accurate semantic translations. It addresses the challenge of modern large language models (LLMs) being better at writing procedural code than SQL due to factors such as missing metadata and difficulty with complex SQL queries. In this tutorial, we’ll explore how Dataherald works and the underlying LangChain agents it utilizes.

How Dataherald Works

Dataherald offers an open-source NL-to-SQL engine, with an option for a hosted API. It allows users to add business context, create training data, and fine-tune LLMs to their schema. The core of the product consists of two LangChain agents that perform the NL-to-SQL translation.

RAG Agent

The RAG agent is used when developers lack a substantial set of sample Question<>SQL pairs for fine-tuning or training the LLM. It connects to the database, extracts essential information for SQL generation, and uses tools such as a schema-linking tool, SQL execution tool, and a few-shot sample retriever tool.

# Example code for connecting to the database and extracting essential information
from dataherald.rag_agent import RAGAgent

rag_agent = RAGAgent()
rag_agent.connect_to_database("your_database_credentials")
table_schema = rag_agent.extract_table_schema()
# Other essential information extraction

Agent with LLM-as-a-Tool

Once there are more than 10 golden SQL per table, the more advanced agent can be used, which involves fine-tuning a model and using the LLM-as-a-tool. This agent executes generated SQL queries against the database to validate correctness and retrieve necessary information.

# Example code for using the advanced agent with LLM-as-a-tool
from dataherald.llm_agent import AdvancedAgent

advanced_agent = AdvancedAgent()
advanced_agent.fine_tune_model("your_dataset")
result = advanced_agent.execute_query("generated_sql")
# Other operations with the advanced agent

Conclusion

Dataherald empowers developers and data teams to efficiently translate natural language queries into SQL, catering to companies of all sizes. It provides support for conversational interfaces and self-service data access. The upcoming developments in Dataherald include LangChain integration, increased support for open source LLMs, and the ability for agents to ask follow-up questions.

If you’re struggling with NL-to-SQL translations and want to streamline the process, consider exploring Dataherald.

In this tutorial, we’ve explored the working of Dataherald, an open-source NL-to-SQL engine built on LangChain. We’ve looked into the RAG agent and the advanced agent with LLM-as-a-tool, providing code snippets for connecting to databases, extracting essential information, and executing SQL queries. Dataherald offers a promising solution for accurately translating natural language queries into SQL, catering to a wide range of businesses.

ChatGPT
Langchain
Recommended from ReadMedium