avatarFabio Chiusano

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

1195

Abstract

6a2">Consider <a href="https://www.sbert.net/index.html">Sentence-BERT</a> to get embeddings of your documents, which makes it easy to tune sentence embedding for your specific tasks.</p><p id="e70c"><b>Symmetric vs asymmetric search</b></p><p id="492d">This critical distinction determines the choice of the right model.</p><ul><li>For symmetric semantic search, your query and the entries in your corpus are of about the same length and have the same amount of content.</li><li>For asymmetric semantic search, you usually have a short query (like a question or some keywords) and you want to find a longer paragraph answering the query.</li></ul><p id="652c"><b>Where to store the document embeddings</b></p><ul><li><a href="https://www.elastic.co/blog/text-similarity-search-with-vectors-in-elasticsearch">ElasticSearch</a></li><li><a href="https://ai.facebook.com/tools/faiss/">FAISS (Facebook AI Similarity Search)</a></li><li><a href="https://github.com/spotify/annoy">Annoy</a></li></ul><p id="e8fb"><b>Sample project with code</b></p><p id="44d1">A project example where the author builds a search bar for users to search for movies from the Wikipedia Movie Plots dataset using semantic s

Options

earch.</p><div id="65b6" class="link-block"> <a href="https://readmedium.com/semantic-search-with-s-bert-is-all-you-need-951bc710e160"> <div> <div> <h2>Semantic Search with S-BERT is all you need</h2> <div><h3>Building In-house Semantic Search Engine from Scratch — Fast and Accurate</h3></div> <div><p>medium.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*AJb_MW-y0gkKfZXt)"></div> </div> </div> </a> </div><figure id="a6c8"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*ZrmUO_NiGvGVNhsh.png"><figcaption>NLPlanet logo.</figcaption></figure><p id="cd2e"><i>Stay up to date with the latest stories about applied Natural Language Processing and join the NLPlanet community on <a href="https://www.linkedin.com/company/nlplanet">LinkedIn</a>, <a href="https://twitter.com/nlplanet_">Twitter</a>, <a href="https://www.facebook.com/NLPlanet-113393687828458">Facebook</a>, and <a href="https://t.me/nlplanet">Telegram</a>.</i></p></article></body>

Two minutes NLP — Quick tips to make your semantic search projects painless

Semantic search, embeddings, symmetric vs asymmetric search, and embeddings storage

Photo by Kelly Sikkema on Unsplash

What is semantic search?

Semantic search is a data searching technique in which a search query aims to not only find keywords but to determine the intent and contextual meaning of the words a person is using for search.

The idea behind semantic search

The idea behind semantic search is to embed all entries in your corpus, which can be sentences, paragraphs, or documents, into a vector space. At search time, the query is embedded into the same vector space and the closest embeddings from your corpus are found.

Embeddings model tips

Consider Sentence-BERT to get embeddings of your documents, which makes it easy to tune sentence embedding for your specific tasks.

Symmetric vs asymmetric search

This critical distinction determines the choice of the right model.

  • For symmetric semantic search, your query and the entries in your corpus are of about the same length and have the same amount of content.
  • For asymmetric semantic search, you usually have a short query (like a question or some keywords) and you want to find a longer paragraph answering the query.

Where to store the document embeddings

Sample project with code

A project example where the author builds a search bar for users to search for movies from the Wikipedia Movie Plots dataset using semantic search.

NLPlanet logo.

Stay up to date with the latest stories about applied Natural Language Processing and join the NLPlanet community on LinkedIn, Twitter, Facebook, and Telegram.

NLP
Machine Learning
Artificial Intelligence
Data Science
Search
Recommended from ReadMedium