Two minutes NLP — Quick tips to make your semantic search projects painless
Semantic search, embeddings, symmetric vs asymmetric search, and embeddings storage
What is semantic search?
Semantic search is a data searching technique in which a search query aims to not only find keywords but to determine the intent and contextual meaning of the words a person is using for search.
The idea behind semantic search
The idea behind semantic search is to embed all entries in your corpus, which can be sentences, paragraphs, or documents, into a vector space. At search time, the query is embedded into the same vector space and the closest embeddings from your corpus are found.
Embeddings model tips
Consider Sentence-BERT to get embeddings of your documents, which makes it easy to tune sentence embedding for your specific tasks.
Symmetric vs asymmetric search
This critical distinction determines the choice of the right model.
- For symmetric semantic search, your query and the entries in your corpus are of about the same length and have the same amount of content.
- For asymmetric semantic search, you usually have a short query (like a question or some keywords) and you want to find a longer paragraph answering the query.
Where to store the document embeddings
Sample project with code
A project example where the author builds a search bar for users to search for movies from the Wikipedia Movie Plots dataset using semantic search.

Stay up to date with the latest stories about applied Natural Language Processing and join the NLPlanet community on LinkedIn, Twitter, Facebook, and Telegram.
