encoder and sorting results with the score:</p><div id="8043"><pre>'Title': 'Armed Response'
'Title': 'The Cape Canaveral Monsters'
'Title': 'Chappie'
'Title': 'Galactic Armored Fleet Majestic Prince: Genetic Awakening'
'Title': 'Small Soldiers'</pre></div><h2 id="be0c">BERT-score (Bert Token Embeddings+Token IDF weights)</h2><div id="7e08" class="link-block">
<a href="https://readmedium.com/multi-objective-ranking-in-large-scale-e-commerce-recommender-systems-9bab88bc00a8">
<div>
<div>
<h2>Multi-objective Ranking in Large-Scale E-commerce Recommender Systems</h2>
<div><h3>Motivation</h3></div>
<div><p>medium.com</p></div>
</div>
<div>
<div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*g8VzJ_PaMQXtL7Id)"></div>
</div>
</div>
</a>
</div><p id="a3d3">BERTScore leverages the pre-trained contextual embeddings from BERT and matches words in candidate and reference sentences by cosine similarity. It has been shown to correlate with human judgment on sentence-level and system-level evaluation. Moreover, BERTScore computes precision, recall, and F1 measure, which can be useful for evaluating different language generation tasks.</p><figure id="6228"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*yKuNUB2K2YkCXjXNszHGuA.png"><figcaption>Bert score</figcaption></figure><p id="e04d">Taking our query as reference and results as the candidates we calculate BERT score f1.</p><div id="e0ac"><pre>ranked_results_bert = <span class="hljs-selector-attr">[]</span>
<span class="hljs-keyword">for</span> cand <span class="hljs-keyword">in</span> results:
P, R, F1 = <span class="hljs-built_in">score</span>(<span class="hljs-selector-attr">[cand[<span class="hljs-string">'Plot'</span>]</span>], ref, lang=<span class="hljs-string">'en'</span>)
ranked_results_bert<span class="hljs-selector-class">.append</span>({<span class="hljs-string">'Title'</span>: cand<span class="hljs-selector-attr">[<span class="hljs-string">'Title'</span>]</span>, <span class="hljs-string">'Score'</span>: F1<span class="hljs-selector-class">.numpy</span>()<span class="hljs-selector-attr">[0]</span>})</pre></div><div id="a1bf"><pre><span class="hljs-symbol">Results:</span></pre></div><div id="4e86"><pre>'Title': 'Armed Response'
'Title': 'Chappie'
'Title': 'Small Soldiers'
'Title': 'Galactic Armored Fleet Majestic Prince: Genetic Awakening'
'Title': 'The Cape Canaveral Monsters'</pre></div><figure id="c93f"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*xMpdmz4w2bZLlZ-b51mLuw.png"><figcaption>Re-ranking Summary</figcaption></figure><p id="ef13">The below diagram represents roughly what we have covered in this series, what we have not discussed is taking user behavior into account while showing results or performing new recommendations on the basis of past searches.</p><figure id="a6e3"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*ltHuuV_Dyok-E_4CE42N3g.png"><figcaption>credit: Eugene Ryan</figcaption></figure><h1 id="520c">Personalization in Search using Embeddings</h1><figure id="5de9"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*JseeLP_GtFYN-xfvzHYYNQ.png"><figcaption>Capturing User Interactions in Personalized Search</figcaption></figure><p id="be98">Now understanding user behavior can be done via his past item interactions/searches or from similar interest users, famously know to everyone as content-based and collaborative recommenders. These are not just helpful in recommendations but can help rank results.</p><p id="c793"><i>Before we jump to any formulation, let’s understand that just recommending things to has also dependency on recency and frequency of items. The genre of movies that a user browsed or viewed recently reflects his recent taste and area of interest, so we might wanna give a weightage to that while recommending the results items while keeping the frequency factor into consideration</i>.</p><p id="4f50">Let’s take the example of movies only. <b>User A</b> has watched or interacted with <b>3 genres</b> of movies say action, drama, and romance.</p><p id="0b20">Below Data shows week-wise interaction history with each genre of movie. It clearly shows the taste of users has inclined more towards action movies and affinity towards romance movies has declined.</p><figure id="d8af"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*5zLMbbVwAR57zRNPYx8wwA.png"><figcaption></figcaption></figure><figure id="da93"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*wpx8crf5Db46lsKUSeABlg.png"><figcaption>10-week User history</figcaption></figure><p id="2ff8">We use the exponential recency weighted average formula to aggregate the user browsing history. Below is the <b>Recommended For You (RFY Model) </b>formula :</p><figure id="ce94"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*APtWEchr1_vwuWmH.png"><figcaption>RFY Weightage Formula</figcaption></figure><p id="2d61">This aligns with our assumption that the most recent browsing data contributes most to the prediction of the next action.
<b><i>x(i) is our item vector and w(i) is the weight assigned based on its recency.</i></b></p><p id="b226"><b>For alpha=0.5, a higher value of “<i>i” </i>will be assigned higher weightage.
</b>I have modified the above little bit and added softmax across all the item weights before getting the weighted vector.</p><div id="1113"><pre>*************** <span class="hljs-keyword">method</span>-1 ****************
>> <span class="hljs-title function_">softmax</span><span class="hljs-params">(np.asarray([weight(10)</span>, <span class="hljs-title function_">weight</span><span class="hljs-params">(8)</span>, <span class="hljs-title function_">weight</span><span class="hljs-params">(0)</span>]))
<span class="hljs-title function_">array</span><span class="hljs-params">([0.43589772, 0.29958783, 0.26451446])</span></pre></div><p id="ae63">I only took the recent browsing data count and calculated the weight for the respective genre of the movie. But it did not take into account the past 9-week window. So let us consider something which gives a distance score keeping in mind the complete user watch history data distribution i.e <b>z-score.</b></p><blockquote id="4f64"><p>action_movie_zscore = stats.zscore(user_watch_hist[‘action_movie’])[-1]
drama_movie_zscore = stats.zscore(user_watch_hist[‘drama_movie’])[-1]
romance_movie_zscore = stats.zscore(user_watch_hist[‘romance_movie’])[-1]</p></blockquote><div id="ce04"><pre>*************** <span class="hljs-keyword">method</span>-2 ****************
>> <span class="hljs-title function_">x</span> = <span class="hljs-title function_">np</span>.<span class="hljs-title function_">asarray</span><span class="hljs-params">([action_movie_zscore, drama_movie_zscore, romance_movie_zscore])</span>
>> <span class="hljs-title function_">softmax</span><span class="hljs-params">(x)</span>
<span class="hljs-title function_">array</span><span class="hljs-params">([0.56467654, 0.38601828, 0.04930517])</span></pre></div><p id="2be9">Z-score is easier to calculate and maintain and is a powerful metric that reflects how far the current observation is from the me
Options
an.
It is quite clear from the softmax output of method-1 and method-2 that later has much more user behavior relevancy in its score.
The added benefit of the <b>z-score based softmax method</b> is for newly or recently added genres in the catalog. Suppose you have a new genre so you add a z-score ‘0’ corresponding to that genre.</p><div id="f250"><pre>>> x = np<span class="hljs-selector-class">.asarray</span>(<span class="hljs-selector-attr">[action_movie_zscore, drama_movie_zscore, romance_movie_zscore, 0]</span>)
>> <span class="hljs-built_in">softmax</span>(x)
<span class="hljs-function"><span class="hljs-title">array</span><span class="hljs-params">([<span class="hljs-number">0.49878136</span>, <span class="hljs-number">0.34097171</span>, <span class="hljs-number">0.04355148</span>, <span class="hljs-number">0.11669544</span>])</span></span></pre></div><p id="9d3e">You can see from the weight scores that it has assigned more weightage than romance movies, so your final recommendation will have diversity and freshness in recommendations.</p><div id="1c2c"><pre><span class="hljs-keyword">from</span> sentence_transformers <span class="hljs-keyword">import</span> SentenceTransformer, util</pre></div><div id="bd83"><pre><span class="hljs-comment">#Compute embeddings of retrieved candidate movie plots</span>
<span class="hljs-attr">embeddings</span> = model.encode(candidate_plots)</pre></div><div id="d85d"><pre><span class="hljs-comment">#Compute cosine-similarities for each plot with user vector</span>
<span class="hljs-attr">cosine_scores</span> = util.pytorch_cos_sim(user_encoded_vector, embeddings)</pre></div><div id="709f"><pre><span class="hljs-comment">#Find the pairs with the highest cosine similarity scores</span>
<span class="hljs-attr">titles</span> = [x[‘Title’] for x in results]</pre></div><div id="05ca"><pre><span class="hljs-attr">ranked_user_behaviour</span> = [{‘Title’:x ,’Score’: y} for x,y in zip(titles,cosine_scores.numpy()[<span class="hljs-number">0</span>])]
<span class="hljs-attr">ranked_user_behaviour</span> = sorted(ranked_user_behaviour, key=lambda x: x[‘Score’], reverse=<span class="hljs-literal">True</span>)</pre></div><figure id="bd23"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*7T7yNFTOCp5HJFTrenuVBw.png"><figcaption>Comparison Table</figcaption></figure><p id="6f40">We use the user behavior (<b>user_encoded_vector</b>) to re-rank the output shown to <b>User A</b>.</p><p id="8e8e">And when he returns back to home page we use the same <b>user_encoded_vector</b> to fetch nearest neighbor movies and can recommend him to watch them. The recommendation will have lot of action movies, drama and bit of romace too.</p><div id="2e69"><pre>#<span class="hljs-keyword">code</span></pre></div><div id="437a"><pre><span class="hljs-attr">t</span>=time.time()
<span class="hljs-attr">query_vector</span> = user_encoded_vector
<span class="hljs-attr">top_k</span> = index.search(query_vector, <span class="hljs-number">20</span>)
<span class="hljs-attr">top_k_ids</span> = top_k[<span class="hljs-number">1</span>].tolist()[<span class="hljs-number">0</span>]
<span class="hljs-attr">top_k_ids</span> = list(np.unique(top_k_ids))
<span class="hljs-section">[fetch_movie_info(idx) for idx in top_k_ids]</span></pre></div><div id="ece8"><pre><span class="hljs-meta">#output</span></pre></div><div id="8d35"><pre>>>>> Recommendation Results <span class="hljs-built_in">in</span> Total <span class="hljs-keyword">Time</span>: <span class="hljs-number">0.03316307067871094</span></pre></div><div id="b2dd"><pre>[{<span class="hljs-symbol">'Title</span><span class="hljs-symbol">':</span> <span class="hljs-symbol">'Key</span> Witness'},
{<span class="hljs-symbol">'Title</span><span class="hljs-symbol">':</span> <span class="hljs-symbol">'The</span> Good Mother'},
{<span class="hljs-symbol">'Title</span><span class="hljs-symbol">':</span> <span class="hljs-symbol">'Fire</span> on the Amazon'},
{<span class="hljs-symbol">'Title</span><span class="hljs-symbol">':</span> <span class="hljs-symbol">'Bang</span>'},
{<span class="hljs-symbol">'Title</span><span class="hljs-symbol">':</span> <span class="hljs-symbol">'How</span> to Make a Monster'},
{<span class="hljs-symbol">'Title</span><span class="hljs-symbol">':</span> <span class="hljs-symbol">'Hammers</span> Over the Anvil'},
{<span class="hljs-symbol">'Title</span><span class="hljs-symbol">':</span> <span class="hljs-symbol">'The</span> Nursemaid Who Disappeared'},
{<span class="hljs-symbol">'Title</span><span class="hljs-symbol">':</span> <span class="hljs-symbol">'Third</span> Person'},
{<span class="hljs-symbol">'Title</span><span class="hljs-symbol">':</span> <span class="hljs-symbol">'You</span> Were Never Really Here'},
{<span class="hljs-symbol">'Title</span><span class="hljs-symbol">':</span> <span class="hljs-symbol">'Shadow</span> Dancing'},
{<span class="hljs-symbol">'Title</span><span class="hljs-symbol">':</span> <span class="hljs-symbol">'Remembrance</span>'},
{<span class="hljs-symbol">'Title</span><span class="hljs-symbol">':</span> <span class="hljs-symbol">'Small</span> Town Murder Songs'},
{<span class="hljs-symbol">'Title</span><span class="hljs-symbol">':</span> <span class="hljs-symbol">'Eadweard</span>'},
{<span class="hljs-symbol">'Title</span><span class="hljs-symbol">':</span> <span class="hljs-symbol">'Caught</span> in the Web'},
{<span class="hljs-symbol">'Title</span><span class="hljs-symbol">':</span> <span class="hljs-symbol">'Bounty</span> Hunters'},
{<span class="hljs-symbol">'Title</span><span class="hljs-symbol">':</span> <span class="hljs-symbol">'Fulltime</span> Killer'},
{<span class="hljs-symbol">'Title</span><span class="hljs-symbol">':</span> <span class="hljs-symbol">'Amar</span>'},
{<span class="hljs-symbol">'Title</span><span class="hljs-symbol">':</span> <span class="hljs-symbol">'SMS</span>'},
{<span class="hljs-symbol">'Title</span><span class="hljs-symbol">':</span> <span class="hljs-symbol">'JAKQ</span> Dengeki Tai'},
{<span class="hljs-symbol">'Title</span><span class="hljs-symbol">':</span> <span class="hljs-symbol">'Gekijō-ban</span> Tiger & Bunny -The Beginning'}]</pre></div><p id="fcfe">Code:</p><div id="965a" class="link-block">
<a href="https://github.com/99sbr/semantic-search-with-sbert/blob/main/search-ranking.ipynb">
<div>
<div>
<h2>99sbr/semantic-search-with-sbert</h2>
<div><h3>Build Semantic Search with S-BERT and Fine-tune your model in unsupervised way - 99sbr/semantic-search-with-sbert</h3></div>
<div><p>github.com</p></div>
</div>
<div>
<div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*laQTFoZfZGzf1e63)"></div>
</div>
</div>
</a>
</div><h2 id="3c32">What Next?</h2><ul><li>BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer</li><li>Training PRM (Personalized Re-ranking Model) using User Data</li><li>Tacking Cold Start Problem by enhancing user geo and meta features</li><li>Graph: Learning from a user or item’s neighbors</li><li>Item2Item Similarity</li><li>RFM based Customer segmentation for targeted recommendation and branding.</li></ul></article></body>
Search, Rank, and Recommendations
Easy way to re-rank search results and personalized recommendations.
This post is in continuation of my previous post “Semantic Search with S-BERT is all you need”
And what we left hanging there were some important questions:
methods of re-ranking of results
quality of results, how inlined it is wrt to a user query.
how to incorporate user behavior in recommending results
Get Started: Overview
Search Ranking and Recommendations are fundamental problems of crucial interest to major Internet companies, including web search engines, content publishing websites, and marketplaces. However, despite sharing some common characteristics a one-size-fits-all solution does not exist in this space. Given a large difference in content that needs to be ranked, personalized, and recommended, each marketplace has a somewhat unique challenge.
Search and recommendations have a lot in common. They help users learn about new products, and need to retrieve and rank millions of products in a very short time (<150ms). They’re trained on similar data, have content and behavioral-based approaches, and optimize for engagement (e.g., click-through rate) and revenue (e.g., conversion, gross merchandise value).
Nonetheless, search differs in one key aspect — it has the user’s query as additional input. (Think of search as recommendations with the query as extra context.) This is a boon and a bane. It’s a boon because the query provides more context to help us help users find what they want; it’s a bane because users expect results to be in line with their query.
One common architecture for search and recommendation systems consists of the following components:
candidate generation
scoring
re-ranking
In the previous post, we understood how to create a concept search by better understanding the data and fine-tuning our model to generate better candidates for a user search query.
However, the retrieval system might retrieve documents that are not that relevant for the search query. Hence, in a second stage, we use a re-ranker based on a cross-encoder that scores the relevancy of all candidates for the given search query.
Cross-Encoders
C.E performs full (cross) self-attention over a given input and label candidate, and tends to attain much higher accuracies than their counterparts. Cross-Encoders can be used whenever we have a pre-defined set of sentence pairs we want to score.
A re-ranker based on a Cross-Encoder can substantially improve the final results for the user. The query and a possible document is passed simultaneously to the transformer network, which then outputs a single score between 0 and 1 indicating how relevant the document is for the given query.
Since we were dealing with movie plots, I passed the user query and plots retrieved.
BERTScore leverages the pre-trained contextual embeddings from BERT and matches words in candidate and reference sentences by cosine similarity. It has been shown to correlate with human judgment on sentence-level and system-level evaluation. Moreover, BERTScore computes precision, recall, and F1 measure, which can be useful for evaluating different language generation tasks.
Bert score
Taking our query as reference and results as the candidates we calculate BERT score f1.
ranked_results_bert = []for cand in results:
P, R, F1 = score([cand['Plot']], ref, lang='en')
ranked_results_bert.append({'Title': cand['Title'], 'Score': F1.numpy()[0]})
The below diagram represents roughly what we have covered in this series, what we have not discussed is taking user behavior into account while showing results or performing new recommendations on the basis of past searches.
credit: Eugene Ryan
Personalization in Search using Embeddings
Capturing User Interactions in Personalized Search
Now understanding user behavior can be done via his past item interactions/searches or from similar interest users, famously know to everyone as content-based and collaborative recommenders. These are not just helpful in recommendations but can help rank results.
Before we jump to any formulation, let’s understand that just recommending things to has also dependency on recency and frequency of items. The genre of movies that a user browsed or viewed recently reflects his recent taste and area of interest, so we might wanna give a weightage to that while recommending the results items while keeping the frequency factor into consideration.
Let’s take the example of movies only. User A has watched or interacted with 3 genres of movies say action, drama, and romance.
Below Data shows week-wise interaction history with each genre of movie. It clearly shows the taste of users has inclined more towards action movies and affinity towards romance movies has declined.
10-week User history
We use the exponential recency weighted average formula to aggregate the user browsing history. Below is the Recommended For You (RFY Model) formula :
RFY Weightage Formula
This aligns with our assumption that the most recent browsing data contributes most to the prediction of the next action.
x(i) is our item vector and w(i) is the weight assigned based on its recency.
For alpha=0.5, a higher value of “i” will be assigned higher weightage.
I have modified the above little bit and added softmax across all the item weights before getting the weighted vector.
I only took the recent browsing data count and calculated the weight for the respective genre of the movie. But it did not take into account the past 9-week window. So let us consider something which gives a distance score keeping in mind the complete user watch history data distribution i.e z-score.
Z-score is easier to calculate and maintain and is a powerful metric that reflects how far the current observation is from the mean.
It is quite clear from the softmax output of method-1 and method-2 that later has much more user behavior relevancy in its score.
The added benefit of the z-score based softmax method is for newly or recently added genres in the catalog. Suppose you have a new genre so you add a z-score ‘0’ corresponding to that genre.
You can see from the weight scores that it has assigned more weightage than romance movies, so your final recommendation will have diversity and freshness in recommendations.
from sentence_transformers import SentenceTransformer, util
#Compute embeddings of retrieved candidate movie plotsembeddings = model.encode(candidate_plots)
#Compute cosine-similarities for each plot with user vectorcosine_scores = util.pytorch_cos_sim(user_encoded_vector, embeddings)
#Find the pairs with the highest cosine similarity scorestitles = [x[‘Title’] for x in results]
ranked_user_behaviour = [{‘Title’:x ,’Score’: y} for x,y in zip(titles,cosine_scores.numpy()[0])]
ranked_user_behaviour = sorted(ranked_user_behaviour, key=lambda x: x[‘Score’], reverse=True)
Comparison Table
We use the user behavior (user_encoded_vector) to re-rank the output shown to User A.
And when he returns back to home page we use the same user_encoded_vector to fetch nearest neighbor movies and can recommend him to watch them. The recommendation will have lot of action movies, drama and bit of romace too.
#code
t=time.time()
query_vector = user_encoded_vector
top_k = index.search(query_vector, 20)
top_k_ids = top_k[1].tolist()[0]
top_k_ids = list(np.unique(top_k_ids))
[fetch_movie_info(idx) for idx in top_k_ids]
#output
>>>> Recommendation Results in Total Time: 0.03316307067871094
[{'Title':'Key Witness'},
{'Title':'The Good Mother'},
{'Title':'Fire on the Amazon'},
{'Title':'Bang'},
{'Title':'How to Make a Monster'},
{'Title':'Hammers Over the Anvil'},
{'Title':'The Nursemaid Who Disappeared'},
{'Title':'Third Person'},
{'Title':'You Were Never Really Here'},
{'Title':'Shadow Dancing'},
{'Title':'Remembrance'},
{'Title':'Small Town Murder Songs'},
{'Title':'Eadweard'},
{'Title':'Caught in the Web'},
{'Title':'Bounty Hunters'},
{'Title':'Fulltime Killer'},
{'Title':'Amar'},
{'Title':'SMS'},
{'Title':'JAKQ Dengeki Tai'},
{'Title':'Gekijō-ban Tiger & Bunny -The Beginning'}]