avatarJIN

Summary

The provided content outlines the system architecture of Yelp's Proximate Places Search, detailing its functional and non-functional requirements, database schema, system APIs, and design concepts, including the use of SQL-based storage, grid systems, data partitioning, replication, caching, load balancing, ranking, and recommendation algorithms.

Abstract

Yelp's Proximate Places Search System is designed to handle a vast number of user interactions, including ratings, reviews, and searches for local businesses. The system architecture supports real-time search experiences with high availability and low latency, capable of managing up to 500 million places and handling 250k requests per second. It employs a sophisticated database schema to store user profiles, point of interest data, reviews, and business categories. The architecture incorporates advanced features such as wait-time prediction, dynamic grid sizing, and a recommendation engine that leverages collaborative filtering and social network data. The system ensures fault tolerance and scalability through techniques like sharding, replication, and the use of a content delivery network (CDN) for caching. Yelp's engineering blog and various GitHub repositories provide insights into the system's design and implementation, emphasizing the importance of a robust infrastructure for supporting the platform's extensive user base.

Opinions

  • The author suggests that Yelp's system design is crucial for connecting users with local businesses effectively.
  • The use of quadtrees and dynamically adjusting grid sizes is presented as an efficient method for managing geographically distributed data.
  • The recommendation to use a master-slave configuration for replication indicates a preference for a tried-and-tested approach to handling read and write traffic.
  • The mention of using Memcache and the least recently used (LRU) policy for caching implies a focus on optimizing system performance.
  • The choice of round-robin load balancing suggests a straightforward strategy for distributing client requests, although the author notes the need for a more intelligent load balancer to handle overloaded servers.
  • The author emphasizes the importance of the Yelp recommendation algorithm, which includes sentiment analysis and a hybrid recommender system, in providing personalized user experiences.
  • The reference to Yelp's transition from a monolithic architecture to a service-oriented architecture (SOA) reflects a trend towards microservices for improved scalability and maintainability.
  • The author encourages further study and exploration of Yelp's data pipelines and architecture, indicating a belief in the value of shared knowledge and community contributions.
  • The author provides links to Patreon, Ko-fi, and Buy Me A Coffee, suggesting a desire for reader support and acknowledgment of their work.
  • The author's inclusion of a referral link for Medium membership implies an endorsement of the platform and a willingness to engage with the reader community beyond the scope of the article.

Yelp/Proximate Places Search System Architecture

Please clap and share if you like this article.

Yelp is the largest review website in the United States. Users can give their comments for hotels, restaurants, shopping malls, tourism, and other fields from all over the world. Users can rate those businesses, submit their reviews, share their experiences on the Yelp website. The Yelp concept is similar to foursquare in Europe. The Yelp app allows 142 million users to search for local businesses and services and rate them. In the end, the Yelp rating review increase local sales up to 5–9%, which equates to roughly $8,000 dollars annually. It is successful because it streamlines production lines, making consumers directly with demand.

Functional Requirements

  1. The system can store information about different places so that users can rate and review based on the other user's feedback
  2. Users can edit the basic information of the places or add new places with a brief description or delete the old places
  3. Users can give ratings (1 to 5 stars) and reviews (text and images) to places
  4. The system will detect users nearest location (longitude/latitude), users can search all the proximate places within a given radius for any locations
  5. Users can create an account, log in and logout from the account.

New Features

  1. Wait-time Prediction

Non-Functional Requirements

  1. The system must be highly available with the real-time search experience
  2. The system must have low latency
  3. The system must allow read-heavy and support a high number of search requests, up to 200 search requests per second
  4. The system can be scalable, up to 40 million places.

Scale Estimation

Total number of places in the system = 200 million

Total requests per second = 100k

Scalability =20% growth per year

Target Goals

  1. Manage 500 million scale
  2. Able to handle 250 k requests per second

Database Schema

https://paulx-cn.github.io/blog/6th_Blog/
https://paulx-cn.github.io/blog/6th_Blog/
  • User Profile Data: User ID, Name, Email, Mobile Number, sex
  • POI (point of interest)/ the relationship Profile: Place ID, Name, Category, GPS location, Photo link, description, place review, and rating
  • Review: Review ID, USer ID, Place ID, Rating, Photos, review text
  • Business Category: Business attribute

System API

  1. Search API
  2. Add places/Business API
  3. Add Reviews API
  4. Business Match API
  5. Event Lookup API
  6. Event Search API
  7. Featured Event API
  8. Phone Search API
  9. Transaction Search API

AI-autogenerated Review

System Design Concepts

  1. SQL based Storage
  • Places, reviews, user details are the main metadata that must be stored in SQL DB. Location (latitude and longitude) can be indexed for search optimization. Nearby places search will be seen in real-time.
  • Each place will be stored in a separate entry, identified by LocationID.
  • When the nearby places search is used, the given location (X, Y) within a radius “D”, we can query like where latitude between X-D and X+D and longitude between Y-D and Y+D.
  • Each index will return a huge list of places, so the query process won’t be efficient.

2. Grid

Fixed Grid Size

  • divide the whole maps into grids and group them into a grid ID.
  • Each grid ID will store all the places residing within a certain range of latitude and longitude
  • Based on a given location, we can find all the nearby grids and then query these grids to find nearby places
  • Each grid size must be equal so that the distance of each grid size is standard
  • Can run slow because some places are not uniformly distributed among grids
  • For the thickly dense area with a lot of places and sparsely populated with few places, the same grid size cannot give the approximately same number of places

Using Dynamically Adjusting Grid Size/QuadTree

https://en.wikipedia.org/wiki/Quadtree#/media/File:Point_quadtree.svg

Heuristic Rule

  • Don’t put more than 500 places in a grid, so that users can search fast
  • When a grid reaches its limit, break the grids into 4 smaller grids of equal size and distribute places fairly among them
  • Small grids for the thickly dense area with a lot of places and large grids for sparsely populated with few places
  • What if the area keeps growing new places in a grid, so it will break the 500 places limit in a grid
  • Also, It will reach the memory limit of the server

3. Data Partitioning

Sharding Based On regions

  • Further category the places into regions (like Zip codes) such that all places belonging to a region will be grouped together
  • Places and user’s locations will be tagged by their respective region. While querying for or storing the nearby places, the region server will be working
  • What if a region has a lot of queries on the server? it will make the performance slow
  • Over time, some regions can store many places but some don’t. It will cause memory issue while regions are growing.
  • We must repartition our data by using consistent hashing

Sharding based on LocationID

  • The hash function will be used to map each LocationID to a server for the places
  • Calculate the hash of each LocationID to find a suitable server to store the place

4. Replication and Fault Tolerance

  • Replication is for the read purpose. So that we have replicas of the QuadTree server, we can distribute read traffic.
  • Implement master-slave configuration, where replicas will serve only read traffic, and all write traffic will go to the master, then be copied to slaves
  • What if the QuadTree server dies? We can have a secondary replica to take control after the failover
  • What if the primary and secondary QuadTree server dies at the same time? We can allocate a new server and rebuild the same QuadTree by the brute-force solution which will update the whole DB and filter LocationIDs using our hash function.
  • HashMap, where “key” would be the QuadTree server number, and the “value” would be a HashSet containing all the places being kept on that QuadTree Server (key = locationID)
  • Write operation = map the place ID onto one of the partitioned databases using a hashing function
  • Photos will be uploaded into external cloud DB such as Amazon S3
  • Path link is stored in the DB
  • Read Operation = Pass on the user’s query to the geo-hashing server

5. Cache

  • Install a cache in front of DB
  • Memcache will be used
  • The least recently used (LRU) will be used

6. Load Balancing (LB)

  • add load balancing at clients and application servers
  • The Round Robin approach will be adopted
  • If a server is dead, the lb will take over and stop sending any traffic to the server
  • But Round Robin LB will not stop sending new requests to the server if a server is overloaded
  • A more intelligent LB will be needed

7. Ranking

  • Add popularity or relevance features
  • Keep track of the overall popularity of each place
  • How many stars do users give averagely

8. Yelp Recommendation Algorithm

  • Sentiment Analysis model using LSTM and GloVe
  • Recommendation Engine: a real-time recommendation generator (Databricks + Apache Spark + AWS)
  1. Content-Based

2. Collaborative Filtering

3. Social Network

4. Location-Based

https://www.researchgate.net/publication/344768108_Sentiment_Analysis_based_on_GloVe_and_LSTM-GRU

The High-level System Architecture Diagram

Wait-Time System Architecture

https://engineeringblog.yelp.com/2019/12/architecting-wait-time-estimations.html

The Detailed System Architecture

http://juliachencoding.blogspot.com/2021/04/system-design-design-yelp.html

Many doubts may come out to your mind after reading this article.

  1. How can we solve the problem of grid imbalance?
  2. How can we solve the system problem to achieve high availability and fault tolerance?

Further Study

References

If you’ve found any of my articles helpful or useful then please consider throwing a coffee my way to help support my work or give me patronage😊, by using

Patreon

Ko-fi.com

buymeacoffee

Last but not least, if you are not a Medium Member yet and plan to become one, I kindly ask you to do so using the following link. I will receive a portion of your membership fee at no additional cost to you.

Yelp
Review
Location
System Architecture
System Design Interview
Recommended from ReadMedium