Uber System Design
This article provides a look into the system design of popular ride-share apps and what databases are used for their implementation.
Join the DZone community and get the full member experience.
Join For FreeThe popular implementations of the ride-hailing service are the following:
- Uber
- Lyft
- Curb
- Grab
Requirements
- The rider can see all the available nearby drivers
- The driver can accept a trip requested by the rider
- The current location of the rider and driver should be continuously published on the trip confirmation
Data Storage
Database Schema
- The primary entities are the riders, the drivers, the vehicles, and the trips tables
- The relationship between the drivers and the vehicles table is 1-to-many
- The relationship between the drivers and trips table is 1-to-many
- The relationship between the riders and trips table is 1-to-many
- The trips table is a join table to represent the relationship between the riders and the drivers
Type of Data Store
- The wide-column data store (LSM tree-based) such as Apache Cassandra is used to persist the time-series location data of the client (driver and rider)
- The cache server such as Redis is used to store the current location of the driver and the rider for quick lookups
- Message queue such as Apache Kafka is used to handle the heavy traffic
- A relational database such as Postgres stores the metadata of the users
High-Level Design
- The DNS redirects the requests from the client (rider and driver) to nearby data centers
- The client (rider and driver) updates the data stores with Geohash of their real-time location
- WebSocket is used for real-time bidirectional communication between the rider and the driver
- Consistent hashing is used to partition the data stores geographically
Write Path
- The client (driver) creates a WebSocket connection on the load balancer to publish the current location (latitude, longitude) of the driver in real-time
- The load balancer uses the round-robin algorithm to delegate the client’s connection to a server with free capacity in the nearby data center
- The Geohash of the driver location is persisted on the message queue to handle the heavy traffic
- The Geohash of the driver location is stored on the wide-column data store for durability
- The Geohash is stored on the point location cache to provide real-time location updates
- The client (rider) creates a WebSocket connection on the load balancer to publish the current location (latitude, longitude) of the rider in real-time
- The load balancer uses the round-robin algorithm to delegate the client’s connection to a server with free capacity in the nearby data center
- The Geohash of the rider location is persisted on the message queue to handle the heavy traffic
- The analytics service (MapReduce based) queries the wide-column data store to generate offline analytics on the trip data
- The controller service prevents hotspots by auto-repartitioning the stateful services
- The point location cache is denormalized by the generation of multi-character Geohash to improve the read performance (provides zoom functionality)
- The server holding the rider’s WebSocket connection queries the point location cache to identify the available nearby drivers
- As a naive approach, the euclidean distance can be used to find the nearest vehicles within a Geohash
- Sharding of the services can be implemented on multiple levels, such as the city level, geo sharding for further granularity, and the product level (capacity of the vehicle)
- The hotspots are handled through replication and further partitioning of the stateful services by the driver ID
- The wide-column data store is optimized for writes, while the cache server is optimized for reads
- The wide-column data store is replicated across multiple data centers for durability
- The LRU policy is used to evict the cache server
Read Path
Driver Accepting a Trip Request
- The client (driver) creates a WebSocket connection on the load balancer to receive updates on trip requests in real-time
- The load balancer uses the round-robin algorithm to delegate the client’s connection to a server with free capacity in the nearby data center
- The server holding the driver’s WebSocket connection must acquire a distributed lock to handle concurrency issues when accepting trip requests from concurrent unique riders
- The server holding the driver’s WebSocket connection invokes the trip service to confirm the trip
- The trip service queries the pub-sub server to create a one-to-one communication channel between the driver and the rider
- The server publishes the location updates by the driver on the pub-sub server
- The pub-sub server persists the driver’s location data on the trip data store for durability
- The pub-sub server delegates the location updates by the driver to the server that is holding the rider’s WebSocket connection using the publish-subscribe pattern
- The server delegates the driver’s location update to the load balancer that holds the rider’s WebSocket connection
- The driver’s location updates are published to the rider
- The state of the trip is cached on the client (rider and driver) for a fallback to another data center
- Chaos Engineering can be used for resiliency testing
- Services gossip protocol the state for high availability
Data store
Database
Relational database
Data storage
Systems design
Published at DZone with permission of N K. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments