┌─────────────┐ │ Rider │ │ Requests │ │ Ride │ └──────┬──────┘ │ POST /request-ride │ {lat: 40.7, lng: -74.0} ▼ ┌─────────────────────────────────────┐ │ Server │ │ │ │ SELECT * FROM drivers │ │ WHERE status = 'available' │ │ │ │ for each driver: │ │ distance = haversine( │ │ rider.lat, rider.lng, │ │ driver.lat, driver.lng │ │ ) │ │ │ │ Pick closest driver │ │ (O(n) scan, n = 10,000) │ └─────────────────────────────────────┘ Problem: Loops through ALL drivers in city Time: 800ms for 10K drivers
Linear scan of all drivers—simple but slow
Your campus rideshare app is a hit—students love splitting rides from the library at 2 AM. You built the simplest thing: when someone requests a ride, loop through all available drivers, calculate distance to each one, pick the closest. Works great with 20 drivers. Then the city opens up. Now you have 10,000 drivers online. A ride request takes 800ms just to find a driver. Students wait, drivers idle, and your database CPU chart looks like a cardiac arrest.
Your matching algorithm does a full table scan: SELECT * FROM drivers WHERE status = 'available', then loops through every driver calculating haversine distance (geographic distance between two lat/lng points). For 10,000 drivers, this is 10,000 distance calculations per request. At 100 requests/minute, that's 1 million calculations/minute. Database CPU hits 90%, matching latency climbs to 800ms, and drivers start going offline out of frustration.