Usually when a third-party API is used, you are required to stay within defined rate limit. This allows API providers to serve many users without much cost. It is easy to adhere to the rate limit, if you use a single server to make all the API calls. However, if API calls are made from multiple servers, adhering to the rate limit is a difficult problem. We'll talk about our approach to solving this problem.
We leverage Plivo's voice API to achieve high calls per second (CPS) in CallHub's voice broadcasting software. We have many servers, with each server running multiple running Celery worker processes making Plivo API calls. All the Celery workers feed from the same Redis queue to maintain consistency.
When a customer schedules a voice broadcasting campaign, the system takes about a minute worth of phone numbers from the customer's phonebook. It then schedules the calls for every second using the Celery workers.
Celery based per-second scheduling is not accurate because Celery only guarantees that the tasks start after a specified time interval. It does not guarantee that the task will be executed at the specified time. The task can get executed any time after the specificed time interval. Due to this, our system used to exceed the CPS limit sometimes and was scheduling calls below the CPS limit at some other times.
Our approach to this problem is to store the actual number of calls made in every second. Whenever new calls are made, we can check if the CPS limit will get exceeded or not for that second. If the CPS limit will be exceeded, we reschedule that call to the next minute.
To implement this solution, we required the following.
1. A check for CPS that can check the current call count and then increment it as necessary. This check must be atomic. We can remove the per-second data after that second has passed. Both of this was done with a Lua script in Redis. You could also implement this using optimistic locking as mentioned in Redis transactions. Here is our Lua script to do this,
2. The Celery task, that made the calls, was configured to have infinite retries. Then, whenever the a call exceeded the CPS limit, we generated a random number between 60 and 120 and rescheduled the pending call to that second. This way, all calls that are scheduled over the CPS limit will eventually get called at the very end of the voice broadcasting campaign. Here is the code to do that,
Here is the complete code that implements both of the above in Python,
We tested this approach with 20 CallHub voice broadcasting campaigns running concurrently, each with 100,000 phone numbers, across 4 celery servers. The calls did not exceed the max CPS, even once. When there were many concurrent calls getting scheduled, there was high load on the CPU (about 50% CPU usage) on the Redis server. This is due to the atomic nature of the Lua rate check.