Cron Jobs vs. Event-Driven Architecture


Cron Jobs vs. Event-Driven Architecture

A cron job is a job that runs at a specific time interval and typically processes a bunch of things in a batch. This alone isn't the worst, but it could potentially be inefficient and put an unnecessary load on your database.

Problem 1 Extra DB Load:

Let's say you have a cron job that runs every hour. It queries your database to see if there are any new records for it to process, then gets to work processing them. Now, those interactions with the database might seem arbitrary, but once you hit a certain scale, any additional database interaction becomes significant. Those scheduled requests, whether there are records to process or not, are like death by 1000 papercuts.

Problem 2 Bad Timing:

Chances are your crons are running at specific intervals, intervals that do NOT take into account how many other higher-priority tasks need to be hitting your databases at this time. The task just says, “It is x o'clock, and I need to run.” This often leads to a spike in load on your databases and an embarrassing increase in latency for your end users. Solution: Enter Event-Driven Architecture: Allow me to introduce you to the magic that is “Event Driven Architecture.” This is a method where, when your restful server gets an interaction like a new order coming in after it does the bare minimum processing required to create the order record in the database, it puts an event record in a queue of some type. The event record should contain most of the relevant information for an event worker to process the event later.

Eventually, a worker consuming those records will pull the event record from the queue and begin to process it. Since the event record already has most of the information about the order in the event record body, the worker will NOT need to do much if there is any extra querying of the database. This prevents extra database resource consumption that can be slow and/or expensive.

Spreading Out The Load:

This becomes much less important if you pass enough information in the event message body that the worker does NOT need to query the database, but let's say, for argument's sake, your worker still needs to query something. Using this technique naturally spreads out the load on the database as there are not a bunch of messages waiting to get queued up until your cron decides to run. The messages are queued up instantly, so unless a huge number of users place an order at the exact same time, the load will be spread out. And even if they do, you can throttle the number of workers so the workers will just take their time consuming that queue instead of running in parallel.

You can do some advanced tactics like setting rules to throttle workers from processing if the database CPU usage is too high or if the website latency is too high as well.

Warning:

Don’t just create a database table with a bunch of events in your primary database. Then, the workers polling for events are putting a load on your primary database, which kind of defeats the purpose of this whole thing.

Batching Events:

There is nothing that says a message in a queue needs to be processed by a worker immediately. When you are building your infrastructure, think very carefully about what needs to be processed within seconds, minutes, hours, or even days. There are a lot of things you can queue up and wait to process until a time that would have a lower impact on your database and, therefore, your customer experience.

AWS SQS:

Pick the right tool for the job. AWS SQS is crazy fast and a great queue management tool, especially if you only have one consumer per queue. This means when an event is queued, only one worker needs to consume and process that event. It is a great starting point for Event-Driven Architecture.

AWS Kinesis:

This is a bit advanced (even for the advanced section), but if you have embraced a MicroService architecture and/or have multiple workers who may need to consume an event, this is a great service. Kinesis is a data stream that will allow as many consumers as possible to consume events as they like. Note: There is a limit if you use the Enhanced Fan-out functionality, but I have not really worked with that.

I generally start my projects with kinesis just because of the amount of flexibility that it gives me with my designs.

Bonus Data Warehouse:

As an added bonus, AWS Kinesis can pretty easily pump these events into S3 for consumption by a few of the tools mentioned earlier in the Data Warehouse Section.