apps/www/_blog/2021-03-05-postgres-as-a-cron-server.mdx
A Supabase user asked recently if they can trigger a webhook periodically. We haven't yet released Functions yet, so we checked whether it's possible with Postgres.
It is. Here's how.
A "cron job" is a script1 that runs periodically at fixed times, dates, or intervals. Traditionally you'd set it up on a Linux server. An example might be an hourly script that downloads emails to your computer.
These days, cron jobs are set up on a remote servers and in the cloud to run internet-related tasks. Like checking an endpoint every hour, or scraping a website every day.
Postgres has "extensions" which allow you to, well, extend the database with "non-core" features. Extensions essentially turn Postgres into an application server.
The team at Citus created pg_cron to run periodic jobs within your Postgres database.
If you're using a cloud-hosted Postgres database, make sure that pg_cron is installed first. The easiest way to do this is to run this command:
select name, comment, default_version, installed_version
from pg_available_extensions
where name = 'pg_cron';
If it returns a result then the extension is supported and you can turn it on by running:
create extension if not exists pg_cron;
If you're using Supabase you can also enable it in the Dashboard.
If you're planning to use a non-superuser role to schedule jobs, ensure that they are granted access to the cron schema and its underlying objects beforehand.
grant usage on schema cron to {{DB user}};
grant all privileges on all tables in schema cron to {{DB user}};
Failure to do so would result in jobs by these roles to not run at all.
The Supabase customer wanted to call external endpoints every day. How would we do this? Another extension of course. This time we're going to use pgsql-http by @pramsey. Using the same technique, we can enable the extension (if it exists in your cloud provider).
create extension if not exists http;
This extension can now be used for sending GET, POST, PATCH, and DELETE requests.
For example, this function would get all the people in Star Wars (using the Star Wars API):
select content::json->'results'
from http_get('https://swapi.dev/api/people');
Now the fun stuff. For this example we're going to call webhook.site every minute with the payload { "hello": "world" }.
Here's the code (with comments --like this).
select
cron.schedule(
'webhook-every-minute', -- name of the cron job
'* * * * *', -- every minute
$$
select status
from
http_post(
'https://webhook.site/223c8a43-725b-4cbd-b1fe-d0da73353a6b', -- webhook URL, replace the ID(223c8..) with your own
'{"hello": "world"}', -- payload
'application/json'
)
$$
);
Now when we see that the payload is sent every minute, exactly on the minute.
And that's it! We've built a cron webhook. Breaking down the code example above we have 2 key parts:
This is the part that sends the data to the website:
select status
from
http_post(
'https://webhook.site/223c8a43-725b-4cbd-b1fe-d0da73353a6b', -- webhook URL
'{"hello": "world"}', -- payload
'application/json'
);
The HTTP function is wrapped with the CRON scheduler:
select
cron.schedule(
'cron-name', -- name of the cron job
'* * * * *', -- every minute
$$
-- Put your code between two dollar signs so that you can create full statements.
-- Alternatively, you can write you code in a Postgres Function and call it here.
$$
);
The second parameter uses cron syntax:
┌───────────── min (0 - 59)
│ ┌────────────── hour (0 - 23)
│ │ ┌─────────────── day of month (1 - 31)
│ │ │ ┌──────────────── month (1 - 12)
│ │ │ │ ┌───────────────── day of week (0 - 6) (0 to 6 are Sunday to
│ │ │ │ │ Saturday, or use names; 7 is also Sunday)
* * * * *
If you're unfamiliar with the cron syntax, useful shortcuts can be found on crontab.guru
* * * * * # every minute
*/5 * * * * # every 5th minute
0 * * * * # every hour
0 0 * * * # every day
To see a list of all your cron jobs, run:
select * from cron.job;
And if you need to see the results of each cron iterations, you can find them in cron.job_run_details:
select * from cron.job_run_details;
To stop a running cron job, you can run:
select cron.unschedule('webhook-every-minute'); -- pass the name of the cron job
There are plenty use-cases for this. For example:
selects all your signups yesterday, then sends them to your favorite transactional email service. Schedule it every day to run at midnight.See a detailed list in the pg_cron README.
You might have noticed this notice the warning at the bottom of the http readme:
"What happens if the web page takes a long time to return?" Your SQL call will just wait there until it does. Make sure your web service fails fast.
Luckily pg_cron implements Background Workers:
Care is taken that these extra processes do not interfere with other postmaster tasks: only one such process is started on each ServerLoop iteration. This means a large number of them could be waiting to be started up and postmaster is still able to quickly service external connection requests.
This means that even if your endpoint takes a long time to return, it's not going to be blocking your core Postgres functions. Either way, you should probably only call endpoints that will return a response quickly, or set the http extension to fail fast (http.timeout_msec = 300).
If you're familiar with C, you could also help @pramsey to implement async functions: https://github.com/pramsey/pgsql-http/issues/105
There are plenty of ways to run cron jobs these days. You can trigger them from your local machine. You can install them on a VPS. You can schedule Serverless functions. You can use a paid service. You can use GitHub Actions.
Is Postgres the best place to put your cron jobs? ¯\_(ツ)_/¯. Postgres databases are free on Supabase and since it takes only one minute to get started, why not make your next cron server a Postgres database?
Not necessarily a script. The cron is really a scheduler which triggers a job (of some sort, usually a bash script). ↩