DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Generate Random Test Data in PostgreSQL
  • Implementing Sharding in PostgreSQL: A Comprehensive Guide
  • Database Connection Pooling at Scale: PgBouncer + Multi-Tenant Postgres (10K Concurrent Connections)
  • Scaling PostgreSQL Reads: Implementing Read-Your-Write Consistency Using WAL-Based Replica Routing

Trending

  • Feature Flag Debt: Performance Impact in Enterprise Applications
  • Beyond Manual Annotation: Engineering Self-Correcting Pseudo-Labeling Pipelines
  • Build a GitHub Slack Bot With AWS Bedrock and MCP, Part 2
  • OpenAPI From Code With Spring and Java: A Recipe for Your CI
  1. DZone
  2. Data Engineering
  3. Databases
  4. PostgreSQL: Simple C Extension Development for a Novice User (and Performance Advantages)

PostgreSQL: Simple C Extension Development for a Novice User (and Performance Advantages)

Optimize the performance of your queries with native C functions.

By 
Jobin Augustine user avatar
Jobin Augustine
·
Aug. 07, 19 · Tutorial
Likes (2)
Comment
Save
Tweet
Share
7.5K Views

Join the DZone community and get the full member experience.

Join For Free

One of the great features of PostgreSQL is its extendability. My colleague and senior PostgreSQL developer Ibar has blogged about developing an extension with much broader capabilities including callback functionality. In this blog post, I am trying to address a complete novice user who has never tried but wants to develop a simple function with business logic. Towards the end of the blog post, I want to show how lightweight the function is by doing simple benchmarking, which is repeatable and should act as a strong justification for why end-users should do this kind of development.

Generally, PostgreSQL and extension developers work on a PostgreSQL source build. For a novice user, that may not be required. Instead, dev/devel packages provided for the Linux distro would be sufficient. Assuming that you have installed PostgreSQL already, the following steps can get you the additional development libraries required.

On Ubuntu/Debian

$ sudo apt install postgresql-server-dev-11


On RHEL/CentOS

sudo yum install postgresql11-devel


The next step is to add a PostgreSQL binary path to your environment to ensure that pg_config is there in the path. In my Ubuntu laptop, this is how I did it:

export PATH=/usr/lib/postgresql/11/bin:$PATH


The above-mentioned paths may vary according to the environment.

Please make sure that the pg_config is executing without specifying the path:

$ pg_config


PostgreSQL installation provides a build infrastructure for extensions, called PGXS, so that simple extension modules can be built simply against an already-installed server. It automates common build rules for simple server extension modules.

$ pg_config --pgxs
/usr/lib/postgresql/11/lib/pgxs/src/makefiles/pgxs.mk


Now, let’s create a directory for development. I am going to develop a simple extension, addme, with a function, addme, to add two numbers.

$ mkdir addme


Now, we need to create a Makefile, which builds the extension. Luckily, we can use all PGXS macros.

MODULES = addme
EXTENSION = addme
DATA = addme--0.0.1.sql
PG_CONFIG = pg_config
PGXS := $(shell $(PG_CONFIG) --pgxs)
include $(PGXS)


MODULE specifies the shared object without file extension and EXTENSION specifies the name of the extension name. DATA defines the installation script. The reason for –0.0.1 specifying in the name is that I should match the version we specify in the control file.

Now, we need a control file addme.control with the following content:

comment = 'Simple number add function'
default_version = '0.0.1'
relocatable = true
module_pathname = '$libdir/addme'


And we can prepare our function in C which will add 2 integers:

#include "postgres.h"
#include "fmgr.h"

PG_MODULE_MAGIC;

PG_FUNCTION_INFO_V1(addme);

Datum
addme(PG_FUNCTION_ARGS)
{
int32 arg1 = PG_GETARG_INT32(0);
int32 arg2 = PG_GETARG_INT32(1);

PG_RETURN_INT32(arg1 + arg2);
}


At this stage, we have only 3 files in the directory.

$ ls
addme.c addme.control Makefile


Now we can make the file:

$ make


For installing the extension, we need a SQL file with create function. This SQL file name should be the same as the one we specified in DATA parameter in the Makefile, which is addme–0.0.1.sql.

Add the following content into this file:

CREATE OR REPLACE FUNCTION
addme(int,int) RETURNS int AS 'MODULE_PATHNAME','addme'
LANGUAGE C STRICT;


And install the extension:

$ sudo make install


Now, we can proceed to create the extension and test it:

postgres=# create extension addme;
CREATE EXTENSION
postgres=# select addme(2,3);
addme
-------
5
(1 row)


Just like any function, we can use it in queries against multiple tuples.

postgres=# select 7||'+'||g||'='||addme(7,g) from generate_series(1,10) as g;
?column?
----------
7+1=8
7+2=9
7+3=10
7+4=11
7+5=12
7+6=13
7+7=14
7+8=15
7+9=16
7+10=17
(10 rows)


Performance Benchmarking

Now it is important to understand the performance characteristics calling a C function in extension. For comparison, we have two  options like:
1. ‘+’ operator provided by SQL like

select 1+2;


2. PLpgSQL function as below

CREATE FUNCTION addmepl(a integer, b integer)
 RETURNS integer
as $$ 
BEGIN
  return a+b;
END;
$$ LANGUAGE plpgsql;


For this test/benchmark, I am going to call the function one million times!

SQL + operator

time psql -c "select floor(random() * (100-1+1) + 1)::int+g from generate_series(1,1000000) as g" > out.txt


C function call

$ time psql -c "select addme(floor(random() * (100-1+1) + 1)::int,g) from generate_series(1,1000000) as g" > out.txt


PL function call

$ time psql -c "select addmepl(floor(random() * (100-1+1) + 1)::int,g) from generate_series(1,1000000) as g" > out.txt


I have performed the tests 6 times for each case and tabulated below.

Test Run
Performance by language

Performance by language


Performance by language

Average runtime


As we can see, the performance of Built-in "+" operator and the custom C function in the extension take the least time with almost the same performance. But the PLpgSQL function call is slow and it shows considerable overhead. Hope this justifies why those functions, which are heavily used, need to be written as a native C extension.

PostgreSQL

Published at DZone with permission of Jobin Augustine. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Generate Random Test Data in PostgreSQL
  • Implementing Sharding in PostgreSQL: A Comprehensive Guide
  • Database Connection Pooling at Scale: PgBouncer + Multi-Tenant Postgres (10K Concurrent Connections)
  • Scaling PostgreSQL Reads: Implementing Read-Your-Write Consistency Using WAL-Based Replica Routing

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook