PostgreSQL, PL/Perl, and CVE-2016-1238 (Part 1)
PostgreSQL, PL/Perl, and CVE-2016-1238 (Part 1)
Danger may be lurking in the user-defined functions implemented in your database. Read on to find out how.
Join the DZone community and get the full member experience.Join For Free
Built by the engineers behind Netezza and the technology behind Amazon Redshift, AnzoGraph™ is a native, Massively Parallel Processing (MPP) distributed Graph OLAP (GOLAP) database that executes queries more than 100x faster than other vendors.
This post and its follow-up are about the dangers in writing user-defined functions in untrusted languages, but also specifically about how to avoid CVE-2016-1238-related problems when writing PL/PerlU functions. The fix is simple and straightforward, and it is important for it to be in PL/PerlU stored procedures and user-defined functions for reasons I will get into. We'll go over actual exploits, and the severity of being able to inject arbitrary Perl code into the running database backend is a good reason to be disciplined and careful about the use of this language.
It is worth saying at the outset that I have been impressed by how well sensible design choices in PostgreSQL generally mitigate problems like this. In essence, you have to be working in an environment where a significant number of security measures have been bypassed, either intentionally or not. This speaks volumes on the security of PostgreSQL's design because it is highly unlikely that this particular attack vector was an explicit concern. In other words, these decisions make many attacks even against untrusted languages far more difficult than they would be otherwise.
The potential consequences are severe enough, however, that secure coding is particularly important in this environment, even with the help of the secure design. And in any language, it is easy to code yourself into corners where you aren't aware you are introducing security problems until they bite you.
The current implementation, we will see, already has a fair bit of real depth of defense behind it. PostgreSQL is not, by default, vulnerable to the problems in the CVE noted in the title. However, with a little recklessness, it is possible to open the door to the possibility of real problems, and it is possible for these problems to be hidden from the code reviewer by error rather than actual malice. Given the seriousness of what can happen if you can run arbitrary code in the PostgreSQL back-end, my view is that all applicable means should be employed to prevent problems.
PL/PerlU can be used in vulnerable ways, but PL/Perl (without the U) is, by design, safe. Even with PL/PerlU it is worth noting that multiple safety measures have to be bypassed before vulnerability becomes a concern. This is not about any vulnerabilities in PostgreSQL, but what vulnerabilities can be added through carelessly writing stored procedures or user-defined functions.
There are two lessons I would like people to take away from this. The first is how much care has been taken with regard to PostgreSQL regarding security in design. The second is how easily one can accidentally code oneself into a corner. PL/PerlU often is the right solution for many problem domains, and it can be used safely, but some very basic rules need to be followed to stay out of trouble.
Extending SQL in PostgreSQL Using Arbitrary Languages
PostgreSQL has a language handler system that allows user-defined functions and stored procedures to be written in many different languages. Python, TCL, C, and Perl come supported out of the box. Perl and TCL come in trusted and untrusted variants (see below), while Python and C are always untrusted.
There are large numbers of external language handlers as well, and it is possible to write one's own. Consequently, PostgreSQL allows, effectively, SQL to be extended by plugins written in any language. The focus here will be on untrusted Perl, or PL/perlU. Untrusted languages have certain inherent risks in their usage, and these become important, as here, when there is concern about specific attack vectors.
Trusted vs. Untrusted Languages, and What PL/Perl and PL/PerlU Can Do
PostgreSQL allows languages to be marked as 'trusted' or 'untrusted' by the RDBMS. Trusted languages are made available for anyone to write functions for while untrusted languages are restricted to database superusers. Trusted languages are certified by the developers not to interact with file handles — not to engage in any other activity other than manipulating data in the database.
There is no way to 'use' or 'require' a Perl module in PL/Perl (trusted) and, therefore, it is largely irrelevant for our discussion. However, PL/PerlU can access anything else on the system, and it does so with the same permissions as the database manager itself. Untrusted languages, like PL/PerlU make it possible to extend SQL in arbitrary ways by injecting (with the database superuser's permission!) code into SQL queries written in whatever language one wants. Untrusted languages make PostgreSQL one of the most programmable relational databases in the world, but they also complicate database security in important ways, which are well beyond the responsibility of PostgreSQL as a project.
Like test harnesses, untrusted languages are insecure by design (i.e. they allow arbitrary code injection into the database backend) and this issue is as heavily mitigated as possible, making PostgreSQL one of the most security-aware databases in the market.
To define a function in an untrusted language, one must be a database superuser, so the PostgeSQL community places primary trust in the database administration team not to do anything stupid, which is generally a good policy. This article is largely in order to help that policy be effective.
Paths, Permissions, and CVE-2016-1238
The CVE referenced in the title of this article and section is one which allows attackers to inject code into a running Perl routine by placing it in the current working directory of a Perl process. If an optional dependency of a dependency is placed in the current working directory, it may be included in the current Perl interpreter without the understanding of the user. This is a problem primarily because Perl programs are sufficiently complex that people rarely understand the full dependency tree of what they are running.
If the current working directory ends up being one which includes user-writeable data of arbitrary forms, the problem exists. If not, then one is safe.
The problem, however, is that changing directories changes the include path. You can demonstrate this by doing as follows:
In ./test.pl, create a file that contains:
use test; use test2; ./test.pm is: chdir test; \ test/test2.pm is: warn 'haxored again';
What happens is that the use test statement includes ./test.pm which changes directory, so when you use test2, you are including from test/test2.pm. This means that if any period during this cycle of the Perl interpreter's life you are in a world-writable directory, and including or requring files, you are asking for trouble.
Usual Cases for PL/PerlU
The usual use case of PL/PerlU is where you need a CPAN module to process or handle information. For example, you may want to return JSON using JSON.pm, or you may want to write to a (non-transactional log).
For example, the most recent PL/PerlU function I wrote used basic regular expressions to parse a public document record stored in a database to extract information relating to patents awarded on protein sequences. It was trivial but was easier to use JSON than to write JSON serialization inline (and yes, performed well enough given the data sizes operate on).
Are Usual Cases Safe? Deceptively So!
Here are a few pl/perlU functions which show the relevant environment that PL/PerlU functions run in by default:
postgres=# create or replace function show_me_inc() returns setof text language plperlu as $$ return \@INC; $$; CREATE FUNCTION postgres=# select show_me_inc(); show_me_inc ------------------------------ /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 . (7 rows) postgres=# create or replace function get_cwd() returns text language plperlu as postgres-# $$ postgres$# use Cwd; postgres$# return getcwd(); postgres$# $$; CREATE FUNCTION postgres=# select get_cwd(); get_cwd --------------------- /var/lib/pgsql/data (1 row)
Wow. I did not expect such a sensible, secure implementation. PostgreSQL usually refuses to start if non-superusers of the system have write access to the data directory. So this is, by default, a very secure configuration, right up until the first chdir() operation.
Now, in the normal use case of a user-defined function using PL/PerlU, you are going to have no problems. The reason is that most of the time, if you are doing sane things, you are going to want to write immutable functions that have no side effects and maybe use helpers, like JSON.pm, to format data. Whether or not there are vulnerabilities exploitable via JSON.pm, they cannot be exploited in this manner.
However, sometimes people do the wrong things in the database and here, particularly, trouble can occur. Stay tuned for the conclusion of this series, where we'll delve into the potential pitfalls of dealing with Pl/PerlU.
Published at DZone with permission of Chris Travers , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.