And now there is pg_xenophile too, the seventh spin-off extension of the robust, efficient PostgreSQL backend for our scalable MQTT hosting platform: FlashMQ.
pg_xenophile is also available via PGXN: https://pgxn.org/dist/pg_xenophile/
Smokes your problems, coughs fresh air.
And now there is pg_xenophile too, the seventh spin-off extension of the robust, efficient PostgreSQL backend for our scalable MQTT hosting platform: FlashMQ.
pg_xenophile is also available via PGXN: https://pgxn.org/dist/pg_xenophile/
Yet another PostgreSQL extension that emerged from the FlashMQ PostgreSQL backend is: pg_safer_settings. pg_safer_settings offers some functions and a mechanisms to make working with settings in PostgreSQL a bit safer. Its defining features are:
My latest split-off release from our super-stable FlashMQ.com PostgreSQL backend is version 1.0.0 the pg_utility_trigger_functions. It’s a very simple bundle of a bunch of trigger functions that I find useful to carry around with me, and it’s sufficiently-tested for me to bring it out immediately under a 1.0.0 stamp.
Another release from the long list of PostgreSQL extensions that I’m splitting off from the database schemas that make up our rock-solid MQTT hosting platform issss: pg_role_fkey_trigger_functions. That extension name is quite the mouthful. From its readme (again courtesy of pg_readme):
The pg_role_fkey_trigger_functions PostgreSQL extension offers a bunch of trigger functions to help establish and/or maintain referential integrity for columns that reference PostgreSQL ROLE NAMEs.
Many extensions are yet to come, because I’ve found that dependencies in themselves aren’t bad, as long as they are used side by side, and not needlessly stacked on top of each other. That they are no longer 100% internal to the migrations belonging to the app doesn’t worry my, because:
Another part of my FlashMQ.com spin-off PostgreSQL extension series is pg_rowalesce. From its README.md (generated with pg_readme):
The pg_rowalesce PostgreSQL extensions its defining feature is the rowalesce() function. rowalesce() is like coalesce(), but for rows and other composite types. From its arbitrary number of argument rows, for each field/column, rowalesce() takes the value from the first row for which that particular field/column has a not null value.
pg_mockable is one of the spin-off PostgreSQL extensions that I’ve created for the backend of our high-performance MQTT hosting platform: FlashMQ. It’s basic premise is that you (through one of the helper function that the extension provides) create a mockable wrapper function around a function that you want to mock. That wrapper function will be a one-statement SQL function, so that in many cases it will be inlineable.
Then, when during testing you need to mock (for instance now() and its entire family), you can call the mockable.mock() function, which will replace that wrapper function to return the second argument fed to mockable.mock(), until mock.unmock() is called to restore the original wrapper function.
I released pg_readme a couple of days ago. From the (auto-generated, inception-style, by pg_readme) README.md:
The pg_readme PostgreSQL extension provides functions to generate a README.md document for a database extension or schema, based on COMMENT objects found in the pg_description system catalog.
This is a split-off from the PostgreSQL database structure that I’ve developed for the backend of our managed MQTT hosting platform. There will be many more of split-off PostgreSQL extensions to follow. Although I like my home-grown migration framework (soon to also be released), pushing reusable parts of schemas into separate extensions has a couple of advantages:
The flashmq.com project has afforded me an opportunity to finally play with PostgREST—a technology that, for many years, I had been itching to use, even before I ever heard of PostgREST; the itch started back in the day when Wiebe and I were working with Ruby on Rails and I tried to hack ActiveRecord to propagate PostgreSQL its permissions into the controller layer, wishing to leave permission checking to the RDBMS rather than redoing everything in the controller.
One habit that changed between 2007 (when I was still working with ActiveRecord) and now (with my most recent torture being by Django’s horrid ORM) is that I no longer rely on PostgreSQL its search_path for anything. Instead, I tend to use qualified names for pretty much anything. But, because one should never stop doubting ones own choices, recently I did some digging and testing to see if there really are no valid, safe use cases for the use of unqualified names in PostgreSQL…
Before delving into the use cases, I want to ruminate on the definitional knowledge: firstly, about Postgres schemas in general, and secondly, about the search_path.
SET search_path TO DEFAULT; SELECT current_setting('search_path'), current_schemas(false);
This will give us the search_path setting side-by-side with the effective search path that is derived from it:
SET current_setting | current_schemas -----------------+----------- "$user", public | {public}
SET search_path TO DEFAULT; CREATE TEMPORARY TABLE a (id INT); SELECT current_setting('search_path'), current_schemas(true);
SET CREATE TABLE current_setting | current_schemas -----------------+------------------------------- "$user", public | {pg_temp_3,pg_catalog,public} (1 row)
CREATE SCHEMA need_to_know_basis; CREATE ROLE no_need_to_know; GRANT USAGE ON SCHEMA public TO no_need_to_know; SET ROLE TO no_need_to_know; SET search_path TO need_to_know_basis, "$user", public; SELECT current_setting('search_path'), current_schemas(false);
CREATE SCHEMA CREATE ROLE GRANT SET SET current_setting | current_schemas -------------------------------------+----------------- need_to_know_basis, "$user", public | {public} (1 row)
DROP ROLE IF EXISTS bondjames; DROP SCHEMA IF EXISTS bondjames; CREATE ROLE bondjames; CREATE SCHEMA AUTHORIZATION bondjames; SET ROLE TO bondjames; SET search_path TO DEFAULT; SELECT current_setting('search_path'), current_schemas(false);
CREATE ROLE CREATE SCHEMA SET SET current_setting | current_schemas -----------------+----------------- "$user", public | {bondjames} (1 row)
SET search_path TO DEFAULT; SELECT current_setting('search_path'), current_schemas(true);
SET current_setting | current_schemas -----------------+--------------------- "$user", public | {pg_catalog,public} (1 row)
“This [was] needed to allow a security-definer function to set a truly secure value of search_path. Without it, an unprivileged SQL user can use temporary objects to execute code with the privileges of the security-definer function (CVE-2007-2138). See CREATE FUNCTION for more information.”
SET search_path TO DEFAULT; SELECT current_setting('search_path'); SHOW search_path;
SET current_setting ----------------- "$user", public (1 row) search_path ----------------- "$user", public (1 row)
Like many settings in PostgreSQL, search_path can be set on many levels:
search_path = '"$user", public'
ALTER ROLE { role_specification | ALL } [ IN DATABASE database_name ] SET search_path { TO | = } { value | DEFAULT } ALTER ROLE { role_specification | ALL } [ IN DATABASE database_name ] SET search_path FROM CURRENT ALTER ROLE { role_specification | ALL } [ IN DATABASE database_name ] RESET search_path
ALTER DATABASE name SET search_path { TO | = } { value | DEFAULT }; ALTER DATABASE name SET search_path FROM CURRENT; ALTER DATABASE name RESET search_path;
ALTER FUNCTION name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ] SET search_path { TO | = } { value | DEFAULT }; ALTER FUNCTION name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ] SET search_path FROM CURRENT; ALTER FUNCTION name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ] RESET search_path; ALTER PROCEDURE name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ] SET search_path { TO | = } { value | DEFAULT }; ALTER PROCEDURE name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ] SET search_path FROM CURRENT; ALTER PROCEDURE name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ] RESET search_path;
Importantly, all these settings set defaults. They do not stop the user (or a VOLATILE function/procedure) from overriding that default by issuing:
SET search_path TO evil_schema, pg_catalog;
Or, if the user doesn’t have their own evil_schema:
select set_config('search_path', 'pg_temp, pg_catalog');
In the following example, SUPERUSER wortel innocuously uses lower() in a SECURITY DEFINER function, naively assuming that it’s implicitly referring to pg_catalog.lower():
CREATE ROLE wortel WITH SUPERUSER; SET ROLE wortel; CREATE SCHEMA myauth; GRANT USAGE ON SCHEMA myauth TO PUBLIC; CREATE FUNCTION myauth.validate_credentials(email$ text, password$ text) RETURNS BOOLEAN SECURITY DEFINER LANGUAGE PLPGSQL AS $$ DECLARE normalized_email text; BEGIN normalized_email := lower(email$); -- Something interesting happens here… RETURN TRUE; END; $$; GRANT EXECUTE ON FUNCTION myauth.validate_credentials TO PUBLIC;
wortel creates a new user piet and gives piet his own schema, assuming that he can only do damage to his own little garden:
CREATE ROLE piet; CREATE SCHEMA AUTHORIZATION piet;
However, piet turns out to be wearing his black hat today:
SET ROLE piet; SET search_path TO DEFAULT; SELECT current_setting('search_path'), current_schemas(true); SELECT myauth.validate_credentials('pietje@example.com', 'Piet zijn password'); CREATE FUNCTION piet.lower(text) RETURNS TEXT LANGUAGE plpgsql AS $$ BEGIN RAISE NOTICE 'I can now do nasty stuff as %!', current_user; RETURN pg_catalog.lower($1); END; $$; SET search_path TO piet, pg_catalog; -- Reverse order; SELECT current_setting('search_path'), current_schemas(true); SELECT myauth.validate_credentials('pietje@example.com', 'Piet zijn password');
SET SET current_setting | current_schemas -----------------+------------------- "$user", public | {pg_catalog,piet} (1 row) validate_credentials ---------------------- t (1 row) CREATE FUNCTION SET current_setting | current_schemas ------------------+------------------- piet, pg_catalog | {piet,pg_catalog} (1 row) NOTICE: I can now do nasty stuff as superuser! validate_credentials ---------------------- t (1 row)
This can be mitigated by attaching a specific search_path to the SECURITY DEFINER routine:
RESET ROLE; ALTER FUNCTION myauth.validate_credentials SET search_path = pg_catalog; SET ROLE piet; SET search_path TO piet, pg_catalog; -- Reverse order; SELECT current_setting('search_path'), current_schemas(true); SELECT myauth.validate_credentials('pietje@example.com', 'Piet zijn password');
RESET ALTER FUNCTION SET SET current_setting | current_schemas ------------------+------------------- piet, pg_catalog | {piet,pg_catalog} (1 row) validate_credentials ---------------------- t (1 row)
Before the changes made in response to CVE-2007-2138, piet would have been able to mask pg_catalog.lower with his own pg_temp.lower() even if he did not muck around with the search_path. After all, by default, pg_temp is prepended to the very front of the effective search path:
CREATE ROLE wortel WITH SUPERUSER; SET ROLE wortel; CREATE SCHEMA myauth; GRANT USAGE ON SCHEMA myauth TO PUBLIC; CREATE FUNCTION myauth.validate_credentials(email$ text, password$ text) RETURNS BOOLEAN SECURITY DEFINER LANGUAGE PLPGSQL AS $$ DECLARE normalized_email text; BEGIN normalized_email := lower(email$); -- Something interesting happens here… RAISE NOTICE 'My effective search path is: %', current_schemas(true); RETURN true; END; $$; GRANT EXECUTE ON FUNCTION myauth.validate_credentials TO PUBLIC; DROP SCHEMA piet CASCADE; DROP ROLE piet; CREATE ROLE piet; SET ROLE piet; SET search_path TO DEFAULT; SELECT current_setting('search_path'), current_schemas(true); SELECT myauth.validate_credentials('pietje@example.com', 'Piet zijn password'); CREATE FUNCTION pg_temp.lower(text) RETURNS TEXT LANGUAGE plpgsql AS $$ BEGIN RAISE NOTICE 'I can now do nasty stuff as %!', current_user; RETURN pg_catalog.lower($1); END; $$; SELECT current_setting('search_path'), current_schemas(true); SELECT myauth.validate_credentials('pietje@example.com', 'Piet zijn password'); RESET ROLE; ALTER FUNCTION myauth.validate_credentials SET search_path = pg_catalog; SET ROLE piet; SELECT current_setting('search_path'), current_schemas(true); SELECT myauth.validate_credentials('pietje@example.com', 'Piet zijn password');
CREATE ROLE SET CREATE SCHEMA GRANT CREATE FUNCTION GRANT ERROR: schema "piet" does not exist DROP ROLE CREATE ROLE SET SET current_setting | current_schemas -----------------+----------------- "$user", public | {pg_catalog} (1 row) NOTICE: My effective search path is: {pg_catalog,public} validate_credentials ---------------------- t (1 row) CREATE FUNCTION current_setting | current_schemas -----------------+------------------------ "$user", public | {pg_temp_3,pg_catalog} (1 row) NOTICE: My effective search path is: {pg_temp_3,pg_catalog,public} validate_credentials ---------------------- t (1 row) RESET ALTER FUNCTION SET current_setting | current_schemas -----------------+------------------------ "$user", public | {pg_temp_3,pg_catalog} (1 row) NOTICE: My effective search path is: {pg_temp_3,pg_catalog} validate_credentials ---------------------- t (1 row)
As you can see, now this approach doesn’t work anymore, even though the effective search path includes pg_temp_3 (logically so because it is omitted in the search_path that is SET on the routine level). That is because part of the fix for CVE-2007-2138 was that unqualified routine and function names are never searched for in the temporary schema name. (The other part of the fix was that pg_temp can be listed explicitly in the search_path to also give it lower precedence for relation names.)
Before PostgreSQL 15, there was another way in which piet would have been able to mask pg_catalog.lower(): by creating the function in public, which was, until Pg 15, fair game for all roles to CREATE FUNCTIONs in.
Pro tip: you can use CREATE { FUNCTION | ROUTINE } … SET search_path FROM CURRENT to take the search_path from the migration context.
So, now we know how to control the search_path of tables so that we don’t have to write out every operator within each function in a form like:
SELECT 3 OPERATOR(pg_catalog.+) 4;
How about the DEFAULT expressions for columns in table definitions? And, similarly, what about GENERATED ALWAYS AS expressions?
If a table is wrapped inside a view for security, can evil_user mask, for instance, the now() function with evil_user.now() and have evil_user.now() be executed as the view’s definer?
No:
“Functions called in the view are treated the same as if they had been called directly from the query using the view. Therefore, the user of a view must have permissions to call all functions used by the view. Functions in the view are executed with the privileges of the user executing the query or the function owner, depending on whether the functions are defined as SECURITY INVOKER or SECURITY DEFINER. Thus, for example, calling CURRENT_USER directly in a view will always return the invoking user, not the view owner. This is not affected by the view’s security_invoker setting, and so a view with security_invoker set to false is not equivalent to a SECURITY DEFINER function and those concepts should not be confused.”
But is the same true for when the table is wrapped by one of these MS Certified Suit-style wrapper functions? This would depend on when the unqualified object names are resolved into schema-qualified names—during table definition or during INSERT or UPDATE. So when does this happen? Well, that is a bit difficult to spot:
BEGIN TRANSACTION; CREATE ROLE wortel WITH SUPERUSER; SET ROLE wortel; CREATE SCHEMA protected; CREATE SEQUENCE protected.thingy_id_seq; CREATE TABLE protected.thingy ( id BIGINT PRIMARY KEY DEFAULT nextval('protected.thingy_id_seq') ,id2 UUID UNIQUE DEFAULT gen_random_uuid() ,inserted_by_application NAME DEFAULT current_setting('application_name') ,inserted_at TIMESTAMPTZ DEFAULT now() ,description TEXT ); SELECT pg_get_expr(adbin, 'protected.thingy'::regclass::oid) FROM pg_catalog.pg_attrdef WHERE adrelid = 'protected.thingy'::regclass::oid; CREATE SCHEMA exposed; CREATE FUNCTION exposed.insert_thingy(TEXT) RETURNS BIGINT SECURITY DEFINER LANGUAGE PLPGSQL AS $$ DECLARE _new_id BIGINT; BEGIN RAISE NOTICE 'current_schemas(true) = %', current_schemas(true); INSERT INTO protected.thingy (description) VALUES ($1) RETURNING id INTO _new_id; RETURN _new_id; END; $$; CREATE ROLE evil_user; CREATE SCHEMA AUTHORIZATION evil_user; GRANT USAGE ON SCHEMA exposed TO evil_user; GRANT EXECUTE ON FUNCTION exposed.insert_thingy TO evil_user; SET ROLE evil_user; CREATE FUNCTION evil_user.nextval(regclass) RETURNS BIGINT LANGUAGE PLPGSQL AS $$ BEGIN RAISE NOTICE 'I am now running as: %', current_user; RETURN 666; END; $$; CREATE FUNCTION evil_user.now() RETURNS TIMESTAMPTZ LANGUAGE PLPGSQL AS $$ BEGIN RAISE NOTICE 'I am now running as: %', current_user; RETURN pg_catalog.now(); END; $$; SET search_path TO evil_user, pg_catalog; -- Reverse path SELECT exposed.insert_thingy('This is a thing'); RESET ROLE; SHOW search_path; SELECT pg_get_expr(adbin, 'protected.thingy'::regclass::oid) FROM pg_catalog.pg_attrdef WHERE adrelid = 'protected.thingy'::regclass::oid;
BEGIN CREATE ROLE SET CREATE SCHEMA CREATE SEQUENCE CREATE TABLE pg_get_expr ---------------------------------------------- nextval('protected.thingy_id_seq'::regclass) gen_random_uuid() current_setting('application_name'::text) now() (4 rows) CREATE SCHEMA CREATE FUNCTION CREATE ROLE CREATE SCHEMA GRANT GRANT SET CREATE FUNCTION CREATE FUNCTION SET NOTICE: current_schemas(true) = {evil_user,pg_catalog} insert_thingy --------------- 1 (1 row) RESET search_path ----------------------- evil_user, pg_catalog (1 row) pg_get_expr --------------------------------------------------------- pg_catalog.nextval('protected.thingy_id_seq'::regclass) gen_random_uuid() current_setting('application_name'::text) pg_catalog.now() (4 rows)
As you can see, pg_get_expr() has only disambiguated the functions that evil_user has tried to mask. Conclusion: yes, schema-binding happens when the DDL (CREATE or ALTER TABLE) is executed.
No need to worry about security holes here, as long as you are aware of the search_path with which you are running your migrations. (You are using some sort of migration script, are you not? The only valid answer in the negative is if instead you’re managing your PostgreSQL schemas via extension upgrade scripts.)
Despite the deep-dive in all the caveats above, the search_path is not all bad. We can now be very precise about the context in which it’s reasonably safe to rely on the search_path:
Do note that even without escalated privileges, the unsuspected masking of a function by a malicious function could still result in, for instance, prices for a certain user always amounting to zero, or the random password reset string being not so random. Therefore, here are some context in which you should never rely on the search_path:
That all sounds terribly complicated. That begs the question: are there, in fact, use cases which would justify us to not just blanket schema-qualify ever-y-thing?
If you consider that operators are schema-qualified objects too, you will most likely want to at least allow yourself and your team to forego the typing of 3 + 4 instead of 3 OPERATOR(pg_catalog.+) 4 in the body of your routines. Admittedly, this is not a very strong argument in favor of the search_path, but there is no harm in it as long as your define your routines with SET search_path.
When I define a poymorphic routine with anyelement, anyarray or anyanything pseudo-types in its signature, I want the routine to be able to apply functions and operators to its arguments regardless of whether these (overloaded) functions and operators already existed in a schema that I was aware of at the time of CREATE-ing the routine.
Suppose I have a polymorphic function called sum_price_lines(anyarray):
BEGIN TRANSACTION; CREATE FUNCTION pricing.sum_price_lines(anyarray) RETURNS anyelement LANGUAGE plpgsql AS $$ BEGIN RETURN ( SELECT SUM(net_price(line)) FROM unnest($1) AS line ); END; $$; CREATE FUNCTION pricing.net_price(int) RETURNS int LANGUAGE SQL RETURN $1; SET search_path TO pricing; SELECT pricing.sum_price_lines('{1,2,3,4}'::int[])
sum_price_lines() calls net_price on each individual member of its anyarray argument. With the supplied function, this doesn’t do anything useful, but the creator of the invoicing schema can for instance use this to sum rows of the invoicing.invoice_line table type, simply by creating an overloaded invoicing.net_price(invoicing.invoice_line[]) function and making sure that it can be found in the search_path.
And indeed overloading (also of operators) is a very important use case of the search_path. Being able to define +, -, and * for anything, from order lines, to invoice lines, to quotation lines, and to users even is a valuable tool, that we wouldn’t want to be taken away from us entirely by a lacking understanding of search_path security.
I got re-interested in relying on the search_path rather than on schema-qualified names while I was working on an extension to make mocking the pg_catalog.now() function (or other functions) easy and performant. Because table defaults are schema-bound during CREATE TABLE time, I couldn’t rely on the search_path to mask pg_catalog.now() with mockable.now(). Instead, I use mockable.now() directly, which, when the mocker is not active, is just a thin wrapper around pg_catalog.now(), which the query planner should be able to inline completely without trouble.
But there are other use cases in which schema masking can be very useful. For instance, if you want to offer a specific extension schema to your … extension. Maybe you want to offer a table with factory defaults and rather than have the user TRUNCATE the table (and make upgrades and comparisons to the factory-default data a headache), you allow them (or some function/producedure, possible triggered from a view) to create a table of their own. Let me know if you know or can come up with more good examples.
I wish to express my gratitude towards the ever-solid developers of PostgreSQL, who have kept improving PostgreSQL at a steady, sustainable pace in the 15 years since I last seriously used it in 2007. All the features that I wished were there in 2007 have since been implemented, including, very recently, in PostgreSQL, WITH (security_invoker) for views. A guilty pleasure from my last long-term stint at a wage slave working with Django’s ORM was to every now and then glare at the PostgreSQL feature matrix and fantasize about being able to really unleash this beast without being hindered by the lowest common denominator (MySQL) that ORMs tend to cater to.
And thanks to the excellent PostgREST, I can now do what I dreamt of for years. And yes, for years, I’ve dreamt precisely of using PostgREST rather than all the ridiculously inefficient, verbose, and mismatched layers around SQL that things like the django-rest-framework forced upon me. Indeed, as François-Guillaume Ribreau testified, “it feels like cheating!”
The motivation for deep-diving into PostgreSQL again, after quite a few years wherein the power of PostgreSQL was buffered by Django in my day job, came from an excellent MQTT broker written by my buddy Wiebe. Me and another friend—Jeroen—where so impressed by the speed and robustness of this MQTT server software that we decided to build a managed MQTT hosting business around it. And we’ve been building all this goodness around the FlashMQ FOSS core with a very solid PostgreSQL core, a PostgREST API and a Vue.JS/Nuxt front + back-end. Some daemons hover around the PostgreSQL core and connect directly to it (through dedicated schemas, to confirm to my Application-Exclusive Schema Scoping (AESS) practice), so that it all resembles a very light-weight micro-services architecture.
A live demo of FlashMQ’s performance compared to other MQTT servers:
Download at www.flashmq.org and get started right away.
Recently, I was digging in the possibilities of using PostgreSQL’s set_config() and current_setting() functions when I stumbled upon a Stack Overflow question detailing pretty much the research I was doing for myself. My answer was rather elaborate, which is why I’m linking it here to find it back with more ease than through the SO interface.
© 2024 BigSmoke
Theme by Anders Noren — Up ↑
Recent Comments