Keep the syntax
CREATE RULE rule_name AS ON event
TO object [WHERE rule_qualification]
DO [INSTEAD] [action | (actions) | NOTHING];
in mind.
In the following, update rules means rules that are defined
ON INSERT, UPDATE or DELETE.
Update rules get applied by the rule system when the result
relation and the command type of a parse tree are equal to the
object and event given in the CREATE RULE command.
For update rules, the rule system creates a list of parse trees.
Initially the parse tree list is empty.
There can be zero (NOTHING keyword), one or multiple actions.
To simplify, we look at a rule with one action. This rule
can have a qualification or not and it can be INSTEAD or not.
What is a rule qualification? It is a restriction that tells
when the actions of the rule should be done and when not. This
qualification can only reference the NEW and/or OLD pseudo relations
which are basically the relation given as object (but with a
special meaning).
So we have four cases that produce the following parse trees for
a one-action rule.
No qualification and not INSTEAD:
No qualification but INSTEAD:
Qualification given and not INSTEAD:
Qualification given and INSTEAD:
The parse tree from the rule action where the rule
qualification and the original parse tree's
qualification have been added.
The original parse tree where the negated rule
qualification has been added.
Finally, if the rule is not INSTEAD, the unchanged original parse tree is
added to the list. Since only qualified INSTEAD rules already add the
original parse tree, we end up with either one or two output parse trees
for a rule with one action.
For ON INSERT rules, the original query (if not suppressed by INSTEAD)
is done before any actions added by rules. This allows the actions to
see the inserted row(s). But for ON UPDATE and ON
DELETE rules, the original query is done after the actions added by rules.
This ensures that the actions can see the to-be-updated or to-be-deleted
rows; otherwise, the actions might do nothing because they find no rows
matching their qualifications.
The parse trees generated from rule actions are thrown into the
rewrite system again and maybe more rules get applied resulting
in more or less parse trees.
So the parse trees in the rule actions must have either another command type
or another result relation. Otherwise this recursive process will end up in a loop.
There is a compiled-in recursion limit of currently 100 iterations.
If after 100 iterations there are still update rules to apply the
rule system assumes a loop over multiple rule definitions and reports
an error.
The parse trees found in the actions of the pg_rewrite
system catalog are only templates. Since they can reference the
range-table entries for NEW and OLD, some substitutions have to be made
before they can be used. For any reference to NEW, the target list of
the original query is searched for a corresponding entry. If found,
that entry's expression replaces the reference. Otherwise
NEW means the same as OLD (for an UPDATE) or is replaced by NULL
(for an INSERT). Any reference to OLD is replaced by a
reference to the range-table entry which is the result relation.
After we are done applying update rules, we apply view rules to the
produced parse tree(s). Views cannot insert new update actions so
there is no need to apply update rules to the output of view rewriting.
We want to trace changes to the sl_avail column in the
shoelace_data relation. So we setup a
log table and a rule that conditionally writes a log entry when
an UPDATE is performed on shoelace_data.
CREATE TABLE shoelace_log (
sl_name char(10), -- shoelace changed
sl_avail integer, -- new available value
log_who text, -- who did it
log_when timestamp -- when
);
CREATE RULE log_shoelace AS ON UPDATE TO shoelace_data
WHERE NEW.sl_avail != OLD.sl_avail
DO INSERT INTO shoelace_log VALUES (
NEW.sl_name,
NEW.sl_avail,
current_user,
current_timestamp
);
Now Al does
al_bundy=> UPDATE shoelace_data SET sl_avail = 6
al_bundy-> WHERE sl_name = 'sl7';
and we look at the log table.
al_bundy=> SELECT * FROM shoelace_log;
sl_name |sl_avail|log_who|log_when
----------+--------+-------+--------------------------------
sl7 | 6|Al |Tue Oct 20 16:14:45 1998 MET DST
(1 row)
That's what we expected. What happened in the background is the following.
The parser created the parse tree (this time the parts of the original
parse tree are highlighted because the base of operations is the
rule action for update rules).
UPDATE shoelace_data SET sl_avail = 6
FROM shoelace_data shoelace_data
WHERE bpchareq(shoelace_data.sl_name, 'sl7');
There is a rule log_shoelace that is ON UPDATE with the rule
qualification expression
int4ne(NEW.sl_avail, OLD.sl_avail)
and one action
INSERT INTO shoelace_log VALUES(
*NEW*.sl_name, *NEW*.sl_avail,
current_user, current_timestamp)
FROM shoelace_data *NEW*, shoelace_data *OLD*;
This is a little strange-looking since you can't normally write
INSERT ... VALUES ... FROM. The FROM clause here is just to indicate
that there are range-table entries in the parse tree for *NEW* and *OLD*.
These are needed so that they can be referenced by variables in the
INSERT command's query tree.
The rule is a qualified non-INSTEAD rule, so the rule system
has to return two parse trees: the modified rule action and the original
parse tree. In the first step the range table of the original query is
incorporated into the rule's action parse tree. This results in
INSERT INTO shoelace_log VALUES(
*NEW*.sl_name, *NEW*.sl_avail,
current_user, current_timestamp)
FROM shoelace_data *NEW*, shoelace_data *OLD*,
shoelace_data shoelace_data;
In step 2 the rule qualification is added to it, so the result set
is restricted to rows where sl_avail changes.
INSERT INTO shoelace_log VALUES(
*NEW*.sl_name, *NEW*.sl_avail,
current_user, current_timestamp)
FROM shoelace_data *NEW*, shoelace_data *OLD*,
shoelace_data shoelace_data
WHERE int4ne(*NEW*.sl_avail, *OLD*.sl_avail);
This is even stranger-looking, since INSERT ... VALUES doesn't have
a WHERE clause either, but the planner and executor will have no
difficulty with it. They need to support this same functionality
anyway for INSERT ... SELECT.
In step 3 the original parse tree's qualification is added,
restricting the result set further to only the rows touched
by the original parse tree.
INSERT INTO shoelace_log VALUES(
*NEW*.sl_name, *NEW*.sl_avail,
current_user, current_timestamp)
FROM shoelace_data *NEW*, shoelace_data *OLD*,
shoelace_data shoelace_data
WHERE int4ne(*NEW*.sl_avail, *OLD*.sl_avail)
AND bpchareq(shoelace_data.sl_name, 'sl7');
Step 4 replaces NEW references by the target list entries from the
original parse tree or by the matching variable references
from the result relation.
INSERT INTO shoelace_log VALUES(
shoelace_data.sl_name, 6,
current_user, current_timestamp)
FROM shoelace_data *NEW*, shoelace_data *OLD*,
shoelace_data shoelace_data
WHERE int4ne(6, *OLD*.sl_avail)
AND bpchareq(shoelace_data.sl_name, 'sl7');
Step 5 changes OLD references into result relation references.
INSERT INTO shoelace_log VALUES(
shoelace_data.sl_name, 6,
current_user, current_timestamp)
FROM shoelace_data *NEW*, shoelace_data *OLD*,
shoelace_data shoelace_data
WHERE int4ne(6, shoelace_data.sl_avail)
AND bpchareq(shoelace_data.sl_name, 'sl7');
That's it. Since the rule is not INSTEAD, we also output the
original parse tree. In short, the output from the rule system
is a list of two parse trees that are the same as the statements:
INSERT INTO shoelace_log VALUES(
shoelace_data.sl_name, 6,
current_user, current_timestamp)
FROM shoelace_data
WHERE 6 != shoelace_data.sl_avail
AND shoelace_data.sl_name = 'sl7';
UPDATE shoelace_data SET sl_avail = 6
WHERE sl_name = 'sl7';
These are executed in this order and that is exactly what
the rule defines. The substitutions and the qualifications
added ensure that if the original query would be, say,
UPDATE shoelace_data SET sl_color = 'green'
WHERE sl_name = 'sl7';
no log entry would get written. This
time the original parse tree does not contain a target list
entry for sl_avail, so NEW.sl_avail will get replaced by
shoelace_data.sl_avail resulting in the extra query
INSERT INTO shoelace_log VALUES(
shoelace_data.sl_name, shoelace_data.sl_avail,
current_user, current_timestamp)
FROM shoelace_data
WHERE shoelace_data.sl_avail != shoelace_data.sl_avail
AND shoelace_data.sl_name = 'sl7';
and that qualification will never be true. It will also
work if the original query modifies multiple rows. So if Al
would issue the command
UPDATE shoelace_data SET sl_avail = 0
WHERE sl_color = 'black';
four rows in fact get updated (sl1, sl2, sl3 and sl4).
But sl3 already has sl_avail = 0. This time, the original
parse trees qualification is different and that results
in the extra parse tree
INSERT INTO shoelace_log SELECT
shoelace_data.sl_name, 0,
current_user, current_timestamp
FROM shoelace_data
WHERE 0 != shoelace_data.sl_avail
AND shoelace_data.sl_color = 'black';
This parse tree will surely insert three new log entries. And
that's absolutely correct.
Here we can see why it is important that the original parse tree is
executed last.
If the UPDATE would have been executed first, all the rows
are already set to zero, so the logging INSERT
would not find any row where 0 != shoelace_data.sl_avail.
A simple way to protect view relations from the mentioned
possibility that someone can try to INSERT, UPDATE and DELETE
on them is to let those parse trees get
thrown away. We create the rules
CREATE RULE shoe_ins_protect AS ON INSERT TO shoe
DO INSTEAD NOTHING;
CREATE RULE shoe_upd_protect AS ON UPDATE TO shoe
DO INSTEAD NOTHING;
CREATE RULE shoe_del_protect AS ON DELETE TO shoe
DO INSTEAD NOTHING;
If Al now tries to do any of these operations on the view
relation shoe, the rule system will
apply the rules. Since the rules have
no actions and are INSTEAD, the resulting list of
parse trees will be empty and the whole query will become
nothing because there is nothing left to be optimized or
executed after the rule system is done with it.
A more sophisticated way to use the rule system is to
create rules that rewrite the parse tree into one that
does the right operation on the real tables. To do that
on the shoelace view, we create
the following rules:
CREATE RULE shoelace_ins AS ON INSERT TO shoelace
DO INSTEAD
INSERT INTO shoelace_data VALUES (
NEW.sl_name,
NEW.sl_avail,
NEW.sl_color,
NEW.sl_len,
NEW.sl_unit);
CREATE RULE shoelace_upd AS ON UPDATE TO shoelace
DO INSTEAD
UPDATE shoelace_data SET
sl_name = NEW.sl_name,
sl_avail = NEW.sl_avail,
sl_color = NEW.sl_color,
sl_len = NEW.sl_len,
sl_unit = NEW.sl_unit
WHERE sl_name = OLD.sl_name;
CREATE RULE shoelace_del AS ON DELETE TO shoelace
DO INSTEAD
DELETE FROM shoelace_data
WHERE sl_name = OLD.sl_name;
Now there is a pack of shoelaces arriving in Al's shop and it has
a big part list. Al is not that good in calculating and so
we don't want him to manually update the shoelace view.
Instead we setup two little tables, one where he can
insert the items from the part list and one with a special
trick. The create commands for these are:
CREATE TABLE shoelace_arrive (
arr_name char(10),
arr_quant integer
);
CREATE TABLE shoelace_ok (
ok_name char(10),
ok_quant integer
);
CREATE RULE shoelace_ok_ins AS ON INSERT TO shoelace_ok
DO INSTEAD
UPDATE shoelace SET
sl_avail = sl_avail + NEW.ok_quant
WHERE sl_name = NEW.ok_name;
Now Al can sit down and do whatever until
al_bundy=> SELECT * FROM shoelace_arrive;
arr_name |arr_quant
----------+---------
sl3 | 10
sl6 | 20
sl8 | 20
(3 rows)
is exactly what's on the part list. We take a quick look
at the current data,
al_bundy=> SELECT * FROM shoelace;
sl_name |sl_avail|sl_color |sl_len|sl_unit |sl_len_cm
----------+--------+----------+------+--------+---------
sl1 | 5|black | 80|cm | 80
sl2 | 6|black | 100|cm | 100
sl7 | 6|brown | 60|cm | 60
sl3 | 0|black | 35|inch | 88.9
sl4 | 8|black | 40|inch | 101.6
sl8 | 1|brown | 40|inch | 101.6
sl5 | 4|brown | 1|m | 100
sl6 | 0|brown | 0.9|m | 90
(8 rows)
move the arrived shoelaces in
al_bundy=> INSERT INTO shoelace_ok SELECT * FROM shoelace_arrive;
and check the results
al_bundy=> SELECT * FROM shoelace ORDER BY sl_name;
sl_name |sl_avail|sl_color |sl_len|sl_unit |sl_len_cm
----------+--------+----------+------+--------+---------
sl1 | 5|black | 80|cm | 80
sl2 | 6|black | 100|cm | 100
sl7 | 6|brown | 60|cm | 60
sl4 | 8|black | 40|inch | 101.6
sl3 | 10|black | 35|inch | 88.9
sl8 | 21|brown | 40|inch | 101.6
sl5 | 4|brown | 1|m | 100
sl6 | 20|brown | 0.9|m | 90
(8 rows)
al_bundy=> SELECT * FROM shoelace_log;
sl_name |sl_avail|log_who|log_when
----------+--------+-------+--------------------------------
sl7 | 6|Al |Tue Oct 20 19:14:45 1998 MET DST
sl3 | 10|Al |Tue Oct 20 19:25:16 1998 MET DST
sl6 | 20|Al |Tue Oct 20 19:25:16 1998 MET DST
sl8 | 21|Al |Tue Oct 20 19:25:16 1998 MET DST
(4 rows)
It's a long way from the one INSERT ... SELECT to these
results. And its description will be the last in this
document (but not the last example :-). First there was the parser's output
INSERT INTO shoelace_ok SELECT
shoelace_arrive.arr_name, shoelace_arrive.arr_quant
FROM shoelace_arrive shoelace_arrive, shoelace_ok shoelace_ok;
Now the first rule shoelace_ok_ins is applied and turns it
into
UPDATE shoelace SET
sl_avail = int4pl(shoelace.sl_avail, shoelace_arrive.arr_quant)
FROM shoelace_arrive shoelace_arrive, shoelace_ok shoelace_ok,
shoelace_ok *OLD*, shoelace_ok *NEW*,
shoelace shoelace
WHERE bpchareq(shoelace.sl_name, showlace_arrive.arr_name);
and throws away the original INSERT on shoelace_ok.
This rewritten query is passed to the rule system again and
the second applied rule shoelace_upd produced
UPDATE shoelace_data SET
sl_name = shoelace.sl_name,
sl_avail = int4pl(shoelace.sl_avail, shoelace_arrive.arr_quant),
sl_color = shoelace.sl_color,
sl_len = shoelace.sl_len,
sl_unit = shoelace.sl_unit
FROM shoelace_arrive shoelace_arrive, shoelace_ok shoelace_ok,
shoelace_ok *OLD*, shoelace_ok *NEW*,
shoelace shoelace, shoelace *OLD*,
shoelace *NEW*, shoelace_data showlace_data
WHERE bpchareq(shoelace.sl_name, showlace_arrive.arr_name)
AND bpchareq(shoelace_data.sl_name, shoelace.sl_name);
Again it's an INSTEAD rule and the previous parse tree is trashed.
Note that this query still uses the view shoelace.
But the rule system isn't finished with this loop so it continues
and applies the _RETURN rule on it and we get
UPDATE shoelace_data SET
sl_name = s.sl_name,
sl_avail = int4pl(s.sl_avail, shoelace_arrive.arr_quant),
sl_color = s.sl_color,
sl_len = s.sl_len,
sl_unit = s.sl_unit
FROM shoelace_arrive shoelace_arrive, shoelace_ok shoelace_ok,
shoelace_ok *OLD*, shoelace_ok *NEW*,
shoelace shoelace, shoelace *OLD*,
shoelace *NEW*, shoelace_data showlace_data,
shoelace *OLD*, shoelace *NEW*,
shoelace_data s, unit u
WHERE bpchareq(s.sl_name, showlace_arrive.arr_name)
AND bpchareq(shoelace_data.sl_name, s.sl_name);
Again an update rule has been applied and so the wheel
turns on and we are in rewrite round 3. This time rule
log_shoelace gets applied, producing the extra
parse tree
INSERT INTO shoelace_log SELECT
s.sl_name,
int4pl(s.sl_avail, shoelace_arrive.arr_quant),
current_user,
current_timestamp
FROM shoelace_arrive shoelace_arrive, shoelace_ok shoelace_ok,
shoelace_ok *OLD*, shoelace_ok *NEW*,
shoelace shoelace, shoelace *OLD*,
shoelace *NEW*, shoelace_data showlace_data,
shoelace *OLD*, shoelace *NEW*,
shoelace_data s, unit u,
shoelace_data *OLD*, shoelace_data *NEW*
shoelace_log shoelace_log
WHERE bpchareq(s.sl_name, showlace_arrive.arr_name)
AND bpchareq(shoelace_data.sl_name, s.sl_name);
AND int4ne(int4pl(s.sl_avail, shoelace_arrive.arr_quant), s.sl_avail);
After that the rule system runs out of rules and returns the
generated parse trees.
So we end up with two final parse trees that are equal to the
SQL statements
INSERT INTO shoelace_log SELECT
s.sl_name,
s.sl_avail + shoelace_arrive.arr_quant,
current_user,
current_timestamp
FROM shoelace_arrive shoelace_arrive, shoelace_data shoelace_data,
shoelace_data s
WHERE s.sl_name = shoelace_arrive.arr_name
AND shoelace_data.sl_name = s.sl_name
AND s.sl_avail + shoelace_arrive.arr_quant != s.sl_avail;
UPDATE shoelace_data SET
sl_avail = shoelace_data.sl_avail + shoelace_arrive.arr_quant
FROM shoelace_arrive shoelace_arrive,
shoelace_data shoelace_data,
shoelace_data s
WHERE s.sl_name = shoelace_arrive.sl_name
AND shoelace_data.sl_name = s.sl_name;
The result is that data coming from one relation inserted into another,
changed into updates on a third, changed into updating
a fourth plus logging that final update in a fifth
gets reduced into two queries.
There is a little detail that's a bit ugly. Looking at
the two queries turns out, that the shoelace_data
relation appears twice in the range table where it could definitely
be reduced to one. The planner does not handle it and so the
execution plan for the rule systems output of the INSERT will be
Nested Loop
-> Merge Join
-> Seq Scan
-> Sort
-> Seq Scan on s
-> Seq Scan
-> Sort
-> Seq Scan on shoelace_arrive
-> Seq Scan on shoelace_data
while omitting the extra range table entry would result in a
Merge Join
-> Seq Scan
-> Sort
-> Seq Scan on s
-> Seq Scan
-> Sort
-> Seq Scan on shoelace_arrive
that totally produces the same entries in the log relation.
Thus, the rule system caused one extra scan on the
shoelace_data relation that is
absolutely not necessary. And the same obsolete scan
is done once more in the UPDATE. But it was a really hard
job to make that all possible at all.
A final demonstration of the PostgreSQL
rule system and its power. There is a cute blonde that
sells shoelaces. And what Al could never realize, she's not
only cute, she's smart too - a little too smart. Thus, it
happens from time to time that Al orders shoelaces that
are absolutely not sellable. This time he ordered 1000 pairs
of magenta shoelaces and since another kind is currently not
available but he committed to buy some, he also prepared
his database for pink ones.
al_bundy=> INSERT INTO shoelace VALUES
al_bundy-> ('sl9', 0, 'pink', 35.0, 'inch', 0.0);
al_bundy=> INSERT INTO shoelace VALUES
al_bundy-> ('sl10', 1000, 'magenta', 40.0, 'inch', 0.0);
Since this happens often, we must lookup for shoelace entries,
that fit for absolutely no shoe sometimes. We could do that in
a complicated statement every time, or we can setup a view
for it. The view for this is
CREATE VIEW shoelace_obsolete AS
SELECT * FROM shoelace WHERE NOT EXISTS
(SELECT shoename FROM shoe WHERE slcolor = sl_color);
Its output is
al_bundy=> SELECT * FROM shoelace_obsolete;
sl_name |sl_avail|sl_color |sl_len|sl_unit |sl_len_cm
----------+--------+----------+------+--------+---------
sl9 | 0|pink | 35|inch | 88.9
sl10 | 1000|magenta | 40|inch | 101.6
For the 1000 magenta shoelaces we must debit Al before we can
throw 'em away, but that's another problem. The pink entry we delete.
To make it a little harder for PostgreSQL,
we don't delete it directly. Instead we create one more view
CREATE VIEW shoelace_candelete AS
SELECT * FROM shoelace_obsolete WHERE sl_avail = 0;
and do it this way:
DELETE FROM shoelace WHERE EXISTS
(SELECT * FROM shoelace_candelete
WHERE sl_name = shoelace.sl_name);
Voilą:
al_bundy=> SELECT * FROM shoelace;
sl_name |sl_avail|sl_color |sl_len|sl_unit |sl_len_cm
----------+--------+----------+------+--------+---------
sl1 | 5|black | 80|cm | 80
sl2 | 6|black | 100|cm | 100
sl7 | 6|brown | 60|cm | 60
sl4 | 8|black | 40|inch | 101.6
sl3 | 10|black | 35|inch | 88.9
sl8 | 21|brown | 40|inch | 101.6
sl10 | 1000|magenta | 40|inch | 101.6
sl5 | 4|brown | 1|m | 100
sl6 | 20|brown | 0.9|m | 90
(9 rows)
A DELETE on a view, with a subselect qualification that
in total uses 4 nesting/joined views, where one of them
itself has a subselect qualification containing a view
and where calculated view columns are used,
gets rewritten into
one single parse tree that deletes the requested data
from a real table.
I think there are only a few situations out in the real
world, where such a construct is necessary. But
it makes me feel comfortable that it works.