Postgres Ordering Data Type - A Failed Experiment
I am sharing this unsuccessful experiment to embrace the spirit of acknowledging and learning from our failed attempts. In a feature request discussion on HackerNews, a user named danielheath proposed the idea of a Postgres data type that would allow assigning a specific position to an item/row in a table. The expectation was that Postgres would automatically adjust the positions of other items in the list to accommodate the assigned position.
An example scenario presented was managing a playlist, where a user wants to move a song to a particular position, such as position 3. This feature would relieve the programmer from the burden of manually rearranging the list and allow assigning the desired position directly. When retrieving the results, the item should appear at the assigned position.
Although several solutions were proposed at the time, it was unclear if any of them met the exact requirements of the request. I pondered over possible implementations and eventually devised a method that I believed could achieve the desired outcome. Motivated by this, I decided to code the solution using plain SQL over the weekend, hoping that if successful, I could propose it as a feature for inclusion in the Postgres project.
Below, you can find the code for the solution I developed. The fundamental idea was to dynamically assign positions at runtime. When an update request arrived, I used the requested position along with the current timestamp to assign a value for sorting the results.
Initially, the solution seemed to work, but its limitations soon became
apparent. Towards the end, you can observe that the solution fails to set the
position of ‘Song B’ to 3 under certain conditions. The problem with this
approach, as well as any other potential solutions, is that we require a sorted
list of elements before we can place an item at a desired position. In an
attempt to address this, I partially implemented the set_song_position_v2()
function. However, even if this function were complete and handled all possible
edge cases, it would not fulfill the requested feature. A proper solution
necessitates working with a pre-sorted list, which contradicts the essence of
the feature request—to easily assign the desired position without additional
effort.
In simpler terms, this issue is analogous to trying to obtain the max() value of a column without maintaining an index containing all the rows.
/*
* The playlists table holds all the playlists, and the songs in those
* playlists. This table is denormalized, and it does not have any indexes, to
* keep the code simple and readable.
*
* The position_ts column records the timestamp when the operation occurred.
* This column is then used to resolve conflicts when the song_position of 2
* or more songs is the same.
*/
create table playlists(
playlist_name text,
song_name text,
song_position integer,
position_ts timestamptz default clock_timestamp()
);
> CREATE TABLE
create or replace function update_playlist_position_ts()
returns trigger as
$$
begin
new.position_ts = clock_timestamp();
return new;
end;
$$ language plpgsql;
> CREATE FUNCTION
create or replace trigger update_position_ts
before update of song_position on playlists
for each row
execute function update_playlist_position_ts();
> CREATE TRIGGER
create or replace view playlists_numbered as
select row_number() over (partition by playlist_name order by song_position asc, position_ts desc) as position,
playlist_name,
song_name
from (select playlist_name,
song_name,
song_position,
position_ts
from playlists);
> CREATE VIEW
-- Simpler view definition with similar (but not the same) results as the
-- playlists_numbered view.
create or replace view playlists_ordered as
select row_number() over () as position, *
from (select playlist_name,
song_name
from playlists
order by
playlist_name,
song_position asc,
position_ts desc);
> CREATE VIEW
-- Set the position of a song within a playlist
create or replace function set_song_position(playlist text, song text, new_position int)
returns void as
$$
-- Simply set the 'song_positon' to the desired value; the
-- trigger and the View will do the rest to actually show
-- this song at a position relative to other songs within
-- the playlist.
--
-- TODO: should it throw error if song is not already in the
-- playlist?
update playlists
set song_position = $3
where playlist_name = $1
and song_name = $2;
$$
language sql;
> CREATE FUNCTION
insert into playlists values
('Playlist A', 'Song A', 1),
('Playlist A', 'Song B', 2),
('Playlist A', 'Song C', 3),
('Playlist B', 'Song X', 1),
('Playlist B', 'Song Y', 2);
> INSERT 0 5
select * from playlists_numbered;
> position | playlist_name | song_name
> ----------+---------------+-----------
> 1 | Playlist A | Song A
> 2 | Playlist A | Song B
> 3 | Playlist A | Song C
> 1 | Playlist B | Song X
> 2 | Playlist B | Song Y
> (5 rows)
select set_song_position('Playlist A', 'Song C', 2);
> set_song_position
> -------------------
>
> (1 row)
select * from playlists_numbered
where playlist_name = 'Playlist A';
> position | playlist_name | song_name
> ----------+---------------+-----------
> 1 | Playlist A | Song A
> 2 | Playlist A | Song C
> 3 | Playlist A | Song B
> (3 rows)
select set_song_position('Playlist A', 'Song B', 1);
> set_song_position
> -------------------
>
> (1 row)
select * from playlists_numbered
where playlist_name = 'Playlist A';
> position | playlist_name | song_name
> ----------+---------------+-----------
> 1 | Playlist A | Song B
> 2 | Playlist A | Song A
> 3 | Playlist A | Song C
> (3 rows)
-- Setting a song position to a value more than the number of
-- songs will place the song last.
select set_song_position('Playlist A', 'Song A', 99);
> set_song_position
> -------------------
>
> (1 row)
select * from playlists_numbered
where playlist_name = 'Playlist A';
> position | playlist_name | song_name
> ----------+---------------+-----------
> 1 | Playlist A | Song B
> 2 | Playlist A | Song C
> 3 | Playlist A | Song A
> (3 rows)
-- Inserting a new song at an in-between position places the new
-- song before the song at position 99.
insert into playlists values('Playlist A', 'Song D', 4);
> INSERT 0 1
select * from playlists_numbered
where playlist_name = 'Playlist A';
> position | playlist_name | song_name
> ----------+---------------+-----------
> 1 | Playlist A | Song B
> 2 | Playlist A | Song C
> 3 | Playlist A | Song D
> 4 | Playlist A | Song A
> (4 rows)
-- This fails. We want Song B to be at position 3, but the
-- result shows that Song D shows up at position 3.
select set_song_position('Playlist A', 'Song B', 3);
> set_song_position
> -------------------
>
> (1 row)
select * from playlists_numbered
where playlist_name = 'Playlist A';
> position | playlist_name | song_name
> ----------+---------------+-----------
> 1 | Playlist A | Song C
> 2 | Playlist A | Song B
> 3 | Playlist A | Song D
> 4 | Playlist A | Song A
> (4 rows)
create or replace function set_song_position_v2(p_playlist text, p_song text, p_position int)
returns void as
$$
declare
pos int := 0;
begin
pos := (
select coalesce(min(song_position), p_position) as pos from (
select row_number() over (partition by playlist_name order by song_position asc, position_ts desc) as position,
playlist_name,
song_name,
song_position
from (select playlist_name,
song_name,
song_position,
position_ts
from playlists
where playlist_name = p_playlist
and song_name <> p_song -- exclude the song we're manipulating
))
where position >= p_position
);
raise notice 'Position is %', pos;
update playlists
set song_position = pos
where playlist_name = p_playlist
and song_name = p_song;
end;
$$ language plpgsql;
> CREATE FUNCTION
-- Clean up
drop table playlists cascade; -- cascade drops views of the table
psql:playlists.sql:133: NOTICE: drop cascades to 2 other objects
DETAIL: drop cascades to view playlists_numbered
drop cascades to view playlists_ordered
> DROP TABLE
drop function update_playlist_position_ts();
> DROP FUNCTION
drop function set_song_position(text, text, int);
> DROP FUNCTION
drop function set_song_position_v2(text, text, int);
> DROP FUNCTION