Building on your test data, but this works with arbitrary data. This works with any number of elements in the string.
Register a composite type made up of one text
and one integer
value once per database. I call it ai
:
CREATE TYPE ai AS (a text, i int);
The trick is to form an array of ai
from each value in the column.
regexp_matches()
with the pattern (\D*)(\d*)
and the g
option returns one row for every combination of letters and numbers. Plus one irrelevant dangling row with two empty strings '{"",""}'
Filtering or suppressing it would just add cost. Aggregate this into an array, after replacing empty strings (''
) with 0
in the integer
component (as ''
cannot be cast to integer
).
NULL
values sort first – or you have to special case them – or use the whole shebang in a STRICT
function like @Craig proposes.
Postgres 9.4 or later
SELECT data
FROM alnum
ORDER BY ARRAY(SELECT ROW(x[1], CASE x[2] WHEN '' THEN '0' ELSE x[2] END)::ai
FROM regexp_matches(data, '(\D*)(\d*)', 'g') x)
, data;
db<>fiddle here
Postgres 9.1 (original answer)
Tested with PostgreSQL 9.1.5, where regexp_replace()
had a slightly different behavior.
SELECT data
FROM (
SELECT ctid, data, regexp_matches(data, '(\D*)(\d*)', 'g') AS x
FROM alnum
) x
GROUP BY ctid, data -- ctid as stand-in for a missing pk
ORDER BY regexp_replace (left(data, 1), '[0-9]', '0')
, array_agg(ROW(x[1], CASE x[2] WHEN '' THEN '0' ELSE x[2] END)::ai)
, data -- for special case of trailing 0
Add regexp_replace (left(data, 1), '[1-9]', '0')
as first ORDER BY
item to take care of leading digits and empty strings.
If special characters like {}()"',
can occur, you’d have to escape those accordingly.
@Craig’s suggestion to use a ROW
expression takes care of that.
BTW, this won’t execute in sqlfiddle, but it does in my db cluster. JDBC is not up to it. sqlfiddle complains:
Method org.postgresql.jdbc3.Jdbc3Array.getArrayImpl(long,int,Map) is
not yet implemented.
This has since been fixed: http://sqlfiddle.com/#!17/fad6e/1