Support PG_UNICODE_FAST locale in the builtin collation provider.
The PG_UNICODE_FAST locale uses code point sort order (fast,
memcmp-based) combined with Unicode character semantics. The character
semantics are based on Unicode full case mapping.
Full case mapping can map a single codepoint to multiple codepoints,
such as "ร" uppercasing to "SS". Additionally, it handles
context-sensitive mappings like the "final sigma", and it uses
titlecase mappings such as "ว
" when titlecasing (rather than plain
uppercase mappings).
Importantly, the uppercasing of "ร" as "SS" is specifically mentioned
by the SQL standard. In Postgres, UCS_BASIC uses plain ASCII semantics
for case mapping and pattern matching, so if we changed it to use the
PG_UNICODE_FAST locale, it would offer better compliance with the
standard. For now, though, do not change the behavior of UCS_BASIC.
Discussion: https://postgr.es/m/
ddfd67928818f138f51635712529bc5e1d25e4e7.camel@j-davis.com
Discussion: https://postgr.es/m/
27bb0e52-801d-4f73-a0a4-
02cfdd4a9ada@eisentraut.org
Reviewed-by: Peter Eisentraut, Daniel Verite