D1 feature request: Some option for Unicode collation (specifically non-ASCII case insensitivity) C
D1 feature request: Some option for Unicode collation (specifically non-ASCII case insensitivity)
Could we have the ICU extension enabled?, see https://sqlite.org/src/dir/ext/icu
Sadly, this could affect existing indexes (as discussed at the bottom of the link).
So, another potential option is to implement a custom collation sequence (via create_collation - https://www.sqlite.org/c3ref/create_collation.html ). For example, add a new 'UNICODE_NOCASE' collation that could then be used on columns or specific where clauses with
I'm currently adding duplicate columns with the data passed through Javascript's toLower(), but that's a PITA.
I understand it is ultimately an Sqlite quirk. However, Sqlite provides the above two options, but consumers of D1 have no control of those extensions/APIs which is why some assistance from D1 would be helpful.
Stanislav Trofimchuk wrote a really good background piece, with lots of discussion and example showing use of custom collation (from Python): https://shallowdepth.online/posts/2022/01/5-ways-to-implement-case-insensitive-search-in-sqlite-with-full-unicode-support/ and a great follow up post specifically on performance: https://shallowdepth.online/posts/2022/01/performance-of-application-defined-functions-and-collations-in-sqlite/
Could we have the ICU extension enabled?, see https://sqlite.org/src/dir/ext/icu
Sadly, this could affect existing indexes (as discussed at the bottom of the link).
So, another potential option is to implement a custom collation sequence (via create_collation - https://www.sqlite.org/c3ref/create_collation.html ). For example, add a new 'UNICODE_NOCASE' collation that could then be used on columns or specific where clauses with
... COLLATE UNICODE_NOCASE.I'm currently adding duplicate columns with the data passed through Javascript's toLower(), but that's a PITA.
I understand it is ultimately an Sqlite quirk. However, Sqlite provides the above two options, but consumers of D1 have no control of those extensions/APIs which is why some assistance from D1 would be helpful.
Stanislav Trofimchuk wrote a really good background piece, with lots of discussion and example showing use of custom collation (from Python): https://shallowdepth.online/posts/2022/01/5-ways-to-implement-case-insensitive-search-in-sqlite-with-full-unicode-support/ and a great follow up post specifically on performance: https://shallowdepth.online/posts/2022/01/performance-of-application-defined-functions-and-collations-in-sqlite/

