| 142fa60f | 21-Oct-2024 |
André Almeida <[email protected]> |
unicode: Recreate utf8_parse_version()
All filesystems that currently support UTF-8 casefold can fetch the UTF-8 version from the filesystem metadata stored on disk. They can get the data stored and
unicode: Recreate utf8_parse_version()
All filesystems that currently support UTF-8 casefold can fetch the UTF-8 version from the filesystem metadata stored on disk. They can get the data stored and directly match it to a integer, so they can skip the string parsing step, which motivated the removal of this function in the first place.
However, for tmpfs, the only way to tell the kernel which UTF-8 version we are about to use is via mount options, using a string. Re-introduce utf8_parse_version() to be used by tmpfs.
This version differs from the original by skipping the intermediate step of copying the version string to an auxiliary string before calling match_token(). This versions calls match_token() in the argument string. The paramenters are simpler now as well.
utf8_parse_version() was created by 9d53690f0d4 ("unicode: implement higher level API for string handling") and later removed by 49bd03cc7e9 ("unicode: pass a UNICODE_AGE() tripple to utf8_load").
Signed-off-by: André Almeida <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Theodore Ts'o <[email protected]> Reviewed-by: Gabriel Krisman Bertazi <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
show more ...
|
| 66715f00 | 12-Sep-2024 |
Gan Jie <[email protected]> |
unicode: change the reference of database file
Commit 2b3d04787012 ("unicode: Add utf8-data module") changed the database file from 'utf8data.h' to 'utf8data.c' to build separate module, but it seem
unicode: change the reference of database file
Commit 2b3d04787012 ("unicode: Add utf8-data module") changed the database file from 'utf8data.h' to 'utf8data.c' to build separate module, but it seems forgot to update README.utf8data , which may causes confusion. Update the README.utf8data and the default 'UTF8_NAME' in 'mkutf8data.c'.
Signed-off-by: Gan Jie <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
show more ...
|
| 156bb2c5 | 02-Sep-2024 |
André Almeida <[email protected]> |
unicode: Fix utf8_load() error path
utf8_load() requests the symbol "utf8_data_table" and then checks if the requested UTF-8 version is supported. If it's unsupported, it tries to put the data table
unicode: Fix utf8_load() error path
utf8_load() requests the symbol "utf8_data_table" and then checks if the requested UTF-8 version is supported. If it's unsupported, it tries to put the data table using symbol_put(). If an unsupported version is requested, symbol_put() fails like this:
kernel BUG at kernel/module/main.c:786! RIP: 0010:__symbol_put+0x93/0xb0 Call Trace: <TASK> ? __die_body.cold+0x19/0x27 ? die+0x2e/0x50 ? do_trap+0xca/0x110 ? do_error_trap+0x65/0x80 ? __symbol_put+0x93/0xb0 ? exc_invalid_op+0x51/0x70 ? __symbol_put+0x93/0xb0 ? asm_exc_invalid_op+0x1a/0x20 ? __pfx_cmp_name+0x10/0x10 ? __symbol_put+0x93/0xb0 ? __symbol_put+0x62/0xb0 utf8_load+0xf8/0x150
That happens because symbol_put() expects the unique string that identify the symbol, instead of a pointer to the loaded symbol. Fix that by using such string.
Fixes: 2b3d04787012 ("unicode: Add utf8-data module") Signed-off-by: André Almeida <[email protected]> Reviewed-by: Theodore Ts'o <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
show more ...
|
| 68318904 | 24-May-2024 |
Jeff Johnson <[email protected]> |
unicode: add MODULE_DESCRIPTION() macros
Currently 'make W=1' reports: WARNING: modpost: missing MODULE_DESCRIPTION() in fs/unicode/utf8data.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/un
unicode: add MODULE_DESCRIPTION() macros
Currently 'make W=1' reports: WARNING: modpost: missing MODULE_DESCRIPTION() in fs/unicode/utf8data.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/unicode/utf8-selftest.o
Add a MODULE_DESCRIPTION() to utf8-selftest.c and utf8data.c_shipped, and update mkutf8data.c to add a MODULE_DESCRIPTION() to any future generated utf8data file.
Signed-off-by: Jeff Johnson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
show more ...
|
| e2a58d2d | 15-Sep-2021 |
Christoph Hellwig <[email protected]> |
unicode: only export internal symbols for the selftests
The exported symbols in utf8-norm.c are not needed for normal file system consumers, so move them to conditional _GPL exports just for the sel
unicode: only export internal symbols for the selftests
The exported symbols in utf8-norm.c are not needed for normal file system consumers, so move them to conditional _GPL exports just for the selftest.
Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
show more ...
|
| 6ca99ce7 | 15-Sep-2021 |
Christoph Hellwig <[email protected]> |
unicode: cache the normalization tables in struct unicode_map
Instead of repeatedly looking up the version add pointers to the NFD and NFD+CF tables to struct unicode_map, and pass a unicode_map plu
unicode: cache the normalization tables in struct unicode_map
Instead of repeatedly looking up the version add pointers to the NFD and NFD+CF tables to struct unicode_map, and pass a unicode_map plus index to the functions using the normalization tables.
Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
show more ...
|
| fbc59d65 | 15-Sep-2021 |
Christoph Hellwig <[email protected]> |
unicode: move utf8cursor to utf8-selftest.c
Only used by the tests, so no need to keep it in the core.
Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Gabriel Krisman Bertazi <krisman@
unicode: move utf8cursor to utf8-selftest.c
Only used by the tests, so no need to keep it in the core.
Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
show more ...
|
| 9012d79c | 15-Sep-2021 |
Christoph Hellwig <[email protected]> |
unicode: simplify utf8len
Just use the utf8nlen implementation with a (size_t)-1 len argument, similar to utf8_lookup. Also move the function to utf8-selftest.c, as it isn't used anywhere else.
Si
unicode: simplify utf8len
Just use the utf8nlen implementation with a (size_t)-1 len argument, similar to utf8_lookup. Also move the function to utf8-selftest.c, as it isn't used anywhere else.
Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
show more ...
|