History log of /linux-6.15/fs/unicode/utf8-core.c (Results 1 – 11 of 11)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5
# 142fa60f 21-Oct-2024 André Almeida <[email protected]>

unicode: Recreate utf8_parse_version()

All filesystems that currently support UTF-8 casefold can fetch the
UTF-8 version from the filesystem metadata stored on disk. They can get
the data stored and

unicode: Recreate utf8_parse_version()

All filesystems that currently support UTF-8 casefold can fetch the
UTF-8 version from the filesystem metadata stored on disk. They can get
the data stored and directly match it to a integer, so they can skip the
string parsing step, which motivated the removal of this function in the
first place.

However, for tmpfs, the only way to tell the kernel which UTF-8 version
we are about to use is via mount options, using a string. Re-introduce
utf8_parse_version() to be used by tmpfs.

This version differs from the original by skipping the intermediate step
of copying the version string to an auxiliary string before calling
match_token(). This versions calls match_token() in the argument string.
The paramenters are simpler now as well.

utf8_parse_version() was created by 9d53690f0d4 ("unicode: implement
higher level API for string handling") and later removed by 49bd03cc7e9
("unicode: pass a UNICODE_AGE() tripple to utf8_load").

Signed-off-by: André Almeida <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Theodore Ts'o <[email protected]>
Reviewed-by: Gabriel Krisman Bertazi <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>

show more ...


Revision tags: v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7
# 156bb2c5 02-Sep-2024 André Almeida <[email protected]>

unicode: Fix utf8_load() error path

utf8_load() requests the symbol "utf8_data_table" and then checks if the
requested UTF-8 version is supported. If it's unsupported, it tries to
put the data table

unicode: Fix utf8_load() error path

utf8_load() requests the symbol "utf8_data_table" and then checks if the
requested UTF-8 version is supported. If it's unsupported, it tries to
put the data table using symbol_put(). If an unsupported version is
requested, symbol_put() fails like this:

kernel BUG at kernel/module/main.c:786!
RIP: 0010:__symbol_put+0x93/0xb0
Call Trace:
<TASK>
? __die_body.cold+0x19/0x27
? die+0x2e/0x50
? do_trap+0xca/0x110
? do_error_trap+0x65/0x80
? __symbol_put+0x93/0xb0
? exc_invalid_op+0x51/0x70
? __symbol_put+0x93/0xb0
? asm_exc_invalid_op+0x1a/0x20
? __pfx_cmp_name+0x10/0x10
? __symbol_put+0x93/0xb0
? __symbol_put+0x62/0xb0
utf8_load+0xf8/0x150

That happens because symbol_put() expects the unique string that
identify the symbol, instead of a pointer to the loaded symbol. Fix that
by using such string.

Fixes: 2b3d04787012 ("unicode: Add utf8-data module")
Signed-off-by: André Almeida <[email protected]>
Reviewed-by: Theodore Ts'o <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>

show more ...


Revision tags: v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2
# 573858e8 07-Mar-2023 Nick Alcock <[email protected]>

unicode: remove MODULE_LICENSE in non-modules

Since commit 8b41fc4454e ("kbuild: create modules.builtin without
Makefile.modbuiltin or tristate.conf"), MODULE_LICENSE declarations
are used to identi

unicode: remove MODULE_LICENSE in non-modules

Since commit 8b41fc4454e ("kbuild: create modules.builtin without
Makefile.modbuiltin or tristate.conf"), MODULE_LICENSE declarations
are used to identify modules. As a consequence, uses of the macro
in non-modules will cause modprobe to misidentify their containing
object file as a module when it is not (false positives), and modprobe
might succeed rather than failing with a suitable error message.

So remove it in the files in this commit, none of which can be built as
modules.

Signed-off-by: Nick Alcock <[email protected]>
Suggested-by: Luis Chamberlain <[email protected]>
Acked-by: Gabriel Krisman Bertazi <[email protected]>
Cc: Luis Chamberlain <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Hitomi Hasegawa <[email protected]>
Cc: Gabriel Krisman Bertazi <[email protected]>
Cc: [email protected]
Signed-off-by: Luis Chamberlain <[email protected]>

show more ...


Revision tags: v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2
# 2b3d0478 15-Sep-2021 Christoph Hellwig <[email protected]>

unicode: Add utf8-data module

utf8data.h contains a large database table which is an auto-generated
decodification trie for the unicode normalization functions.

Allow building it into a separate mo

unicode: Add utf8-data module

utf8data.h contains a large database table which is an auto-generated
decodification trie for the unicode normalization functions.

Allow building it into a separate module.

Based on a patch from Shreeya Patel <[email protected]>.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>

show more ...


# 6ca99ce7 15-Sep-2021 Christoph Hellwig <[email protected]>

unicode: cache the normalization tables in struct unicode_map

Instead of repeatedly looking up the version add pointers to the
NFD and NFD+CF tables to struct unicode_map, and pass a
unicode_map plu

unicode: cache the normalization tables in struct unicode_map

Instead of repeatedly looking up the version add pointers to the
NFD and NFD+CF tables to struct unicode_map, and pass a
unicode_map plus index to the functions using the normalization
tables.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>

show more ...


# 49bd03cc 15-Sep-2021 Christoph Hellwig <[email protected]>

unicode: pass a UNICODE_AGE() tripple to utf8_load

Don't bother with pointless string parsing when the caller can just pass
the version in the format that the core expects. Also remove the
fallback

unicode: pass a UNICODE_AGE() tripple to utf8_load

Don't bother with pointless string parsing when the caller can just pass
the version in the format that the core expects. Also remove the
fallback to the latest version that none of the callers actually uses.

Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>

show more ...


# a440943e 15-Sep-2021 Christoph Hellwig <[email protected]>

unicode: remove the charset field from struct unicode_map

It is hardcoded and only used for a f2fs sysfs file where it can be
hardcoded just as easily.

Signed-off-by: Christoph Hellwig <[email protected]>

unicode: remove the charset field from struct unicode_map

It is hardcoded and only used for a f2fs sysfs file where it can be
hardcoded just as easily.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Gabriel Krisman Bertazi <[email protected]>
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>

show more ...


Revision tags: v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5
# 3d7bfea8 08-Jul-2020 Daniel Rosenberg <[email protected]>

unicode: Add utf8_casefold_hash

This adds a case insensitive hash function to allow taking the hash
without needing to allocate a casefolded copy of the string.

The existing d_hash implementations

unicode: Add utf8_casefold_hash

This adds a case insensitive hash function to allow taking the hash
without needing to allocate a casefolded copy of the string.

The existing d_hash implementations for casefolding allocate memory
within rcu-walk, by avoiding it we can be more efficient and avoid
worrying about a failed allocation.

Signed-off-by: Daniel Rosenberg <[email protected]>
Reviewed-by: Gabriel Krisman Bertazi <[email protected]>
Reviewed-by: Eric Biggers <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>

show more ...


Revision tags: v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1, v5.4, v5.4-rc8, v5.4-rc7, v5.4-rc6, v5.4-rc5, v5.4-rc4, v5.4-rc3, v5.4-rc2, v5.4-rc1, v5.3, v5.3-rc8
# aa28b98d 06-Sep-2019 Colin Ian King <[email protected]>

unicode: make array 'token' static const, makes object smaller

Don't populate the array 'token' on the stack but instead make it
static const. Makes the object code smaller by 234 bytes.

Before:

unicode: make array 'token' static const, makes object smaller

Don't populate the array 'token' on the stack but instead make it
static const. Makes the object code smaller by 234 bytes.

Before:
text data bss dec hex filename
5371 272 0 5643 160b fs/unicode/utf8-core.o

After:
text data bss dec hex filename
5041 368 0 5409 1521 fs/unicode/utf8-core.o

(gcc version 9.2.1, amd64)

Signed-off-by: Colin Ian King <[email protected]>
Reviewed-by: Theodore Ts'o <[email protected]>
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>

show more ...


Revision tags: v5.3-rc7, v5.3-rc6, v5.3-rc5, v5.3-rc4, v5.3-rc3, v5.3-rc2, v5.3-rc1, v5.2, v5.2-rc7, v5.2-rc6
# 3ae72562 20-Jun-2019 Gabriel Krisman Bertazi <[email protected]>

ext4: optimize case-insensitive lookups

Temporarily cache a casefolded version of the file name under lookup in
ext4_filename, to avoid repeatedly casefolding it. I got up to 30%
speedup on lookups

ext4: optimize case-insensitive lookups

Temporarily cache a casefolded version of the file name under lookup in
ext4_filename, to avoid repeatedly casefolding it. I got up to 30%
speedup on lookups of large directories (>100k entries), depending on
the length of the string under lookup.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>

show more ...


Revision tags: v5.2-rc5, v5.2-rc4, v5.2-rc3, v5.2-rc2, v5.2-rc1, v5.1, v5.1-rc7
# 9d53690f 25-Apr-2019 Gabriel Krisman Bertazi <[email protected]>

unicode: implement higher level API for string handling

This patch integrates the utf8n patches with some higher level API to
perform UTF-8 string comparison, normalization and casefolding
operation

unicode: implement higher level API for string handling

This patch integrates the utf8n patches with some higher level API to
perform UTF-8 string comparison, normalization and casefolding
operations. Implemented is a variation of NFD, and casefold is
performed by doing full casefold on top of NFD. These algorithms are
based on the core implemented by Olaf Weber from SGI.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>

show more ...