|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
aee76cb5 |
| 23-Jul-2022 |
Corentin Jabot <[email protected]> |
[Clang] Add support for Unicode identifiers (UAX31) in C2x mode.
This implements N2836 Identifier Syntax using Unicode Standard Annex 31.
The feature was already implemented for C++, and the semant
[Clang] Add support for Unicode identifiers (UAX31) in C2x mode.
This implements N2836 Identifier Syntax using Unicode Standard Annex 31.
The feature was already implemented for C++, and the semantics are the same.
Unlike C++ there was, afaict, no decision to backport the feature in older languages mode, so C17 and earlier are not modified and the code point tables for these language modes are conserved.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D130416
show more ...
|
| #
6882ca9a |
| 13-Jul-2022 |
Corentin Jabot <[email protected]> |
[Clang] Adjust extension warnings for delimited sequences
WG21 approved delimited escape sequences and named escape sequences. Adjust the extension warnings accordingly, and update the release notes
[Clang] Adjust extension warnings for delimited sequences
WG21 approved delimited escape sequences and named escape sequences. Adjust the extension warnings accordingly, and update the release notes.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D129664
show more ...
|
|
Revision tags: llvmorg-14.0.6 |
|
| #
d4892a16 |
| 17-Jun-2022 |
Corentin Jabot <[email protected]> |
[Clang] Add a warning on invalid UTF-8 in comments.
Introduce an off-by default `-Winvalid-utf8` warning that detects invalid UTF-8 code units sequences in comments.
Invalid UTF-8 in other places i
[Clang] Add a warning on invalid UTF-8 in comments.
Introduce an off-by default `-Winvalid-utf8` warning that detects invalid UTF-8 code units sequences in comments.
Invalid UTF-8 in other places is already diagnosed, as that cannot appear in identifiers and other grammar constructs.
The warning is off by default as its likely to be somewhat disruptive otherwise.
This warning allows clang to conform to the yet-to be approved WG21 "P2295R5 Support for UTF-8 as a portable source file encoding" paper.
Reviewed By: aaron.ballman, #clang-language-wg
Differential Revision: https://reviews.llvm.org/D128059
show more ...
|
| #
a262f4db |
| 12-Jul-2022 |
Jonas Devlieghere <[email protected]> |
Revert "[Clang] Add a warning on invalid UTF-8 in comments."
This reverts commit cc309721d20c8e544ae7a10a66735ccf4981a11c because it breaks the following tests on GreenDragon:
TestDataFormatterOb
Revert "[Clang] Add a warning on invalid UTF-8 in comments."
This reverts commit cc309721d20c8e544ae7a10a66735ccf4981a11c because it breaks the following tests on GreenDragon:
TestDataFormatterObjCCF.py TestDataFormatterObjCExpr.py TestDataFormatterObjCKVO.py TestDataFormatterObjCNSBundle.py TestDataFormatterObjCNSData.py TestDataFormatterObjCNSError.py TestDataFormatterObjCNSNumber.py TestDataFormatterObjCNSURL.py TestDataFormatterObjCPlain.py TestDataFormatterObjNSException.py
https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/45288/
show more ...
|
| #
cc309721 |
| 17-Jun-2022 |
Corentin Jabot <[email protected]> |
[Clang] Add a warning on invalid UTF-8 in comments.
Introduce an off-by default `-Winvalid-utf8` warning that detects invalid UTF-8 code units sequences in comments.
Invalid UTF-8 in other places i
[Clang] Add a warning on invalid UTF-8 in comments.
Introduce an off-by default `-Winvalid-utf8` warning that detects invalid UTF-8 code units sequences in comments.
Invalid UTF-8 in other places is already diagnosed, as that cannot appear in identifiers and other grammar constructs.
The warning is off by default as its likely to be somewhat disruptive otherwise.
This warning allows clang to conform to the yet-to be approved WG21 "P2295R5 Support for UTF-8 as a portable source file encoding" paper.
Reviewed By: aaron.ballman, #clang-language-wg
Differential Revision: https://reviews.llvm.org/D128059
show more ...
|
| #
50416e54 |
| 09-Jul-2022 |
Corentin Jabot <[email protected]> |
Revert "[Clang] Add a warning on invalid UTF-8 in comments."
It is probable thart this change crashes on the powerpc bots.
This reverts commit 355532a1499aa9b13a89fb5b5caaba2344d57cd7.
|
| #
355532a1 |
| 17-Jun-2022 |
Corentin Jabot <[email protected]> |
[Clang] Add a warning on invalid UTF-8 in comments.
Introduce an off-by default `-Winvalid-utf8` warning that detects invalid UTF-8 code units sequences in comments.
Invalid UTF-8 in other places i
[Clang] Add a warning on invalid UTF-8 in comments.
Introduce an off-by default `-Winvalid-utf8` warning that detects invalid UTF-8 code units sequences in comments.
Invalid UTF-8 in other places is already diagnosed, as that cannot appear in identifiers and other grammar constructs.
The warning is off by default as its likely to be somewhat disruptive otherwise.
This warning allows clang to conform to the yet-to be approved WG21 "P2295R5 Support for UTF-8 as a portable source file encoding" paper.
Reviewed By: aaron.ballman, #clang-language-wg
Differential Revision: https://reviews.llvm.org/D128059
show more ...
|
| #
e9fe20da |
| 06-Jul-2022 |
Nico Weber <[email protected]> |
Revert "[Clang] Add a warning on invalid UTF-8 in comments."
This reverts commit 4174f0ca618b467571b43cff12cbe4c4239670f8.
Also revert follow-up "[Clang] Fix invalid utf-8 detection" This reverts c
Revert "[Clang] Add a warning on invalid UTF-8 in comments."
This reverts commit 4174f0ca618b467571b43cff12cbe4c4239670f8.
Also revert follow-up "[Clang] Fix invalid utf-8 detection" This reverts commit bf45e27a676d87944f1f13d5f0d0f39935fc4010.
The second commit broke tests, see comments on https://reviews.llvm.org/D129223, and it sounds like the first commit isn't valid without the second one. So reverting both for now.
show more ...
|
| #
4174f0ca |
| 17-Jun-2022 |
Corentin Jabot <[email protected]> |
[Clang] Add a warning on invalid UTF-8 in comments.
Introduce an off-by default `-Winvalid-utf8` warning that detects invalid UTF-8 code units sequences in comments.
Invalid UTF-8 in other places i
[Clang] Add a warning on invalid UTF-8 in comments.
Introduce an off-by default `-Winvalid-utf8` warning that detects invalid UTF-8 code units sequences in comments.
Invalid UTF-8 in other places is already diagnosed, as that cannot appear in identifiers and other grammar constructs.
The warning is off by default as its likely to be somewhat disruptive otherwise.
This warning allows clang to conform to the yet-to be approved WG21 "P2295R5 Support for UTF-8 as a portable source file encoding" paper.
Reviewed By: aaron.ballman, #clang-language-wg
Differential Revision: https://reviews.llvm.org/D128059
show more ...
|
| #
fb06dd3e |
| 06-Jul-2022 |
Corentin Jabot <[email protected]> |
Revert "[Clang] Add a warning on invalid UTF-8 in comments."
Reverting while I investigate build failures
This reverts commit e3dc56805f1029dd5959e4c69196a287961afb8d.
|
| #
e3dc5680 |
| 17-Jun-2022 |
Corentin Jabot <[email protected]> |
[Clang] Add a warning on invalid UTF-8 in comments.
Introduce an off-by default `-Winvalid-utf8` warning that detects invalid UTF-8 code units sequences in comments.
Invalid UTF-8 in other places i
[Clang] Add a warning on invalid UTF-8 in comments.
Introduce an off-by default `-Winvalid-utf8` warning that detects invalid UTF-8 code units sequences in comments.
Invalid UTF-8 in other places is already diagnosed, as that cannot appear in identifiers and other grammar constructs.
The warning is off by default as its likely to be somewhat disruptive otherwise.
This warning allows clang to conform to the yet-to be approved WG21 "P2295R5 Support for UTF-8 as a portable source file encoding" paper.
Reviewed By: aaron.ballman, #clang-language-wg
Differential Revision: https://reviews.llvm.org/D128059
show more ...
|
| #
c68b8c84 |
| 28-Jun-2022 |
Argyrios Kyrtzidis <[email protected]> |
[Lex] Make sure to notify `MultipleIncludeOpt` for "read tokens" during fast dependency directive lexing
Otherwise a header may be erroneously marked as having a header macro guard and won't get re-
[Lex] Make sure to notify `MultipleIncludeOpt` for "read tokens" during fast dependency directive lexing
Otherwise a header may be erroneously marked as having a header macro guard and won't get re-included.
Differential Revision: https://reviews.llvm.org/D128772
show more ...
|
|
Revision tags: llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
c92056d0 |
| 04-Apr-2022 |
Corentin Jabot <[email protected]> |
[Clang][C++23] P2071 Named universal character escapes
Implements [[ https://wg21.link/p2071r1 | P2071 Named Universal Character Escapes ]] - as an extension in all language mode, the patch not wa
[Clang][C++23] P2071 Named universal character escapes
Implements [[ https://wg21.link/p2071r1 | P2071 Named Universal Character Escapes ]] - as an extension in all language mode, the patch not warn in c++23 mode will be done later once this paper is plenary approved (in July).
We add
* A code generator that transforms `UnicodeData.txt` and `NameAliases.txt` to a space efficient data structure that can be queried in `O(NameLength)` * A set of functions in `Unicode.h` to query that data, including
* A function to find an exact match of a given Unicode character name * A function to perform a loose (ignoring case, space, underscore, medial hyphen) matching * A function returning the best matching codepoint for a given string per edit distance
* Support of `\N{}` escape sequences in String and character Literals, with loose and typos diagnostics/fixits * Support of `\N{}` as UCN with loose matching diagnostics/fixits.
Loose matching is considered an error to match closely the semantics of P2071.
The generated data contributes to 280kB of data to the binaries.
`UnicodeData.txt` and `NameAliases.txt` are not committed to the repository in this patch, and regenerating the data is a manual process.
Reviewed By: tahonermann
Differential Revision: https://reviews.llvm.org/D123064
show more ...
|
| #
fad6e379 |
| 28-May-2022 |
Argyrios Kyrtzidis <[email protected]> |
[Lex] Fix crash during dependency scanning while skipping an unmatched `#if`
|
| #
b4c83a13 |
| 12-May-2022 |
Argyrios Kyrtzidis <[email protected]> |
[Tooling/DependencyScanning & Preprocessor] Refactor dependency scanning to produce pre-lexed preprocessor directive tokens, instead of minimized sources
This is a commit with the following changes:
[Tooling/DependencyScanning & Preprocessor] Refactor dependency scanning to produce pre-lexed preprocessor directive tokens, instead of minimized sources
This is a commit with the following changes:
* Remove `ExcludedPreprocessorDirectiveSkipMapping` and related functionality
Removes `ExcludedPreprocessorDirectiveSkipMapping`; its intended benefit for fast skipping of excluded directived blocks will be superseded by a follow-up patch in the series that will use dependency scanning lexing for the same purpose.
* Refactor dependency scanning to produce pre-lexed preprocessor directive tokens, instead of minimized sources
Replaces the "source minimization" mechanism with a mechanism that produces lexed dependency directives tokens.
* Make the special lexing for dependency scanning a first-class feature of the `Preprocessor` and `Lexer`
This is bringing the following benefits:
* Full access to the preprocessor state during dependency scanning. E.g. a component can see what includes were taken and where they were located in the actual sources. * Improved performance for dependency scanning. Measurements with a release+thin-LTO build shows ~ -11% reduction in wall time. * Opportunity to use dependency scanning lexing to speed-up skipping of excluded conditional blocks during normal preprocessing (as follow-up, not part of this patch).
For normal preprocessing measurements show differences are below the noise level.
Since, after this change, we don't minimize sources and pass them in place of the real sources, `DependencyScanningFilesystem` is not technically necessary, but it has valuable performance benefits for caching file `stat`s along with the results of scanning the sources. So the setup of using the `DependencyScanningFilesystem` during a dependency scan remains.
Differential Revision: https://reviews.llvm.org/D125486 Differential Revision: https://reviews.llvm.org/D125487 Differential Revision: https://reviews.llvm.org/D125488
show more ...
|
| #
e9a902c7 |
| 16-Apr-2022 |
Christopher Di Bella <[email protected]> |
Revert "Revert "Revert "[clang][pp] adds '#pragma include_instead'"""
> Includes regression test for problem noted by @hans. > is reverts commit 973de71. > > Differential Revision: https://reviews.l
Revert "Revert "Revert "[clang][pp] adds '#pragma include_instead'"""
> Includes regression test for problem noted by @hans. > is reverts commit 973de71. > > Differential Revision: https://reviews.llvm.org/D106898
Feature implemented as-is is fairly expensive and hasn't been used by libc++. A potential reimplementation is possible if libc++ become interested in this feature again.
Differential Revision: https://reviews.llvm.org/D123885
show more ...
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1 |
|
| #
33ec6530 |
| 08-Feb-2022 |
Timm Bäder <[email protected]> |
[clang][lexer] Allow u8 character literal prefixes in C2x
Implement N2418 for C2x.
Differential Revision: https://reviews.llvm.org/D119221
|
| #
d813116c |
| 01-Mar-2022 |
Dawid Jurczak <[email protected]> |
[NFC][Lexer] Remove getLangOpts function from Lexer
Given that there is only one external user of Lexer::getLangOpts we can remove getter entirely without much pain.
Differential Revision: https://
[NFC][Lexer] Remove getLangOpts function from Lexer
Given that there is only one external user of Lexer::getLangOpts we can remove getter entirely without much pain.
Differential Revision: https://reviews.llvm.org/D120404
show more ...
|
| #
a64d3c60 |
| 25-Feb-2022 |
Dawid Jurczak <[email protected]> |
[NFC][Lexer] Make Lexer::LangOpts const reference
This change can be seen as code cleanup but motivation is more performance related. While browsing perf reports captured during Linux build we can n
[NFC][Lexer] Make Lexer::LangOpts const reference
This change can be seen as code cleanup but motivation is more performance related. While browsing perf reports captured during Linux build we can notice unusual portion of instructions executed in std::vector<std::string> copy constructor like:
0.59% 0.58% clang-14 clang-14 [.] std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::vector
or even:
1.42% 0.26% clang clang-14 [.] clang::LangOptions::LangOptions | --1.16%--clang::LangOptions::LangOptions | --0.74%--std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::vector
After more digging we can see that relevant LangOptions std::vector members (*Files, ModuleFeatures and NoBuiltinFuncs) are constructed when Lexer::LangOpts field is initialized on list:
Lexer::Lexer(..., const LangOptions &langOpts, ...) : ..., LangOpts(langOpts),
Since LangOptions copy constructor is called by Lexer(..., const LangOptions &LangOpts,...) and local Lexer objects are created thousands times (in Lexer::getRawToken, Preprocessor::EnterSourceFile and more) during single module processing in frontend it makes std::vector copy constructors surprisingly hot.
Unfortunately even though in current Lexer implementation mentioned std::vector members are unused and most of time empty, no compiler is smart enough to optimize their std::vector copy constructors out (take a look at test assembly): https://godbolt.org/z/hdoxPfMYY even with LTO enabled. However there is simple way to fix this. Since Lexer doesn't access *Files, ModuleFeatures, NoBuiltinFuncs and any other LangOptions fields (but only LangOptionsBase) we can simply get rid of redundant copy constructor assembly by changing LangOpts type to more appropriate const LangOptions reference: https://godbolt.org/z/fP7de9176
Additionally we need to store LineComment outside LangOpts because it's written in SkipLineComment function. Also FormatTokenLexer need to be adjusted a bit to avoid lifetime issues related to passing local LangOpts reference to Lexer.
After this change I can see more than 1% speedup in some of my microbenchmarks when using Clang release binary built with LTO. For Linux build gains are not so significant but still nice at the level of -0.4%/-0.5% instructions drop.
Differential Revision: https://reviews.llvm.org/D120334
show more ...
|
| #
fbe38a78 |
| 22-Feb-2022 |
Dawid Jurczak <[email protected]> |
[NFC][Lexer] Make access to LangOpts more consistent
Before this change without any good reason Lexer::LangOpts is sometimes accessed by getter and another time read directly in Lexer functions. Sin
[NFC][Lexer] Make access to LangOpts more consistent
Before this change without any good reason Lexer::LangOpts is sometimes accessed by getter and another time read directly in Lexer functions. Since getLangOpts is a bit more verbose prefer direct access to LangOpts member when possible.
Differential Revision: https://reviews.llvm.org/D120333
show more ...
|
|
Revision tags: llvmorg-15-init |
|
| #
ff77071a |
| 28-Jan-2022 |
Kadir Cetinkaya <[email protected]> |
[clang][Lexer] Make raw and normal lexer behave the same for line comments
Normally there are heruistics in lexer to treat `//*` specially in language modes that don't have line comments (to emit `/
[clang][Lexer] Make raw and normal lexer behave the same for line comments
Normally there are heruistics in lexer to treat `//*` specially in language modes that don't have line comments (to emit `/`). Unfortunately this only applied to the first occurence of a line comment inside the file, as the subsequent line comments were treated as if language had support for them.
This unfortunately only holds in normal lexing mode, as in raw mode all occurences of line comments received this treatment, which created discrepancies when comparing expanded and spelled tokens.
The proper fix would be to just make sure we treat all the line comments with a subsequent `*` the same way, but it would imply breaking some code that's accepted by clang today. So instead we introduce the same bug into raw lexing mode.
Fixes https://github.com/clangd/clangd/issues/1003.
Differential Revision: https://reviews.llvm.org/D118471
show more ...
|
|
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
| #
298367ee |
| 29-Dec-2021 |
Kazu Hirata <[email protected]> |
[clang] Use nullptr instead of 0 or NULL (NFC)
Identified with modernize-use-nullptr.
|
|
Revision tags: llvmorg-13.0.1-rc1 |
|
| #
197576c4 |
| 18-Nov-2021 |
Jan Svoboda <[email protected]> |
[clang][lex] Refactor check for the first file include
This patch refactors the code that checks whether a file has just been included for the first time.
The `HeaderSearch::FirstTimeLexingFile` fu
[clang][lex] Refactor check for the first file include
This patch refactors the code that checks whether a file has just been included for the first time.
The `HeaderSearch::FirstTimeLexingFile` function is removed and the information is threaded to the original call site from `HeaderSearch::ShouldEnterIncludeFile`. This will make it possible to avoid tracking the number of includes in a follow up patch.
Depends on D114092.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D114093
show more ...
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4 |
|
| #
274adcb8 |
| 15-Sep-2021 |
Corentin Jabot <[email protected]> |
Implement delimited escape sequences.
\x{XXXX} \u{XXXX} and \o{OOOO} are accepted in all languages mode in characters and string literals.
This is a feature proposed for both C++ (P2290R1) and C (N
Implement delimited escape sequences.
\x{XXXX} \u{XXXX} and \o{OOOO} are accepted in all languages mode in characters and string literals.
This is a feature proposed for both C++ (P2290R1) and C (N2785). The papers have been seen by both committees but are not yet adopted into either standard. However, they do have support from both committees.
show more ...
|
| #
601102d2 |
| 14-Sep-2021 |
Corentin Jabot <[email protected]> |
Cleanup identifier parsing; NFC
Rename methods to clearly signal when they only deal with ASCII, simplify the parsing of identifier, and use start/continue instead of head/body for consistency with
Cleanup identifier parsing; NFC
Rename methods to clearly signal when they only deal with ASCII, simplify the parsing of identifier, and use start/continue instead of head/body for consistency with Unicode terminology.
show more ...
|