History log of /llvm-project-15.0.7/clang-tools-extra/pseudo/lib/grammar/LRTableBuild.cpp (Results 1 – 12 of 12)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# aba43035 23-Jul-2022 Dmitri Gribenko <[email protected]>

Use llvm::sort instead of std::sort where possible

llvm::sort is beneficial even when we use the iterator-based overload,
since it can optionally shuffle the elements (to detect
non-determinism). Ho

Use llvm::sort instead of std::sort where possible

llvm::sort is beneficial even when we use the iterator-based overload,
since it can optionally shuffle the elements (to detect
non-determinism). However llvm::sort is not usable everywhere, for
example, in compiler-rt.

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D130406

show more ...


# 31211674 29-Jun-2022 Sam McCall <[email protected]>

[pseudo] Add error-recovery framework & brace-based recovery

The idea is:

- a parse failure is detected when all heads die when trying to shift the next token
- we can recover by choosing a nonterm

[pseudo] Add error-recovery framework & brace-based recovery

The idea is:

- a parse failure is detected when all heads die when trying to shift the next token
- we can recover by choosing a nonterminal we're partway through parsing, and
determining where it ends through nonlocal means (e.g. matching brackets)
- we can find candidates by walking up the stack from the (ex-)heads
- the token range is defined using heuristics attached to grammar rules
- the unparsed region is represented in the forest by an Opaque node

This patch has the core GLR functionality.
It does not allow recovery heuristics to be attached as extensions to
the grammar, but rather infers a brace-based heuristic.

Expected followups:

- make recovery heuristics grammar extensions (depends on D127448)
- add recovery to our grammar for bracketed constructs and sequence nodes
- change the structure of our augmented `_ := start` rules to eliminate some
special-cases in glrParse.
- (if I can work out how): avoid some spurious recovery cases described in comments

(Previously mistakenly committed as a0f4c10ae227a62c2a63611e64eba83f0ff0f577)

Differential Revision: https://reviews.llvm.org/D128486

show more ...


# 9fbf1107 04-Jul-2022 Sam McCall <[email protected]>

[pseudo] Eliminate LRTable::Action. NFC

The last remaining uses are in tests/test builders.
Replace with a builder struct.

Differential Revision: https://reviews.llvm.org/D129093


# b37dafd5 24-Jun-2022 Sam McCall <[email protected]>

[pseudo] Store shift and goto actions in a compact structure with faster lookup.

The actions table is very compact but the binary search to find the
correct action is relatively expensive.
A hashtab

[pseudo] Store shift and goto actions in a compact structure with faster lookup.

The actions table is very compact but the binary search to find the
correct action is relatively expensive.
A hashtable is faster but pretty large (64 bits per value, plus empty
slots, and lookup is constant time but not trivial due to collisions).

The structure in this patch uses 1.25 bits per entry (whether present or absent)
plus the size of the values, and lookup is trivial.

The Shift table is 119KB = 27KB values + 92KB keys.
The Goto table is 86KB = 30KB values + 57KB keys.
(Goto has a smaller keyspace as #nonterminals < #terminals, and more entries).

This patch improves glrParse speed by 28%: 4.69 => 5.99 MB/s
Overall the table grows by 60%: 142 => 228KB.

By comparison, DenseMap<unsigned, StateID> is "only" 16% faster (5.43 MB/s),
and results in a 285% larger table (547 KB) vs the baseline.

Differential Revision: https://reviews.llvm.org/D128485

show more ...


# 743971fa 28-Jun-2022 Sam McCall <[email protected]>

Revert "[pseudo] Add error-recovery framework & brace-based recovery"

This reverts commit a0f4c10ae227a62c2a63611e64eba83f0ff0f577.
This commit hadn't been reviewed yet, and was unintentionally incl

Revert "[pseudo] Add error-recovery framework & brace-based recovery"

This reverts commit a0f4c10ae227a62c2a63611e64eba83f0ff0f577.
This commit hadn't been reviewed yet, and was unintentionally included
on another branch.

show more ...


Revision tags: llvmorg-14.0.6, llvmorg-14.0.5
# a0f4c10a 08-Jun-2022 Sam McCall <[email protected]>

[pseudo] Add error-recovery framework & brace-based recovery

The idea is:
- a parse failure is detected when all heads die when trying to shift
the next token
- we can recover by choosing a non

[pseudo] Add error-recovery framework & brace-based recovery

The idea is:
- a parse failure is detected when all heads die when trying to shift
the next token
- we can recover by choosing a nonterminal we're partway through parsing,
and determining where it ends through nonlocal means (e.g. matching brackets)
- we can find candidates by walking up the stack from the (ex-)heads
- the token range is defined using heuristics attached to grammar rules
- the unparsed region is represented in the forest by an Opaque node

This patch has the core GLR functionality.
It does not allow recovery heuristics to be attached as extensions to
the grammar, but rather infers a brace-based heuristic.

Expected followups:
- make recovery heuristics grammar extensions (depends on D127448)
- add recover to our grammar for bracketed constructs and sequence nodes
- change the structure of our augmented `_ := start` rules to eliminate
some special-cases in glrParse.
- (if I can work out how): avoid some spurious recovery cases described
in comments
- grammar changes to eliminate the hard distinction between init-list
and designated-init-list shown in the recovery-init-list.cpp testcase

Differential Revision: https://reviews.llvm.org/D128486

show more ...


# 85eaecbe 23-Jun-2022 Sam McCall <[email protected]>

[pseudo] Check follow-sets instead of tying reduce actions to lookahead tokens.

Previously, the action table stores a reduce action for each lookahead
token it should allow. These tokens are the fol

[pseudo] Check follow-sets instead of tying reduce actions to lookahead tokens.

Previously, the action table stores a reduce action for each lookahead
token it should allow. These tokens are the followSet(action.rule.target).

In practice, the follow sets are large, so we spend a bunch of time binary
searching around all these essentially-duplicates to check whether our lookahead
token is there.
However the number of reduces for a given state is very small, so we're
much better off linear scanning over them and performing a fast check for each.

D128318 was an attempt at this, storing a bitmap for each reduce.
However it's even more compact just to use the follow sets directly, as
there are fewer nonterminals than (state, rule) pairs. It's also faster.

This specialized approach means unbundling Reduce from other actions in
LRTable, so it's no longer useful to support it in Action. I suspect
Action will soon go away, as we store each kind of action separately.

This improves glrParse speed by 42% (3.30 -> 4.69 MB/s).
It also reduces LR table size by 59% (343 -> 142kB).

Differential Revision: https://reviews.llvm.org/D128472

show more ...


# b70ee9d9 23-Jun-2022 Sam McCall <[email protected]>

Reland "[pseudo] Track heads as GSS nodes, rather than as "pending actions"."

This reverts commit 2c80b5319870b57fbdbb6c9cef9c86c26c65371d.

Fixes LRTable::buildForTest to create states that are ref

Reland "[pseudo] Track heads as GSS nodes, rather than as "pending actions"."

This reverts commit 2c80b5319870b57fbdbb6c9cef9c86c26c65371d.

Fixes LRTable::buildForTest to create states that are referenced but
have no actions.

show more ...


# c70aeaad 09-Jun-2022 Haojian Wu <[email protected]>

[pseudo] Move grammar-related headers to a separate dir, NFC.

We did that for .cpp, but forgot the headers.

Differential Revision: https://reviews.llvm.org/D127388


# 7a05942d 08-Jun-2022 Haojian Wu <[email protected]>

[pseudo] Remove the explicit Accept actions.

As pointed out in the previous review section, having a dedicated accept
action doesn't seem to be necessary. This patch implements the the same behavior

[pseudo] Remove the explicit Accept actions.

As pointed out in the previous review section, having a dedicated accept
action doesn't seem to be necessary. This patch implements the the same behavior
without accept acction, which will save some code complexity.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D125677

show more ...


# 93bcff8a 03-Jun-2022 Sam McCall <[email protected]>

[pseudo] Invert rows/columns of LRTable storage for speedup. NFC

There are more states than symbols.
This means first partioning the action list by state leaves us with a smaller
range to binary sea

[pseudo] Invert rows/columns of LRTable storage for speedup. NFC

There are more states than symbols.
This means first partioning the action list by state leaves us with a smaller
range to binary search over. This improves find() a lot and glrParse() by 7%.
The tradeoff is storing more smaller ranges increases the size of the offsets
array, overall grammar memory is +1% (337->340KB).

Before:
glrParse 188795975 ns 188778003 ns 77 bytes_per_second=1.98068M/s
After:
glrParse 175936203 ns 175916873 ns 81 bytes_per_second=2.12548M/s

Differential Revision: https://reviews.llvm.org/D127006

show more ...


Revision tags: llvmorg-14.0.4
# cd2292ef 24-May-2022 Haojian Wu <[email protected]>

[pseudo] A basic implementation of compiling cxx grammar at build time.

The main idea is to compile the cxx grammar at build time, and construct
the core pieces (Grammar, LRTable) of the pseudoparse

[pseudo] A basic implementation of compiling cxx grammar at build time.

The main idea is to compile the cxx grammar at build time, and construct
the core pieces (Grammar, LRTable) of the pseudoparse based on the compiled
data sources.

This is a tiny implementation, which is good for start:

- defines how the public API should look like;
- integrates the cxx grammar compilation workflow with the cmake system.
- onlynonterminal symbols of the C++ grammar are compiled, anything
else are still doing the real compilation work at runtime, we can opt-in more
bits in the future;
- splits the monolithic clangPsuedo library for better layering;

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D125667

show more ...