This document describes how to implement style lint rules.
Before you begin, familiarize yourself with:
- Join the developer's mailing list: [email protected] (join)
- Use the github issue tracker, discuss and comment:
Whose style guide serves as a reference for lint rules? Everyone and anyone's. Every team may set its own guidelines on what constitutes correct style. The style linter hosts an ever-growing library of rules, but you decide which rules or configurations best suit your project.
- Identify error-prone constructs, including those that may lead to production bugs
- Have few exceptions, low risk of false positives, and low frequency of lint waivers
- Reduces the number of choices for ways-to-express-a-concept
- Rules that enforce SV-LRM conformance should be implemented in actual compilers, but sometimes can be enforced in style linters as well.
The major classes of text analyses available today are:
-
LineLintRule analyzes text one line at a time. Examples:
-
TokenStreamLintRule scans one token at a time. Examples:
-
SyntaxTreeLintRule analyzes the syntax trees, examining tree nodes and leaves. The vast majority of SystemVerilog lint rules fall under this category. The Analysis Tools section describes various syntax tree analysis tools.
-
TextStructureLintRule analyzes an entire TextStructureView in any manner. This is the most flexible analyzer that can access all of the structured views of the analyzed text. It is best suited for rules that:
- need access to more than one of the following forms: lines, tokens, syntax-tree
- can be implemented efficiently without having to traverse any of the aforementioned forms.
Examples: module_filename_rule, line_length_rule, and posix_eof_rule.
For complete links to examples of each of the above lint rule classes, click on the class definition and navigate to "Extended By" inside the "Cross References" panel in the code search viewer.
This section describes various tools and libraries for analyzing and querying syntax trees.
Use verible-verilog-syntax --printtree
to examine
the syntax structure of examples of code of interest.
TreeContextVisitor is a syntax tree visitor that maintains a stack of ancestor nodes in a stack as it traverses nodes and leaves. This is useful for being able to query the stack to determine the context at any node.
The SV concrete syntax tree (CST) is described here. The CST
library contains a number of useful GetXFromY
-type accessor functions. These
functions offer the most direct way of extracting information from syntax tree
nodes. Accessor functions are useful when you've already narrowed down your
search to one or a few specific types (enums) of syntax tree nodes.
Pros:
- Fast, because no search is involved.
Cons:
- You may have to write some new CST accessor functions if what you need doesn't already exist.
SearchSyntaxTree is a generic search function for identifying all syntax tree nodes that satisfy a given predicate. Searching with this function yields TreeSearchMatch objects that point to syntax tree nodes/leaves and include the context in which the node matched.
Pros:
- Good for finding nodes when you don't know (or don't care) where in a subtree they could appear.
- More resilient (but not immune) to CST restructuring.
Cons:
- Slower than direct access because a search always visits every subnode and leaf.
The CST Matcher library provides a convenient way to create matcher objects that describe certain syntactic patterns.
The Syntax Tree Matcher library uses some principles from Clang's ASTMatcher Library.
Pros:
- Expressive and composeable.
- More resilient to CST positional substructure changes such as changing child ranks.
Cons:
- Can be expensive due to searching.
For every SystemVerilog CST node enum,
we produce a corresponding node-matcher in verilog_matchers.h that finds that
node type. For example, NodekFunctionDeclaration
matches nodes tagged
kFunctionDeclaration
. These are defined using TagMatchBuilder.
Path matchers are a shorthand for expressing a match on an ancestral chain of
node types. For example, MakePathMatcher({X, Y, Z})
, where X
, Y
, and Z
are CST tags for specific nodes types, creates a matcher that will find nodes of
type X
that directly contain a Y
child that directly contains a Z
child.
verilog_matchers.h contains several examples.
Every matcher object can accept inner matchers that can refine matching
conditions and narrow search results. Composing matchers looks like
OuterMatcher(InnerMatcher(...), ...)
, which would return a positive match on a
node that matches OuterMatcher
, whose subtree also satisfies InnerMatcher
.
Matcher operators are functions described in core_matchers.h.
Summary:
AllOf(...)
matches positively if all of its inner matchers positively.AnyOf(...)
matches positively if any of its inner matchers positively.Unless(...)
matches positively if its inner matcher does not to match.
TagMatchBuilders by default combines its inner matchers with AllOf
, so can
write NodekFoo(InnerMatcher1(), InnerMatcher2())
instead of the equivalent
NodekFoo(AllOf(InnerMatcher1(), InnerMatcher2()))
.
The order of the inner matchers to the above functions is inconsequential to the match result; they are fully commutative.
Many matchers support binding to user-provided names called
[BindableMatchers]. This lets you save interesting subtree positions found
during the match and retrieve them from a BoundSymbolManager.
Example using .Bind()
.
When you've determined that the code being analyzed matches a pattern of interest, record a LintViolation object.
Narrow down the location of the offending substring of text as much as possible so that users can see precisely what is wrong. You can select a whole block of text, a syntax node subtree, a single token, or even a substring within a token.
Include a diagnostic message that describes the problem, citing a passage from a style guide. Recommend a corrective action where you can.
Each lint rule that analyzes code produces a LintRuleStatus that contains a set of [LintViolations].
A typical set of lint rule tests follow this template:
TEST(LintRuleNameTest, Various) {
const std::initializer_list<LintTestCase> kTestCases = {
...
};
RunLintTestCases<VerilogAnalyzer, LintRuleName>(kTestCases);
}
Each LintTestCase is built from mix of plain string literals and tagged string literals that markup where findings are to be expected.
{"uninteresting text ",
{kSymbolType, "interesting text"}, // expect a finding over this substring
"; uninteresting text"},
The test driver converts this into input code to analyze and a set of expected findings. A test will fail if the actual findings do not match the expected ones exactly (down to their locations).
Make sure to include negative tests that expect no lint violations.