Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added ability to use regex rules #133

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions docs/rules.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ rules:
# word_boundary_start: false
# word_boundary_end: false
# include_note: false
# regex_terms: false
```

A set of default rules is provided in [`pkg/rule/default.yaml`]({{config.repo_url}}blob/main/pkg/rule/default.yaml).
Expand Down Expand Up @@ -60,6 +61,14 @@ You can configure options for each rule. Add an `options` key to your rule defin
* If `false`, the rule note will not be included in the output message
* If `not set`, `include_note` in your `woke` config file (ie `.woke.yml`) regulates if the note should be included in the output message (default: `false`).

### `regex_terms`

:octicons-milestone-24: Default: `false`

* If `true`, terms will be evaluated as regular expressions
* If `false`, terms will be treated as plain-text values
* **NOTE** this is an advanced feature. Rules will be skipped if they do not compile. Only use non-capturing groups in patterns. Look-around assertions are not supported.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a fatal error should happen if a supplied regex can't compile. Otherwise the user may never know that their rule isn't being used.

Copy link
Contributor Author

@cognitivegears cognitivegears Aug 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure - I added a Error log message, so when it runs if a rule won't compile, an error is generated but it continues, it looks like this:

2021-08-13T09:03:27-05:00 ERR Unable to compile regular expression, disabling rule error="error parsing regexp: missing closing ): `(?i)(black(\\s*list)`" Rule=rulename

I thought about letting it do a fatal error instead, though I chose this approach in anticipation of the remote ruleset feature - my thought was that if there is a rule that doesn't work in a remote ruleset that the user can't necessarily directly change, it could continue on after printing the error. That said, I'm totally happy to go either way with this - if you want it a fatal error that's easy to change, I can just change it back to a MustCompile instead, or doing a panic after logging the error so that they still get that debug info from that log. What do you prefer here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my thought was that if there is a rule that doesn't work in a remote ruleset that the user can't necessarily directly change

I think we should tackle this when it becomes a problem or concern.


## Disabling Default Rules

You can disable default rules by providing a rule in your `woke` config file (ie `.woke.yml`), with no terms or alternatives.
Expand Down
1 change: 1 addition & 0 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,7 @@ Outputs the results as a series of [`json`](https://www.json.org/json-en.html) f
"WordBoundary": <optionbool>,
"WordBoundaryStart": <optionbool>,
"WordBoundaryEnd": <optionbool>,
"RegexTerms": <optionbool>,
"IncludeNote": <optionbool>
}
},
Expand Down
4 changes: 2 additions & 2 deletions pkg/printer/json_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ func TestJSON_Print_JSON(t *testing.T) {
res := generateFileResult()
p := NewJSON(buf)
assert.NoError(t, p.Print(res))
expected := "{\"Filename\":\"foo.txt\",\"Results\":[{\"Rule\":{\"Name\":\"whitelist\",\"Terms\":[\"whitelist\",\"white-list\",\"whitelisted\",\"white-listed\"],\"Alternatives\":[\"allowlist\"],\"Note\":\"\",\"Severity\":\"warning\",\"Options\":{\"WordBoundary\":false,\"WordBoundaryStart\":false,\"WordBoundaryEnd\":false,\"IncludeNote\":null}},\"Finding\":\"whitelist\",\"Line\":\"this whitelist must change\",\"StartPosition\":{\"Filename\":\"foo.txt\",\"Offset\":0,\"Line\":1,\"Column\":6},\"EndPosition\":{\"Filename\":\"foo.txt\",\"Offset\":0,\"Line\":1,\"Column\":15},\"Reason\":\"`whitelist` may be insensitive, use `allowlist` instead\"}]}\n"
expected := "{\"Filename\":\"foo.txt\",\"Results\":[{\"Rule\":{\"Name\":\"whitelist\",\"Terms\":[\"whitelist\",\"white-list\",\"whitelisted\",\"white-listed\"],\"Alternatives\":[\"allowlist\"],\"Note\":\"\",\"Severity\":\"warning\",\"Options\":{\"WordBoundary\":false,\"WordBoundaryStart\":false,\"WordBoundaryEnd\":false,\"RegexTerms\":false,\"IncludeNote\":null}},\"Finding\":\"whitelist\",\"Line\":\"this whitelist must change\",\"StartPosition\":{\"Filename\":\"foo.txt\",\"Offset\":0,\"Line\":1,\"Column\":6},\"EndPosition\":{\"Filename\":\"foo.txt\",\"Offset\":0,\"Line\":1,\"Column\":15},\"Reason\":\"`whitelist` may be insensitive, use `allowlist` instead\"}]}\n"
got := buf.String()
assert.Equal(t, expected, got)
}
Expand Down Expand Up @@ -57,6 +57,6 @@ func TestJSON_Multiple(t *testing.T) {
p.End()
got := buf.String()

expected := "{\"Filename\":\"foo.txt\",\"Results\":[{\"Rule\":{\"Name\":\"whitelist\",\"Terms\":[\"whitelist\",\"white-list\",\"whitelisted\",\"white-listed\"],\"Alternatives\":[\"allowlist\"],\"Note\":\"\",\"Severity\":\"warning\",\"Options\":{\"WordBoundary\":false,\"WordBoundaryStart\":false,\"WordBoundaryEnd\":false,\"IncludeNote\":null}},\"Finding\":\"whitelist\",\"Line\":\"this whitelist must change\",\"StartPosition\":{\"Filename\":\"foo.txt\",\"Offset\":0,\"Line\":1,\"Column\":6},\"EndPosition\":{\"Filename\":\"foo.txt\",\"Offset\":0,\"Line\":1,\"Column\":15},\"Reason\":\"`whitelist` may be insensitive, use `allowlist` instead\"}]}\n{\"Filename\":\"bar.txt\",\"Results\":[{\"Rule\":{\"Name\":\"slave\",\"Terms\":[\"slave\"],\"Alternatives\":[\"follower\"],\"Note\":\"\",\"Severity\":\"error\",\"Options\":{\"WordBoundary\":false,\"WordBoundaryStart\":false,\"WordBoundaryEnd\":false,\"IncludeNote\":null}},\"Finding\":\"slave\",\"Line\":\"this slave term must change\",\"StartPosition\":{\"Filename\":\"bar.txt\",\"Offset\":0,\"Line\":1,\"Column\":6},\"EndPosition\":{\"Filename\":\"bar.txt\",\"Offset\":0,\"Line\":1,\"Column\":15},\"Reason\":\"`slave` may be insensitive, use `follower` instead\"}]}\n{\"Filename\":\"barfoo.txt\",\"Results\":[{\"Rule\":{\"Name\":\"test\",\"Terms\":[\"test\"],\"Alternatives\":[\"alternative\"],\"Note\":\"\",\"Severity\":\"info\",\"Options\":{\"WordBoundary\":false,\"WordBoundaryStart\":false,\"WordBoundaryEnd\":false,\"IncludeNote\":null}},\"Finding\":\"test\",\"Line\":\"this test must change\",\"StartPosition\":{\"Filename\":\"barfoo.txt\",\"Offset\":0,\"Line\":1,\"Column\":6},\"EndPosition\":{\"Filename\":\"barfoo.txt\",\"Offset\":0,\"Line\":1,\"Column\":15},\"Reason\":\"`test` may be insensitive, use `alternative` instead\"}]}\n"
expected := "{\"Filename\":\"foo.txt\",\"Results\":[{\"Rule\":{\"Name\":\"whitelist\",\"Terms\":[\"whitelist\",\"white-list\",\"whitelisted\",\"white-listed\"],\"Alternatives\":[\"allowlist\"],\"Note\":\"\",\"Severity\":\"warning\",\"Options\":{\"WordBoundary\":false,\"WordBoundaryStart\":false,\"WordBoundaryEnd\":false,\"RegexTerms\":false,\"IncludeNote\":null}},\"Finding\":\"whitelist\",\"Line\":\"this whitelist must change\",\"StartPosition\":{\"Filename\":\"foo.txt\",\"Offset\":0,\"Line\":1,\"Column\":6},\"EndPosition\":{\"Filename\":\"foo.txt\",\"Offset\":0,\"Line\":1,\"Column\":15},\"Reason\":\"`whitelist` may be insensitive, use `allowlist` instead\"}]}\n{\"Filename\":\"bar.txt\",\"Results\":[{\"Rule\":{\"Name\":\"slave\",\"Terms\":[\"slave\"],\"Alternatives\":[\"follower\"],\"Note\":\"\",\"Severity\":\"error\",\"Options\":{\"WordBoundary\":false,\"WordBoundaryStart\":false,\"WordBoundaryEnd\":false,\"RegexTerms\":false,\"IncludeNote\":null}},\"Finding\":\"slave\",\"Line\":\"this slave term must change\",\"StartPosition\":{\"Filename\":\"bar.txt\",\"Offset\":0,\"Line\":1,\"Column\":6},\"EndPosition\":{\"Filename\":\"bar.txt\",\"Offset\":0,\"Line\":1,\"Column\":15},\"Reason\":\"`slave` may be insensitive, use `follower` instead\"}]}\n{\"Filename\":\"barfoo.txt\",\"Results\":[{\"Rule\":{\"Name\":\"test\",\"Terms\":[\"test\"],\"Alternatives\":[\"alternative\"],\"Note\":\"\",\"Severity\":\"info\",\"Options\":{\"WordBoundary\":false,\"WordBoundaryStart\":false,\"WordBoundaryEnd\":false,\"RegexTerms\":false,\"IncludeNote\":null}},\"Finding\":\"test\",\"Line\":\"this test must change\",\"StartPosition\":{\"Filename\":\"barfoo.txt\",\"Offset\":0,\"Line\":1,\"Column\":6},\"EndPosition\":{\"Filename\":\"barfoo.txt\",\"Offset\":0,\"Line\":1,\"Column\":15},\"Reason\":\"`test` may be insensitive, use `alternative` instead\"}]}\n"
assert.Equal(t, expected, got)
}
1 change: 1 addition & 0 deletions pkg/rule/options.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,6 @@ type Options struct {
WordBoundary bool `yaml:"word_boundary"`
WordBoundaryStart bool `yaml:"word_boundary_start"`
WordBoundaryEnd bool `yaml:"word_boundary_end"`
RegexTerms bool `yaml:"regex_terms"`
IncludeNote *bool `yaml:"include_note"`
}
19 changes: 15 additions & 4 deletions pkg/rule/rule.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@ import (
"regexp"
"strings"

"github.com/rs/zerolog/log"

"github.com/get-woke/woke/pkg/util"
)

Expand Down Expand Up @@ -81,8 +83,13 @@ func (r *Rule) SetOptions(o Options) {
}

func (r *Rule) setRegex() {
group := strings.Join(escape(r.Terms), "|")
r.re = regexp.MustCompile(fmt.Sprintf(r.regexString(), group))
var err error
group := strings.Join(escape(r, r.Terms), "|")
r.re, err = regexp.Compile(fmt.Sprintf(r.regexString(), group))
if err != nil {
log.Error().Err(err).Str("Rule", r.Name).Msg("Unable to compile regular expression, disabling rule")
r.Terms = nil // Disable the rule
}
}

func (r *Rule) regexString() string {
Expand Down Expand Up @@ -189,9 +196,13 @@ func IsDirectiveOnlyLine(line string) bool {
return !util.ContainsAlphanumeric(leftText)
}

func escape(ss []string) []string {
func escape(r *Rule, ss []string) []string {
for i, s := range ss {
ss[i] = regexp.QuoteMeta(s)
if r.Options.RegexTerms {
ss[i] = s
} else {
ss[i] = regexp.QuoteMeta(s)
}
}
return ss
}
Expand Down
68 changes: 68 additions & 0 deletions pkg/rule/rule_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,32 @@ func testRule() Rule {
}
}

func testRegexRuleWithOptions(o Options) Rule {
r := testRegexRule()
r.SetOptions(o)
return r
}

func testRegexRule() Rule {
return Rule{
Name: "ruleregex",
Terms: []string{`\d+`},
Alternatives: []string{"alt-regex1", "alt-regex-1"},
Severity: SevWarn,
}
}

func testInvalidRegexRule() Rule {
r := Rule{
Name: "invalidrule",
Terms: []string{"("},
Alternatives: []string{"alt-rule1", "alt-rule-1"},
Severity: SevWarn,
}
r.SetOptions(Options{RegexTerms: true})
return r
}

func TestRule_FindMatchIndexes(t *testing.T) {
tests := []struct {
text string
Expand Down Expand Up @@ -49,6 +75,43 @@ func TestRule_FindMatchIndexes(t *testing.T) {
assert.Equal(t, [][]int(nil), e.FindMatchIndexes("rule1"))
}

func TestRule_InvalidRegexRule(t *testing.T) {
r := testInvalidRegexRule()

// Verify rule is compiled
r.setRegex()

// Validate that terms are now empty / rule is disabled
assert.Empty(t, r.Terms)
assert.True(t, r.Disabled())
}

func TestRule_FindMatchRegexIndexes(t *testing.T) {
tests := []struct {
text string
expected [][]int
expectedRe [][]int
}{
{"this string has 123456 and 56789 included", [][]int(nil), [][]int{{16, 22}, {27, 32}}},
{"this string does not have any findings", [][]int(nil), [][]int(nil)},
{`this string has finding with \d+ \d+`, [][]int{{29, 32}, {33, 36}}, [][]int(nil)},
}
for _, test := range tests {
r := testRegexRule() // Default to non regular expression matching
got := r.FindMatchIndexes(test.text)
assert.Equal(t, test.expected, got)
}

for _, test := range tests {
r := testRegexRuleWithOptions(Options{RegexTerms: true})
got := r.FindMatchIndexes(test.text)
assert.Equal(t, test.expectedRe, got)
}

e := Rule{Name: "rule1"}
assert.Equal(t, [][]int(nil), e.FindMatchIndexes("rule1"))
}

func TestRule_Reason(t *testing.T) {
r := testRule()
assert.Equal(t, "`rule-1` may be insensitive, use `alt-rule1`, `alt-rule-1` instead", r.Reason("rule-1"))
Expand Down Expand Up @@ -133,6 +196,11 @@ func TestRule_regexString(t *testing.T) {
rule: testRuleWithOptions(Options{WordBoundary: true}),
expected: `(?i)\b(%s)\b`,
},
{
desc: "regex rule",
rule: testRegexRuleWithOptions(Options{RegexTerms: true}),
expected: `(?i)(%s)`,
},
{
desc: "word boundary start",
rule: testRuleWithOptions(Options{WordBoundaryStart: true}),
Expand Down