Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add ASCII-only option, to mimic default RE2 behavior (#1)
* add ASCII-only option, to mimic default RE2 behaviour This is a workaround, motivated by the difference in handling non-valid UTF8 bytes that Oniriguma has, compared to Go's default RE2. See src-d/enry#225 (comment) Summary of changes: - c: prevent `NewOnigRegex()` from hard-coding UTF8 - c: `NewOnigRegex()` now propely calls to `onig_initialize()` [1] - go: expose new `MustCompileASCII()` \w default charecter class matching only ASCII - go: `MustCompile()` refactored, `initRegexp()` extracted for common UTF8/ASCII logic Encoding was not exposed on Go API level intentionaly for simplisity, in order to avoid introducing complex struct type [2] to API surface. 1. https://github.com/kkos/oniguruma/blob/83572e983928243d741f61ac290fc057d69fefc3/doc/API#L6 2. https://github.com/kkos/oniguruma/blob/83572e983928243d741f61ac290fc057d69fefc3/src/oniguruma.h#L121 Signed-off-by: Alexander Bezzubov <[email protected]> * ci: test on 2 latest go versions Signed-off-by: Alexander Bezzubov <[email protected]> * ci: bump version of Oniguruma to 6.9.1 Update deb to get fix https://bugs.launchpad.net/ubuntu/+source/dpkg/+bug/1730627 Signed-off-by: Alexander Bezzubov <[email protected]> * ci: refactor oniguruma installation Signed-off-by: Alexander Bezzubov <[email protected]> * refactoring go part a bit, addressing review feedback Signed-off-by: Alexander Bezzubov <[email protected]> * ci: fix typo in bash var substitution Signed-off-by: Alexander Bezzubov <[email protected]> * cgo: simplify naive encoding init Signed-off-by: Alexander Bezzubov <[email protected]> * go: doc syntax fix Signed-off-by: Alexander Bezzubov <[email protected]> * tixing fypos Signed-off-by: Alexander Bezzubov <[email protected]>
- Loading branch information