Skip to content

Latest commit

 

History

History
12 lines (7 loc) · 632 Bytes

README.md

File metadata and controls

12 lines (7 loc) · 632 Bytes

cape-splitter CircleCI

Functionality

Cape splitter provides the following functionality:

  • Split documents into groups, keeping full sentences and extracting overlapping text before and after the group.
  • Return batches, grouping batches by number of words.

Performance

Tokenization and splitting is done in 3.7 secs for SQuAD on a MacBook Pro (mid-2015 with 2.2 GHz Intel Core i7).