Skip to content

Latest commit

 

History

History
116 lines (87 loc) · 3.45 KB

0070-split-sequence.org

File metadata and controls

116 lines (87 loc) · 3.45 KB

split-sequence

This is the utility library with a single purpose – to split, dissect, cut, cleave and partition sequences.

In the simplest form it is:

POFTHEDAY> (split-sequence:split-sequence
            #\Space
            "Bob loves Alice!")
("Bob" "loves" "Alice!")
16

It is also is able to split only N times, split from the and to remove empty subsequences:

POFTHEDAY> (split-sequence:split-sequence
            0
            #(1 2 3 4 0 5 6 7 0 8 9 0))
(#(1 2 3 4) #(5 6 7) #(8 9) #())

POFTHEDAY> (split-sequence:split-sequence
            0
            #(1 2 3 4 0 5 6 7 0 8 9 0)
            :remove-empty-subseqs t)
(#(1 2 3 4) #(5 6 7) #(8 9))

POFTHEDAY> (split-sequence:split-sequence
            0
            #(1 2 3 4 0 5 6 7 0 8 9 0)
            :remove-empty-subseqs t
            :from-end t
            :count 1)
(#(8 9))

There are also split-sequence-if and split-sequence-if-not:

POFTHEDAY> (defstruct word text)
POFTHEDAY> (defstruct white-space)

POFTHEDAY> (defmethod print-object ((obj word) stream)
             (format stream "<WORD ~A>" (word-text obj)))

POFTHEDAY> (defmethod print-object ((obj white-space) stream)
             (format stream "<SPACE>"))

POFTHEDAY> (defparameter *tokens*
             (list (make-word :text "Bob")
                   (make-white-space)
                   (make-word :text "loves")
                   (make-white-space)
                   (make-word :text "Alice")))
(<WORD Bob> <SPACE> <WORD loves> <SPACE> <WORD Alice>)

POFTHEDAY> (split-sequence:split-sequence-if
            (lambda (item)
              (typep item 'white-space))
              *tokens*)
((<WORD Bob>) (<WORD loves>) (<WORD Alice>))

By the way, a library cl-utilities, reviewed two days ago, and rutils, reviewed at the start of the week, are also include these splitting functions, but code is different. Probably this is because split-sequence evolved since it was copied into cl-utilities and rutils.

This simple search query on Ultralisp.org shows that this functionality is also available in some other Common Lisp libraries.

Update 1

@fwoaroof gave me a link to the split function, optimized to work with very long (> 1G) strings.

Update 2

@stevelosh sent me a code which uses split-sequence to make an iterator:

(defun spliterator (delimiter sequence &key (test #'eql) (key #'identity))
  (let ((start 0)
        (length (length sequence)))
    (lambda ()
      (if (= start length)
          (values nil nil)
          (multiple-value-bind (next end)
              (split-sequence:split-sequence delimiter sequence
                                             :count 1 :start start
                                             :key key :test test)
            (setf start end)
            (values (first next) t))))))

Thank you, Steve!