-
Notifications
You must be signed in to change notification settings - Fork 368
Text Range Module
Rangy's TextRange module provides various methods for navigating and manipulating the visible text (see below for a definition of this) on a page by character or word.
This module also provides an implementation of innerText
(see this article by kangax for some background). It isn't strictly Rangy's area of concern but it comes virtually free from this module's functionality: an element's innerText
can be considered the visible text of a range that encompasses the contents of the element, so that's what is provided.
The algorithm used to determine visible text is based on Aryeh Gregor's aborted innerText
specification from 2011. This is no longer available on its original URL so here is a copy from the Rangy repository:
https://rawgit.com/timdown/rangy/master/fiddlings/spec/innerText.htm
Summary:
- Text inside a
<script>
or<style>
element is not included - Text inside any element hidden via CSS
display: none
orvisibility: hidden
is not included - Collapsed white space (for example in HTML such as
<span>One two</span>
) is considered as a single space character - White space implied by block elements and
<br>
elements is included
Factors that are not taken into consideration:
- CSS
text-transform
. For example, the text inside<span id="foo" style="text-transform: uppercase">hello</span>
will be rendered on the page as "HELLO" but will be considered as "hello" by Rangy.rangy.innerText( document.getElementById("foo") )
will return "hello". - Text hidden via
overflow
orclip
CSS rules. - CSS generated content. This content is generated by CSS
content
property in conjunction with:before
and:after
pseudo-elements. For example, with the style rule#foo:after { content: "Two"; }
, the text inside<span id="foo">One</span>Three
will be rendered on the page as "OneTwoThree" but will be considered as "OneThree" by Rangy.
Square brackets around one or more function or method parameters indicate that the parameter(s) is/are optional.
Optional object used in options in several methods below. It governs how words are identified and may have any combination of the following properties:
-
includeTrailingSpace
: Boolean specifying whether to include trailing space after a word within the word. Default isfalse
. -
wordRegex
: regular expression object used to identify words. Default is/[a-z0-9]+('[a-z0-9]+)*/gi
. -
tokenizer
: Function used to tokenize text when identifying word boundaries (to be documented).
Optional object used in options in several methods below. It governs treatment of space characters and may have any combination of the following properties:
-
includeBlockContentTrailingSpace
: Boolean specifying whether to include a trailing space within a block. -
includeSpaceBeforeBr
: Boolean specifying whether to include an inline space immediately preceding a<br>
element. Default istrue
. -
includePreLineTrailingSpace
: Boolean specifying whether to include a trailing space immediately preceding a line break within an element whosewhite-space
CSS property is set topre-line
. Default istrue
. -
ignoreCharacters
: String containing characters that should be ignored. This can be used to ignore zero-width space characters, for example. Default is""
.
The TextRange module adds the following to all Rangy Range objects. The API is based on methods from Internet Explorer's TextRange object.
Moves the start or end of the range by the number of units specified by count
. unit
must be one of "word" and "character".
options
is an optional object parameter that governs how the range boundary move handles particular cases. It may have any combination of the following properties:
-
wordOptions
: See above. -
characterOptions
: See above.
Collapses the range to a single point and moves it by the number of units specified by count
. unit
must be one of "word" and "character".
If count
is negative, the range is collapsed to the start, otherwise it is collapsed to the end.
options
as per moveStart()
and moveEnd()
above.
Expands the range to completely encompass all units that it currently contains or partially contains. unit
must be one of "word" and "character", although this particular method is only really useful for words. If "character" is specified, this method is identical to calling moveEnd("character", 1)
. If "word" is specified, the range is expanded to encompass all partially-selected word or non-word units, as defined by the wordRegex
option.
options
is an optional object parameter that governs how the range boundary move handles particular cases. It may have any combination of the following properties:
-
wordOptions
: See above. -
characterOptions
: See above. -
trim
: Boolean specifying whether to trim trailing and leading spaces from the final range. Default isfalse
. -
trimStart
: Boolean specifying whether to trim leading spaces from the expanded range. Only comes into effect iftrim
property is true. Default istrue
. -
trimEnd
: Boolean specifying whether to trim trailing spaces from the expanded range. Only comes into effect iftrim
property is true. Default istrue
.
Returns the visible text contained in the range.
Moves the range to contain text within containerNode
specified by character indices startIndex
and endIndex
within the visible text of
containerNode
.
Returns the range as a pair of character indices relative to the start of the visible text of containerNode
. The returned value is an object with properties start
and end
.
Provides a means of searching text on a page, including using regular expressions. In conjunction with the Class Applier module, this can be used to create a custom page search facility (as demonstrated on the demo page).
This method searches the visible text of the document for the text specified by searchTerm
, which may be either a string or a regular expression object. The search starts from the start or end of the range (depending on search direction). The range moves to encompass the first match.
options
is an optional object parameter that allows flexible searching. Any combination of the following properties may be supplied:
-
caseSensitive
: Default isfalse
-
withinRange
: Specifies the scope of the search. If supplied, only the text within this range is searched. Default isnull
. -
wholeWordsOnly
: Whether to match only whole words. Default isfalse
. -
wrap
: Whether the search should wrap around if the start or end of the search scope is reached without finding a match. Default isfalse
. -
direction
: String that specifies the direction of the search. Set it to "backward" to perform a backward search. Default is "forward". -
wordOptions
: See above. -
characterOptions
: See above.
Replaces the contents of the range with HTML specified by html
.
Collapses the selection to a single point and moves it by the number of units specified by count
. unit
must be one of "word" and "character".
If count
is negative, the selection is collapsed to the start, otherwise it is collapsed to the end.
See Range move()
documentation for details.
Expands the selection to completely encompass all units that it currently contains or partially contains. See Range expand()
documentation for details.
If the selection was originally backward then the expanded selection will also be backward, if programmatic creation of backward ranges is supported by the browser (in practice this means all major browsers except IE).
Selects a range of characters within the visible text of containerNode
specified by startIndex
and endIndex
. The selection direction is governed by direction
(although in IE, the selection will always be created forwards).
direction
may be any of the strings "forward", "forwards", "backward" or "backwards" or a Boolean (in which case true
corresponds to "backwards").
Returns an object specifying the selection as character indices within the visible text of containerNode
. This object can later be used to restore the selection by passing it into restoreCharacterRanges()
.
Restores a selection previously saved using saveCharacterRanges()
.
These two methods provide a character index-based selection save and restore which is not vulnerable to formatting changes (unlike Rangy's existing selection save/restore module).
Returns the visible text for the element el
.