WebODF point translation system: PositionIterators, Filters, SelectionMovers and other rabbits...

13 Sep 2013

      Hi all,

Based on some recent activities (and some questions by Friedrich), it seemed like
a good time to sit down and try and disseminate some knowledge about some more
confusing concepts such as filtering, position iterators and other friends.

I suppose the first point to start at is what need is being solved. Anyone who has
worked with javascript for a small amount of time is likely familiar with DOM Ranges,
elements, and other DOM manipulation methods. A DOM Range shows quite clearly the
normal coordinate system that the browser uses internally, being Node + Offset.
A particular node reference is guarenteed to be unique within the document. This
makes up the concept I'll refer to as the "DOM coordinate" system, as it uses
in-memory references to active nodes in the document.

For collaborative editing though, we need to be able to share information between
different computers. In this case, we can't use DOM coordinates because there is
no built-in way to serialize a reference to a particular DOM and offset such that
another computer will be able to select the exact point. So, another coordinate
system needs to be layered on top of the DOM coordinates. In WebODF, this is
most clearly seen on Operations, which carry single integers (usually position +
length pairs). I'll refer to this concept as the "ODF coordinate" system.

For WebODF, ODF coords are actually cursor positions. This wasn't a deliberate
design from what I can tell (and I have a few misgivings about it), but it is
functional enough to allow collaboration with the current codebase.

Now, I want to clarify up front that none of this is my design. I'm simply repeating
what I've discovered from the codebase, so please don't ask me to justify why it
was created this way.

Design
======
The high level classes used in translating between DOM / ODF coordinate systems:

---------------------
core.PositionIterator
---------------------
A relatively light wrapper around the standard HTML TreeWalker concept. This class
provides methods to navigate across available positions within a DOM (a position
being a DOM coordinate pair of a node + offset). It has some basic logic to skip
some types of equivalent positions during iteration.

This class takes a "node filter" that can be used to exclude certain nodes. It also
provides a setUnfilteredPosition method that will set the PositionIterator instance
to the next acceptable node based on the node filter (this is something
the normal TreeWalker.currentNode assignemnt does not guarantee).

Historically, this used to contain some other translation functions. These other
methods have been retired during the great coordinates purge of a few months back[1].

-------------------------------------------------
core.PositionIterator.FilteredEmptyTextNodeFilter
gui.SelectionMover.CursorFilter
-------------------------------------------------
These are examples of node filters. These are passed into the PositionIterator
on construction. A node level filter can return whether a particular node should
be returned as a result of a call to one of the PositionIterator's navigation
functions (e.g., iterator.nextPosition).

For example, a CursorFilter is used to prevent the WebODF cursor element from being
returned as an available DOM coordinate.

------------------
gui.SelectionMover
------------------
The SelectionMover class is the heart & soul of the coordinate translation system.
The SelectionMover is bound to an OdtCursor. It provides a bunch of functions that
take a "position filter", and can do things such as move the cursor a set number
of valid positions in a direction, or can report the number of valid positions to
a particular point.

This is where the confusion starts. A "position filter" is an object that supplies
a single function "acceptPosition" and when given a DOM coordinate reports whether
the DOM coordinate is a valid position as far as the filter is concerned.

The most commonly used "position filter" is ops.OdtDocument.TextPositionFilter, which
is an implementation of valid ODT cursor positions as described in README_cursorpositions.

The code for converting DOM to ODF coordinates is generally something like:

iterator = new core.PositionIterator(new core.CursorFilter())
iterator.setUnfilteredPosition(documentRoot, 0) // Starting at the beginning of the doc
positionFilter = new TextPositionFilter()
odfCoordinate = 0
// nextPosition uses the "node filter" to find the next non-cursor node
while (iterator.nextPosition() && hasNotPassed(DOMCoordinate)) {
    // the "position filter" is used to figure out how many valid positions have
    // been passed
    if (positionFilter.acceptPosition(iterator)) { 
        odfCoordinate += 1;
    }
}

Conversion from ODF to DOM coordinates is almost identical:

iterator = new core.PositionIterator(new core.CursorFilter())
iterator.setUnfilteredPosition(documentRoot, 0) // Starting at the beginning of the doc
positionFilter = new TextPositionFilter()
// nextPosition uses the "node filter" to find the next non-cursor node
while (odfCoordinates !== 0 && iterator.nextPosition()) {
    // the "position filter" is used to figure out how many valid positions have
    // been passed
    if (positionFilter.acceptPosition(iterator)) { 
        odfCoordinate -= 1;
    }
}
DOMCoordinate = (iterator.node, iterator.offset)

---------------
ops.OdtDocument
---------------
This is the primary object for accessing cursors, and translating between ODF
and DOM coordinates. The most commonly used functions are:

getCursorPosition - Returns the ODF coordinate of a specific cursor
getCursorSelection - Returns the ODF coordinate including selection of a specific cursor
getDistanceFromCursor - Returns the difference in ODF coordinates between the cursor and
    the specified DOM coordinate

Other useful functions to know about:
getPositionFilter - Returns the "position filter" instance that is used for converting
    between ODF and DOM coordinates. This is an instance of the TextPositionFilter
    generally (with some extra magic around document roots, which I'm ignoring for now)

Summary
=======

So, to recap this architecture:
- A PositionIterator takes a NodeFilter as a construction argument.
- The SelectionMover has a PositionIterator, and supplies the appropriate NodeFilter
    to this iterator
- Each SelectionMover is bound is tied to a single OdtCursor
- The SelectionMover uses a per-call PositionFilter to
    convert between ODF and DOM coordinates
- The OdtDocument creates all OdtCursors, SelectionMovers and PositionFilters,
    and links these all together appropriately
- The only used PositionFilter is a TextPositionFilter, as described in
    README_cursorpositions (shhh Adityab. Don't bring up the root filters yet :D)

Ok. I'll leave this email here at this point :-).

Any questions/queries/clarifications... don't hesitate to ask... but don't forget...
I didn't design this... I'm just the messenger ;-)

Cheers,

Philip

Footnotes
1. https://open.nlnet.nl/pipermail/webodf/2013-June/000029.html

Philip Peitsch

tags

participants (1)