Step iterators introduction & design concepts

20 Jan 2014

      Hi everyone,

I wanted to take a moment to introduce a new class I've recently
added to webodf to assist with cursor movement. I will explain some of
the reasoning behind it and provide examples of best practices.

First step is defining some terms. Most of these are drawn from
earlier discussions on this list[1]:

Position:     A DOM node + offset pair
Step:         A position returns FILTER_ACCEPT when passed to a specific
            filter

A new class has been added to core called a StepIterator. It provides
the following methods -

* roundToClosestStep
* roundToPreviousStep/roundToNextStep
* previousStep/nextStep
* isStep

New instances of this (as of pr#315) are most easily accessed via
OdtDocument.createStepIterator.

The key aims of this class are:
1. Make it easier to avoid iteration bugs (e.g., not starting step
   counting from a step)
2. Provide a semantically clearer interface for movement-by-step
   and movement within restricted containers (e.g., roots/paragraphs)
3. Provide an interface to make further performance improvements to

In practice, I'm intending step iterators to supplant the
SelectionMover class for all uses. I'm also intending to phase out
the recently introduced rounding constraints in the StepTranslator
class, as I have found usage of this to be very hard to use properly
and I tend to create bugs with it (oops :S).

In my mind, the step iterators provide a few strong advantages
over existing mechanisms:

* Step iterators are not tied to the focus node of a cursor. They
  can operate on arbitrary points and positions (e.g., shadow cursor)
* By allowing the developer to set the subtree over which step
  iteration will occur, iteration will complete far more quickly for
  queries that only want to check if there is another step available
  (e.g., move/extend cursor operations within an annotation)
* Limiting the iteration subtree also reduces the chance a dev will
  unintentionally walk a large number of steps (which is still slow!)
* Avoids confusion & performance problems with conversion directly
  between filters. Using DOM positions as input and output guides
  developers towards the right way to write performant code.
* By binding the iterator and filter together more tightly, major
  performance improvements can be realised (such as combined node +
  position filters). Brief testing has shown a 2-3x performance
  improvement might be available through combining these.

Now for some examples.

Check if there is at least one more step in the paragraph:

// Subtree is limited to the paragraph node. Guarantees the iterator
// will not walk more than the maximum paragraph length under any
// condition
var stepItr = odtDocument.createstepItr(
                        paragraphNode, paragraphNode.childNodes.length,
                        [textPositionFilter, rootFilter],
                        paragraphNode);

// step iterator is initialized at
// (paragraphNode, paragraphNode.childNodes.length)
if (stepItr.nextStep()) { console.log("Has another step!"); }

Find the last step within a paragraph:

var stepItr = odtDocument.createstepItr(
                        paragraphNode, paragraphNode.childNodes.length,
                        [textPositionFilter, rootFilter],
                        paragraphNode);

// step iterator is initialized at
// (paragraphNode, paragraphNode.childNodes.length)
// Under some circumstances, this will be a valid step,
// so just calling previousStep here might skip the last
// step. Instead, roundToPrevious can be used to only move
// to the previous step if the current position is unwalkable
runtime.assert(stepItr.roundToPreviousStep(), "No previous step available");

// Convert to an absolute cursor step using step cache
step = odtDocument.convertDomPointToCursorStep(stepItr.container(),
                                                stepItr.offset());

Find the last step just after a paragraph:

var subtree = odtDocument.getRootElement(paragraphNode),
    stepItr = odtDocument.createstepItr(
                        paragraphNode, paragraphNode.childNodes.length,
                        [textPositionFilter, rootFilter],
                        subtree);

// step iterator is initialized at
// (paragraphNode, paragraphNode.childNodes.length)
// Move to the next cursor step
runtime.assert(stepItr.nextStep(), "No next step available");

// Convert to an absolute cursor step using step cache
step = odtDocument.convertDomPointToCursorStep(stepItr.container(),
                                                stepItr.offset());

Best practices:
* Always make the subtree as small as possible. If you know you need
  a step within a particular container (E.g., paragraph, annotation),
  use this for the subtree. This provides very large performance
  when rounding to the closest step, or checking if there are more
  steps left within the subtree.
* Some filters are very expensive. Use the filters you need, but try
  and avoid direct access to the filters themselves. Instead, use
  stepItr.isStep as this is cached to the last position used.
* DON'T WALK LARGE NUMBERS OF STEPS. Step iteration is really meant
  to be fine-tuning of a rough position. Walking a full paragraph via
  step iteration can take upwards of 30-40ms per paragraph, which
  adds up very quickly. WebODF eventually needs to be able to perform
  hundreds of operations per second!

As ever, I'm keen for any feedback or improvements (or requested
examples) on this interface.

Cheers,

Philip

[1] https://open.nlnet.nl/pipermail/webodf/2013-October/000082.html

Philip Peitsch

tags

participants (1)