SelectionMover and standardising lingo
Hi all, Just quickly brain-dumping some recommendations for how to improve the clarity and reduce terminology confusion within SelectionMover. Some commonly repeated terms that appear badly overloaded in SelectionMover are steps, positions and points. Add to this, the functions available are also extremely confusing both in terms of their names, and the internal code (there is a patch coming today to help with the code side). So, new terms of reference I'm standardising on are: Points ====== Raw DOM positions. This is any valid node & offset pair. Related functions ----------------- SelectionMover.StepCounter.countStepsToPosition - converts the distance between two Points into to it's (roughly) equivalent number of Steps Positions ========= A set of positions as returned by the position iterator. This is a subset of points as these can be filtered at the node level via a NodeFilter. Related functions ----------------- SelectionMover.StepCounter.countForwardSteps SelectionMover.StepCounter.countBackwardSteps - converts the number of Steps into it's equivalent number of Positions SelectionMover.movePointForward SelectionMover.movePointBackward - Move the cursor the requested number of Positions in forward or backwards Steps ===== A filtered subset of positions as returned by the position iterator. This is controlled using a PositionFilter. Related functions ----------------- SelectionMover.StepCounter.countStepsToPosition - converts the distance between two Points into to it's (roughly) equivalent number of Steps SelectionMover.StepCounter.isPositionWalkable - returns true if the supplied iterator is a valid Step SelectionMover.StepCounter.convertForwardStepsBetweenFilters SelectionMover.StepCounter.convertBackwardStepsBetweenFilters - translates between one filtered set of Steps to a different filtered set of Steps. IMPORTANT: this function will only work filter1 only accepts a subset of filter2. countStepsToLineBoundary - convert the distance between the cursor and nearest line boundary into Steps countLinesSteps - convert the distance between the cursor and the same horizontal position X lines up/down into Steps My recommendations (I'm assuming you want to hear these :) ) 1. Standardise SelectionMover on Points and Steps only. Any attempts to move a cursor to a Position that is not also a Step is likely a bug. The only place where we deliberately move in non-Step increments is when trying to put the cursor back into a filtered Step. This same effect could be achieved by calling moveForwardStep(0, filter). 2. Remove or rewrite convert(Forward|Backward)StepsBetweenFilters. The current implementation will not work for arbitrary filter pairs. These should likely be re-written to return the number of filter1-steps when travelling filter2-steps in distance. I would also suggest this takes the same approach as countStepsToPosition and round the cursor back to the Point of the last filter2-step that is smaller-than or equal-to the current cursor Point. Cheers, Philip
Hi Philip, Am Freitag, 4. Oktober 2013, 05:31:20 schrieb Philip Peitsch:
Hi all,
Just quickly brain-dumping some recommendations for how to improve the clarity and reduce terminology confusion within SelectionMover.
Some commonly repeated terms that appear badly overloaded in SelectionMover are steps, positions and points.
Add to this, the functions available are also extremely confusing both in terms of their names, and the internal code (there is a patch coming today to help with the code side).
I agree that the current object and method names make a nice jungle, well, nice for people who like rather chaotic systems. Just I have to say that the proposed use of the lingo machete will still leave me in the danger of being lost and entangled by liana and consumed by selection snakes or cursor tigers. E.g. to me (as non-native speaker obviously) steps are a more relative measure, giving a distance, while position is a more absolute measure, giving a, well, position or location. Then also the meaning of "Point", "Position" and "Step" is slightly arbitrary and will have to be learned and cannot directly derived from the terms itself. So I will forget that again and again. The only one? Perhaps because my brain is structured by my German mother tongue, but I would prefer prefixes to mark the meaning of the positions. Let's see how that can be derived from the definitions of the terms you gave:
Points ====== Raw DOM positions. This is any valid node & offset pair.
Positions ========= A set of positions as returned by the position iterator. This is a subset of points as these can be filtered at the node level via a NodeFilter.
Steps ===== A filtered subset of positions as returned by the position iterator. This is controlled using a PositionFilter.
So "Positions" depend on the type of the node filter (+ built-in EmptyTextNodeFilter). Now given the variety of possible filters the semantics of "Positions" is quite broad, and I actually see no real advantage in having an explicit general distinctive name for a set of non-raw-DOM positions. Especially as the positions are not direct subsets, but actually in different address spaces, as the "offset" property depends on the "seen" nodes, so node:x, offset:y can mean something different with a node-filtered view on the DOM. I would propose to call real DOM positions always "DomPositions" and equally prefix the node-filtered positions by whatever xmltree-view the nodefilter is used for, e.g. with "ODF" or "WebODF" (webodf-extended ODF) to "OdfPosition" or "WebOdfPosition". That way it is more clear against what nodefilter view on the raw DOM a certain position is valid. Similar for "Steps" vs. "Position". While the position-filtered set of positions is in the same address space like the position-unfiltered set, the exact meaning depends on the type of the positionfilter, so that should be indicated by a proper prefix. They are still positions in a certain (virtual) node tree, and by their object type themselves not restricted to other positions. Actually in the current code steps are always used as distance between two (filtered) positions, with the single-numeric distance estimated from the mapping of the position to the number in the iterating sequence, so actually another address system, but with 1:1 mapping (you know that, just noting for myself for the record). Now SelectionMover operates on a certain node tree, defined by the passed root node, which then is node-filtered by the CursorFilter+EmptyTextNodeFilter. And thus has a certain address space, in which the used positions (by node & offset) are valid. Which makes me wonder if there are not incompatibilities, as the users of SelectionMover only pass in a position filter and get back distances which are only valid against that position filter and the address space of the SelectionMover node tree. Seems any results from the methods of SelectionMover are only used again with SelectionMover and thus the same address space, so possibly no problem exists. Okay. One thing I am missing: what is the purpose of the CursorFilter? Why does it filter only webodf-namespaced elements? And not more, like skipping HTML elements, as odf.OdfNodeFilter does? (hm, btw, could there already be alien elements matching HTML tags be loaded from the odt file? not expected, but might need handling WRT to our extension of )
My recommendations (I'm assuming you want to hear these :) )
1. Standardise SelectionMover on Points and Steps only. Any attempts to move a cursor to a Position that is not also a Step is likely a bug. The only place where we deliberately move in non-Step increments is when trying to put the cursor back into a filtered Step. This same effect could be achieved by calling moveForwardStep(0, filter).
2. Remove or rewrite convert(Forward|Backward)StepsBetweenFilters. The current implementation will not work for arbitrary filter pairs. These should likely be re-written to return the number of filter1-steps when travelling filter2-steps in distance. I would also suggest this takes the same approach as countStepsToPosition and round the cursor back to the Point of the last filter2-step that is smaller-than or equal-to the current cursor Point.
Still undecided on these recommendations (to be blamed on my lack of experience in this field and current state of my mind, thank you, cold). As if things are not confusing enough we also use the term "position" with regard to the cursor-walkable textpositions in the linear addressing system. Not only in the op specs, but also in the API of OdtDocument etc. Seems we will get out of words #) I need some more time for this. Cheers Friedrich PS: I also think in the var names we should always explicitly talk about nodeFilter or positionFilter, because that makes a difference as I only now fully realized :) -- Friedrich W. H. Kossebau // KO GmbH http://kogmbh.com/legal/
On 09/10/2013, at 9:10 AM, Friedrich W. H. Kossebau wrote:
E.g. to me (as non-native speaker obviously) steps are a more relative measure, giving a distance, while position is a more absolute measure, giving a, well, position or location. Then also the meaning of "Point", "Position" and "Step" is slightly arbitrary and will have to be learned and cannot directly derived from the terms itself. So I will forget that again and again. The only one?
I agree. I'm not attached to the chosen terms at all… but now that I've put some forward we can debate in an informed manner :)
Perhaps because my brain is structured by my German mother tongue, but I would prefer prefixes to mark the meaning of the positions. Let's see how that can be derived from the definitions of the terms you gave:
Points ====== Raw DOM positions. This is any valid node & offset pair.
Positions ========= A set of positions as returned by the position iterator. This is a subset of points as these can be filtered at the node level via a NodeFilter.
Steps ===== A filtered subset of positions as returned by the position iterator. This is controlled using a PositionFilter.
So "Positions" depend on the type of the node filter (+ built-in EmptyTextNodeFilter). Now given the variety of possible filters the semantics of "Positions" is quite broad, and I actually see no real advantage in having an explicit general distinctive name for a set of non-raw-DOM positions. Especially as the positions are not direct subsets, but actually in different address spaces, as the "offset" property depends on the "seen" nodes, so node:x, offset:y can mean something different with a node-filtered view on the DOM.
Even though the variety of filters is infinite in theory, in practice, there is only a single node filter (pair) in use in WebODF, being CursorFilter + EmptyTextNodeFilter.
I would propose to call real DOM positions always "DomPositions" and equally prefix the node-filtered positions by whatever xmltree-view the nodefilter is used for, e.g. with "ODF" or "WebODF" (webodf-extended ODF) to "OdfPosition" or "WebOdfPosition". That way it is more clear against what nodefilter view on the raw DOM a certain position is valid.
+1 on this. Good proposal.
Similar for "Steps" vs. "Position". While the position-filtered set of positions is in the same address space like the position-unfiltered set, the exact meaning depends on the type of the positionfilter, so that should be indicated by a proper prefix. They are still positions in a certain (virtual) node tree, and by their object type themselves not restricted to other positions. Actually in the current code steps are always used as distance between two (filtered) positions, with the single-numeric distance estimated from the mapping of the position to the number in the iterating sequence, so actually another address system, but with 1:1 mapping (you know that, just noting for myself for the record).
Right. I did a bad job explaining this. The key reason for Points vs. Positions (or DOM Positions vs OdfPositions) is that OdfPosition is actually an integer number. The integer number that means "starting at root, call nextPosition X times". Any time the code uses passes around direct container & offset pairs, I consider this speaking in DOM Positions. The key point of confusion is that there are two integer-based coord systems that mean drastically different things: OdfPosition (Positions in the original email) ================================= PositionIterator + CursorFilter + EmptyTextNodeFilter Integer number indicates how many times to call .nextPosition from root CursorStep (? Steps in the original email) =================================== (PositionIterator + CursorFilter + EmptyTextNodeFilter) + TextPositionFilter Integer number indicates how many times acceptPosition(iterator) will equal FILTER_ACCEPT before the PositionIterator arrives at the desired position.
Now SelectionMover operates on a certain node tree, defined by the passed root node, which then is node-filtered by the CursorFilter+EmptyTextNodeFilter. And thus has a certain address space, in which the used positions (by node & offset) are valid. Which makes me wonder if there are not incompatibilities, as the users of SelectionMover only pass in a position filter and get back distances which are only valid against that position filter and the address space of the SelectionMover node tree. Seems any results from the methods of SelectionMover are only used again with SelectionMover and thus the same address space, so possibly no problem exists. Okay.
The key problem with the current design is that SelectionMover mixes OdfPosition and CursorStep input & output. It returns or takes a number of "steps", but depending on which function depends on which coordinate system the integer is in :-/. The only place that currently uses an additional PositionFilter is the annotation walking, but the mechanism for translation in and out of this is somewhat painful to use (one of the recommendations highlights this).
One thing I am missing: what is the purpose of the CursorFilter? Why does it filter only webodf-namespaced elements? And not more, like skipping HTML elements, as odf.OdfNodeFilter does? (hm, btw, could there already be alien elements matching HTML tags be loaded from the odt file? not expected, but might need handling WRT to our extension of )
The next level of filtering you're referring to is taken care of in TextPositionFilter. This is actually one of the first places I will start to play with for some easy perf improvements.
My recommendations (I'm assuming you want to hear these :) )
1. Standardise SelectionMover on Points and Steps only. Any attempts to move a cursor to a Position that is not also a Step is likely a bug. The only place where we deliberately move in non-Step increments is when trying to put the cursor back into a filtered Step. This same effect could be achieved by calling moveForwardStep(0, filter).
2. Remove or rewrite convert(Forward|Backward)StepsBetweenFilters. The current implementation will not work for arbitrary filter pairs. These should likely be re-written to return the number of filter1-steps when travelling filter2-steps in distance. I would also suggest this takes the same approach as countStepsToPosition and round the cursor back to the Point of the last filter2-step that is smaller-than or equal-to the current cursor Point.
Still undecided on these recommendations (to be blamed on my lack of experience in this field and current state of my mind, thank you, cold).
You have a compromised immune system and you're willing to read emails like this? I salute you, you brave person :) I fear you passed that code on to me during our code review on IRC the other night :-(. I need better virus scanners in my chat client...
As if things are not confusing enough we also use the term "position" with regard to the cursor-walkable textpositions in the linear addressing system. Not only in the op specs, but also in the API of OdtDocument etc. Seems we will get out of words #)
My fear also :-/
I need some more time for this.
Cheers Friedrich
PS: I also think in the var names we should always explicitly talk about nodeFilter or positionFilter, because that makes a difference as I only now fully realized :)
I agree here too.
On Friday, October 04, 2013 05:31:20 AM Philip Peitsch wrote:
Hi all,
Just quickly brain-dumping some recommendations for how to improve the clarity and reduce terminology confusion within SelectionMover.
Some commonly repeated terms that appear badly overloaded in SelectionMover are steps, positions and points.
Add to this, the functions available are also extremely confusing both in terms of their names, and the internal code (there is a patch coming today to help with the code side).
So, new terms of reference I'm standardising on are:
Points ====== Raw DOM positions. This is any valid node & offset pair.
Related functions ----------------- SelectionMover.StepCounter.countStepsToPosition - converts the distance between two Points into to it's (roughly) equivalent number of Steps
Positions ========= A set of positions as returned by the position iterator. This is a subset of points as these can be filtered at the node level via a NodeFilter.
Related functions ----------------- SelectionMover.StepCounter.countForwardSteps SelectionMover.StepCounter.countBackwardSteps - converts the number of Steps into it's equivalent number of Positions
SelectionMover.movePointForward SelectionMover.movePointBackward - Move the cursor the requested number of Positions in forward or backwards
Steps ===== A filtered subset of positions as returned by the position iterator. This is controlled using a PositionFilter.
Related functions ----------------- SelectionMover.StepCounter.countStepsToPosition - converts the distance between two Points into to it's (roughly) equivalent number of Steps
SelectionMover.StepCounter.isPositionWalkable - returns true if the supplied iterator is a valid Step
SelectionMover.StepCounter.convertForwardStepsBetweenFilters SelectionMover.StepCounter.convertBackwardStepsBetweenFilters - translates between one filtered set of Steps to a different filtered set of Steps. IMPORTANT: this function will only work filter1 only accepts a subset of filter2.
countStepsToLineBoundary - convert the distance between the cursor and nearest line boundary into Steps
countLinesSteps - convert the distance between the cursor and the same horizontal position X lines up/down into Steps
My recommendations (I'm assuming you want to hear these :) )
1. Standardise SelectionMover on Points and Steps only. Any attempts to move a cursor to a Position that is not also a Step is likely a bug. The only place where we deliberately move in non-Step increments is when trying to put the cursor back into a filtered Step. This same effect could be achieved by calling moveForwardStep(0, filter).
100% Agreed.
2. Remove or rewrite convert(Forward|Backward)StepsBetweenFilters. The current implementation will not work for arbitrary filter pairs. These should likely be re-written to return the number of filter1-steps when travelling filter2-steps in distance. I would also suggest this takes the same approach as countStepsToPosition and round the cursor back to the Point of the last filter2-step that is smaller-than or equal-to the current cursor Point.
Good point; at the time when I wrote those I was only looking at a specific use-case - mapping from TextPositionFilter-space to (TextPositionFilter & RootFilter)-space, and that approach subconsciously manifested itself here. Agree that this should be made to work with *really* arbitrary filter pairs. In all, thanks for the detailing. We should put this in a README- SelectionMover.txt. Cheers, Aditya
participants (3)
-
Aditya Bhatt
-
Friedrich W. H. Kossebau
-
Philip Peitsch