Hi everyone,
I'm starting to work towards improving the paste support in webodf so that it can
do more than just paste plain text in a single line.
The first step here is obviously designing what might work. I've put some documents
together on this and have them sitting out in a branch (insert-fragment in my
repo if anyone is interested).
Here is where I'm at so far:
Both of these are available in the branch as well at
- https://gitorious.org/webodf/peitschie-webodf/source/insert-fragment:webodf…
- https://gitorious.org/webodf/peitschie-webodf/source/insert-fragment:webodf…
Copy & paste behaviour and support
==================================
Overview
--------
Paste support is extremely important to get right for an editor. Users have an expectation that data can be replicated
with a high degree of accuracy when copied and pasted. This includes things such as pasting images, styled text (e.g.,
bold, underline), paragraph breaks, lists etc.
This README does not cover details about how data is written or retrieved from the clipboard. For that information, please
see README_clipboard.txt.
Desired paste support
---------------------
* Plain text with paragraphs, tabs, spaces
* HTML text with direct formatting
* HTML tables
* HTML table rows & columns
* Images (standalone)
* Images within mixed HTML fragments (e.g., HTML fragment with paragraphs & images etc.)
* Lists (bulleted & numbered)
Requirements
------------
1. Paste should be able to be undone/redone safely
2. Extra "formatting" steps should be able to be undone without removing the pasted content. E.g., automatically converting
to a list or table (optional advanced feature)
3. Want to avoid duplicating logic in other operations (e.g., paragraph splitting & merging behaviours, image insertion,
style addition, adding new list items etc.)
4. Any new operations must be able to be OT'ed easily
Design
------
There are effectively two opposing approaches that can be taken to handle pasting of new data:
1. Create a complex operation (e.g., OpPasteData) that is responsible for determining how to insert a fragment into the
document
2. Create a paste handler that attempts to break the paste fragment into a series of smaller operations
Option 1
- Pro: Easily integrates with existing undo/redo manager
- Con: OpPasteData likely to contain a lot of duplication of existing ops however
- Pro: Less operations generated (less on-the-wire traffic)
Option 2
- Pro: Better re-use of existing operations
- Pro: Less complex operations required
- Con: Need significant re-work of undo operation grouping to allow paste to be undone/redone
OT adaption of a paste command is relatively straightforward for both options, as both largely generate insert-only
operations. This means that usually the start position just needs to be shifted around to cope with added or removed
characters.
Based on the pro's and con's, Option 2 is the best approach for paste handling. The key argument for this is that it makes
better use of existing operations (requirement#3). The existing undo manager grouping logic is not very extensible, and
should be reworked anyways.
Example paste steps
-------------------
1. Extract data from clipboard. Order of preference is
- Custom webodf fragment ("application/vnd.webodf")
- LO/MSWord fragment (??)
- RTF fragment (??)
- HTML ("text/html")
- Plain text ("text/plain")
2. Convert data into webodf fragment (and associated styles) using appropriate import filter
3. Split the fragment up into separate paragraphs
4. Start a new transaction/undo group (this is a new feature...)
5. Add any new named styles (Op???)
6. Add any new auto styles (Op???)
7. For each paragraph
- start a new paragraph at the current position (OpSplitParagraph)
- insert the new paragraph (OpInsertFragment)
8. After all paragraphs have been inserted, remove the FIRST created split to merge the first inserted paragraph with
it's previous sibling (allows the paragraphs to merge with the correct paragraph merge logic)
9. Finish transaction/undo group (Actually, this probably happens on the next edit op start)
10. Auto-convert things to lists, links, etc. (optional). This should be in a new transaction/undo group
Questions
---------
* Should pasting multiple paragraphs into a list should result in a new list item per paragraph?
* Should links be automatically converted?
Anyone have thoughts, concerns, feedback or cookies (I'm a little hungry after all this research…)?
Cheers,
Philip
Hi everyone,
I have been looking at some funky "valid" ODT documents and discovering some fun
behaviours around cursor positioning. I thought this was a good time to bring up
some potential changes, as I noticed on IRC the other night that Friedrich had also
discovered sometimes the user is unable to place the cursor inside a text:a block.
The existing rules for valid cursor positions only allow placing the cursor
inside what is known as a "grouping element", which is either a span, p or h. This
has already caused me some challenges, because for one of the highlight overlays
I'm doing on top of webodf I need to wrap document text in a normal HTML span,
which then prevents the cursor from entering.
In the situation Friedrich found, there is actually no requirement saying the text
content within a text:a element must be placed in a span.
The suggestion on IRC yesterday was to just add text:a into the grouping element
definitions (a reasonable one), but I got to thinking a little more about whether
this is the best long-term solution. Especially as there are other elements that
could possible contain character data. And for extensibility purposes, ideally
we don't want to have to redo this core cursor positioning logic every time the
UI requires some extra containers and wrappers to help display things :).
Re-reading the ODF specs[1], the preferred approach laid out is actually a
blacklist, not a whitelist as we're currently doing. The blacklist as required
for processing is already defined & used in StyleHelper.isAcceptedNode, is used
for OpRemoveText and OpApplyDirectStyling.
Would anyone have any problems if I changed OdtDocument.TestPositionFilter to use
the blacklist approach instead? The blacklist will need an additional entry to
exclude the cursor as well, so I'll put that in also.
>From my testing with this, it appears to function identically to the existing
approach, with the added bonus of being able to navigate within text:a tags
that don't contain a span :)
Cheers,
Philip
P.S., sorry for all the list spam lately! Apparently I have too much time for
philosophy :)
[1] http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-part1.html#F…
Hi all,
I'm just putting together a quick plan of what I'll be looking at with regards
to performance, to address some queries & requests raised as part of MR#114
(https://gitorious.org/webodf/webodf/merge_requests/114)
As part of this next cycle, my goal is to have editing of a 20 page ODT and
flat ODT with images up to a level that is responsive. At the moment, deleting a
single character at the end of a sample 11 page document produces an extremely
noticable 500ms delay before the character is gone.
>From my initial investigation so far my plan of attack for this is:
1. Improve obviously suboptimal paths:
OdtDocument
* TextPositionFilter - Most of the container checks should be filtered nodes
2. Eliminate the number of times the average run needs to step through
the document to find a position:
* e.g., upgradeWhiteSpace is called at a specific position. Any Op using
this therefore steps through the document up to that specific position usually
2 or 3 times.
3. Implement a bookmark system to quickly retrieve iterators at specific
positions within the document.
I've used one of these internally for several months now at a different layer
above webodf, and have found this to be the most significant improvement.
The plan with each of these is to introduce benchmarking numbers to allow the
performance improvements to be proven. As such, I don't plan on addressing
any of the performance related concerns in MR#114, as they are literally (from
initial profiling checks) drops in a very large ocean of improvement.
If people have other ideas, complaints, etc., as ever, I'm open to anything :)
Cheers,
Philip