How collaborative editing drove CKEditor 5’s architecture
Real-time collaboration (RTC) is one of the most in-demand features of modern digital platforms. But, after researching and observing some failed attempts in other products, it was clear that full support for collaborative editing in a rich text scenario couldn’t be simply overlaid on existing capabilities. A purpose-built architecture had to be designed and implemented from its conception, with real-time collaboration treated as a first-class citizen throughout the entire project.
While it sounds simple, for CKSource, it meant leaving behind years of WYSIWYG HTML editor experience and the rock-solid code base of CKEditor 4. Our customers valued that code base and we were proud of it – especially seeing it took an estimated 50+ person-years to build.
A person-month is equivalent to approximately 160 hours of labor – the amount of work performed by a single average worker in one month. So, a project that takes 12 person-months would take 4 developers working together for 3 months to finish. A person-year is equal to twelve person-months.
But the specter of Netscape’s infamous history loomed large over the RTC development plans for the new CKEditor 5 architecture. It’s a well-known example of planning to rewrite a popular program from scratch, and then failing to successfully release the new version. Fortunately, that’s not what happened to us.
After four years of development, the new CKEditor 5 architecture was successfully built – purpose-built for real-time collaboration from its very foundations.
Likewise, the platform’s integrity was further validated with the creation of CKEditor 5’s collaboration features – a collection of Premium plugins enabling users to co-create and edit content in a real-time collaborative environment, similar to Google Docs.
This article describes:
An overview of how collaborative editing was approached
The challenges faced during development
Why delivering real-time collaborative editing that’s capable of handling rich text, is a technical dilemma that few have mastered.
# Real real-time collaboration
Since collaborative editing is an in-demand feature, many projects claim to support it. However, on closer inspection it’s clear that few projects provide a complete, high-quality solution. Plus, because the interpretation of the two terms “collaborative editing” and “collaboration” are broad and understood in a variety of ways, it leads to even more confusion among potential users.
In this article, we use the terms “collaboration” and “real-time collaborative editing” interchangeably to refer to our implementation of collaboration in CKEditor 5.
Our goal was to implement real real-time collaboration, without compromising on functionality. To understand what that means, let’s review some alternative solutions available for adding collaboration to an application.
# Issues with alternative collaborative editing solutions
There are numerous shortcuts that enable collaboration in an app that hasn’t been designed for RTC from the ground up. But, they all result in a poor user experience (UX). Some of the most common issues are:
**Full or partial content locking:
**Only one user can edit the document or a given part of the document (ie. a block element: paragraph, table, list item, etc.) at the same time.
Collaboration features enabled in read-only mode:
Users are able to make comments on text, but only while editing is disabled.
Manual conflict resolution:
Simultaneous edits to the same piece of content have to be resolved manually by one of the users.
Only basic features enabled in collaborative editing:
Text can be made bold or a heading can be created, but adding tables or nested lists isn’t possible.
Lack of intention preservation:
After conflicts are resolved, the user ends up with different content than what they had intended to create (in other words, poor conflict resolution).
During the development of CKEditor 5, it was crucial to avoid these pitfalls. The goal was to create a solution that delivered true real-time collaborative editing – enabling all users to create and edit content simultaneously, without limitations or missing features.
The guiding development principle was: CKEditor 5 should look, feel and behave the same, no matter whether collaborative editing is on or off.
# Conflict resolution is crucial to collaborative editing
During collaborative editing, users are constantly modifying their local editor content and synchronizing the changes. When two or more users edit the same part of the content, conflicts inevitably appear.
Conflict resolution is what makes or breaks a collaborative editing experience.
For example, when two users remove part of the same paragraph, their editors’ states need to be synchronized. However, this is problematic: when User A receives information from User B, this information is based on User B’s content – which is different from what User A is currently working on.
This is one of the simplest scenarios, but without proper conflict resolution in place, it leads to inconsistency between the content each user can see – which goes on to become a fundamental problem for any collaborative editing solution. Some editors introduce full or partial content locking to prevent this from happening, but this was not the kind of limitation that was acceptable for CKEditor 5.
During development, it’s also often supposed that in a real-world use case, conflicts won’t happen frequently. This leads to the incorrect conclusion that because they’re hypothetically infrequent, a sophisticated solution isn’t needed – the system should simply reject changes when a conflict is discovered. But that logic is flawed.
In reality, conflicts occur frequently and rejecting one user’s changes to avoid conflicts leads to a poor user experience.
# CKEditor uses Operational Transformation for conflict resolution
There are two main approaches to implementing conflict resolution in real-time collaborative editing:
A complete analysis of the OT vs CRDT advantage/disadvantage debate is beyond the scope of this article, but ultimately, CKEditor uses Operational Transformation to resolve conflicts.
# What is Operational Transformation?
Operational Transformation is based on a set of operations (objects describing changes) and algorithms that transform those operations accordingly, so that all users end up with the same editor content regardless of the order in which the operations were received.
As a concept, Operational Transformation is well-described in IT literature and is proven by existing implementations (although there were none that could serve as a stable and powerful enough base for our needs). Therefore, in 2015, work commenced on CKEditor’s take on Operational Transformation implementation.
However, it quickly became apparent that basic Operational Transformation, as usually described and implemented, isn’t enough to provide a high-quality user experience for rich text editing.
Operational Transformation in its basic form defines three operations: insert, delete, and set attribute. These operations are meant to be executed on a linear data model. They are responsible for inserting text characters, removing text characters and changing their attributes (for example, making text bold). However, a powerful WYSIWYG HTML editor requires more than that.
# Operational Transformation must support complex data structures for real collaborative editing
The linear data model usually used in OT, is a simple data model that’s perfectly sufficient to represent plain text. However, HTML is a tree-based language, where an element can contain multiple other elements. In the browser, an HTML document is represented as the Document Object Model (DOM), which is tree-structured.
While it’s possible to represent simple, flat structured data in an OT linear model, this model falls short when it comes to complex data structures, like tables, captioned images or lists containing block elements.
In a linear model, elements simply cannot contain other elements. For example, a block quote can’t contain a list item or a heading, so our development needed to take a step further – by providing algorithms for Operational Transformation – to work for a tree data structure.
When work began in 2015, there was only one paper about Operational Transformation for trees and no evidence of anyone actively working on OT for trees. Our initial efforts were based on the little research we’d found, but the reality turned out even more challenging than expected.
The first implementation took over a year, with several significant reworks over the next two years. The result of those intense efforts, however, not only built the engine for real-time collaboration, but also implemented a complete end-user solution that validates the data – work that would otherwise be purely theoretical.
The Figure 2 and Figure 3 below show:
Simple structured content, as it’s represented in a linear data model
A more complex piece of rich text, represented in a tree-structured data model.
# Advanced conflict resolution used in collaborative editing
However, on its own, switching to the tree data model wasn’t enough to ensure bulletproof real-time collaboration. It was quickly realized that the basic set of operations (insert, delete, set attribute) is insufficient to handle real-life scenarios in a graceful way.
While those three operations, built into Operational Transformation, may provide enough semantics to implement conflict resolution in a linear data model, they don’t satisfy the semantics of rich text editing.
Below are five example situations where users simultaneously perform an action on the same piece of content. Each scenario has a correct and an incorrect example of conflict resolution.
To properly handle these and many other situations, our Operational Transformation algorithms needed heavy enhancement. The most important enhancement made was adding a new set of operations: element rename, split, merge, insert text, marker.
The goal of these additions was to better express the semantics of any user changes. That, in turn, allowed the implementation of better conflict resolution algorithms. In further detail, the additional operations were:
The rename operation, to handle renaming elements – for example, to change a paragraph into a heading or a list item.
The split and merge operations, to better describe user intention.
The insert text operation, to differentiate between inserting text content and elements.
Plus, the marker operation, to conveniently mark given ranges of content (e.g. for the comments feature), although that’s unrelated to conflict resolution.
# Why add four new operations to Operational Transformation?
Thanks to the new operations, more contextual transformation algorithms can be written. This way, more complex use cases (like Scenarios 1–4 shown above) can be resolved.
To explain further: The rename, split, and merge actions can be executed by a combination of insert, move and remove operations. For example, splitting a paragraph can be represented as a combination of “insert a new paragraph” + “move a part of the old paragraph to the new paragraph”. However, the split operation is semantic-focused – it conveys the user’s intention. It carries more meaning than a simple sequence of insert + move, which are just two isolated actions, executed one after the other.
Side note: It’s believed that the set of necessary operations is strongly related to the semantics of the tree data that you’re representing. A rich text editor has a different nature than, for example, a genealogical tree, and hence requires a different set of operations.
# Further Operational Transformation extensions needed for collaborative editing
Adding the above new operations didn’t solve all the problems encountered. Our Operational Transformation implementation needed to be further extended, to handle the cumulative scenarios our team had experienced. The most significant additions made were:
The graveyard root:
A special data tree root where removed nodes are stored. It enables better conflict resolution in situations such as Scenario 5 (above), where User A changes some content and that content is simultaneously removed by User B.
**Generalizing operations to work on ranges:
**Instead of singular nodes for better processing and memory efficiency.
Sometimes, when being transformed, an operation needs to be broken into two operations, for example when part of the content is removed, as shown above in Scenario 5.
Selective undo mechanisms:
The undo feature needs to be aware of collaborative editing, so, for example, a user is only able to undo their own changes, not someone else’s.
# Real-time collaborative editing in CKEditor 5
So far, we’ve generalized about implementing real-time collaborative editing. Those low-level topics were platform-agnostic, but the second part to this article covers the specific end-user features and CKEditor 5 architecture that allows us to implement them.
# 1. Support for rich text editing features
The new CKEditor 5 framework is built to support all rich text editor features in collaboration mode. These features range from simple ones (like text styling and image drag and drop) to complex ones (like undo and redo, nested lists or tables).
Since the mechanisms used in real-time collaborative editing lay within the very foundation of the CKEditor 5 architecture, any new feature releases for the rich text editor will also work in collaboration mode.
# 2. Support for third-party plugins
Usually, a WYSIWYG HTML editor is a component within a bigger platform or application, so the CKEditor 5 architecture needed to be designed to make sure it was flexible and easily extensible. This ensures any custom features created are as fully supported in a collaborative environment, as the core CKEditor plugins.
As a result, if you develop your own piece of CKEditor functionality, it’s highly likely you won’t need to write even a single line of code to enable it for collaboration.
The CKEditor 5 framework makes it easy for developers to build custom features for real-time collaborative editing:
# Data abstraction (model-view-controller architecture)
The rich text editor content (the data) is abstracted from both the view and from the DOM (the browser’s content representation). This delivers an important benefit: abstract data is much easier to operate on. Therefore:
- A content element, for example, an image widget, can be represented as one element in the data model, instead of a few elements (as it is in the DOM or HTML). This means the feature code can be much simpler.
# A single entry point for changes
Every change performed on the editor data, internally, always results in creating at least one operation. Operations are atomic data objects, describing the change.
- These are then used to synchronize data between collaborating clients. This is what ensures every CKEditor 5 feature is supported out-of-the-box in real-time collaborative editing mode.
# A simple API built on a powerful foundation
All the mechanisms responsible for the collaborative magic are hidden from the developer. Instead, an API is provided resembling what you’re already used to using.
- Changing the data tree is easy, thanks to intuitive methods that perform actions which are then translated into operations behind the scenes.
# Data conversion decoupled from data synchronization
After the editor data model is changed, the changes are converted to the editor view (a custom, DOM-like data structure) and then rendered to the real DOM. Most importantly, only the editor data is synchronized – the conversion is done on every client independently.
- This means even a complicated feature, if represented by an easy abstraction, is still supported in the collaborative environment.
Markers are ranges or selections on content that are trackable and automatically kept in sync while the data tree is being changed and also during collaboration.
- Thanks to them, creating features like user selection or adding a comment to the text is a breeze.
Post-fixers are callbacks that are called after the editor data changes. They’re not exclusive to collaboration, but can be used to fix the editor model if your feature is complicated.
# 3. CKEditor 5’s real-time collaboration backend
Real-time collaboration requires a server (backend) to propagate changes between connected clients. A backend server also offers additional benefits:
Your changes are not lost if you accidentally close the document. A temporary backup in the cloud is always available.
Your changes are propagated to other connected users, even if you temporarily lose your internet connection.
The backend of CKEditor has been implemented as a SaaS solution, ready for zero-effort instant integration with your application. However, if you can’t use a cloud solution for any reason, an on-premise version of the collaboration server is also available.
Significant time and effort was spent on designing and implementing a highly optimized client-server communication protocol for real-time collaboration. Learn more about some of the optimizations implemented, including how server traffic was reduced 10-20 times, in our post on data compression.
# 4. Building dedicated collaboration features
Apart from enabling users to share and edit the same document simultaneously (try it for yourself in our real-time collaboration demo), dedicated collaboration features are constantly being added, to deliver the ultimate collaborative editing experience in the CKEditor 5 Ecosystem.
Here’s a list of our real-time collaboration features, including the ones that were rolled out after this post was first published:
User selection visibility – visual highlights at the exact place where another user is editing, to demonstrate collaboration and help users locate each other within the document. Released 2018.
Presence list – shows profile pictures or avatars of all users who are currently editing the document. Released 2018.
Mentions – With configurable autocompletion, provides a way to quickly insert and link names or phrases. Released April 2019.
Track Changes (Premium plugin) – edit content in suggestion mode and then accept or reject those changes. Released February 2019.
Revision History (Premium plugin) – create document versions and view, compare, restore, or rename them in a preview mode. Released June 2021.
# Next steps for collaborative editing in CKEditor 5
Building a next gen collaborative rich text editor began with the assumption that real-time collaborative editing had to be its core feature – lying at its very foundation. Which also meant a complete rewrite and rebuild of the CKEditor architecture from scratch.
After four years of research and development, we created an Operational Transformation implementation that was extended to support tree-based data structures (necessary for rich text content) to handle advanced conflict resolution. The successful implementation of CKEditor 5 with a collaboration-first ethos, included the CKEditor 5 editor itself and CKEditor 5 Real-time Collaboration features.
Behind the scenes, the implementation involved a massive effort and exceeded our development time estimations, by a factor of two, with:
Number of tickets closed: 5,700
Number of tests: 12,500
Code coverage: 100%
Development team: 25+
Estimated number of person-years spent on the project (to Sept 2018): 42
As the numbers convey, building real real-time collaboration into CKEditor 5 was an enormous undertaking. But it was worth the effort to master the challenge of collaborative editing, in a rich text editor, and to deliver exactly what our users needed and wanted.
See all the Premium collaboration features in action, including Comments, Track Changes, Revision History, and Real-time Collaborative Editing, in our complete Collaboration demo. To use Premium plugins, you need to purchase a CKEditor 5 Commercial License. CKEditor 5 with real-time collaboration can be smoothly integrated into any software solution to give teams a modern, powerful way to work together on content creation.
This post was originally published on