Why We Chose a Multirepo Architecture for CKEditor 5
Last week I tweeted about the lack of a centralised GitHub statistic for CKEditor 5 which is a result of the multirepo architecture that we currently have:
The thing that I miss most in #CKEditor5's multi-repo architecture is that there's no single contributors graph any more :D— Piotrek Koszuliński (@reinmarpl) June 3, 2016
In a response, Jan Dudek pointed out that “notable projects stay with monolithic architecture”.
Interesting that you chose to split. Some notable projects stay with monolithic repos. Where do you see the biggest advantages?— Jan Dudek (@_janek) June 3, 2016
I failed to fit it into a tweet, so here’s a short article.
# Thumbs Up for Monorepo
To answer why we’ve made a choice to keep CKEditor 5 code in multiple repositories I should perhaps first explain what problems that brings. Or rather — what are the advantages of keeping things in a single repository.
What Jan Dudek might have in mind is summarized in the “Why is Babel a monorepo?” article. As its authors point out, the topic was discussed dozens of times and the main pros of a monorepo architecture are:
- Single lint, build, test and release process.
- Easy to coordinate changes across modules.
- Single place to report issues.
- Easier to setup a development environment.
- Tests across modules are ran together which finds bugs that touch multiple modules easier.
It’s all true. We’ve been there and seen these things. However, most of the arguments (like linting, building, releasing, setting up a development environment, testing) are just a matter of creating and utilizing the right tools. We’ve done that, although I admit that it required some time because, as James Kyle writes, the platform (surprisingly) doesn’t support such an environment:
It's amazing to me that for a community that loves millions of tiny little modules the DX of working across many node modules is so awful— James Kyle (@thejameskyle) April 20, 2016
The second part of the arguments for a monorepo setup is related to maintenance — reporting issues and coordinating changes across packages. We understand that the community won’t know in which CKEditor repository certain bugs should be reported. Therefore, it’s clear for us that there must be a central repository where uncategorised and vague bugs will be reported. They can be moved to the proper repositories once the real cause or topic is identified. This will require some additional work, but we’ve always done it anyway. Working with a huge and diverse community requires it.
This is a topic for a separate article, but I’ll never write one, so here’s an excerpt from it.
An obvious choice for a packaging system for code kept in all these repositories was of course npm. It works great (although slooow) in a normal situation when just one or two of your packages are in a development mode and you just need to install their dependencies and keep them npm linked. Unfortunately, in case of CKEditor, we may reach something like 100 or more packages in the future (plus, packages may have dependencies between one other). Imagine now that you want to keep 10 of them in a development version, so whenever one of them changes, that change should also affect all dependent packages. This is where linking fails short, especially that such npm commands like install or update break when encountering it. We’ve seen all kind of issues and random behaviours which we couldn’t even report. Additionally, it’s super inconvenient because you need to explicitly tell npm to install globally kept dev versions of dependencies in all dependent packages.
Long story short, we needed to come up with a quite complicated mechanism for updating all these repositories and a builder which can find proper packages in all this mess. This toolset is still young, but it works fine.
# Why Multirepo Then?
Usually we discuss such topics in issues in the CKEditor 5 design repository, however, this time we haven’t opened one, because it was something that we’ve decided before even starting the project. I guess that it’s high time that we explain our way of thinking.
There are obvious issues when dealing with dozens of repositories and npm packages. We can’t deny that. It slows us down a bit and may be confusing at times. It also required spending a significant amount of time on the toolset, but I still think that (in our case) the cost was worth it.
# It’s a Framework to Be Used by the Community
CKEditor 5 is an editing library that’s going to be used by the community to develop hundreds of features. Therefore, it’s also a framework in which these features can be created in a standardised and controlled way.
We (the core team) are going to create the most popular packages, but the community will create even more of them. We can keep our features together with the framework in one repository, but the community can’t. Other developers will need to work on their packages in their own repositories. That was the case in CKEditor 4 — the plugin architecture was in place, but development and maintenance of 3rd party add-ons was horrible because the entire workflow was designed around a single repository.
By using the same setup as the community we wanted to ensure that we’ll create proper tools for working with the project.
# Packages Are Independent Projects
This is controversial and it’s not black and white. CKEditor 5 is made of a core, editing engine, UI library and features. It’s a big project with multiple layers of architecture. We often tend to see such code bases as monoliths. An insignificant change in one module affects couple of other ones, but it’s not a problem, because it’s all a single repo.
We’ve been there with CKEditor 4 and we’re there with CKEditor 5. It happens that a single change (especially in a lower level of your application) requires changes in a couple of other places. However, it’s often also a result of bad architecture caused by too tight coupling between modules.
Splitting your project into multiple repositories won’t magically fix your architecture, but it’ll force you to think about all the pieces separately. The more times you have to deal with a single change in one package which required changes in 10 others, the more you’ll think how to avoid such situations. And from what I can tell, it really helped us. Waterfall situations still happen, but the overall architecture is a lot better.
# Because a Single Repository Doesn’t Solve All Problems With npm
I had a moment of doubt when I came to Fred and said: “I’ve got enough of this, let’s merge all the repos. We only care to have multiple npm packages after all, just like Babel does, right?”. But it was then that I understood that all the issues with npm will stay.
If you have one repository which contains multiple packages, you still have multiple package.json files and all those packages have dependencies between each other. So you’re in exactly the same situation in which we are — you need to install all of them using symlinks. I’ve seen somewhere in Babel repository that they have some ugly scripts to do that.
So the only thing we would gain by merging the repositories (from the development tools perspective) would be getting rid of the need to clone and update all these repositories, which is fairly simple, compared to npm issues.
# Mind the Granularity
Node.js’s community is known for its love for creating npm packages for even the tiniest pieces of code. I bet it’s fun, but at the same time it’s ridiculous if you use such packages, ending up with hundreds of packages in your boilerplate code (seriously, I got a couple hundreds of npm packages installed in a recruitment task that I’ve been reviewing).
We don’t plan to split CKEditor 5 packages that much. For instance, the ckeditor5-basic-styles is going to contain all basic styling commands like bold, italic, underline, etc. Thanks to that, having something like 50.000 LOC (excluding tests), we have only 15 repositories.
# It’s Working
The number of repositories will grow significantly, but we can’t complain so far. Of course, things are slower (pulling changes to 15 repos or running npm install in newly installed ones takes time) and sometimes we need to touch a couple of repositories at once, but thanks to the development tools that we created (and which we’re going to improve as we go), all these problems are acceptable. At the same time, an inclusive, open platform and its effect on the architecture of the project are a nice return on investment. I believe that, especially in a longer term, we and the community will appreciate it.