A Simple Approach to Building a Real-Time Collaborative Text Editor

philmcc · on Oct 7, 2017

So... I'm a "learn by stackexchange" dev and really enjoyed tracking this person's thought process.

I find that most tutorial/explanations come fully baked with the the problems -predictively- solved, as opposed to reactively solved, if that makes sense -- The difference between ("In order to prevent this problem, we'll do X" as opposed to "Hmm that's not working. Why is that? Ah. Okay perhaps we can do W... no, that won't work and here's (y, zing) why, let's try X")

Author's blog notwithstanding, where are other places to read other developers essentially plain-speak their way through problem solving / program architecture in a similar fashion?

rudi-c · on Oct 7, 2017

Author here -- I'm glad you like the post! There are more like you describe out there, though they tend to be scattered around the Internet.

https://jvns.ca/ often gets mentioned on HN and has a large volume of posts that follow the format "let's assume both you and I don't know anything - how do we figure this out?". http://jamie-wong.com/, a coworker, has a few good ones on graphics/numerical stuff.

Characterizing unique IDs with emojis instead of an abstract number was inspired by Tim Babb who has a good blog post on Kalman Filters: http://www.bzarg.com/p/how-a-kalman-filter-works-in-pictures...

Explaining a concept in person really helps, as you get immediate feedback if they look confused and it helps you figure out in what order you need to introduce concepts. In the case of this project, I was working on it alongside other people who were working their side projects. So I had the chance to explain what I was working on very frequently as we asked each other what we were up to, but I'm still learning to write this way intentionally.

mesbahamin · on Oct 8, 2017

I wrote an article [1] in this style a while back.

I started with a problem that I had no idea how to solve (finding elemental spellings for words like "CrYPtOgRaPHEr"). I started with the most straightforward approach I could think of, and wrote the article gradually as I went through the process of finding better and better solutions.

I ended up learning about data structures, reasoning about time complexity, and using tools to profile Python programs.

There wouldn't have been much to write about if I had just 'gotten things to work', then moved on. I found that writing the article forced me to strive for much deeper understanding than I might have otherwise.

I'm eager to read the other links posted in response to your question.

[1]: https://www.amin.space/blog/2017/5/elemental_speller/

Swizec · on Oct 8, 2017

> where are other places to read other developers essentially plain-speak their way through problem solving / program architecture in a similar fashion?

Livecoding is becoming more and more of a trend. When I do it on my YouTube channel I follow exactly this approach.

I basically use it as an excuse to play with new tech or to try building stuff I need. Except I also turn on the stream and talk out loud like the audience was my rubber ducky.

mostafah · on Oct 8, 2017

Brent Simmons has an interesting habit of writing blog posts in order to help him think about the problems. His blog is http://inessential.com, which includes some non-technical stuff too. For the start, open one of his “diary” pages from the bottom of the page, like “Vesper Sync Diary”.

jdlshore · on Oct 9, 2017

My for-fee screencast letscodejavascript.com is exactly that, with a focus on design and rigorous development practice, and coincidentally is about a collaborative drawing app.

JohnHammersley · on Oct 7, 2017

Assuming the author of the article spots this on HN, I'd just like to say a quick thanks for mentioning Overleaf [1] in this! It's a nice easy to read article and it reminded me that we were planning to write one along these lines but never got around to it :)

It's interesting to see how CRDT has been adopted -- we wrote our own OT implementation for Overleaf (back when we were writeLaTeX) and it's still serving us well. We took a lot of inspiration from Etherpad [2], which is still a great collaborative editor, especially for notes.

We're now going over much of our code as we work on the integration with ShareLaTeX [3], which also has a very nice real time track changes / commenting implementation [4]. This helps with the UI aspect of collaboration, which is important on top of the use of OT or CRDT to ensure no version conflicts.

Good luck in your next projects, and I look forward to reading write ups like this of those too :)

[1] https://www.overleaf.com/

[2] http://etherpad.org/

[3] https://www.overleaf.com/blog/518-exciting-news-sharelatex-i...

[4] https://www.sharelatex.com/blog/2017/03/09/track-changes-and...

(Note: Edited for grammar)

alalonde · on Oct 8, 2017

Very cool to see this on the front page. I would agree that for the use case of plain text, yes, the problem has been solved, but for just about anything more complicated it quickly becomes intractable. Rich Text, for example, is extremely difficult to get right (ask the ckeditor guys!)

For those wanting real-time collaboration functionality in their apps but don't have the intellectual curiosity (or time!) to learn the ins and outs, we [1] built a general-purpose API for folks to add real-time collaboration to their web apps. Think Firebase but designed from the ground up for simultaneous editing and with additional first-class support for common UX needs such as shared cursors and selections. We agree that the web is moving in this direction and are excited to see what gets built!

[1]convergencelabs.com

codingdave · on Oct 7, 2017

The problem I ran into when I had to do this on a prior project wasn't the algorithm for worrying about individual characters. It was dealing with how to manage people highlighting entire paragraphs or documents, and pasting over them with new content, at the same time as the other editor(s) had been changing things.

Ultimately, while there were algorithmic answers to just about any scenario, we ended up just declaring a business answer instead - the editor only would let you into edit mode one paragraph at a time, and you locked that paragraph while you were in it. We found that almost nobody was actually editing the exact same paragraph at the same time anyway... they were working in different parts of the documents. So not only did that resolve all possible copy/paste insanity, the algorithm became brain-dead simple -- when you update or leave the paragraph, send those changes to everyone else's editor.

matt4077 · on Oct 8, 2017

Back when SubEthaEdit was new, I was a writer on a comedy show. Many afternoons, our team of 4 to 6 writers would get into something like a competitive I-can-finish-you-joke-before-you-do edit war. I don't remember ever again being paid for having that much fun.

But that only worked before SubEthaEdit worked absolutely flawlessly with anything we threw at it. You'd have two people editing the same word, yet it never occurred to anybody that we weren't actually all playing with the same piece of clay.

(Moral of the story: "nobody would ever want to that" is a self-fulfilling prophecy)

hokus · on Oct 8, 2017

That formula should be extend to involve a live audience each having a tomato launching siege engine at the bottom of the window. From the audience perspective the tomato is flying away, from the to be comedian perspective they are flying towards his screen obscuring his vision proportionally.

Then add a gradual set of laugh samples to represent the number of observers who pressed the "yes, that was funny" button. For the jokes with the highest powerlevel animal sounds should be mixed in.

Lets call the invention The Funny Royal.

codingdave · on Oct 8, 2017

Yes, you are correct - I meant nobody in my specific user community. There certainly are other audiences who would edit documents differently than my authors did....

alalonde · on Oct 8, 2017

Yup, I just wrote an article [1] about just this. Simultaneous co-editing doesn't work unless the UX includes the appropriate cues to avoid this sort of thing. There are astoundingly few good examples of good (much less great!) real-time coediting UXes. This is the primary reason why we added first-class support for these cues so that it's not so damn hard!

[1] https://convergencelabs.com/blog/2017/09/what-makes-for-a-gr...

codingdave · on Oct 8, 2017

Agreed. My first UX attempt was to simply put a list of the current editors on a document in the upper-right corner, and highlight the field or paragraph being edited by each person. It worked well enough, but this was also a limited scale audience - just a few dozen legal document authors. A larger scale app might require more, but it met the need we had.

rudi-c · on Oct 8, 2017

Could you elaborate on the kind of bad UX you've encountered, and what solutions can be used to fix them? I can't think of anything that a simple cursor position indicator (and maybe a selection indicator - though that could get messy if the selection is large) doesn't solve.

EGreg · on Oct 8, 2017

That's how Wikimedia does it and it doesn't need all the fancy concurrent editing logic.

muxator · on Oct 8, 2017

MediaWiki locks the whole article (or just a section if the user happens to explicitly choose so), and has no way of doing conflict resolution. It only detects it.

Actually, the way MediaWiki works was what inspired me to better study the problems of version control and concurrent editing in the first place.

EGreg · on Oct 8, 2017

But as codingdave said, there are actual person-level issues with editing the same text together, and it happens rarely anyway.

sophiebits · on Oct 8, 2017

Quip does it this way and I find it pretty frustrating. YMMV.

bla2 · on Oct 8, 2017

I liked Raph's writings on CRDTs: https://medium.com/@raphlinus/working-code-for-operational-t...

jiyinyiyong · on Oct 8, 2017

For the keys, I created a similar library called https://github.com/Cirru/bisection-key to generate string keys.

archagon · on Oct 8, 2017

CRDTs are awesome! I’m working on a similar article going in-depth on Victor Grishchenko’s Causal Tree[1] CRDT, which makes several ingenious decisions (making a tree out of atoms and their causes, storing the full ordered atom history for each site, treating the output data as a DFS traversal of the atom tree, sorting atoms by their on-creation awareness of other atoms) that allow the format to be extended to many other data types, all with generally O(N) performance and permitting git-like document history and per-change blame queries without any extra work. I've grown to think of it as the purest possible expression of the CRDT concept.

In my example project[2] (NOT PRODUCTION READY!) I’ve already implemented text editing and basic vector drawing that synchronize over arbitrary network topologies. If you can massage your data into the CT format, you get real-time collaboration, offline mode, cloud sync, and decentralized network support for basically free — all in a thousand lines of comprehensible, functional code! You can even use it with a basic key-value store like CloudKit, which was admittedly one of my main reasons of diving into this class of algorithm.

Hope to finish v1 of the demo and post the article by the end of the month, but writing is very hard...

More on point, I really like how this article dissects and visualizes CRDTs, which can be a difficult topic to breach. If you're looking to implement something like this, however, it should be noted that Logoot has a pretty significant interleaving problem for concurrent operations if you use it to sync individual characters as opposed to lines: https://stackoverflow.com/questions/45722742/logoot-crdt-int...

I think for text this is a deal-breaker, because if one client goes offline for a while, they'll basically corrupt the document if anything they've been working on overlaps with other users' changes. You could solve this with a central server by forcing a re-sync and/or manual merge, but I feel that sort of defeats the purpose of using a CRDT.

[1]: https://ai2-s2-pdfs.s3.amazonaws.com/6534/c371ef78979d7ed84b...

[2]: https://github.com/archagon/crdt-playground

archagon · on Oct 8, 2017

Hmm, FYI, I seem to have broken it a little bit: https://i.imgur.com/qrNeZFJ.png