I actually have a demo of an earlier version up on a nameless Amazon AWS micro instance, built with node.js and redis and a rather terrifying amount of JavaScript code:
There are some weird browser compatibility issues, but it works in vaguely recent versions of Chrome and Safari and Firefox.
There are two parts to it. The core of what I’m making is, essentially, a synchronized string data type: a string where changes are immediately applied locally, and propagate across the network. The other part, necessary for actually using the thing, is making an application which maps documents onto strings. That’s not really hard so much as it is irritating, so I won’t talk more about it.
The synchronized string part is actually pretty cool. You apply some arbitrary patch of insertions and deletions, and send this out over the network. No matter what order these patches are applied in, they’re guaranteed to converge to the same ultimate result—it’s a lot like using a distributed version control system, except that you do a branch and merge on every keystroke, in a way that isn’t slow and doesn’t break horribly. :-)
You get the complete version history with the document, along with per-character records of who inserted it. If you want to have a server holding the canonical version of the document, you can do that, or go serverless. Technically, it’s based on the really clever idea of causal trees (PDF) invented by Victor Grishchenko; I’ve just been adding refinements so it goes fast and doesn’t involve terrifying regular expressions.
No matter what order these patches are applied in, they’re guaranteed to converge to the same ultimate result—it’s a lot like using a distributed version control system, except that you do a branch and merge on every keystroke, in a way that isn’t slow and doesn’t break horribly.
What’s the connection to Darcs? A faster less-powerful version since branches won’t be long-lived and not all that much commutation should be necessary?
It’s pretty similar to Darcs-style patch manipulation. I’ve only skimmed some introductory material on exactly how Darcs does it, but it looks a lot like what Wave does:
Make patches invertible.
Define a function T :: Patch -> Patch -> Patch that transforms one patch to apply to the document after a second patch has already been applied to it.
With these primitives, you can arbitrarily re-order patches. That’s not quite the way I’m doing it, though. I represent the document as a tree of atoms, each one with a unique id and a pointer to a predecessor. You get the document state by doing a pre-order depth first traversal of this tree, where siblings are traversed in an order that’s guaranteed to be the same on all clients, which I won’t define because it would take way too long to write out.
The patches are forests of atoms, and they get anchored to one or more nodes in the document tree. Atoms are never deleted, but they can be made invisible by attaching a special deletor atom to them. This guarantees that patches in the future will always be able to find their anchor points. I’ve tried both this and a system closer to Darcs’, and this was simpler to understand and implement.
In terms of power, this is equivalent to what Darcs does with permuting patches, and it can handle arbitrarily long-lived branches. My system’s patches commute naturally without any transformation, no matter how far document states may have diverged, which is a plus.
As for the speed, all operations are at most linear in the size of the document, with small constant factors, and I’m working on making the common cases all happen in logarithmic time, using monoid-cached trees—the finger tree is a prominent example. I don’t know how this compares with Darcs, but I wouldn’t expect them to be asymptotically faster here.
If I had more time and wasn’t so focused on just getting this done so I can get out of grad school, it would be interesting to make a DVCS based on this.
I actually have a demo of an earlier version up on a nameless Amazon AWS micro instance, built with node.js and redis and a rather terrifying amount of JavaScript code:
http://ec2-50-16-79-81.compute-1.amazonaws.com/notepad/931ab120-b66b-4275-a20c-e4d7b70b4c62
There are some weird browser compatibility issues, but it works in vaguely recent versions of Chrome and Safari and Firefox.
There are two parts to it. The core of what I’m making is, essentially, a synchronized string data type: a string where changes are immediately applied locally, and propagate across the network. The other part, necessary for actually using the thing, is making an application which maps documents onto strings. That’s not really hard so much as it is irritating, so I won’t talk more about it.
The synchronized string part is actually pretty cool. You apply some arbitrary patch of insertions and deletions, and send this out over the network. No matter what order these patches are applied in, they’re guaranteed to converge to the same ultimate result—it’s a lot like using a distributed version control system, except that you do a branch and merge on every keystroke, in a way that isn’t slow and doesn’t break horribly. :-)
You get the complete version history with the document, along with per-character records of who inserted it. If you want to have a server holding the canonical version of the document, you can do that, or go serverless. Technically, it’s based on the really clever idea of causal trees (PDF) invented by Victor Grishchenko; I’ve just been adding refinements so it goes fast and doesn’t involve terrifying regular expressions.
Since you mentioned Haskell...
What’s the connection to Darcs? A faster less-powerful version since branches won’t be long-lived and not all that much commutation should be necessary?
It’s pretty similar to Darcs-style patch manipulation. I’ve only skimmed some introductory material on exactly how Darcs does it, but it looks a lot like what Wave does:
Make patches invertible.
Define a function
T :: Patch -> Patch -> Patch
that transforms one patch to apply to the document after a second patch has already been applied to it.With these primitives, you can arbitrarily re-order patches. That’s not quite the way I’m doing it, though. I represent the document as a tree of atoms, each one with a unique id and a pointer to a predecessor. You get the document state by doing a pre-order depth first traversal of this tree, where siblings are traversed in an order that’s guaranteed to be the same on all clients, which I won’t define because it would take way too long to write out.
The patches are forests of atoms, and they get anchored to one or more nodes in the document tree. Atoms are never deleted, but they can be made invisible by attaching a special deletor atom to them. This guarantees that patches in the future will always be able to find their anchor points. I’ve tried both this and a system closer to Darcs’, and this was simpler to understand and implement.
In terms of power, this is equivalent to what Darcs does with permuting patches, and it can handle arbitrarily long-lived branches. My system’s patches commute naturally without any transformation, no matter how far document states may have diverged, which is a plus.
As for the speed, all operations are at most linear in the size of the document, with small constant factors, and I’m working on making the common cases all happen in logarithmic time, using monoid-cached trees—the finger tree is a prominent example. I don’t know how this compares with Darcs, but I wouldn’t expect them to be asymptotically faster here.
If I had more time and wasn’t so focused on just getting this done so I can get out of grad school, it would be interesting to make a DVCS based on this.
That is pretty neat.
Neat! Thanks for the info.