You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

372 lines
14 KiB

3 years ago
  1. \documentclass{article}
  2. \usepackage{hyperref}
  3. \begin{document}
  4. \title{Etherpad and EasySync Technical Manual}
  5. \author{AppJet, Inc., with modifications by the Etherpad Foundation}
  6. \date{\today}
  7. \maketitle
  8. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  9. \tableofcontents
  10. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  11. \section{Documents}
  12. \begin{itemize}
  13. \item A document is a list of characters, or a string.
  14. \item A document can also be represented as a list of \emph{changesets}.
  15. \end{itemize}
  16. \section{Changesets}
  17. \begin{itemize}
  18. \item A changeset represents a change to a document.
  19. \item A changeset can be applied to a document to produce a new document.
  20. \item When a document is represented as a list of changesets, it is assumed that the first changeset applies to the empty document, [].
  21. \end{itemize}
  22. \section{Changeset representation} \label{representation}
  23. $$(\ell \rightarrow \ell')[c_1,c_2,c_3,...]$$
  24. where
  25. \begin{itemize}
  26. \item[] $\ell$ is the length of the document before the change,
  27. \item[] $\ell'$ is the length of the document after the change,
  28. \item[] $[c_1,c_2,c_3,...]$ is an array of $\ell'$ characters that described the document after the change.
  29. \end{itemize}
  30. Note that $\forall c_i : 0 \leq i \leq \ell'$ is either an integer or a character.
  31. \begin{itemize}
  32. \item Integers represent retained characters in the original document.
  33. \item Characters represent insertions.
  34. \end{itemize}
  35. \section{Constraints on Changesets}
  36. \begin{itemize}
  37. \item Changesets are canonical and therefor comparable. When represented in computer memory, we always use the same representation for the same changeset. If the memory representation of two changesets differ, they must be different changesets.
  38. \item Changesets are compact. Thus, if there are two ways to represent a changeset in computer memory, then we always use the representation that takes up the fewest bytes.
  39. \end{itemize}
  40. Later we will discuss optimizations to changeset
  41. representation (using ``strips'' and other such
  42. techniques). The two constraints must apply to any
  43. representation of changesets.
  44. \section{Notation}
  45. \begin{itemize}
  46. \item We use the algebraic multiplication notation to represent changeset application.
  47. \item While changesets are defined as operations on documents, documents themselves are represented as a list of changesets, initially applying to the empty document.
  48. \end{itemize}
  49. \paragraph{Example}
  50. $A=(0\rightarrow 5)[``hello"]$
  51. $B=(5\rightarrow 11)[0-4, ``\ world"]$
  52. We can write the document ``hello world'' as $A\cdot B$ or
  53. just $AB$. Note that the ``initial document'' can be made
  54. into the changeset $(0\rightarrow
  55. N)[``<\mathit{the\ document\ text}>"]$.
  56. When $A$ and $B$ are changesets, we can also refer to $(AB)$ as ``the composition'' of $A$ and $B$. Changesets are closed under composition.
  57. \section{Composition of Changesets}
  58. For any two changesets $A$, $B$ such that
  59. \begin{itemize}
  60. \item[] $A=(n_1\rightarrow n_2)[\cdots]$
  61. \item[] $B=(n_2\rightarrow n_3)[\cdots]$
  62. \end{itemize}
  63. it is clear that there is a third changeset $C=(n_1\rightarrow n_3)[\cdots]$ such that applying $C$ to a document $X$ yields the same resulting document as does applying $A$ and then $B$. In this case, we write $AB=C$.
  64. Given the representation from Section \ref{representation}, it is straightforward to compute the composition of two changesets.
  65. \section{Changeset Merging}
  66. Now we come to realtime document editing. Suppose two different users make two different changes to the same document at the same time. It is impossible to compose these changes. For example, if we have the document $X$ of length $n$, we may have $A=(n\rightarrow n_a)[\ldots n_a \mathrm{characters}]$, $B=(n\rightarrow n_b)[\ldots n_b \mathrm{characters}]$ where $n\neq n_a\neq n_b$.
  67. It is impossible to compute $(XA)B$ because $B$ can only be applied to a document of length $n$, and $(XA)$ has length $n_a$. Similarly, $A$ cannot be applied to $(XB)$ because $(XB)$ has length $n_b$.
  68. This is where \emph{merging} comes in. Merging takes two changesets that apply to the same initial document (and that cannot be composed), and computes a single new changeset that preserves the intent of both changes. The merge of $A$ and $B$ is written as $m(A,B)$. For the Etherpad system to work, we require that $m(A,B)=m(B,A)$.
  69. Aside from what we have said so far about merging, there are many different implementations that will lead to a workable system. We have created one implementation for text that has the following constraints.
  70. \section{Follows} \label{follows}
  71. When users $A$ and $B$ have the same document $X$ on their screen, and they proceed to make respective changesets $A$ and $B$, it is no use to compute $m(A,B)$, because $m(A,B)$ applies to document $X$, but the users are already looking at document $XA$ and $XB$. What we really want is to compute $B'$ and $A'$ such that
  72. $$XAB' = XBA' = Xm(A,B)$$
  73. ``Following'' computes these $B'$ and $A'$ changesets. The definition of the ``follow'' function $f$ is such that $Af(A,B)=Bf(B,A)=m(A,B)=m(B,A)$. When we compute $f(A,B)$
  74. \begin{itemize}
  75. \item Insertions in $A$ become retained characters in $f(A,B)$
  76. \item Insertions in $B$ become insertions in $f(A,B)$
  77. \item Retain whatever characters are retained in \emph{both} $A$ and $B$
  78. \end{itemize}
  79. \paragraph{Example}
  80. Suppose we have the initial document $X=(0\rightarrow 8)[``\mathit{baseball}"]$ and user $A$ changes it to ``basil'' with changeset $A$, and user $B$ changes it to ``below'' with changeset $B$.
  81. We have
  82. $X=(0\rightarrow 8)[``\mathit{baseball}"]$ \\
  83. $A=(8\rightarrow 5)[0-1, ``\mathit{si}", 7]$ \\
  84. $B=(8\rightarrow 5)[0, ``\mathit{e}", 6, ``\mathit{ow}"]$ \\
  85. First we compute the merge $m(A,B)=m(B,A)$ according to the constraints
  86. $$m(A,B)=(8\rightarrow 6)[0, "e", "si", "ow"] = (8\rightarrow 6)[0, ``\mathit{esiow}"]$$
  87. Then we need to compute the follows $B'=f(A,B)$ and $A'=f(B,A)$.
  88. $$B'=f(A,B)=(5\rightarrow 6)[0,``\mathit{e}",2,3,``\mathit{ow}"]$$
  89. Note that the numbers $0$, $2$, and $3$ are indices into $A=(8\rightarrow 5)[0,1,``\mathit{si}",7]$
  90. \begin{tabular}{ccccc}
  91. 0 & 1 & 2 & 3 & 4 \\
  92. 0 & 1 & s & i & 7
  93. \end{tabular}
  94. $A'=f(B,A)=(5\rightarrow 6)[0,1,"si",3,4]$
  95. We can now double check that $AB'=BA'=m(A,B)=(8\rightarrow 6)[0,``\mathit{esiow}"]$.
  96. Now that we have made the mathematical meaning of the
  97. preceding pages complete, we can build a client/server
  98. system to support realtime editing by multiple users.
  99. \section{System Overview}
  100. There is a server that holds the current state of a
  101. document. Clients (users) can connect to the server from
  102. their web browsers. The clients and server maintain state
  103. and can send messages to one another in real-time, but
  104. because we are in a web browser scenario, clients cannot
  105. send each other messages directly, and must go through the
  106. server always. (This may distinguish from prior art?)
  107. The other critical design feature of the system is that
  108. \emph{A client must always be able to edit their local
  109. copy of the document, so the user is never blocked from
  110. typing because of waiting to send or receive data.}
  111. \section{Client State}
  112. At any moment in time, a client maintains its state in the
  113. form of 3 changesets. The client document looks like
  114. $A\cdot X \cdot Y$, where
  115. $A$ is the latest server version, the composition of all
  116. changesets committed to the server, from this client or
  117. from others, that the server has informed this client
  118. about. Initially $A=(0\rightarrow N)[<\mathit{initial\ document\ text}>]$.
  119. $X$ is the composition of all changesets this client has
  120. submitted to the server but has not heard back about yet.
  121. Initially $X=(N\rightarrow N)[0,1,2,\ldots, N-1]$, in
  122. other words, the identity, henceforth denoted $I_N$.
  123. $Y$ is the composition of all changesets this client has
  124. made but has not yet submitted to the server yet.
  125. Initially $Y=(N\rightarrow N)[0,1,2,\ldots, N-1]$.
  126. \section{Client Operations}
  127. A client can do 5 things.
  128. \begin{enumerate}
  129. \item Incorporate new typing into local state
  130. \item Submit a changeset to the server
  131. \item Hear back acknowledgement of a submitted changeset
  132. \item Hear from the server about other clients' changesets
  133. \item Connect to the server and request the initial document
  134. \end{enumerate}
  135. As these 5 events happen, the client updates its
  136. representation $A\cdot X \cdot Y$ according to the
  137. relations that follow. Changes ``move left'' as time goes
  138. by: into $Y$ when the user types, into $X$ when change
  139. sets are submitted to the server, and into $A$ when the
  140. server acknowledges changesets.
  141. \subsection{New local typing}
  142. When a user makes an edit $E$ to the document, the client
  143. computes the composition $(Y\cdot E)$ and updates its local
  144. state, i.e. $Y \leftarrow Y\cdot E$. I.e., if $Y$ is the
  145. variable holding local unsubmitted changes, it will be
  146. assigned the new value $(Y\cdot E)$.
  147. \subsection{Submitting changesets to server}
  148. When a client submit its local changes to the server, it
  149. transmits a copy of $Y$ and then assigns $Y$ to $X$, and
  150. assigns the identity to $Y$. I.e.,
  151. \begin{enumerate}
  152. \item Send $Y$ to server,
  153. \item $X \leftarrow Y$
  154. \item $Y \leftarrow I_N$
  155. (the identity).
  156. \end{enumerate}
  157. This happens every 500ms as long as it receives an
  158. acknowledgement. Must receive ACK before submitting
  159. again. Note that $X$ is always equal to the identity
  160. before the second step occurs, so no information is lost.
  161. \subsection{Hear ACK from server}
  162. When the client hears ACK from server,
  163. $A \leftarrow A\cdot X$ \\
  164. $X \leftarrow I_N$
  165. \subsection{Hear about another client's changeset}
  166. When a client hears about another client's changeset $B$,
  167. it computes a new $A$, $X$, and $Y$, which we will call
  168. $A'$, $X'$, and $Y'$ respectively. It also computes a
  169. changeset $D$ which is applied to the current text view on
  170. the client, $V$. Because $AXY$ must always equal the
  171. current view, $AXY=V$ before the client hears about $B$,
  172. and $A'X'Y'=VD$ after the computation is performed.
  173. The steps are:
  174. \begin{enumerate}
  175. \item Compute $A' = AB$
  176. \item Compute $X' = f(B,X)$
  177. \item Compute $Y' = f(f(X,B), Y)$
  178. \item Compute $D=f(Y,f(X,B))$
  179. \item Assign $A \leftarrow A'$, $X \leftarrow X'$, $Y \leftarrow Y'$.
  180. \item Apply $D$ to the current view of the document
  181. displayed on the user's screen.
  182. \end{enumerate}
  183. In steps 2,3, and 4, $f$ is the follow operation described
  184. in Section \ref{follows}.
  185. \paragraph{Proof that $\mathbf{AXY=V \Rightarrow A'X'Y'=VD}$.}
  186. Substituting $A'X'Y'=(AB)(f(B,X))(f(f(X,B),Y))$, we
  187. recall that merges are commutative. So for any two
  188. changesets $P$ and $Q$,
  189. $$m(P,Q)=m(Q,P)=Qf(Q,P)=Pf(P,Q)$$
  190. Applying this to the relation above, we see
  191. \begin{eqnarray*}
  192. A'X'Y'&=& AB f(B,X) f(f(X,B),Y) \\
  193. &=&AX f(X,B) f(f(X,B),Y) \\
  194. &=&A X Y f(Y, f(X,B)) \\
  195. &=&A X Y D \\
  196. &=&V D
  197. \end{eqnarray*}
  198. As claimed.
  199. \subsection{Connect to server}
  200. When a client connects to the server for the first time,
  201. it first generates a random unique ID and sends this to
  202. the server. The client remembers this ID and sends it
  203. with each changeset to the server.
  204. The client receives the latest version of the document
  205. from the server, called HEADTEXT. The client then sets
  206. \begin{itemize}
  207. \item[] $A \leftarrow \mathrm{HEADTEXT}$
  208. \item[] $X \leftarrow I_N$
  209. \item[] $Y \leftarrow I_N$
  210. \end{itemize}
  211. And finally, the client displays HEADTEXT on the screen.
  212. \section{Server Overview}
  213. Like the client(s), the server has state and performs
  214. operations. Operations are only performed in response to
  215. messages from clients.
  216. \section{Server State}
  217. The server maintains a document as an ordered list of
  218. \emph{revision records}. A revision record is a data
  219. structure that contains a changeset and authorship
  220. information.
  221. \begin{verbatim}
  222. RevisionRecord = {
  223. ChangeSet,
  224. Source (unique ID),
  225. Revision Number (consecutive order, starting at 0)
  226. }
  227. \end{verbatim}
  228. For efficiency, the server may also store a variable
  229. called HEADTEXT, which is the composition of all
  230. changesets in the list of revision records. This is an
  231. optimization, because clearly this can be computed from
  232. the set of revision records.
  233. \section{Server Operations Overview}
  234. The server does two things in addition to maintaining
  235. state representing the set of connected clients and
  236. remembering what revision number each client is up to date
  237. with:
  238. \begin{enumerate}
  239. \item Respond to a client's connection requesting the initial document.
  240. \item Respond to a client's submission of a new changeset.
  241. \end{enumerate}
  242. \subsection{Respond to client connect}
  243. When a server receives a connection request from a client,
  244. it receives the client's unique ID and stores that in the
  245. server's set of connected clients. It then sends the
  246. client the contents of HEADTEXT, and the corresponding
  247. revision number. Finally the server notes that this
  248. client is up to date with that revision number.
  249. \subsection{Respond to client changeset}
  250. When the server receives information from a client about
  251. the client's changeset $C$, it does five things:
  252. \begin{enumerate}
  253. \item Notes that this change applies to revision number
  254. $r_c$ (the client's latest revision).
  255. \item Creates a new changeset $C'$ that is relative to the
  256. server's most recent revision number, which we call
  257. $r_H$ ($H$ for HEAD). $C'$ can be computed using
  258. follows (Section \ref{follows}). Remember that the server has a series of
  259. changesets,
  260. $$S_0\rightarrow S_1\rightarrow \ldots S_{r_c}\rightarrow S_{r_c+1} \rightarrow \ldots \rightarrow S_{r_H} $$
  261. $C$ is relative to $S_{r_c}$, but we need to compute $C'$ relative to $S_{r_H}$.
  262. We can compute a new $C$ relative to $S_{r_c+1}$ by computing $f(S_{r_c+1},C)$. Similarly we can repeat for
  263. $S_{r_c+2}$ and so forth until we have $C'$ represented relative to $S_{r_H}$.
  264. \item Send $C'$ to all other clients
  265. \item Send ACK back to original client
  266. \item Add $C'$ to the server's list of revision records by creating a new revision record out of this and the client's ID.
  267. \appendix
  268. \section*{Additional topics}
  269. \begin{enumerate}
  270. \item Optimizations (strips, more caching, etc.)
  271. \item Pseudocode for composition, merge, and follow
  272. \item How authorship information is used to color-code the document based on who typed what
  273. \item How persistent connections are maintained between client and server
  274. \end{enumerate}
  275. \end{enumerate}
  276. \end{document}