first commit

This commit is contained in:
2025-04-24 13:11:28 +08:00
commit ff9c54d5e4
5960 changed files with 834111 additions and 0 deletions

52
services/document-updater/.gitignore vendored Normal file
View File

@@ -0,0 +1,52 @@
compileFolder
Compiled source #
###################
*.com
*.class
*.dll
*.exe
*.o
*.so
# Packages #
############
# it's better to unpack these files and commit the raw source
# git has its own built in compression methods
*.7z
*.dmg
*.gz
*.iso
*.jar
*.rar
*.tar
*.zip
# Logs and databases #
######################
*.log
*.sql
*.sqlite
# OS generated files #
######################
.DS_Store?
ehthumbs.db
Icon?
Thumbs.db
/node_modules/*
forever/
**.swp
# Redis cluster
**/appendonly.aof
**/dump.rdb
**/nodes.conf
# managed by dev-environment$ bin/update_build_scripts
.npmrc

View File

@@ -0,0 +1,3 @@
{
"require": "test/setup.js"
}

View File

@@ -0,0 +1 @@
20.18.2

View File

@@ -0,0 +1,27 @@
# This file was auto-generated, do not edit it directly.
# Instead run bin/update_build_scripts from
# https://github.com/overleaf/internal/
FROM node:20.18.2 AS base
WORKDIR /overleaf/services/document-updater
# Google Cloud Storage needs a writable $HOME/.config for resumable uploads
# (see https://googleapis.dev/nodejs/storage/latest/File.html#createWriteStream)
RUN mkdir /home/node/.config && chown node:node /home/node/.config
FROM base AS app
COPY package.json package-lock.json /overleaf/
COPY services/document-updater/package.json /overleaf/services/document-updater/
COPY libraries/ /overleaf/libraries/
COPY patches/ /overleaf/patches/
RUN cd /overleaf && npm ci --quiet
COPY services/document-updater/ /overleaf/services/document-updater/
FROM app
USER node
CMD ["node", "--expose-gc", "app.js"]

View File

@@ -0,0 +1,662 @@
GNU AFFERO GENERAL PUBLIC LICENSE
Version 3, 19 November 2007
Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The GNU Affero General Public License is a free, copyleft license for
software and other kinds of works, specifically designed to ensure
cooperation with the community in the case of network server software.
The licenses for most software and other practical works are designed
to take away your freedom to share and change the works. By contrast,
our General Public Licenses are intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
software for all its users.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.
Developers that use our General Public Licenses protect your rights
with two steps: (1) assert copyright on the software, and (2) offer
you this License which gives you legal permission to copy, distribute
and/or modify the software.
A secondary benefit of defending all users' freedom is that
improvements made in alternate versions of the program, if they
receive widespread use, become available for other developers to
incorporate. Many developers of free software are heartened and
encouraged by the resulting cooperation. However, in the case of
software used on network servers, this result may fail to come about.
The GNU General Public License permits making a modified version and
letting the public access it on a server without ever releasing its
source code to the public.
The GNU Affero General Public License is designed specifically to
ensure that, in such cases, the modified source code becomes available
to the community. It requires the operator of a network server to
provide the source code of the modified version running there to the
users of that server. Therefore, public use of a modified version, on
a publicly accessible server, gives the public access to the source
code of the modified version.
An older license, called the Affero General Public License and
published by Affero, was designed to accomplish similar goals. This is
a different license, not a version of the Affero GPL, but Affero has
released a new version of the Affero GPL which permits relicensing under
this license.
The precise terms and conditions for copying, distribution and
modification follow.
TERMS AND CONDITIONS
0. Definitions.
"This License" refers to version 3 of the GNU Affero General Public License.
"Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.
"The Program" refers to any copyrightable work licensed under this
License. Each licensee is addressed as "you". "Licensees" and
"recipients" may be individuals or organizations.
To "modify" a work means to copy from or adapt all or part of the work
in a fashion requiring copyright permission, other than the making of an
exact copy. The resulting work is called a "modified version" of the
earlier work or a work "based on" the earlier work.
A "covered work" means either the unmodified Program or a work based
on the Program.
To "propagate" a work means to do anything with it that, without
permission, would make you directly or secondarily liable for
infringement under applicable copyright law, except executing it on a
computer or modifying a private copy. Propagation includes copying,
distribution (with or without modification), making available to the
public, and in some countries other activities as well.
To "convey" a work means any kind of propagation that enables other
parties to make or receive copies. Mere interaction with a user through
a computer network, with no transfer of a copy, is not conveying.
An interactive user interface displays "Appropriate Legal Notices"
to the extent that it includes a convenient and prominently visible
feature that (1) displays an appropriate copyright notice, and (2)
tells the user that there is no warranty for the work (except to the
extent that warranties are provided), that licensees may convey the
work under this License, and how to view a copy of this License. If
the interface presents a list of user commands or options, such as a
menu, a prominent item in the list meets this criterion.
1. Source Code.
The "source code" for a work means the preferred form of the work
for making modifications to it. "Object code" means any non-source
form of a work.
A "Standard Interface" means an interface that either is an official
standard defined by a recognized standards body, or, in the case of
interfaces specified for a particular programming language, one that
is widely used among developers working in that language.
The "System Libraries" of an executable work include anything, other
than the work as a whole, that (a) is included in the normal form of
packaging a Major Component, but which is not part of that Major
Component, and (b) serves only to enable use of the work with that
Major Component, or to implement a Standard Interface for which an
implementation is available to the public in source code form. A
"Major Component", in this context, means a major essential component
(kernel, window system, and so on) of the specific operating system
(if any) on which the executable work runs, or a compiler used to
produce the work, or an object code interpreter used to run it.
The "Corresponding Source" for a work in object code form means all
the source code needed to generate, install, and (for an executable
work) run the object code and to modify the work, including scripts to
control those activities. However, it does not include the work's
System Libraries, or general-purpose tools or generally available free
programs which are used unmodified in performing those activities but
which are not part of the work. For example, Corresponding Source
includes interface definition files associated with source files for
the work, and the source code for shared libraries and dynamically
linked subprograms that the work is specifically designed to require,
such as by intimate data communication or control flow between those
subprograms and other parts of the work.
The Corresponding Source need not include anything that users
can regenerate automatically from other parts of the Corresponding
Source.
The Corresponding Source for a work in source code form is that
same work.
2. Basic Permissions.
All rights granted under this License are granted for the term of
copyright on the Program, and are irrevocable provided the stated
conditions are met. This License explicitly affirms your unlimited
permission to run the unmodified Program. The output from running a
covered work is covered by this License only if the output, given its
content, constitutes a covered work. This License acknowledges your
rights of fair use or other equivalent, as provided by copyright law.
You may make, run and propagate covered works that you do not
convey, without conditions so long as your license otherwise remains
in force. You may convey covered works to others for the sole purpose
of having them make modifications exclusively for you, or provide you
with facilities for running those works, provided that you comply with
the terms of this License in conveying all material for which you do
not control copyright. Those thus making or running the covered works
for you must do so exclusively on your behalf, under your direction
and control, on terms that prohibit them from making any copies of
your copyrighted material outside their relationship with you.
Conveying under any other circumstances is permitted solely under
the conditions stated below. Sublicensing is not allowed; section 10
makes it unnecessary.
3. Protecting Users' Legal Rights From Anti-Circumvention Law.
No covered work shall be deemed part of an effective technological
measure under any applicable law fulfilling obligations under article
11 of the WIPO copyright treaty adopted on 20 December 1996, or
similar laws prohibiting or restricting circumvention of such
measures.
When you convey a covered work, you waive any legal power to forbid
circumvention of technological measures to the extent such circumvention
is effected by exercising rights under this License with respect to
the covered work, and you disclaim any intention to limit operation or
modification of the work as a means of enforcing, against the work's
users, your or third parties' legal rights to forbid circumvention of
technological measures.
4. Conveying Verbatim Copies.
You may convey verbatim copies of the Program's source code as you
receive it, in any medium, provided that you conspicuously and
appropriately publish on each copy an appropriate copyright notice;
keep intact all notices stating that this License and any
non-permissive terms added in accord with section 7 apply to the code;
keep intact all notices of the absence of any warranty; and give all
recipients a copy of this License along with the Program.
You may charge any price or no price for each copy that you convey,
and you may offer support or warranty protection for a fee.
5. Conveying Modified Source Versions.
You may convey a work based on the Program, or the modifications to
produce it from the Program, in the form of source code under the
terms of section 4, provided that you also meet all of these conditions:
a) The work must carry prominent notices stating that you modified
it, and giving a relevant date.
b) The work must carry prominent notices stating that it is
released under this License and any conditions added under section
7. This requirement modifies the requirement in section 4 to
"keep intact all notices".
c) You must license the entire work, as a whole, under this
License to anyone who comes into possession of a copy. This
License will therefore apply, along with any applicable section 7
additional terms, to the whole of the work, and all its parts,
regardless of how they are packaged. This License gives no
permission to license the work in any other way, but it does not
invalidate such permission if you have separately received it.
d) If the work has interactive user interfaces, each must display
Appropriate Legal Notices; however, if the Program has interactive
interfaces that do not display Appropriate Legal Notices, your
work need not make them do so.
A compilation of a covered work with other separate and independent
works, which are not by their nature extensions of the covered work,
and which are not combined with it such as to form a larger program,
in or on a volume of a storage or distribution medium, is called an
"aggregate" if the compilation and its resulting copyright are not
used to limit the access or legal rights of the compilation's users
beyond what the individual works permit. Inclusion of a covered work
in an aggregate does not cause this License to apply to the other
parts of the aggregate.
6. Conveying Non-Source Forms.
You may convey a covered work in object code form under the terms
of sections 4 and 5, provided that you also convey the
machine-readable Corresponding Source under the terms of this License,
in one of these ways:
a) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by the
Corresponding Source fixed on a durable physical medium
customarily used for software interchange.
b) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by a
written offer, valid for at least three years and valid for as
long as you offer spare parts or customer support for that product
model, to give anyone who possesses the object code either (1) a
copy of the Corresponding Source for all the software in the
product that is covered by this License, on a durable physical
medium customarily used for software interchange, for a price no
more than your reasonable cost of physically performing this
conveying of source, or (2) access to copy the
Corresponding Source from a network server at no charge.
c) Convey individual copies of the object code with a copy of the
written offer to provide the Corresponding Source. This
alternative is allowed only occasionally and noncommercially, and
only if you received the object code with such an offer, in accord
with subsection 6b.
d) Convey the object code by offering access from a designated
place (gratis or for a charge), and offer equivalent access to the
Corresponding Source in the same way through the same place at no
further charge. You need not require recipients to copy the
Corresponding Source along with the object code. If the place to
copy the object code is a network server, the Corresponding Source
may be on a different server (operated by you or a third party)
that supports equivalent copying facilities, provided you maintain
clear directions next to the object code saying where to find the
Corresponding Source. Regardless of what server hosts the
Corresponding Source, you remain obligated to ensure that it is
available for as long as needed to satisfy these requirements.
e) Convey the object code using peer-to-peer transmission, provided
you inform other peers where the object code and Corresponding
Source of the work are being offered to the general public at no
charge under subsection 6d.
A separable portion of the object code, whose source code is excluded
from the Corresponding Source as a System Library, need not be
included in conveying the object code work.
A "User Product" is either (1) a "consumer product", which means any
tangible personal property which is normally used for personal, family,
or household purposes, or (2) anything designed or sold for incorporation
into a dwelling. In determining whether a product is a consumer product,
doubtful cases shall be resolved in favor of coverage. For a particular
product received by a particular user, "normally used" refers to a
typical or common use of that class of product, regardless of the status
of the particular user or of the way in which the particular user
actually uses, or expects or is expected to use, the product. A product
is a consumer product regardless of whether the product has substantial
commercial, industrial or non-consumer uses, unless such uses represent
the only significant mode of use of the product.
"Installation Information" for a User Product means any methods,
procedures, authorization keys, or other information required to install
and execute modified versions of a covered work in that User Product from
a modified version of its Corresponding Source. The information must
suffice to ensure that the continued functioning of the modified object
code is in no case prevented or interfered with solely because
modification has been made.
If you convey an object code work under this section in, or with, or
specifically for use in, a User Product, and the conveying occurs as
part of a transaction in which the right of possession and use of the
User Product is transferred to the recipient in perpetuity or for a
fixed term (regardless of how the transaction is characterized), the
Corresponding Source conveyed under this section must be accompanied
by the Installation Information. But this requirement does not apply
if neither you nor any third party retains the ability to install
modified object code on the User Product (for example, the work has
been installed in ROM).
The requirement to provide Installation Information does not include a
requirement to continue to provide support service, warranty, or updates
for a work that has been modified or installed by the recipient, or for
the User Product in which it has been modified or installed. Access to a
network may be denied when the modification itself materially and
adversely affects the operation of the network or violates the rules and
protocols for communication across the network.
Corresponding Source conveyed, and Installation Information provided,
in accord with this section must be in a format that is publicly
documented (and with an implementation available to the public in
source code form), and must require no special password or key for
unpacking, reading or copying.
7. Additional Terms.
"Additional permissions" are terms that supplement the terms of this
License by making exceptions from one or more of its conditions.
Additional permissions that are applicable to the entire Program shall
be treated as though they were included in this License, to the extent
that they are valid under applicable law. If additional permissions
apply only to part of the Program, that part may be used separately
under those permissions, but the entire Program remains governed by
this License without regard to the additional permissions.
When you convey a copy of a covered work, you may at your option
remove any additional permissions from that copy, or from any part of
it. (Additional permissions may be written to require their own
removal in certain cases when you modify the work.) You may place
additional permissions on material, added by you to a covered work,
for which you have or can give appropriate copyright permission.
Notwithstanding any other provision of this License, for material you
add to a covered work, you may (if authorized by the copyright holders of
that material) supplement the terms of this License with terms:
a) Disclaiming warranty or limiting liability differently from the
terms of sections 15 and 16 of this License; or
b) Requiring preservation of specified reasonable legal notices or
author attributions in that material or in the Appropriate Legal
Notices displayed by works containing it; or
c) Prohibiting misrepresentation of the origin of that material, or
requiring that modified versions of such material be marked in
reasonable ways as different from the original version; or
d) Limiting the use for publicity purposes of names of licensors or
authors of the material; or
e) Declining to grant rights under trademark law for use of some
trade names, trademarks, or service marks; or
f) Requiring indemnification of licensors and authors of that
material by anyone who conveys the material (or modified versions of
it) with contractual assumptions of liability to the recipient, for
any liability that these contractual assumptions directly impose on
those licensors and authors.
All other non-permissive additional terms are considered "further
restrictions" within the meaning of section 10. If the Program as you
received it, or any part of it, contains a notice stating that it is
governed by this License along with a term that is a further
restriction, you may remove that term. If a license document contains
a further restriction but permits relicensing or conveying under this
License, you may add to a covered work material governed by the terms
of that license document, provided that the further restriction does
not survive such relicensing or conveying.
If you add terms to a covered work in accord with this section, you
must place, in the relevant source files, a statement of the
additional terms that apply to those files, or a notice indicating
where to find the applicable terms.
Additional terms, permissive or non-permissive, may be stated in the
form of a separately written license, or stated as exceptions;
the above requirements apply either way.
8. Termination.
You may not propagate or modify a covered work except as expressly
provided under this License. Any attempt otherwise to propagate or
modify it is void, and will automatically terminate your rights under
this License (including any patent licenses granted under the third
paragraph of section 11).
However, if you cease all violation of this License, then your
license from a particular copyright holder is reinstated (a)
provisionally, unless and until the copyright holder explicitly and
finally terminates your license, and (b) permanently, if the copyright
holder fails to notify you of the violation by some reasonable means
prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is
reinstated permanently if the copyright holder notifies you of the
violation by some reasonable means, this is the first time you have
received notice of violation of this License (for any work) from that
copyright holder, and you cure the violation prior to 30 days after
your receipt of the notice.
Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under
this License. If your rights have been terminated and not permanently
reinstated, you do not qualify to receive new licenses for the same
material under section 10.
9. Acceptance Not Required for Having Copies.
You are not required to accept this License in order to receive or
run a copy of the Program. Ancillary propagation of a covered work
occurring solely as a consequence of using peer-to-peer transmission
to receive a copy likewise does not require acceptance. However,
nothing other than this License grants you permission to propagate or
modify any covered work. These actions infringe copyright if you do
not accept this License. Therefore, by modifying or propagating a
covered work, you indicate your acceptance of this License to do so.
10. Automatic Licensing of Downstream Recipients.
Each time you convey a covered work, the recipient automatically
receives a license from the original licensors, to run, modify and
propagate that work, subject to this License. You are not responsible
for enforcing compliance by third parties with this License.
An "entity transaction" is a transaction transferring control of an
organization, or substantially all assets of one, or subdividing an
organization, or merging organizations. If propagation of a covered
work results from an entity transaction, each party to that
transaction who receives a copy of the work also receives whatever
licenses to the work the party's predecessor in interest had or could
give under the previous paragraph, plus a right to possession of the
Corresponding Source of the work from the predecessor in interest, if
the predecessor has it or can get it with reasonable efforts.
You may not impose any further restrictions on the exercise of the
rights granted or affirmed under this License. For example, you may
not impose a license fee, royalty, or other charge for exercise of
rights granted under this License, and you may not initiate litigation
(including a cross-claim or counterclaim in a lawsuit) alleging that
any patent claim is infringed by making, using, selling, offering for
sale, or importing the Program or any portion of it.
11. Patents.
A "contributor" is a copyright holder who authorizes use under this
License of the Program or a work on which the Program is based. The
work thus licensed is called the contributor's "contributor version".
A contributor's "essential patent claims" are all patent claims
owned or controlled by the contributor, whether already acquired or
hereafter acquired, that would be infringed by some manner, permitted
by this License, of making, using, or selling its contributor version,
but do not include claims that would be infringed only as a
consequence of further modification of the contributor version. For
purposes of this definition, "control" includes the right to grant
patent sublicenses in a manner consistent with the requirements of
this License.
Each contributor grants you a non-exclusive, worldwide, royalty-free
patent license under the contributor's essential patent claims, to
make, use, sell, offer for sale, import and otherwise run, modify and
propagate the contents of its contributor version.
In the following three paragraphs, a "patent license" is any express
agreement or commitment, however denominated, not to enforce a patent
(such as an express permission to practice a patent or covenant not to
sue for patent infringement). To "grant" such a patent license to a
party means to make such an agreement or commitment not to enforce a
patent against the party.
If you convey a covered work, knowingly relying on a patent license,
and the Corresponding Source of the work is not available for anyone
to copy, free of charge and under the terms of this License, through a
publicly available network server or other readily accessible means,
then you must either (1) cause the Corresponding Source to be so
available, or (2) arrange to deprive yourself of the benefit of the
patent license for this particular work, or (3) arrange, in a manner
consistent with the requirements of this License, to extend the patent
license to downstream recipients. "Knowingly relying" means you have
actual knowledge that, but for the patent license, your conveying the
covered work in a country, or your recipient's use of the covered work
in a country, would infringe one or more identifiable patents in that
country that you have reason to believe are valid.
If, pursuant to or in connection with a single transaction or
arrangement, you convey, or propagate by procuring conveyance of, a
covered work, and grant a patent license to some of the parties
receiving the covered work authorizing them to use, propagate, modify
or convey a specific copy of the covered work, then the patent license
you grant is automatically extended to all recipients of the covered
work and works based on it.
A patent license is "discriminatory" if it does not include within
the scope of its coverage, prohibits the exercise of, or is
conditioned on the non-exercise of one or more of the rights that are
specifically granted under this License. You may not convey a covered
work if you are a party to an arrangement with a third party that is
in the business of distributing software, under which you make payment
to the third party based on the extent of your activity of conveying
the work, and under which the third party grants, to any of the
parties who would receive the covered work from you, a discriminatory
patent license (a) in connection with copies of the covered work
conveyed by you (or copies made from those copies), or (b) primarily
for and in connection with specific products or compilations that
contain the covered work, unless you entered into that arrangement,
or that patent license was granted, prior to 28 March 2007.
Nothing in this License shall be construed as excluding or limiting
any implied license or other defenses to infringement that may
otherwise be available to you under applicable patent law.
12. No Surrender of Others' Freedom.
If conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot convey a
covered work so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you may
not convey it at all. For example, if you agree to terms that obligate you
to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.
13. Remote Network Interaction; Use with the GNU General Public License.
Notwithstanding any other provision of this License, if you modify the
Program, your modified version must prominently offer all users
interacting with it remotely through a computer network (if your version
supports such interaction) an opportunity to receive the Corresponding
Source of your version by providing access to the Corresponding Source
from a network server at no charge, through some standard or customary
means of facilitating copying of software. This Corresponding Source
shall include the Corresponding Source for any work covered by version 3
of the GNU General Public License that is incorporated pursuant to the
following paragraph.
Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU General Public License into a single
combined work, and to convey the resulting work. The terms of this
License will continue to apply to the part which is the covered work,
but the work with which it is combined will remain governed by version
3 of the GNU General Public License.
14. Revised Versions of this License.
The Free Software Foundation may publish revised and/or new versions of
the GNU Affero General Public License from time to time. Such new versions
will be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the
Program specifies that a certain numbered version of the GNU Affero General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation. If the Program does not specify a version number of the
GNU Affero General Public License, you may choose any version ever published
by the Free Software Foundation.
If the Program specifies that a proxy can decide which future
versions of the GNU Affero General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.
Later license versions may give you additional or different
permissions. However, no additional obligations are imposed on any
author or copyright holder as a result of your choosing to follow a
later version.
15. Disclaimer of Warranty.
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
16. Limitation of Liability.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.
17. Interpretation of Sections 15 and 16.
If the disclaimer of warranty and limitation of liability provided
above cannot be given local legal effect according to their terms,
reviewing courts shall apply local law that most closely approximates
an absolute waiver of all civil liability in connection with the
Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
state the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
Also add information on how to contact you by electronic and paper mail.
If your software can interact with users remotely through a computer
network, you should also make sure that it provides a way for users to
get its source. For example, if your program is a web application, its
interface could display a "Source" link that leads users to an archive
of the code. There are many ways you could offer source, and different
solutions will be better for different programs; see section 13 for the
specific requirements.
You should also get your employer (if you work as a programmer) or school,
if any, to sign a "copyright disclaimer" for the program, if necessary.
For more information on this, and how to apply and follow the GNU AGPL, see
<http://www.gnu.org/licenses/>.

View File

@@ -0,0 +1,156 @@
# This file was auto-generated, do not edit it directly.
# Instead run bin/update_build_scripts from
# https://github.com/overleaf/internal/
BUILD_NUMBER ?= local
BRANCH_NAME ?= $(shell git rev-parse --abbrev-ref HEAD)
PROJECT_NAME = document-updater
BUILD_DIR_NAME = $(shell pwd | xargs basename | tr -cd '[a-zA-Z0-9_.\-]')
DOCKER_COMPOSE_FLAGS ?= -f docker-compose.yml
DOCKER_COMPOSE := BUILD_NUMBER=$(BUILD_NUMBER) \
BRANCH_NAME=$(BRANCH_NAME) \
PROJECT_NAME=$(PROJECT_NAME) \
MOCHA_GREP=${MOCHA_GREP} \
docker compose ${DOCKER_COMPOSE_FLAGS}
COMPOSE_PROJECT_NAME_TEST_ACCEPTANCE ?= test_acceptance_$(BUILD_DIR_NAME)
DOCKER_COMPOSE_TEST_ACCEPTANCE = \
COMPOSE_PROJECT_NAME=$(COMPOSE_PROJECT_NAME_TEST_ACCEPTANCE) $(DOCKER_COMPOSE)
COMPOSE_PROJECT_NAME_TEST_UNIT ?= test_unit_$(BUILD_DIR_NAME)
DOCKER_COMPOSE_TEST_UNIT = \
COMPOSE_PROJECT_NAME=$(COMPOSE_PROJECT_NAME_TEST_UNIT) $(DOCKER_COMPOSE)
clean:
-docker rmi ci/$(PROJECT_NAME):$(BRANCH_NAME)-$(BUILD_NUMBER)
-docker rmi us-east1-docker.pkg.dev/overleaf-ops/ol-docker/$(PROJECT_NAME):$(BRANCH_NAME)-$(BUILD_NUMBER)
-$(DOCKER_COMPOSE_TEST_UNIT) down --rmi local
-$(DOCKER_COMPOSE_TEST_ACCEPTANCE) down --rmi local
HERE=$(shell pwd)
MONOREPO=$(shell cd ../../ && pwd)
# Run the linting commands in the scope of the monorepo.
# Eslint and prettier (plus some configs) are on the root.
RUN_LINTING = docker run --rm -v $(MONOREPO):$(MONOREPO) -w $(HERE) node:20.18.2 npm run --silent
RUN_LINTING_CI = docker run --rm --volume $(MONOREPO)/.editorconfig:/overleaf/.editorconfig --volume $(MONOREPO)/.eslintignore:/overleaf/.eslintignore --volume $(MONOREPO)/.eslintrc:/overleaf/.eslintrc --volume $(MONOREPO)/.prettierignore:/overleaf/.prettierignore --volume $(MONOREPO)/.prettierrc:/overleaf/.prettierrc --volume $(MONOREPO)/tsconfig.backend.json:/overleaf/tsconfig.backend.json ci/$(PROJECT_NAME):$(BRANCH_NAME)-$(BUILD_NUMBER) npm run --silent
# Same but from the top of the monorepo
RUN_LINTING_MONOREPO = docker run --rm -v $(MONOREPO):$(MONOREPO) -w $(MONOREPO) node:20.18.2 npm run --silent
SHELLCHECK_OPTS = \
--shell=bash \
--external-sources
SHELLCHECK_COLOR := $(if $(CI),--color=never,--color)
SHELLCHECK_FILES := { git ls-files "*.sh" -z; git grep -Plz "\A\#\!.*bash"; } | sort -zu
shellcheck:
@$(SHELLCHECK_FILES) | xargs -0 -r docker run --rm -v $(HERE):/mnt -w /mnt \
koalaman/shellcheck:stable $(SHELLCHECK_OPTS) $(SHELLCHECK_COLOR)
shellcheck_fix:
@$(SHELLCHECK_FILES) | while IFS= read -r -d '' file; do \
diff=$$(docker run --rm -v $(HERE):/mnt -w /mnt koalaman/shellcheck:stable $(SHELLCHECK_OPTS) --format=diff "$$file" 2>/dev/null); \
if [ -n "$$diff" ] && ! echo "$$diff" | patch -p1 >/dev/null 2>&1; then echo "\033[31m$$file\033[0m"; \
elif [ -n "$$diff" ]; then echo "$$file"; \
else echo "\033[2m$$file\033[0m"; fi \
done
format:
$(RUN_LINTING) format
format_ci:
$(RUN_LINTING_CI) format
format_fix:
$(RUN_LINTING) format:fix
lint:
$(RUN_LINTING) lint
lint_ci:
$(RUN_LINTING_CI) lint
lint_fix:
$(RUN_LINTING) lint:fix
typecheck:
$(RUN_LINTING) types:check
typecheck_ci:
$(RUN_LINTING_CI) types:check
test: format lint typecheck shellcheck test_unit test_acceptance
test_unit:
ifneq (,$(wildcard test/unit))
$(DOCKER_COMPOSE_TEST_UNIT) run --rm test_unit
$(MAKE) test_unit_clean
endif
test_clean: test_unit_clean
test_unit_clean:
ifneq (,$(wildcard test/unit))
$(DOCKER_COMPOSE_TEST_UNIT) down -v -t 0
endif
test_acceptance: test_acceptance_clean test_acceptance_pre_run test_acceptance_run
$(MAKE) test_acceptance_clean
test_acceptance_debug: test_acceptance_clean test_acceptance_pre_run test_acceptance_run_debug
$(MAKE) test_acceptance_clean
test_acceptance_run:
ifneq (,$(wildcard test/acceptance))
$(DOCKER_COMPOSE_TEST_ACCEPTANCE) run --rm test_acceptance
endif
test_acceptance_run_debug:
ifneq (,$(wildcard test/acceptance))
$(DOCKER_COMPOSE_TEST_ACCEPTANCE) run -p 127.0.0.9:19999:19999 --rm test_acceptance npm run test:acceptance -- --inspect=0.0.0.0:19999 --inspect-brk
endif
test_clean: test_acceptance_clean
test_acceptance_clean:
$(DOCKER_COMPOSE_TEST_ACCEPTANCE) down -v -t 0
test_acceptance_pre_run:
ifneq (,$(wildcard test/acceptance/js/scripts/pre-run))
$(DOCKER_COMPOSE_TEST_ACCEPTANCE) run --rm test_acceptance test/acceptance/js/scripts/pre-run
endif
benchmarks:
$(DOCKER_COMPOSE_TEST_ACCEPTANCE) run --rm test_acceptance npm run benchmarks
build:
docker build \
--pull \
--build-arg BUILDKIT_INLINE_CACHE=1 \
--tag ci/$(PROJECT_NAME):$(BRANCH_NAME)-$(BUILD_NUMBER) \
--tag us-east1-docker.pkg.dev/overleaf-ops/ol-docker/$(PROJECT_NAME):$(BRANCH_NAME)-$(BUILD_NUMBER) \
--tag us-east1-docker.pkg.dev/overleaf-ops/ol-docker/$(PROJECT_NAME):$(BRANCH_NAME) \
--cache-from us-east1-docker.pkg.dev/overleaf-ops/ol-docker/$(PROJECT_NAME):$(BRANCH_NAME) \
--cache-from us-east1-docker.pkg.dev/overleaf-ops/ol-docker/$(PROJECT_NAME):main \
--file Dockerfile \
../..
tar:
$(DOCKER_COMPOSE) up tar
publish:
docker push $(DOCKER_REPO)/$(PROJECT_NAME):$(BRANCH_NAME)-$(BUILD_NUMBER)
.PHONY: clean \
format format_fix \
lint lint_fix \
build_types typecheck \
lint_ci format_ci typecheck_ci \
shellcheck shellcheck_fix \
test test_clean test_unit test_unit_clean \
test_acceptance test_acceptance_debug test_acceptance_pre_run \
test_acceptance_run test_acceptance_run_debug test_acceptance_clean \
benchmarks \
build tar publish \

View File

@@ -0,0 +1,12 @@
overleaf/document-updater
===========================
An API for applying incoming updates to documents in real-time.
License
-------
The code in this repository is released under the GNU AFFERO GENERAL PUBLIC LICENSE, version 3. A copy can be found in the `LICENSE` file.
Copyright (c) Overleaf, 2014-2019.

View File

@@ -0,0 +1,299 @@
// Metrics must be initialized before importing anything else
require('@overleaf/metrics/initialize')
const Metrics = require('@overleaf/metrics')
const express = require('express')
const Settings = require('@overleaf/settings')
const logger = require('@overleaf/logger')
logger.initialize('document-updater')
logger.logger.addSerializers(require('./app/js/LoggerSerializers'))
const RedisManager = require('./app/js/RedisManager')
const DispatchManager = require('./app/js/DispatchManager')
const DeleteQueueManager = require('./app/js/DeleteQueueManager')
const Errors = require('./app/js/Errors')
const HttpController = require('./app/js/HttpController')
const mongodb = require('./app/js/mongodb')
const async = require('async')
const bodyParser = require('body-parser')
Metrics.event_loop.monitor(logger, 100)
Metrics.open_sockets.monitor()
const app = express()
app.use(bodyParser.json({ limit: Settings.maxJsonRequestSize }))
Metrics.injectMetricsRoute(app)
DispatchManager.createAndStartDispatchers(Settings.dispatcherCount)
app.get('/status', (req, res) => {
if (Settings.shuttingDown) {
return res.sendStatus(503) // Service unavailable
} else {
return res.send('document updater is alive')
}
})
const pubsubClient = require('@overleaf/redis-wrapper').createClient(
Settings.redis.pubsub
)
app.get('/health_check/redis', (req, res, next) => {
pubsubClient.healthCheck(error => {
if (error) {
logger.err({ err: error }, 'failed redis health check')
return res.sendStatus(500)
} else {
return res.sendStatus(200)
}
})
})
const docUpdaterRedisClient = require('@overleaf/redis-wrapper').createClient(
Settings.redis.documentupdater
)
app.get('/health_check/redis_cluster', (req, res, next) => {
docUpdaterRedisClient.healthCheck(error => {
if (error) {
logger.err({ err: error }, 'failed redis cluster health check')
return res.sendStatus(500)
} else {
return res.sendStatus(200)
}
})
})
app.get('/health_check', (req, res, next) => {
async.series(
[
cb => {
pubsubClient.healthCheck(error => {
if (error) {
logger.err({ err: error }, 'failed redis health check')
}
cb(error)
})
},
cb => {
docUpdaterRedisClient.healthCheck(error => {
if (error) {
logger.err({ err: error }, 'failed redis cluster health check')
}
cb(error)
})
},
cb => {
mongodb.healthCheck(error => {
if (error) {
logger.err({ err: error }, 'failed mongo health check')
}
cb(error)
})
},
],
error => {
if (error) {
return res.sendStatus(500)
} else {
return res.sendStatus(200)
}
}
)
})
// record http metrics for the routes below this point
app.use(Metrics.http.monitor(logger))
app.param('project_id', (req, res, next, projectId) => {
if (projectId != null && projectId.match(/^[0-9a-f]{24}$/)) {
return next()
} else {
return next(new Error('invalid project id'))
}
})
app.param('doc_id', (req, res, next, docId) => {
if (docId != null && docId.match(/^[0-9a-f]{24}$/)) {
return next()
} else {
return next(new Error('invalid doc id'))
}
})
// Record requests that come in after we've started shutting down - for investigation.
app.use((req, res, next) => {
if (Settings.shuttingDown) {
logger.warn(
{ req, timeSinceShutdown: Date.now() - Settings.shutDownTime },
'request received after shutting down'
)
// We don't want keep-alive connections to be kept open when the server is shutting down.
res.set('Connection', 'close')
}
next()
})
app.get('/project/:project_id/doc/:doc_id', HttpController.getDoc)
app.get(
'/project/:project_id/doc/:doc_id/comment/:comment_id',
HttpController.getComment
)
app.get('/project/:project_id/doc/:doc_id/peek', HttpController.peekDoc)
// temporarily keep the GET method for backwards compatibility
app.get('/project/:project_id/doc', HttpController.getProjectDocsAndFlushIfOld)
// will migrate to the POST method of get_and_flush_if_old instead
app.post(
'/project/:project_id/get_and_flush_if_old',
HttpController.getProjectDocsAndFlushIfOld
)
app.get(
'/project/:project_id/last_updated_at',
HttpController.getProjectLastUpdatedAt
)
app.post('/project/:project_id/clearState', HttpController.clearProjectState)
app.post('/project/:project_id/doc/:doc_id', HttpController.setDoc)
app.post('/project/:project_id/doc/:doc_id/append', HttpController.appendToDoc)
app.post(
'/project/:project_id/doc/:doc_id/flush',
HttpController.flushDocIfLoaded
)
app.delete('/project/:project_id/doc/:doc_id', HttpController.deleteDoc)
app.delete('/project/:project_id', HttpController.deleteProject)
app.delete('/project', HttpController.deleteMultipleProjects)
app.post('/project/:project_id', HttpController.updateProject)
app.post(
'/project/:project_id/history/resync',
longerTimeout,
HttpController.resyncProjectHistory
)
app.post('/project/:project_id/flush', HttpController.flushProject)
app.post(
'/project/:project_id/doc/:doc_id/change/:change_id/accept',
HttpController.acceptChanges
)
app.post(
'/project/:project_id/doc/:doc_id/change/accept',
HttpController.acceptChanges
)
app.post(
'/project/:project_id/doc/:doc_id/comment/:comment_id/resolve',
HttpController.resolveComment
)
app.post(
'/project/:project_id/doc/:doc_id/comment/:comment_id/reopen',
HttpController.reopenComment
)
app.delete(
'/project/:project_id/doc/:doc_id/comment/:comment_id',
HttpController.deleteComment
)
app.post('/project/:project_id/block', HttpController.blockProject)
app.post('/project/:project_id/unblock', HttpController.unblockProject)
app.get('/flush_queued_projects', HttpController.flushQueuedProjects)
app.get('/total', (req, res, next) => {
const timer = new Metrics.Timer('http.allDocList')
RedisManager.getCountOfDocsInMemory((err, count) => {
if (err) {
return next(err)
}
timer.done()
res.send({ total: count })
})
})
app.use((error, req, res, next) => {
if (error instanceof Errors.NotFoundError) {
return res.sendStatus(404)
} else if (error instanceof Errors.OpRangeNotAvailableError) {
return res.status(422).json(error.info)
} else if (error instanceof Errors.FileTooLargeError) {
return res.sendStatus(413)
} else if (error.statusCode === 413) {
return res.status(413).send('request entity too large')
} else {
logger.error({ err: error, req }, 'request errored')
return res.status(500).send('Oops, something went wrong')
}
})
const shutdownCleanly = signal => () => {
logger.info({ signal }, 'received interrupt, cleaning up')
if (Settings.shuttingDown) {
logger.warn({ signal }, 'already shutting down, ignoring interrupt')
return
}
Settings.shuttingDown = true
// record the time we started shutting down
Settings.shutDownTime = Date.now()
setTimeout(() => {
logger.info({ signal }, 'shutting down')
process.exit()
}, Settings.gracefulShutdownDelayInMs)
}
const watchForEvent = eventName => {
docUpdaterRedisClient.on(eventName, e => {
console.log(`redis event: ${eventName} ${e}`) // eslint-disable-line no-console
})
}
const events = ['connect', 'ready', 'error', 'close', 'reconnecting', 'end']
for (const eventName of events) {
watchForEvent(eventName)
}
const port =
Settings.internal.documentupdater.port ||
(Settings.api &&
Settings.api.documentupdater &&
Settings.api.documentupdater.port) ||
3003
const host = Settings.internal.documentupdater.host || '127.0.0.1'
if (!module.parent) {
// Called directly
mongodb.mongoClient
.connect()
.then(() => {
app.listen(port, host, function (err) {
if (err) {
logger.fatal({ err }, `Cannot bind to ${host}:${port}. Exiting.`)
process.exit(1)
}
logger.info(
`Document-updater starting up, listening on ${host}:${port}`
)
if (Settings.continuousBackgroundFlush) {
logger.info('Starting continuous background flush')
DeleteQueueManager.startBackgroundFlush()
}
})
})
.catch(err => {
logger.fatal({ err }, 'Cannot connect to mongo. Exiting.')
process.exit(1)
})
}
module.exports = app
for (const signal of [
'SIGINT',
'SIGHUP',
'SIGQUIT',
'SIGUSR1',
'SIGUSR2',
'SIGTERM',
'SIGABRT',
]) {
process.on(signal, shutdownCleanly(signal))
}
function longerTimeout(req, res, next) {
res.setTimeout(6 * 60 * 1000)
next()
}

View File

@@ -0,0 +1,145 @@
/* eslint-disable
no-unused-vars,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
let DeleteQueueManager
const Settings = require('@overleaf/settings')
const RedisManager = require('./RedisManager')
const ProjectManager = require('./ProjectManager')
const logger = require('@overleaf/logger')
const metrics = require('./Metrics')
// Maintain a sorted set of project flushAndDelete requests, ordered by timestamp
// (ZADD), and process them from oldest to newest. A flushAndDelete request comes
// from real-time and is triggered when a user leaves a project.
//
// The aim is to remove the project from redis 5 minutes after the last request
// if there has been no activity (document updates) in that time. If there is
// activity we can expect a further flushAndDelete request when the editing user
// leaves the project.
//
// If a new flushAndDelete request comes in while an existing request is already
// in the queue we update the timestamp as we can postpone flushing further.
//
// Documents are processed by checking the queue, seeing if the first entry is
// older than 5 minutes, and popping it from the queue in that case.
module.exports = DeleteQueueManager = {
flushAndDeleteOldProjects(options, callback) {
const startTime = Date.now()
const cutoffTime =
startTime - options.min_delete_age + 100 * (Math.random() - 0.5)
let count = 0
const flushProjectIfNotModified = (projectId, flushTimestamp, cb) =>
ProjectManager.getProjectDocsTimestamps(
projectId,
function (err, timestamps) {
if (err != null) {
return callback(err)
}
if (timestamps.length === 0) {
logger.debug(
{ projectId },
'skipping flush of queued project - no timestamps'
)
return cb()
}
// are any of the timestamps newer than the time the project was flushed?
for (const timestamp of Array.from(timestamps)) {
if (timestamp > flushTimestamp) {
metrics.inc('queued-delete-skipped')
logger.debug(
{ projectId, timestamps, flushTimestamp },
'found newer timestamp, will skip delete'
)
return cb()
}
}
logger.debug({ projectId, flushTimestamp }, 'flushing queued project')
return ProjectManager.flushAndDeleteProjectWithLocks(
projectId,
{ skip_history_flush: false },
function (err) {
if (err != null) {
logger.err({ projectId, err }, 'error flushing queued project')
}
metrics.inc('queued-delete-completed')
return cb(null, true)
}
)
}
)
function flushNextProject() {
const now = Date.now()
if (now - startTime > options.timeout) {
logger.debug('hit time limit on flushing old projects')
return callback(null, count)
}
if (count > options.limit) {
logger.debug('hit count limit on flushing old projects')
return callback(null, count)
}
return RedisManager.getNextProjectToFlushAndDelete(
cutoffTime,
function (err, projectId, flushTimestamp, queueLength) {
if (err != null) {
return callback(err, count)
}
if (projectId == null) {
return callback(null, count)
}
logger.debug({ projectId, queueLength }, 'flushing queued project')
metrics.globalGauge('queued-flush-backlog', queueLength)
return flushProjectIfNotModified(
projectId,
flushTimestamp,
function (err, flushed) {
if (err) {
// Do not stop processing the queue in case the flush fails.
// Slowing down the processing can fill up redis.
metrics.inc('queued-delete-error')
}
if (flushed) {
count++
}
return flushNextProject()
}
)
}
)
}
return flushNextProject()
},
startBackgroundFlush() {
const SHORT_DELAY = 10
const LONG_DELAY = 1000
function doFlush() {
if (Settings.shuttingDown) {
logger.info('discontinuing background flush due to shutdown')
return
}
return DeleteQueueManager.flushAndDeleteOldProjects(
{
timeout: 1000,
min_delete_age: 3 * 60 * 1000,
limit: 1000, // high value, to ensure we always flush enough projects
},
(_err, flushed) =>
setTimeout(doFlush, flushed > 10 ? SHORT_DELAY : LONG_DELAY)
)
}
return doFlush()
},
}

View File

@@ -0,0 +1,40 @@
const DMP = require('diff-match-patch')
const dmp = new DMP()
// Do not attempt to produce a diff for more than 100ms
dmp.Diff_Timeout = 0.1
module.exports = {
ADDED: 1,
REMOVED: -1,
UNCHANGED: 0,
diffAsShareJsOp(before, after) {
const diffs = dmp.diff_main(before.join('\n'), after.join('\n'))
dmp.diff_cleanupSemantic(diffs)
const ops = []
let position = 0
for (const diff of diffs) {
const type = diff[0]
const content = diff[1]
if (type === this.ADDED) {
ops.push({
i: content,
p: position,
})
position += content.length
} else if (type === this.REMOVED) {
ops.push({
d: content,
p: position,
})
} else if (type === this.UNCHANGED) {
position += content.length
} else {
throw new Error('Unknown type')
}
}
return ops
},
}

View File

@@ -0,0 +1,112 @@
/* eslint-disable
no-unused-vars,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS202: Simplify dynamic range loops
* DS205: Consider reworking code to avoid use of IIFEs
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
let DispatchManager
const Settings = require('@overleaf/settings')
const logger = require('@overleaf/logger')
const Keys = require('./UpdateKeys')
const redis = require('@overleaf/redis-wrapper')
const Errors = require('./Errors')
const _ = require('lodash')
const UpdateManager = require('./UpdateManager')
const Metrics = require('./Metrics')
const RateLimitManager = require('./RateLimitManager')
module.exports = DispatchManager = {
createDispatcher(RateLimiter, queueShardNumber) {
let pendingListKey
if (queueShardNumber === 0) {
pendingListKey = 'pending-updates-list'
} else {
pendingListKey = `pending-updates-list-${queueShardNumber}`
}
const client = redis.createClient(Settings.redis.documentupdater)
const worker = {
client,
_waitForUpdateThenDispatchWorker(callback) {
if (callback == null) {
callback = function () {}
}
const timer = new Metrics.Timer('worker.waiting')
return worker.client.blpop(pendingListKey, 0, function (error, result) {
logger.debug(`getting ${queueShardNumber}`, error, result)
timer.done()
if (error != null) {
return callback(error)
}
if (result == null) {
return callback()
}
const [listName, docKey] = Array.from(result)
const [projectId, docId] = Array.from(
Keys.splitProjectIdAndDocId(docKey)
)
// Dispatch this in the background
const backgroundTask = cb =>
UpdateManager.processOutstandingUpdatesWithLock(
projectId,
docId,
function (error) {
// log everything except OpRangeNotAvailable errors, these are normal
if (error != null) {
// downgrade OpRangeNotAvailable and "Delete component" errors so they are not sent to sentry
const logAsDebug =
error instanceof Errors.OpRangeNotAvailableError ||
error instanceof Errors.DeleteMismatchError
if (logAsDebug) {
logger.debug(
{ err: error, projectId, docId },
'error processing update'
)
} else {
logger.error(
{ err: error, projectId, docId },
'error processing update'
)
}
}
return cb()
}
)
return RateLimiter.run(backgroundTask, callback)
})
},
run() {
if (Settings.shuttingDown) {
return
}
return worker._waitForUpdateThenDispatchWorker(error => {
if (error != null) {
logger.error({ err: error }, 'Error in worker process')
throw error
} else {
return worker.run()
}
})
},
}
return worker
},
createAndStartDispatchers(number) {
const RateLimiter = new RateLimitManager(number)
_.times(number, function (shardNumber) {
return DispatchManager.createDispatcher(RateLimiter, shardNumber).run()
})
},
}

View File

@@ -0,0 +1,708 @@
const { callbackifyAll } = require('@overleaf/promise-utils')
const RedisManager = require('./RedisManager')
const ProjectHistoryRedisManager = require('./ProjectHistoryRedisManager')
const PersistenceManager = require('./PersistenceManager')
const DiffCodec = require('./DiffCodec')
const logger = require('@overleaf/logger')
const Metrics = require('./Metrics')
const HistoryManager = require('./HistoryManager')
const Errors = require('./Errors')
const RangesManager = require('./RangesManager')
const { extractOriginOrSource } = require('./Utils')
const { getTotalSizeOfLines } = require('./Limits')
const Settings = require('@overleaf/settings')
const MAX_UNFLUSHED_AGE = 300 * 1000 // 5 mins, document should be flushed to mongo this time after a change
const DocumentManager = {
async getDoc(projectId, docId) {
const {
lines,
version,
ranges,
resolvedCommentIds,
pathname,
projectHistoryId,
unflushedTime,
historyRangesSupport,
} = await RedisManager.promises.getDoc(projectId, docId)
if (lines == null || version == null) {
logger.debug(
{ projectId, docId },
'doc not in redis so getting from persistence API'
)
const {
lines,
version,
ranges,
resolvedCommentIds,
pathname,
projectHistoryId,
historyRangesSupport,
} = await PersistenceManager.promises.getDoc(projectId, docId)
logger.debug(
{
projectId,
docId,
lines,
ranges,
resolvedCommentIds,
version,
pathname,
projectHistoryId,
historyRangesSupport,
},
'got doc from persistence API'
)
await RedisManager.promises.putDocInMemory(
projectId,
docId,
lines,
version,
ranges,
resolvedCommentIds,
pathname,
projectHistoryId,
historyRangesSupport
)
return {
lines,
version,
ranges: ranges || {},
resolvedCommentIds,
pathname,
projectHistoryId,
unflushedTime: null,
alreadyLoaded: false,
historyRangesSupport,
}
} else {
return {
lines,
version,
ranges,
pathname,
projectHistoryId,
resolvedCommentIds,
unflushedTime,
alreadyLoaded: true,
historyRangesSupport,
}
}
},
async getDocAndRecentOps(projectId, docId, fromVersion) {
const { lines, version, ranges, pathname, projectHistoryId } =
await DocumentManager.getDoc(projectId, docId)
if (fromVersion === -1) {
return { lines, version, ops: [], ranges, pathname, projectHistoryId }
} else {
const ops = await RedisManager.promises.getPreviousDocOps(
docId,
fromVersion,
version
)
return {
lines,
version,
ops,
ranges,
pathname,
projectHistoryId,
}
}
},
async appendToDoc(projectId, docId, linesToAppend, originOrSource, userId) {
const { lines: currentLines } = await DocumentManager.getDoc(
projectId,
docId
)
const currentLineSize = getTotalSizeOfLines(currentLines)
const addedSize = getTotalSizeOfLines(linesToAppend)
const newlineSize = '\n'.length
if (currentLineSize + newlineSize + addedSize > Settings.max_doc_length) {
throw new Errors.FileTooLargeError(
'doc would become too large if appending this text'
)
}
return await DocumentManager.setDoc(
projectId,
docId,
currentLines.concat(linesToAppend),
originOrSource,
userId,
false,
false
)
},
async setDoc(
projectId,
docId,
newLines,
originOrSource,
userId,
undoing,
external
) {
if (newLines == null) {
throw new Error('No lines were provided to setDoc')
}
const UpdateManager = require('./UpdateManager')
const {
lines: oldLines,
version,
alreadyLoaded,
} = await DocumentManager.getDoc(projectId, docId)
if (oldLines != null && oldLines.length > 0 && oldLines[0].text != null) {
logger.debug(
{ docId, projectId, oldLines, newLines },
'document is JSON so not updating'
)
return
}
logger.debug(
{ docId, projectId, oldLines, newLines },
'setting a document via http'
)
const op = DiffCodec.diffAsShareJsOp(oldLines, newLines)
if (undoing) {
for (const o of op || []) {
o.u = true
} // Turn on undo flag for each op for track changes
}
const { origin, source } = extractOriginOrSource(originOrSource)
const update = {
doc: docId,
op,
v: version,
meta: {
user_id: userId,
},
}
if (external) {
update.meta.type = 'external'
}
if (origin) {
update.meta.origin = origin
} else if (source) {
update.meta.source = source
}
// Keep track of external updates, whether they are for live documents
// (flush) or unloaded documents (evict), and whether the update is a no-op.
Metrics.inc('external-update', 1, {
status: op.length > 0 ? 'diff' : 'noop',
method: alreadyLoaded ? 'flush' : 'evict',
path: source,
})
// Do not notify the frontend about a noop update.
// We still want to execute the code below
// to evict the doc if we loaded it into redis for
// this update, otherwise the doc would never be
// removed from redis.
if (op.length > 0) {
await UpdateManager.promises.applyUpdate(projectId, docId, update)
}
// If the document was loaded already, then someone has it open
// in a project, and the usual flushing mechanism will happen.
// Otherwise we should remove it immediately since nothing else
// is using it.
if (alreadyLoaded) {
return await DocumentManager.flushDocIfLoaded(projectId, docId)
} else {
try {
return await DocumentManager.flushAndDeleteDoc(projectId, docId, {})
} finally {
// There is no harm in flushing project history if the previous
// call failed and sometimes it is required
HistoryManager.flushProjectChangesAsync(projectId)
}
}
},
async flushDocIfLoaded(projectId, docId) {
const {
lines,
version,
ranges,
unflushedTime,
lastUpdatedAt,
lastUpdatedBy,
} = await RedisManager.promises.getDoc(projectId, docId)
if (lines == null || version == null) {
Metrics.inc('flush-doc-if-loaded', 1, { status: 'not-loaded' })
logger.debug({ projectId, docId }, 'doc is not loaded so not flushing')
// TODO: return a flag to bail out, as we go on to remove doc from memory?
return
} else if (unflushedTime == null) {
Metrics.inc('flush-doc-if-loaded', 1, { status: 'unmodified' })
logger.debug({ projectId, docId }, 'doc is not modified so not flushing')
return
}
logger.debug({ projectId, docId, version }, 'flushing doc')
Metrics.inc('flush-doc-if-loaded', 1, { status: 'modified' })
const result = await PersistenceManager.promises.setDoc(
projectId,
docId,
lines,
version,
ranges,
lastUpdatedAt,
lastUpdatedBy || null
)
await RedisManager.promises.clearUnflushedTime(docId)
return result
},
async flushAndDeleteDoc(projectId, docId, options) {
let result
try {
result = await DocumentManager.flushDocIfLoaded(projectId, docId)
} catch (error) {
if (options.ignoreFlushErrors) {
logger.warn(
{ projectId, docId, err: error },
'ignoring flush error while deleting document'
)
} else {
throw error
}
}
await RedisManager.promises.removeDocFromMemory(projectId, docId)
return result
},
async acceptChanges(projectId, docId, changeIds) {
if (changeIds == null) {
changeIds = []
}
const {
lines,
version,
ranges,
pathname,
projectHistoryId,
historyRangesSupport,
} = await DocumentManager.getDoc(projectId, docId)
if (lines == null || version == null) {
throw new Errors.NotFoundError(`document not found: ${docId}`)
}
const newRanges = RangesManager.acceptChanges(
projectId,
docId,
changeIds,
ranges,
lines
)
await RedisManager.promises.updateDocument(
projectId,
docId,
lines,
version,
[],
newRanges,
{}
)
if (historyRangesSupport) {
const historyUpdates = RangesManager.getHistoryUpdatesForAcceptedChanges({
docId,
acceptedChangeIds: changeIds,
changes: ranges.changes || [],
lines,
pathname,
projectHistoryId,
})
if (historyUpdates.length === 0) {
return
}
await ProjectHistoryRedisManager.promises.queueOps(
projectId,
...historyUpdates.map(op => JSON.stringify(op))
)
}
},
async updateCommentState(projectId, docId, commentId, userId, resolved) {
const { lines, version, pathname, historyRangesSupport } =
await DocumentManager.getDoc(projectId, docId)
if (lines == null || version == null) {
throw new Errors.NotFoundError(`document not found: ${docId}`)
}
if (historyRangesSupport) {
await RedisManager.promises.updateCommentState(docId, commentId, resolved)
await ProjectHistoryRedisManager.promises.queueOps(
projectId,
JSON.stringify({
pathname,
commentId,
resolved,
meta: {
ts: new Date(),
user_id: userId,
},
})
)
}
},
async getComment(projectId, docId, commentId) {
const { ranges } = await DocumentManager.getDoc(projectId, docId)
const comment = ranges?.comments?.find(comment => comment.id === commentId)
if (!comment) {
throw new Errors.NotFoundError({
message: 'comment not found',
info: { commentId },
})
}
return { comment }
},
async deleteComment(projectId, docId, commentId, userId) {
const { lines, version, ranges, pathname, historyRangesSupport } =
await DocumentManager.getDoc(projectId, docId)
if (lines == null || version == null) {
throw new Errors.NotFoundError(`document not found: ${docId}`)
}
const newRanges = RangesManager.deleteComment(commentId, ranges)
await RedisManager.promises.updateDocument(
projectId,
docId,
lines,
version,
[],
newRanges,
{}
)
if (historyRangesSupport) {
await RedisManager.promises.updateCommentState(docId, commentId, false)
await ProjectHistoryRedisManager.promises.queueOps(
projectId,
JSON.stringify({
pathname,
deleteComment: commentId,
meta: {
ts: new Date(),
user_id: userId,
},
})
)
}
},
async renameDoc(projectId, docId, userId, update, projectHistoryId) {
await RedisManager.promises.renameDoc(
projectId,
docId,
userId,
update,
projectHistoryId
)
},
async getDocAndFlushIfOld(projectId, docId) {
const { lines, version, unflushedTime, alreadyLoaded } =
await DocumentManager.getDoc(projectId, docId)
// if doc was already loaded see if it needs to be flushed
if (
alreadyLoaded &&
unflushedTime != null &&
Date.now() - unflushedTime > MAX_UNFLUSHED_AGE
) {
await DocumentManager.flushDocIfLoaded(projectId, docId)
}
return { lines, version }
},
async resyncDocContents(projectId, docId, path, opts = {}) {
logger.debug({ projectId, docId, path }, 'start resyncing doc contents')
let {
lines,
ranges,
resolvedCommentIds,
version,
projectHistoryId,
historyRangesSupport,
} = await RedisManager.promises.getDoc(projectId, docId)
// To avoid issues where the same docId appears with different paths,
// we use the path from the resyncProjectStructure update. If we used
// the path from the getDoc call to web then the two occurences of the
// docId would map to the same path, and this would be rejected by
// project-history as an unexpected resyncDocContent update.
if (lines == null || version == null) {
logger.debug(
{ projectId, docId },
'resyncing doc contents - not found in redis - retrieving from web'
)
;({
lines,
ranges,
resolvedCommentIds,
version,
projectHistoryId,
historyRangesSupport,
} = await PersistenceManager.promises.getDoc(projectId, docId, {
peek: true,
}))
} else {
logger.debug(
{ projectId, docId },
'resyncing doc contents - doc in redis - will queue in redis'
)
}
if (opts.historyRangesMigration) {
historyRangesSupport = opts.historyRangesMigration === 'forwards'
}
await ProjectHistoryRedisManager.promises.queueResyncDocContent(
projectId,
projectHistoryId,
docId,
lines,
ranges ?? {},
resolvedCommentIds,
version,
// use the path from the resyncProjectStructure update
path,
historyRangesSupport
)
if (opts.historyRangesMigration) {
await RedisManager.promises.setHistoryRangesSupportFlag(
docId,
historyRangesSupport
)
}
},
async getDocWithLock(projectId, docId) {
const UpdateManager = require('./UpdateManager')
return await UpdateManager.promises.lockUpdatesAndDo(
DocumentManager.getDoc,
projectId,
docId
)
},
async getCommentWithLock(projectId, docId, commentId) {
const UpdateManager = require('./UpdateManager')
return await UpdateManager.promises.lockUpdatesAndDo(
DocumentManager.getComment,
projectId,
docId,
commentId
)
},
async getDocAndRecentOpsWithLock(projectId, docId, fromVersion) {
const UpdateManager = require('./UpdateManager')
return await UpdateManager.promises.lockUpdatesAndDo(
DocumentManager.getDocAndRecentOps,
projectId,
docId,
fromVersion
)
},
async getDocAndFlushIfOldWithLock(projectId, docId) {
const UpdateManager = require('./UpdateManager')
return await UpdateManager.promises.lockUpdatesAndDo(
DocumentManager.getDocAndFlushIfOld,
projectId,
docId
)
},
async setDocWithLock(
projectId,
docId,
lines,
source,
userId,
undoing,
external
) {
const UpdateManager = require('./UpdateManager')
return await UpdateManager.promises.lockUpdatesAndDo(
DocumentManager.setDoc,
projectId,
docId,
lines,
source,
userId,
undoing,
external
)
},
async appendToDocWithLock(projectId, docId, lines, source, userId) {
const UpdateManager = require('./UpdateManager')
return await UpdateManager.promises.lockUpdatesAndDo(
DocumentManager.appendToDoc,
projectId,
docId,
lines,
source,
userId
)
},
async flushDocIfLoadedWithLock(projectId, docId) {
const UpdateManager = require('./UpdateManager')
return await UpdateManager.promises.lockUpdatesAndDo(
DocumentManager.flushDocIfLoaded,
projectId,
docId
)
},
async flushAndDeleteDocWithLock(projectId, docId, options) {
const UpdateManager = require('./UpdateManager')
return await UpdateManager.promises.lockUpdatesAndDo(
DocumentManager.flushAndDeleteDoc,
projectId,
docId,
options
)
},
async acceptChangesWithLock(projectId, docId, changeIds) {
const UpdateManager = require('./UpdateManager')
await UpdateManager.promises.lockUpdatesAndDo(
DocumentManager.acceptChanges,
projectId,
docId,
changeIds
)
},
async updateCommentStateWithLock(
projectId,
docId,
threadId,
userId,
resolved
) {
const UpdateManager = require('./UpdateManager')
await UpdateManager.promises.lockUpdatesAndDo(
DocumentManager.updateCommentState,
projectId,
docId,
threadId,
userId,
resolved
)
},
async deleteCommentWithLock(projectId, docId, threadId, userId) {
const UpdateManager = require('./UpdateManager')
await UpdateManager.promises.lockUpdatesAndDo(
DocumentManager.deleteComment,
projectId,
docId,
threadId,
userId
)
},
async renameDocWithLock(projectId, docId, userId, update, projectHistoryId) {
const UpdateManager = require('./UpdateManager')
await UpdateManager.promises.lockUpdatesAndDo(
DocumentManager.renameDoc,
projectId,
docId,
userId,
update,
projectHistoryId
)
},
async resyncDocContentsWithLock(projectId, docId, path, opts) {
const UpdateManager = require('./UpdateManager')
await UpdateManager.promises.lockUpdatesAndDo(
DocumentManager.resyncDocContents,
projectId,
docId,
path,
opts
)
},
}
module.exports = {
...callbackifyAll(DocumentManager, {
multiResult: {
getDoc: [
'lines',
'version',
'ranges',
'pathname',
'projectHistoryId',
'unflushedTime',
'alreadyLoaded',
'historyRangesSupport',
],
getDocWithLock: [
'lines',
'version',
'ranges',
'pathname',
'projectHistoryId',
'unflushedTime',
'alreadyLoaded',
'historyRangesSupport',
],
getDocAndFlushIfOld: ['lines', 'version'],
getDocAndFlushIfOldWithLock: ['lines', 'version'],
getDocAndRecentOps: [
'lines',
'version',
'ops',
'ranges',
'pathname',
'projectHistoryId',
],
getDocAndRecentOpsWithLock: [
'lines',
'version',
'ops',
'ranges',
'pathname',
'projectHistoryId',
],
getCommentWithLock: ['comment'],
},
}),
promises: DocumentManager,
}

View File

@@ -0,0 +1,15 @@
const OError = require('@overleaf/o-error')
class NotFoundError extends OError {}
class OpRangeNotAvailableError extends OError {}
class ProjectStateChangedError extends OError {}
class DeleteMismatchError extends OError {}
class FileTooLargeError extends OError {}
module.exports = {
NotFoundError,
OpRangeNotAvailableError,
ProjectStateChangedError,
DeleteMismatchError,
FileTooLargeError,
}

View File

@@ -0,0 +1,179 @@
// @ts-check
const _ = require('lodash')
const { isDelete } = require('./Utils')
/**
* @import { Comment, HistoryComment, HistoryRanges, HistoryTrackedChange } from './types'
* @import { Ranges, TrackedChange } from './types'
*/
/**
* Convert editor ranges to history ranges
*
* @param {Ranges} ranges
* @return {HistoryRanges}
*/
function toHistoryRanges(ranges) {
const changes = ranges.changes ?? []
const comments = (ranges.comments ?? []).slice()
// Changes are assumed to be sorted, but not comments
comments.sort((a, b) => a.op.p - b.op.p)
/**
* This will allow us to go through comments at a different pace as we loop
* through tracked changes
*/
const commentsIterator = new CommentsIterator(comments)
/**
* Current offset between editor pos and history pos
*/
let offset = 0
/**
* History comments that might overlap with the tracked change considered
*
* @type {HistoryComment[]}
*/
let pendingComments = []
/**
* The final history comments generated
*
* @type {HistoryComment[]}
*/
const historyComments = []
/**
* The final history tracked changes generated
*
* @type {HistoryTrackedChange[]}
*/
const historyChanges = []
for (const change of changes) {
historyChanges.push(toHistoryChange(change, offset))
// After this point, we're only interested in tracked deletes
if (!isDelete(change.op)) {
continue
}
// Fill pendingComments with new comments that start before this tracked
// delete and might overlap
for (const comment of commentsIterator.nextComments(change.op.p)) {
pendingComments.push(toHistoryComment(comment, offset))
}
// Save comments that are fully before this tracked delete
const newPendingComments = []
for (const historyComment of pendingComments) {
const commentEnd = historyComment.op.p + historyComment.op.c.length
if (commentEnd <= change.op.p) {
historyComments.push(historyComment)
} else {
newPendingComments.push(historyComment)
}
}
pendingComments = newPendingComments
// The rest of pending comments overlap with this tracked change. Adjust
// their history length.
for (const historyComment of pendingComments) {
historyComment.op.hlen =
(historyComment.op.hlen ?? historyComment.op.c.length) +
change.op.d.length
}
// Adjust the offset
offset += change.op.d.length
}
// Save the last pending comments
for (const historyComment of pendingComments) {
historyComments.push(historyComment)
}
// Save any comments that came after the last tracked change
for (const comment of commentsIterator.nextComments()) {
historyComments.push(toHistoryComment(comment, offset))
}
const historyRanges = {}
if (historyComments.length > 0) {
historyRanges.comments = historyComments
}
if (historyChanges.length > 0) {
historyRanges.changes = historyChanges
}
return historyRanges
}
class CommentsIterator {
/**
* Build a CommentsIterator
*
* @param {Comment[]} comments
*/
constructor(comments) {
this.comments = comments
this.currentIndex = 0
}
/**
* Generator that returns the next comments to consider
*
* @param {number} beforePos - only return comments that start before this position
* @return {Iterable<Comment>}
*/
*nextComments(beforePos = Infinity) {
while (this.currentIndex < this.comments.length) {
const comment = this.comments[this.currentIndex]
if (comment.op.p < beforePos) {
yield comment
this.currentIndex += 1
} else {
return
}
}
}
}
/**
* Convert an editor tracked change into a history tracked change
*
* @param {TrackedChange} change
* @param {number} offset - how much the history change is ahead of the
* editor change
* @return {HistoryTrackedChange}
*/
function toHistoryChange(change, offset) {
/** @type {HistoryTrackedChange} */
const historyChange = _.cloneDeep(change)
if (offset > 0) {
historyChange.op.hpos = change.op.p + offset
}
return historyChange
}
/**
* Convert an editor comment into a history comment
*
* @param {Comment} comment
* @param {number} offset - how much the history comment is ahead of the
* editor comment
* @return {HistoryComment}
*/
function toHistoryComment(comment, offset) {
/** @type {HistoryComment} */
const historyComment = _.cloneDeep(comment)
if (offset > 0) {
historyComment.op.hpos = comment.op.p + offset
}
return historyComment
}
module.exports = {
toHistoryRanges,
}

View File

@@ -0,0 +1,143 @@
const async = require('async')
const logger = require('@overleaf/logger')
const { promisifyAll } = require('@overleaf/promise-utils')
const request = require('request')
const Settings = require('@overleaf/settings')
const ProjectHistoryRedisManager = require('./ProjectHistoryRedisManager')
const metrics = require('./Metrics')
const HistoryManager = {
// flush changes in the background
flushProjectChangesAsync(projectId) {
HistoryManager.flushProjectChanges(
projectId,
{ background: true },
function () {}
)
},
// flush changes and callback (for when we need to know the queue is flushed)
flushProjectChanges(projectId, options, callback) {
if (callback == null) {
callback = function () {}
}
if (options.skip_history_flush) {
logger.debug({ projectId }, 'skipping flush of project history')
return callback()
}
metrics.inc('history-flush', 1, { status: 'project-history' })
const url = `${Settings.apis.project_history.url}/project/${projectId}/flush`
const qs = {}
if (options.background) {
qs.background = true
} // pass on the background flush option if present
logger.debug({ projectId, url, qs }, 'flushing doc in project history api')
request.post({ url, qs }, function (error, res, body) {
if (error) {
logger.error({ error, projectId }, 'project history api request failed')
callback(error)
} else if (res.statusCode < 200 && res.statusCode >= 300) {
logger.error(
{ projectId },
`project history api returned a failure status code: ${res.statusCode}`
)
callback(error)
} else {
callback()
}
})
},
FLUSH_DOC_EVERY_N_OPS: 100,
FLUSH_PROJECT_EVERY_N_OPS: 500,
recordAndFlushHistoryOps(projectId, ops, projectOpsLength) {
if (ops == null) {
ops = []
}
if (ops.length === 0) {
return
}
// record updates for project history
if (
HistoryManager.shouldFlushHistoryOps(
projectOpsLength,
ops.length,
HistoryManager.FLUSH_PROJECT_EVERY_N_OPS
)
) {
// Do this in the background since it uses HTTP and so may be too
// slow to wait for when processing a doc update.
logger.debug(
{ projectOpsLength, projectId },
'flushing project history api'
)
HistoryManager.flushProjectChangesAsync(projectId)
}
},
shouldFlushHistoryOps(length, opsLength, threshold) {
if (!length) {
return false
} // don't flush unless we know the length
// We want to flush every 100 ops, i.e. 100, 200, 300, etc
// Find out which 'block' (i.e. 0-99, 100-199) we were in before and after pushing these
// ops. If we've changed, then we've gone over a multiple of 100 and should flush.
// (Most of the time, we will only hit 100 and then flushing will put us back to 0)
const previousLength = length - opsLength
const prevBlock = Math.floor(previousLength / threshold)
const newBlock = Math.floor(length / threshold)
return newBlock !== prevBlock
},
MAX_PARALLEL_REQUESTS: 4,
resyncProjectHistory(
projectId,
projectHistoryId,
docs,
files,
opts,
callback
) {
ProjectHistoryRedisManager.queueResyncProjectStructure(
projectId,
projectHistoryId,
docs,
files,
opts,
function (error) {
if (error) {
return callback(error)
}
if (opts.resyncProjectStructureOnly) return callback()
const DocumentManager = require('./DocumentManager')
const resyncDoc = (doc, cb) => {
DocumentManager.resyncDocContentsWithLock(
projectId,
doc.doc,
doc.path,
opts,
cb
)
}
async.eachLimit(
docs,
HistoryManager.MAX_PARALLEL_REQUESTS,
resyncDoc,
callback
)
}
)
},
}
module.exports = HistoryManager
module.exports.promises = promisifyAll(HistoryManager, {
without: [
'flushProjectChangesAsync',
'recordAndFlushHistoryOps',
'shouldFlushHistoryOps',
],
})

View File

@@ -0,0 +1,559 @@
const DocumentManager = require('./DocumentManager')
const HistoryManager = require('./HistoryManager')
const ProjectManager = require('./ProjectManager')
const RedisManager = require('./RedisManager')
const Errors = require('./Errors')
const logger = require('@overleaf/logger')
const Settings = require('@overleaf/settings')
const Metrics = require('./Metrics')
const DeleteQueueManager = require('./DeleteQueueManager')
const { getTotalSizeOfLines } = require('./Limits')
const async = require('async')
function getDoc(req, res, next) {
let fromVersion
const docId = req.params.doc_id
const projectId = req.params.project_id
logger.debug({ projectId, docId }, 'getting doc via http')
const timer = new Metrics.Timer('http.getDoc')
if (req.query.fromVersion != null) {
fromVersion = parseInt(req.query.fromVersion, 10)
} else {
fromVersion = -1
}
DocumentManager.getDocAndRecentOpsWithLock(
projectId,
docId,
fromVersion,
(error, lines, version, ops, ranges, pathname) => {
timer.done()
if (error) {
return next(error)
}
logger.debug({ projectId, docId }, 'got doc via http')
if (lines == null || version == null) {
return next(new Errors.NotFoundError('document not found'))
}
res.json({
id: docId,
lines,
version,
ops,
ranges,
pathname,
ttlInS: RedisManager.DOC_OPS_TTL,
})
}
)
}
function getComment(req, res, next) {
const docId = req.params.doc_id
const projectId = req.params.project_id
const commentId = req.params.comment_id
logger.debug({ projectId, docId, commentId }, 'getting comment via http')
DocumentManager.getCommentWithLock(
projectId,
docId,
commentId,
(error, comment) => {
if (error) {
return next(error)
}
if (comment == null) {
return next(new Errors.NotFoundError('comment not found'))
}
res.json(comment)
}
)
}
// return the doc from redis if present, but don't load it from mongo
function peekDoc(req, res, next) {
const docId = req.params.doc_id
const projectId = req.params.project_id
logger.debug({ projectId, docId }, 'peeking at doc via http')
RedisManager.getDoc(projectId, docId, function (error, lines, version) {
if (error) {
return next(error)
}
if (lines == null || version == null) {
return next(new Errors.NotFoundError('document not found'))
}
res.json({ id: docId, lines, version })
})
}
function getProjectDocsAndFlushIfOld(req, res, next) {
const projectId = req.params.project_id
const projectStateHash = req.query.state
// exclude is string of existing docs "id:version,id:version,..."
const excludeItems =
req.query.exclude != null ? req.query.exclude.split(',') : []
logger.debug({ projectId, exclude: excludeItems }, 'getting docs via http')
const timer = new Metrics.Timer('http.getAllDocs')
const excludeVersions = {}
for (const item of excludeItems) {
const [id, version] = item.split(':')
excludeVersions[id] = version
}
logger.debug(
{ projectId, projectStateHash, excludeVersions },
'excluding versions'
)
ProjectManager.getProjectDocsAndFlushIfOld(
projectId,
projectStateHash,
excludeVersions,
(error, result) => {
timer.done()
if (error instanceof Errors.ProjectStateChangedError) {
res.sendStatus(409) // conflict
} else if (error) {
next(error)
} else {
logger.debug(
{
projectId,
result: result.map(doc => `${doc._id}:${doc.v}`),
},
'got docs via http'
)
res.send(result)
}
}
)
}
function getProjectLastUpdatedAt(req, res, next) {
const projectId = req.params.project_id
ProjectManager.getProjectDocsTimestamps(projectId, (err, timestamps) => {
if (err) return next(err)
// Filter out nulls. This can happen when
// - docs get flushed between the listing and getting the individual docs ts
// - a doc flush failed half way (doc keys removed, project tracking not updated)
timestamps = timestamps.filter(ts => !!ts)
timestamps = timestamps.map(ts => parseInt(ts, 10))
timestamps.sort((a, b) => (a > b ? 1 : -1))
res.json({ lastUpdatedAt: timestamps.pop() })
})
}
function clearProjectState(req, res, next) {
const projectId = req.params.project_id
const timer = new Metrics.Timer('http.clearProjectState')
logger.debug({ projectId }, 'clearing project state via http')
ProjectManager.clearProjectState(projectId, error => {
timer.done()
if (error) {
next(error)
} else {
res.sendStatus(200)
}
})
}
function setDoc(req, res, next) {
const docId = req.params.doc_id
const projectId = req.params.project_id
const { lines, source, user_id: userId, undoing } = req.body
const lineSize = getTotalSizeOfLines(lines)
if (lineSize > Settings.max_doc_length) {
logger.warn(
{ projectId, docId, source, lineSize, userId },
'document too large, returning 406 response'
)
return res.sendStatus(406)
}
logger.debug(
{ projectId, docId, lines, source, userId, undoing },
'setting doc via http'
)
const timer = new Metrics.Timer('http.setDoc')
DocumentManager.setDocWithLock(
projectId,
docId,
lines,
source,
userId,
undoing,
true,
(error, result) => {
timer.done()
if (error) {
return next(error)
}
logger.debug({ projectId, docId }, 'set doc via http')
res.json(result)
}
)
}
function appendToDoc(req, res, next) {
const docId = req.params.doc_id
const projectId = req.params.project_id
const { lines, source, user_id: userId } = req.body
const timer = new Metrics.Timer('http.appendToDoc')
DocumentManager.appendToDocWithLock(
projectId,
docId,
lines,
source,
userId,
(error, result) => {
timer.done()
if (error instanceof Errors.FileTooLargeError) {
logger.warn('refusing to append to file, it would become too large')
return res.sendStatus(422)
}
if (error) {
return next(error)
}
logger.debug(
{ projectId, docId, lines, source, userId },
'appending to doc via http'
)
res.json(result)
}
)
}
function flushDocIfLoaded(req, res, next) {
const docId = req.params.doc_id
const projectId = req.params.project_id
logger.debug({ projectId, docId }, 'flushing doc via http')
const timer = new Metrics.Timer('http.flushDoc')
DocumentManager.flushDocIfLoadedWithLock(projectId, docId, error => {
timer.done()
if (error) {
return next(error)
}
logger.debug({ projectId, docId }, 'flushed doc via http')
res.sendStatus(204) // No Content
})
}
function deleteDoc(req, res, next) {
const docId = req.params.doc_id
const projectId = req.params.project_id
const ignoreFlushErrors = req.query.ignore_flush_errors === 'true'
const timer = new Metrics.Timer('http.deleteDoc')
logger.debug({ projectId, docId }, 'deleting doc via http')
DocumentManager.flushAndDeleteDocWithLock(
projectId,
docId,
{ ignoreFlushErrors },
error => {
timer.done()
// There is no harm in flushing project history if the previous call
// failed and sometimes it is required
HistoryManager.flushProjectChangesAsync(projectId)
if (error) {
return next(error)
}
logger.debug({ projectId, docId }, 'deleted doc via http')
res.sendStatus(204) // No Content
}
)
}
function flushProject(req, res, next) {
const projectId = req.params.project_id
logger.debug({ projectId }, 'flushing project via http')
const timer = new Metrics.Timer('http.flushProject')
ProjectManager.flushProjectWithLocks(projectId, error => {
timer.done()
if (error) {
return next(error)
}
logger.debug({ projectId }, 'flushed project via http')
res.sendStatus(204) // No Content
})
}
function deleteProject(req, res, next) {
const projectId = req.params.project_id
logger.debug({ projectId }, 'deleting project via http')
const options = {}
if (req.query.background) {
options.background = true
} // allow non-urgent flushes to be queued
if (req.query.shutdown) {
options.skip_history_flush = true
} // don't flush history when realtime shuts down
if (req.query.background) {
ProjectManager.queueFlushAndDeleteProject(projectId, error => {
if (error) {
return next(error)
}
logger.debug({ projectId }, 'queue delete of project via http')
res.sendStatus(204)
}) // No Content
} else {
const timer = new Metrics.Timer('http.deleteProject')
ProjectManager.flushAndDeleteProjectWithLocks(projectId, options, error => {
timer.done()
if (error) {
return next(error)
}
logger.debug({ projectId }, 'deleted project via http')
res.sendStatus(204) // No Content
})
}
}
function deleteMultipleProjects(req, res, next) {
const projectIds = req.body.project_ids || []
logger.debug({ projectIds }, 'deleting multiple projects via http')
async.eachSeries(
projectIds,
(projectId, cb) => {
logger.debug({ projectId }, 'queue delete of project via http')
ProjectManager.queueFlushAndDeleteProject(projectId, cb)
},
error => {
if (error) {
return next(error)
}
res.sendStatus(204) // No Content
}
)
}
function acceptChanges(req, res, next) {
const { project_id: projectId, doc_id: docId } = req.params
let changeIds = req.body.change_ids
if (changeIds == null) {
changeIds = [req.params.change_id]
}
logger.debug(
{ projectId, docId },
`accepting ${changeIds.length} changes via http`
)
const timer = new Metrics.Timer('http.acceptChanges')
DocumentManager.acceptChangesWithLock(projectId, docId, changeIds, error => {
timer.done()
if (error) {
return next(error)
}
logger.debug(
{ projectId, docId },
`accepted ${changeIds.length} changes via http`
)
res.sendStatus(204) // No Content
})
}
function resolveComment(req, res, next) {
const {
project_id: projectId,
doc_id: docId,
comment_id: commentId,
} = req.params
const userId = req.body.user_id
logger.debug({ projectId, docId, commentId }, 'resolving comment via http')
DocumentManager.updateCommentStateWithLock(
projectId,
docId,
commentId,
userId,
true,
error => {
if (error) {
return next(error)
}
logger.debug({ projectId, docId, commentId }, 'resolved comment via http')
res.sendStatus(204) // No Content
}
)
}
function reopenComment(req, res, next) {
const {
project_id: projectId,
doc_id: docId,
comment_id: commentId,
} = req.params
const userId = req.body.user_id
logger.debug({ projectId, docId, commentId }, 'reopening comment via http')
DocumentManager.updateCommentStateWithLock(
projectId,
docId,
commentId,
userId,
false,
error => {
if (error) {
return next(error)
}
logger.debug({ projectId, docId, commentId }, 'reopened comment via http')
res.sendStatus(204) // No Content
}
)
}
function deleteComment(req, res, next) {
const {
project_id: projectId,
doc_id: docId,
comment_id: commentId,
} = req.params
const userId = req.body.user_id
logger.debug({ projectId, docId, commentId }, 'deleting comment via http')
const timer = new Metrics.Timer('http.deleteComment')
DocumentManager.deleteCommentWithLock(
projectId,
docId,
commentId,
userId,
error => {
timer.done()
if (error) {
return next(error)
}
logger.debug({ projectId, docId, commentId }, 'deleted comment via http')
res.sendStatus(204) // No Content
}
)
}
function updateProject(req, res, next) {
const timer = new Metrics.Timer('http.updateProject')
const projectId = req.params.project_id
const { projectHistoryId, userId, updates = [], version, source } = req.body
logger.debug({ projectId, updates, version }, 'updating project via http')
ProjectManager.updateProjectWithLocks(
projectId,
projectHistoryId,
userId,
updates,
version,
source,
error => {
timer.done()
if (error) {
return next(error)
}
logger.debug({ projectId }, 'updated project via http')
res.sendStatus(204) // No Content
}
)
}
function resyncProjectHistory(req, res, next) {
const projectId = req.params.project_id
const {
projectHistoryId,
docs,
files,
historyRangesMigration,
resyncProjectStructureOnly,
} = req.body
logger.debug(
{ projectId, docs, files },
'queuing project history resync via http'
)
const opts = {}
if (historyRangesMigration) {
opts.historyRangesMigration = historyRangesMigration
}
if (resyncProjectStructureOnly) {
opts.resyncProjectStructureOnly = resyncProjectStructureOnly
}
HistoryManager.resyncProjectHistory(
projectId,
projectHistoryId,
docs,
files,
opts,
error => {
if (error) {
return next(error)
}
logger.debug({ projectId }, 'queued project history resync via http')
res.sendStatus(204)
}
)
}
function flushQueuedProjects(req, res, next) {
res.setTimeout(10 * 60 * 1000)
const options = {
limit: req.query.limit || 1000,
timeout: 5 * 60 * 1000,
min_delete_age: req.query.min_delete_age || 5 * 60 * 1000,
}
DeleteQueueManager.flushAndDeleteOldProjects(options, (err, flushed) => {
if (err) {
logger.err({ err }, 'error flushing old projects')
res.sendStatus(500)
} else {
logger.info({ flushed }, 'flush of queued projects completed')
res.send({ flushed })
}
})
}
/**
* Block a project from getting loaded in docupdater
*
* The project is blocked only if it's not already loaded in docupdater. The
* response indicates whether the project has been blocked or not.
*/
function blockProject(req, res, next) {
const projectId = req.params.project_id
RedisManager.blockProject(projectId, (err, blocked) => {
if (err) {
return next(err)
}
res.json({ blocked })
})
}
/**
* Unblock a project
*/
function unblockProject(req, res, next) {
const projectId = req.params.project_id
RedisManager.unblockProject(projectId, (err, wasBlocked) => {
if (err) {
return next(err)
}
res.json({ wasBlocked })
})
}
module.exports = {
getDoc,
peekDoc,
getProjectDocsAndFlushIfOld,
getProjectLastUpdatedAt,
clearProjectState,
appendToDoc,
setDoc,
flushDocIfLoaded,
deleteDoc,
flushProject,
deleteProject,
deleteMultipleProjects,
acceptChanges,
resolveComment,
reopenComment,
deleteComment,
updateProject,
resyncProjectHistory,
flushQueuedProjects,
blockProject,
unblockProject,
getComment,
}

View File

@@ -0,0 +1,31 @@
module.exports = {
// compute the total size of the document in chararacters, including newlines
getTotalSizeOfLines(lines) {
let size = 0
for (const line of lines) {
size += line.length + 1 // include the newline
}
return size
},
// check whether the total size of the document in characters exceeds the
// maxDocLength.
//
// The estimated size should be an upper bound on the true size, typically
// it will be the size of the JSON.stringified array of lines. If the
// estimated size is less than the maxDocLength then we know that the total
// size of lines will also be less than maxDocLength.
docIsTooLarge(estimatedSize, lines, maxDocLength) {
if (estimatedSize <= maxDocLength) {
return false // definitely under the limit, no need to calculate the total size
}
// calculate the total size, bailing out early if the size limit is reached
let size = 0
for (const line of lines) {
size += line.length + 1 // include the newline
if (size > maxDocLength) return true
}
// since we didn't hit the limit in the loop, the document is within the allowed length
return false
},
}

View File

@@ -0,0 +1,18 @@
const Settings = require('@overleaf/settings')
const redis = require('@overleaf/redis-wrapper')
const rclient = redis.createClient(Settings.redis.lock)
const keys = Settings.redis.lock.key_schema
const RedisLocker = require('@overleaf/redis-wrapper/RedisLocker')
module.exports = new RedisLocker({
rclient,
getKey(docId) {
return keys.blockingKey({ doc_id: docId })
},
wrapTimeoutError(err, docId) {
err.doc_id = docId
return err
},
metricsPrefix: 'doc',
lockTTLSeconds: Settings.redisLockTTLSeconds,
})

View File

@@ -0,0 +1,51 @@
/* eslint-disable
no-return-assign,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
const _ = require('lodash')
const showLength = function (thing) {
if (thing != null ? thing.length : undefined) {
return thing.length
} else {
return thing
}
}
const showUpdateLength = function (update) {
if ((update != null ? update.op : undefined) instanceof Array) {
const copy = _.cloneDeep(update)
copy.op.forEach(function (element, index) {
if (element?.i?.length != null) {
copy.op[index].i = element.i.length
}
if (element?.d?.length != null) {
copy.op[index].d = element.d.length
}
if (element?.c?.length != null) {
return (copy.op[index].c = element.c.length)
}
})
return copy
} else {
return update
}
}
module.exports = {
// replace long values with their length
lines: showLength,
oldLines: showLength,
newLines: showLength,
docLines: showLength,
newDocLines: showLength,
ranges: showLength,
update: showUpdateLength,
}

View File

@@ -0,0 +1,3 @@
// TODO: This file was created by bulk-decaffeinate.
// Sanity-check the conversion and remove this comment.
module.exports = require('@overleaf/metrics')

View File

@@ -0,0 +1,196 @@
const { promisify } = require('node:util')
const { promisifyMultiResult } = require('@overleaf/promise-utils')
const Settings = require('@overleaf/settings')
const Errors = require('./Errors')
const Metrics = require('./Metrics')
const logger = require('@overleaf/logger')
const request = require('requestretry').defaults({
maxAttempts: 2,
retryDelay: 10,
})
// We have to be quick with HTTP calls because we're holding a lock that
// expires after 30 seconds. We can't let any errors in the rest of the stack
// hold us up, and need to bail out quickly if there is a problem.
const MAX_HTTP_REQUEST_LENGTH = 5000 // 5 seconds
function updateMetric(method, error, response) {
// find the status, with special handling for connection timeouts
// https://github.com/request/request#timeouts
let status
if (error && error.connect === true) {
status = `${error.code} (connect)`
} else if (error) {
status = error.code
} else if (response) {
status = response.statusCode
}
Metrics.inc(method, 1, { status })
if (error && error.attempts > 1) {
Metrics.inc(`${method}-retries`, 1, { status: 'error' })
}
if (response && response.attempts > 1) {
Metrics.inc(`${method}-retries`, 1, { status: 'success' })
}
}
function getDoc(projectId, docId, options = {}, _callback) {
const timer = new Metrics.Timer('persistenceManager.getDoc')
if (typeof options === 'function') {
_callback = options
options = {}
}
const callback = function (...args) {
timer.done()
_callback(...args)
}
const urlPath = `/project/${projectId}/doc/${docId}`
const requestParams = {
url: `${Settings.apis.web.url}${urlPath}`,
method: 'GET',
headers: {
accept: 'application/json',
},
auth: {
user: Settings.apis.web.user,
pass: Settings.apis.web.pass,
sendImmediately: true,
},
jar: false,
timeout: MAX_HTTP_REQUEST_LENGTH,
}
if (options.peek) {
requestParams.qs = { peek: 'true' }
}
request(requestParams, (error, res, body) => {
updateMetric('getDoc', error, res)
if (error) {
logger.error({ err: error, projectId, docId }, 'web API request failed')
return callback(new Error('error connecting to web API'))
}
if (res.statusCode >= 200 && res.statusCode < 300) {
try {
body = JSON.parse(body)
} catch (e) {
return callback(e)
}
if (body.lines == null) {
return callback(new Error('web API response had no doc lines'))
}
if (body.version == null) {
return callback(new Error('web API response had no valid doc version'))
}
if (body.pathname == null) {
return callback(new Error('web API response had no valid doc pathname'))
}
if (!body.pathname) {
logger.warn(
{ projectId, docId },
'missing pathname in PersistenceManager getDoc'
)
Metrics.inc('pathname', 1, {
path: 'PersistenceManager.getDoc',
status: body.pathname === '' ? 'zero-length' : 'undefined',
})
}
callback(
null,
body.lines,
body.version,
body.ranges,
body.pathname,
body.projectHistoryId?.toString(),
body.historyRangesSupport || false,
body.resolvedCommentIds || []
)
} else if (res.statusCode === 404) {
callback(new Errors.NotFoundError(`doc not not found: ${urlPath}`))
} else if (res.statusCode === 413) {
callback(
new Errors.FileTooLargeError(`doc exceeds maximum size: ${urlPath}`)
)
} else {
callback(
new Error(`error accessing web API: ${urlPath} ${res.statusCode}`)
)
}
})
}
function setDoc(
projectId,
docId,
lines,
version,
ranges,
lastUpdatedAt,
lastUpdatedBy,
_callback
) {
const timer = new Metrics.Timer('persistenceManager.setDoc')
const callback = function (...args) {
timer.done()
_callback(...args)
}
const urlPath = `/project/${projectId}/doc/${docId}`
request(
{
url: `${Settings.apis.web.url}${urlPath}`,
method: 'POST',
json: {
lines,
ranges,
version,
lastUpdatedBy,
lastUpdatedAt,
},
auth: {
user: Settings.apis.web.user,
pass: Settings.apis.web.pass,
sendImmediately: true,
},
jar: false,
timeout: MAX_HTTP_REQUEST_LENGTH,
},
(error, res, body) => {
updateMetric('setDoc', error, res)
if (error) {
logger.error({ err: error, projectId, docId }, 'web API request failed')
return callback(new Error('error connecting to web API'))
}
if (res.statusCode >= 200 && res.statusCode < 300) {
callback(null, body)
} else if (res.statusCode === 404) {
callback(new Errors.NotFoundError(`doc not not found: ${urlPath}`))
} else if (res.statusCode === 413) {
callback(
new Errors.FileTooLargeError(`doc exceeds maximum size: ${urlPath}`)
)
} else {
callback(
new Error(`error accessing web API: ${urlPath} ${res.statusCode}`)
)
}
}
)
}
module.exports = {
getDoc,
setDoc,
promises: {
getDoc: promisifyMultiResult(getDoc, [
'lines',
'version',
'ranges',
'pathname',
'projectHistoryId',
'historyRangesSupport',
'resolvedCommentIds',
]),
setDoc: promisify(setDoc),
},
}

View File

@@ -0,0 +1,68 @@
/* eslint-disable
no-unused-vars,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS206: Consider reworking classes to avoid initClass
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
let Profiler
const Settings = require('@overleaf/settings')
const logger = require('@overleaf/logger')
const deltaMs = function (ta, tb) {
const nanoSeconds = (ta[0] - tb[0]) * 1e9 + (ta[1] - tb[1])
const milliSeconds = Math.floor(nanoSeconds * 1e-6)
return milliSeconds
}
module.exports = Profiler = (function () {
Profiler = class Profiler {
static initClass() {
this.prototype.LOG_CUTOFF_TIME = 15 * 1000
this.prototype.LOG_SYNC_CUTOFF_TIME = 1000
}
constructor(name, args) {
this.name = name
this.args = args
this.t0 = this.t = process.hrtime()
this.start = new Date()
this.updateTimes = []
this.totalSyncTime = 0
}
log(label, options = {}) {
const t1 = process.hrtime()
const dtMilliSec = deltaMs(t1, this.t)
this.t = t1
this.totalSyncTime += options.sync ? dtMilliSec : 0
this.updateTimes.push([label, dtMilliSec]) // timings in ms
return this // make it chainable
}
end(message) {
const totalTime = deltaMs(this.t, this.t0)
const exceedsCutoff = totalTime > this.LOG_CUTOFF_TIME
const exceedsSyncCutoff = this.totalSyncTime > this.LOG_SYNC_CUTOFF_TIME
if (exceedsCutoff || exceedsSyncCutoff) {
// log anything greater than cutoffs
const args = {}
for (const k in this.args) {
const v = this.args[k]
args[k] = v
}
args.updateTimes = this.updateTimes
args.start = this.start
args.end = new Date()
args.status = { exceedsCutoff, exceedsSyncCutoff }
logger.warn(args, this.name)
}
return totalTime
}
}
Profiler.initClass()
return Profiler
})()

View File

@@ -0,0 +1,139 @@
/* eslint-disable
no-unused-vars,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS205: Consider reworking code to avoid use of IIFEs
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
const request = require('request')
const Settings = require('@overleaf/settings')
const RedisManager = require('./RedisManager')
const { rclient } = RedisManager
const docUpdaterKeys = Settings.redis.documentupdater.key_schema
const async = require('async')
const ProjectManager = require('./ProjectManager')
const _ = require('lodash')
const logger = require('@overleaf/logger')
const { promisifyAll } = require('@overleaf/promise-utils')
const ProjectFlusher = {
// iterate over keys asynchronously using redis scan (non-blocking)
// handle all the cluster nodes or single redis server
_getKeys(pattern, limit, callback) {
const nodes = (typeof rclient.nodes === 'function'
? rclient.nodes('master')
: undefined) || [rclient]
const doKeyLookupForNode = (node, cb) =>
ProjectFlusher._getKeysFromNode(node, pattern, limit, cb)
return async.concatSeries(nodes, doKeyLookupForNode, callback)
},
_getKeysFromNode(node, pattern, limit, callback) {
if (limit == null) {
limit = 1000
}
let cursor = 0 // redis iterator
const keySet = {} // use hash to avoid duplicate results
const batchSize = limit != null ? Math.min(limit, 1000) : 1000
// scan over all keys looking for pattern
const doIteration = (
cb // avoid hitting redis too hard
) =>
node.scan(
cursor,
'MATCH',
pattern,
'COUNT',
batchSize,
function (error, reply) {
let keys
if (error != null) {
return callback(error)
}
;[cursor, keys] = Array.from(reply)
for (const key of Array.from(keys)) {
keySet[key] = true
}
keys = Object.keys(keySet)
const noResults = cursor === '0' // redis returns string results not numeric
const limitReached = limit != null && keys.length >= limit
if (noResults || limitReached) {
return callback(null, keys)
} else {
return setTimeout(doIteration, 10)
}
}
)
return doIteration()
},
// extract ids from keys like DocsWithHistoryOps:57fd0b1f53a8396d22b2c24b
// or docsInProject:{57fd0b1f53a8396d22b2c24b} (for redis cluster)
_extractIds(keyList) {
const ids = (() => {
const result = []
for (const key of Array.from(keyList)) {
const m = key.match(/:\{?([0-9a-f]{24})\}?/) // extract object id
result.push(m[1])
}
return result
})()
return ids
},
flushAllProjects(options, callback) {
logger.info({ options }, 'flushing all projects')
return ProjectFlusher._getKeys(
docUpdaterKeys.docsInProject({ project_id: '*' }),
options.limit,
function (error, projectKeys) {
if (error != null) {
logger.err({ err: error }, 'error getting keys for flushing')
return callback(error)
}
const projectIds = ProjectFlusher._extractIds(projectKeys)
if (options.dryRun) {
return callback(null, projectIds)
}
const jobs = _.map(
projectIds,
projectId => cb =>
ProjectManager.flushAndDeleteProjectWithLocks(
projectId,
{ background: true },
cb
)
)
return async.parallelLimit(
async.reflectAll(jobs),
options.concurrency,
function (error, results) {
const success = []
const failure = []
_.each(results, function (result, i) {
if (result.error != null) {
return failure.push(projectIds[i])
} else {
return success.push(projectIds[i])
}
})
logger.info(
{ successCount: success.length, failureCount: failure.length },
'finished flushing all projects'
)
return callback(error, { success, failure })
}
)
}
)
},
}
module.exports = ProjectFlusher
module.exports.promises = promisifyAll(ProjectFlusher)

View File

@@ -0,0 +1,245 @@
// @ts-check
const Settings = require('@overleaf/settings')
const { callbackifyAll } = require('@overleaf/promise-utils')
const projectHistoryKeys = Settings.redis?.project_history?.key_schema
const rclient = require('@overleaf/redis-wrapper').createClient(
Settings.redis.project_history
)
const logger = require('@overleaf/logger')
const metrics = require('./Metrics')
const { docIsTooLarge } = require('./Limits')
const { addTrackedDeletesToContent, extractOriginOrSource } = require('./Utils')
const HistoryConversions = require('./HistoryConversions')
const OError = require('@overleaf/o-error')
/**
* @import { Ranges } from './types'
*/
const ProjectHistoryRedisManager = {
async queueOps(projectId, ...ops) {
// Record metric for ops pushed onto queue
for (const op of ops) {
metrics.summary('redis.projectHistoryOps', op.length, { status: 'push' })
}
// Make sure that this MULTI operation only operates on project
// specific keys, i.e. keys that have the project id in curly braces.
// The curly braces identify a hash key for Redis and ensures that
// the MULTI's operations are all done on the same node in a
// cluster environment.
const multi = rclient.multi()
// Push the ops onto the project history queue
multi.rpush(
projectHistoryKeys.projectHistoryOps({ project_id: projectId }),
...ops
)
// To record the age of the oldest op on the queue set a timestamp if not
// already present (SETNX).
multi.setnx(
projectHistoryKeys.projectHistoryFirstOpTimestamp({
project_id: projectId,
}),
Date.now()
)
const result = await multi.exec()
return result[0]
},
async queueRenameEntity(
projectId,
projectHistoryId,
entityType,
entityId,
userId,
projectUpdate,
originOrSource
) {
projectUpdate = {
pathname: projectUpdate.pathname,
new_pathname: projectUpdate.newPathname,
meta: {
user_id: userId,
ts: new Date(),
},
version: projectUpdate.version,
projectHistoryId,
}
projectUpdate[entityType] = entityId
const { origin, source } = extractOriginOrSource(originOrSource)
if (origin != null) {
projectUpdate.meta.origin = origin
if (origin.kind !== 'editor') {
projectUpdate.meta.type = 'external'
}
} else if (source != null) {
projectUpdate.meta.source = source
if (source !== 'editor') {
projectUpdate.meta.type = 'external'
}
}
logger.debug(
{ projectId, projectUpdate },
'queue rename operation to project-history'
)
const jsonUpdate = JSON.stringify(projectUpdate)
return await ProjectHistoryRedisManager.queueOps(projectId, jsonUpdate)
},
async queueAddEntity(
projectId,
projectHistoryId,
entityType,
entityId,
userId,
projectUpdate,
originOrSource
) {
let docLines = projectUpdate.docLines
let ranges
if (projectUpdate.historyRangesSupport && projectUpdate.ranges) {
docLines = addTrackedDeletesToContent(
docLines,
projectUpdate.ranges.changes ?? []
)
ranges = HistoryConversions.toHistoryRanges(projectUpdate.ranges)
}
projectUpdate = {
pathname: projectUpdate.pathname,
docLines,
url: projectUpdate.url,
meta: {
user_id: userId,
ts: new Date(),
},
version: projectUpdate.version,
hash: projectUpdate.hash,
metadata: projectUpdate.metadata,
projectHistoryId,
createdBlob: projectUpdate.createdBlob ?? false,
}
if (ranges) {
projectUpdate.ranges = ranges
}
projectUpdate[entityType] = entityId
const { origin, source } = extractOriginOrSource(originOrSource)
if (origin != null) {
projectUpdate.meta.origin = origin
if (origin.kind !== 'editor') {
projectUpdate.meta.type = 'external'
}
} else if (source != null) {
projectUpdate.meta.source = source
if (source !== 'editor') {
projectUpdate.meta.type = 'external'
}
}
logger.debug(
{ projectId, projectUpdate },
'queue add operation to project-history'
)
const jsonUpdate = JSON.stringify(projectUpdate)
return await ProjectHistoryRedisManager.queueOps(projectId, jsonUpdate)
},
async queueResyncProjectStructure(
projectId,
projectHistoryId,
docs,
files,
opts
) {
logger.debug({ projectId, docs, files }, 'queue project structure resync')
const projectUpdate = {
resyncProjectStructure: { docs, files },
projectHistoryId,
meta: {
ts: new Date(),
},
}
if (opts.resyncProjectStructureOnly) {
projectUpdate.resyncProjectStructureOnly = opts.resyncProjectStructureOnly
}
const jsonUpdate = JSON.stringify(projectUpdate)
return await ProjectHistoryRedisManager.queueOps(projectId, jsonUpdate)
},
/**
* Add a resync doc update to the project-history queue
*
* @param {string} projectId
* @param {string} projectHistoryId
* @param {string} docId
* @param {string[]} lines
* @param {Ranges} ranges
* @param {string[]} resolvedCommentIds
* @param {number} version
* @param {string} pathname
* @param {boolean} historyRangesSupport
* @return {Promise<number>} the number of ops added
*/
async queueResyncDocContent(
projectId,
projectHistoryId,
docId,
lines,
ranges,
resolvedCommentIds,
version,
pathname,
historyRangesSupport
) {
logger.debug(
{ projectId, docId, lines, version, pathname },
'queue doc content resync'
)
let content = lines.join('\n')
if (historyRangesSupport) {
content = addTrackedDeletesToContent(content, ranges.changes ?? [])
}
const projectUpdate = {
resyncDocContent: { content, version },
projectHistoryId,
path: pathname,
doc: docId,
meta: {
ts: new Date(),
},
}
if (historyRangesSupport) {
projectUpdate.resyncDocContent.ranges =
HistoryConversions.toHistoryRanges(ranges)
projectUpdate.resyncDocContent.resolvedCommentIds = resolvedCommentIds
}
const jsonUpdate = JSON.stringify(projectUpdate)
// Do an optimised size check on the docLines using the serialised
// project update length as an upper bound
const sizeBound = jsonUpdate.length
if (docIsTooLarge(sizeBound, lines, Settings.max_doc_length)) {
throw new OError(
'blocking resync doc content insert into project history queue: doc is too large',
{ projectId, docId, docSize: sizeBound }
)
}
return await ProjectHistoryRedisManager.queueOps(projectId, jsonUpdate)
},
}
module.exports = {
...callbackifyAll(ProjectHistoryRedisManager),
promises: ProjectHistoryRedisManager,
}

View File

@@ -0,0 +1,341 @@
const RedisManager = require('./RedisManager')
const ProjectHistoryRedisManager = require('./ProjectHistoryRedisManager')
const DocumentManager = require('./DocumentManager')
const HistoryManager = require('./HistoryManager')
const async = require('async')
const logger = require('@overleaf/logger')
const Metrics = require('./Metrics')
const Errors = require('./Errors')
const { promisifyAll } = require('@overleaf/promise-utils')
function flushProjectWithLocks(projectId, _callback) {
const timer = new Metrics.Timer('projectManager.flushProjectWithLocks')
const callback = function (...args) {
timer.done()
_callback(...args)
}
RedisManager.getDocIdsInProject(projectId, (error, docIds) => {
if (error) {
return callback(error)
}
const errors = []
const jobs = docIds.map(docId => callback => {
DocumentManager.flushDocIfLoadedWithLock(projectId, docId, error => {
if (error instanceof Errors.NotFoundError) {
logger.warn(
{ err: error, projectId, docId },
'found deleted doc when flushing'
)
callback()
} else if (error) {
logger.error({ err: error, projectId, docId }, 'error flushing doc')
errors.push(error)
callback()
} else {
callback()
}
})
})
logger.debug({ projectId, docIds }, 'flushing docs')
async.series(jobs, () => {
if (errors.length > 0) {
callback(new Error('Errors flushing docs. See log for details'))
} else {
callback(null)
}
})
})
}
function flushAndDeleteProjectWithLocks(projectId, options, _callback) {
const timer = new Metrics.Timer(
'projectManager.flushAndDeleteProjectWithLocks'
)
const callback = function (...args) {
timer.done()
_callback(...args)
}
RedisManager.getDocIdsInProject(projectId, (error, docIds) => {
if (error) {
return callback(error)
}
const errors = []
const jobs = docIds.map(docId => callback => {
DocumentManager.flushAndDeleteDocWithLock(projectId, docId, {}, error => {
if (error) {
logger.error({ err: error, projectId, docId }, 'error deleting doc')
errors.push(error)
}
callback()
})
})
logger.debug({ projectId, docIds }, 'deleting docs')
async.series(jobs, () =>
// When deleting the project here we want to ensure that project
// history is completely flushed because the project may be
// deleted in web after this call completes, and so further
// attempts to flush would fail after that.
HistoryManager.flushProjectChanges(projectId, options, error => {
if (errors.length > 0) {
callback(new Error('Errors deleting docs. See log for details'))
} else if (error) {
callback(error)
} else {
callback(null)
}
})
)
})
}
function queueFlushAndDeleteProject(projectId, callback) {
RedisManager.queueFlushAndDeleteProject(projectId, error => {
if (error) {
logger.error(
{ projectId, error },
'error adding project to flush and delete queue'
)
return callback(error)
}
Metrics.inc('queued-delete')
callback()
})
}
function getProjectDocsTimestamps(projectId, callback) {
RedisManager.getDocIdsInProject(projectId, (error, docIds) => {
if (error) {
return callback(error)
}
if (docIds.length === 0) {
return callback(null, [])
}
RedisManager.getDocTimestamps(docIds, (error, timestamps) => {
if (error) {
return callback(error)
}
callback(null, timestamps)
})
})
}
function getProjectDocsAndFlushIfOld(
projectId,
projectStateHash,
excludeVersions,
_callback
) {
const timer = new Metrics.Timer('projectManager.getProjectDocsAndFlushIfOld')
const callback = function (...args) {
timer.done()
_callback(...args)
}
RedisManager.checkOrSetProjectState(
projectId,
projectStateHash,
(error, projectStateChanged) => {
if (error) {
logger.error(
{ err: error, projectId },
'error getting/setting project state in getProjectDocsAndFlushIfOld'
)
return callback(error)
}
// we can't return docs if project structure has changed
if (projectStateChanged) {
return callback(
new Errors.ProjectStateChangedError('project state changed')
)
}
// project structure hasn't changed, return doc content from redis
RedisManager.getDocIdsInProject(projectId, (error, docIds) => {
if (error) {
logger.error(
{ err: error, projectId },
'error getting doc ids in getProjectDocs'
)
return callback(error)
}
// get the doc lines from redis
const jobs = docIds.map(docId => cb => {
DocumentManager.getDocAndFlushIfOldWithLock(
projectId,
docId,
(err, lines, version) => {
if (err) {
logger.error(
{ err, projectId, docId },
'error getting project doc lines in getProjectDocsAndFlushIfOld'
)
return cb(err)
}
const doc = { _id: docId, lines, v: version } // create a doc object to return
cb(null, doc)
}
)
})
async.series(jobs, (error, docs) => {
if (error) {
return callback(error)
}
callback(null, docs)
})
})
}
)
}
function clearProjectState(projectId, callback) {
RedisManager.clearProjectState(projectId, callback)
}
function updateProjectWithLocks(
projectId,
projectHistoryId,
userId,
updates,
projectVersion,
source,
_callback
) {
const timer = new Metrics.Timer('projectManager.updateProject')
const callback = function (...args) {
timer.done()
_callback(...args)
}
let projectSubversion = 0 // project versions can have multiple operations
let projectOpsLength = 0
function handleUpdate(update, cb) {
update.version = `${projectVersion}.${projectSubversion++}`
switch (update.type) {
case 'add-doc':
ProjectHistoryRedisManager.queueAddEntity(
projectId,
projectHistoryId,
'doc',
update.id,
userId,
update,
source,
(error, count) => {
projectOpsLength = count
cb(error)
}
)
break
case 'rename-doc':
if (!update.newPathname) {
// an empty newPathname signifies a delete, so there is no need to
// update the pathname in redis
ProjectHistoryRedisManager.queueRenameEntity(
projectId,
projectHistoryId,
'doc',
update.id,
userId,
update,
source,
(error, count) => {
projectOpsLength = count
cb(error)
}
)
} else {
// rename the doc in redis before queuing the update
DocumentManager.renameDocWithLock(
projectId,
update.id,
userId,
update,
projectHistoryId,
error => {
if (error) {
return cb(error)
}
ProjectHistoryRedisManager.queueRenameEntity(
projectId,
projectHistoryId,
'doc',
update.id,
userId,
update,
source,
(error, count) => {
projectOpsLength = count
cb(error)
}
)
}
)
}
break
case 'add-file':
ProjectHistoryRedisManager.queueAddEntity(
projectId,
projectHistoryId,
'file',
update.id,
userId,
update,
source,
(error, count) => {
projectOpsLength = count
cb(error)
}
)
break
case 'rename-file':
ProjectHistoryRedisManager.queueRenameEntity(
projectId,
projectHistoryId,
'file',
update.id,
userId,
update,
source,
(error, count) => {
projectOpsLength = count
cb(error)
}
)
break
default:
cb(new Error(`Unknown update type: ${update.type}`))
}
}
async.eachSeries(updates, handleUpdate, error => {
if (error) {
return callback(error)
}
if (
HistoryManager.shouldFlushHistoryOps(
projectOpsLength,
updates.length,
HistoryManager.FLUSH_PROJECT_EVERY_N_OPS
)
) {
HistoryManager.flushProjectChangesAsync(projectId)
}
callback()
})
}
module.exports = {
flushProjectWithLocks,
flushAndDeleteProjectWithLocks,
queueFlushAndDeleteProject,
getProjectDocsTimestamps,
getProjectDocsAndFlushIfOld,
clearProjectState,
updateProjectWithLocks,
}
module.exports.promises = promisifyAll(module.exports)

View File

@@ -0,0 +1,577 @@
// @ts-check
const RangesTracker = require('@overleaf/ranges-tracker')
const logger = require('@overleaf/logger')
const OError = require('@overleaf/o-error')
const Metrics = require('./Metrics')
const _ = require('lodash')
const { isInsert, isDelete, isComment, getDocLength } = require('./Utils')
/**
* @import { Comment, CommentOp, InsertOp, DeleteOp, HistoryOp, Op } from './types'
* @import { HistoryCommentOp, HistoryDeleteOp, HistoryInsertOp, HistoryRetainOp } from './types'
* @import { HistoryDeleteTrackedChange, HistoryUpdate, Ranges, TrackedChange, Update } from './types'
*/
const RANGE_DELTA_BUCKETS = [0, 1, 2, 3, 4, 5, 10, 20, 50]
const RangesManager = {
MAX_COMMENTS: 500,
MAX_CHANGES: 2000,
/**
* Apply an update to the given doc (lines and ranges) and return new ranges
*
* @param {string} projectId
* @param {string} docId
* @param {Ranges} ranges - ranges before the updates were applied
* @param {Update[]} updates
* @param {string[]} newDocLines - the document lines after the updates were applied
* @param {object} opts
* @param {boolean} [opts.historyRangesSupport] - whether history ranges support is enabled
* @returns {{ newRanges: Ranges, rangesWereCollapsed: boolean, historyUpdates: HistoryUpdate[] }}
*/
applyUpdate(projectId, docId, ranges, updates, newDocLines, opts = {}) {
if (ranges == null) {
ranges = {}
}
if (updates == null) {
updates = []
}
const { changes, comments } = _.cloneDeep(ranges)
const rangesTracker = new RangesTracker(changes, comments)
const [emptyRangeCountBefore, totalRangeCountBefore] =
RangesManager._emptyRangesCount(rangesTracker)
const historyUpdates = []
for (const update of updates) {
const trackingChanges = Boolean(update.meta?.tc)
rangesTracker.track_changes = trackingChanges
if (update.meta?.tc) {
rangesTracker.setIdSeed(update.meta.tc)
}
const historyOps = []
for (const op of update.op) {
let croppedCommentOps = []
if (opts.historyRangesSupport) {
historyOps.push(
getHistoryOp(op, rangesTracker.comments, rangesTracker.changes)
)
if (isDelete(op) && trackingChanges) {
// If a tracked delete overlaps a comment, the comment must be
// cropped. The extent of the cropping is calculated before the
// delete is applied, but the cropping operations are applied
// later, after the delete is applied.
croppedCommentOps = getCroppedCommentOps(op, rangesTracker.comments)
}
} else if (isInsert(op) || isDelete(op)) {
historyOps.push(op)
}
rangesTracker.applyOp(op, { user_id: update.meta?.user_id })
if (croppedCommentOps.length > 0) {
historyOps.push(
...croppedCommentOps.map(op =>
getHistoryOpForComment(op, rangesTracker.changes)
)
)
}
}
if (historyOps.length > 0) {
historyUpdates.push({ ...update, op: historyOps })
}
}
if (
rangesTracker.changes?.length > RangesManager.MAX_CHANGES ||
rangesTracker.comments?.length > RangesManager.MAX_COMMENTS
) {
throw new Error('too many comments or tracked changes')
}
try {
// This is a consistency check that all of our ranges and
// comments still match the corresponding text
rangesTracker.validate(newDocLines.join('\n'))
} catch (err) {
logger.error(
{ err, projectId, docId, newDocLines, updates },
'error validating ranges'
)
throw err
}
const [emptyRangeCountAfter, totalRangeCountAfter] =
RangesManager._emptyRangesCount(rangesTracker)
const rangesWereCollapsed =
emptyRangeCountAfter > emptyRangeCountBefore ||
totalRangeCountAfter + 1 < totalRangeCountBefore // also include the case where multiple ranges were removed
// monitor the change in range count, we may want to snapshot before large decreases
if (totalRangeCountAfter < totalRangeCountBefore) {
Metrics.histogram(
'range-delta',
totalRangeCountBefore - totalRangeCountAfter,
RANGE_DELTA_BUCKETS,
{ status_code: rangesWereCollapsed ? 'saved' : 'unsaved' }
)
}
const newRanges = RangesManager._getRanges(rangesTracker)
logger.debug(
{
projectId,
docId,
changesCount: newRanges.changes?.length,
commentsCount: newRanges.comments?.length,
rangesWereCollapsed,
},
'applied updates to ranges'
)
return { newRanges, rangesWereCollapsed, historyUpdates }
},
acceptChanges(projectId, docId, changeIds, ranges, lines) {
const { changes, comments } = ranges
logger.debug(`accepting ${changeIds.length} changes in ranges`)
const rangesTracker = new RangesTracker(changes, comments)
rangesTracker.removeChangeIds(changeIds)
const newRanges = RangesManager._getRanges(rangesTracker)
return newRanges
},
deleteComment(commentId, ranges) {
const { changes, comments } = ranges
logger.debug({ commentId }, 'deleting comment in ranges')
const rangesTracker = new RangesTracker(changes, comments)
rangesTracker.removeCommentId(commentId)
const newRanges = RangesManager._getRanges(rangesTracker)
return newRanges
},
/**
*
* @param {object} args
* @param {string} args.docId
* @param {string[]} args.acceptedChangeIds
* @param {TrackedChange[]} args.changes
* @param {string} args.pathname
* @param {string} args.projectHistoryId
* @param {string[]} args.lines
*/
getHistoryUpdatesForAcceptedChanges({
docId,
acceptedChangeIds,
changes,
pathname,
projectHistoryId,
lines,
}) {
/** @type {(change: TrackedChange) => boolean} */
const isAccepted = change => acceptedChangeIds.includes(change.id)
const historyOps = []
// Keep ops in order of offset, with deletes before inserts
const sortedChanges = changes.slice().sort(function (c1, c2) {
const result = c1.op.p - c2.op.p
if (result !== 0) {
return result
} else if (isInsert(c1.op) && isDelete(c2.op)) {
return 1
} else if (isDelete(c1.op) && isInsert(c2.op)) {
return -1
} else {
return 0
}
})
const docLength = getDocLength(lines)
let historyDocLength = docLength
for (const change of sortedChanges) {
if (isDelete(change.op)) {
historyDocLength += change.op.d.length
}
}
let unacceptedDeletes = 0
for (const change of sortedChanges) {
/** @type {HistoryOp | undefined} */
let op
if (isDelete(change.op)) {
if (isAccepted(change)) {
op = {
p: change.op.p,
d: change.op.d,
}
if (unacceptedDeletes > 0) {
op.hpos = op.p + unacceptedDeletes
}
} else {
unacceptedDeletes += change.op.d.length
}
} else if (isInsert(change.op)) {
if (isAccepted(change)) {
op = {
p: change.op.p,
r: change.op.i,
tracking: { type: 'none' },
}
if (unacceptedDeletes > 0) {
op.hpos = op.p + unacceptedDeletes
}
}
}
if (!op) {
continue
}
/** @type {HistoryUpdate} */
const historyOp = {
doc: docId,
op: [op],
meta: {
...change.metadata,
ts: Date.now(),
doc_length: docLength,
pathname,
},
}
if (projectHistoryId) {
historyOp.projectHistoryId = projectHistoryId
}
if (historyOp.meta && historyDocLength !== docLength) {
historyOp.meta.history_doc_length = historyDocLength
}
historyOps.push(historyOp)
if (isDelete(change.op) && isAccepted(change)) {
historyDocLength -= change.op.d.length
}
}
return historyOps
},
_getRanges(rangesTracker) {
// Return the minimal data structure needed, since most documents won't have any
// changes or comments
const response = {}
if (rangesTracker.changes != null && rangesTracker.changes.length > 0) {
response.changes = rangesTracker.changes
}
if (rangesTracker.comments != null && rangesTracker.comments.length > 0) {
response.comments = rangesTracker.comments
}
return response
},
_emptyRangesCount(ranges) {
let emptyCount = 0
let totalCount = 0
for (const comment of ranges.comments || []) {
totalCount++
if (comment.op.c === '') {
emptyCount++
}
}
for (const change of ranges.changes || []) {
totalCount++
if (change.op.i != null) {
if (change.op.i === '') {
emptyCount++
}
}
}
return [emptyCount, totalCount]
},
}
/**
* Calculate ops to be sent to the history system.
*
* @param {Op} op - the editor op
* @param {TrackedChange[]} changes - the list of tracked changes in the
* document before the op is applied. That list, coming from
* RangesTracker is ordered by position.
* @returns {HistoryOp}
*/
function getHistoryOp(op, comments, changes, opts = {}) {
if (isInsert(op)) {
return getHistoryOpForInsert(op, comments, changes)
} else if (isDelete(op)) {
return getHistoryOpForDelete(op, changes)
} else if (isComment(op)) {
return getHistoryOpForComment(op, changes)
} else {
throw new OError('Unrecognized op', { op })
}
}
/**
* Calculate history ops for an insert
*
* Inserts are moved forward by tracked deletes placed strictly before the
* op. When an insert is made at the same position as a tracked delete, the
* insert is placed before the tracked delete.
*
* We also add a commentIds property when inserts are made inside a comment.
* The current behaviour is to include the insert in the comment only if the
* insert is made strictly inside the comment. Inserts made at the edges are
* not included in the comment.
*
* @param {InsertOp} op
* @param {Comment[]} comments
* @param {TrackedChange[]} changes
* @returns {HistoryInsertOp}
*/
function getHistoryOpForInsert(op, comments, changes) {
let hpos = op.p
let trackedDeleteRejection = false
const commentIds = new Set()
for (const comment of comments) {
if (comment.op.p < op.p && op.p < comment.op.p + comment.op.c.length) {
// Insert is inside the comment; add the comment id
commentIds.add(comment.op.t)
}
}
// If it's determined that the op is a tracked delete rejection, we have to
// calculate its proper history position. If multiple tracked deletes are
// found at the same position as the insert, the tracked deletes that come
// before the tracked delete that was actually rejected offset the history
// position.
let trackedDeleteRejectionOffset = 0
for (const change of changes) {
if (!isDelete(change.op)) {
// We're only interested in tracked deletes
continue
}
if (change.op.p < op.p) {
// Tracked delete is before the op. Move the op forward.
hpos += change.op.d.length
} else if (change.op.p === op.p) {
// Tracked delete is at the same position as the op.
if (op.u && change.op.d.startsWith(op.i)) {
// We're undoing and the insert matches the start of the tracked
// delete. RangesManager treats this as a tracked delete rejection. We
// will note this in the op so that project-history can take the
// appropriate action.
trackedDeleteRejection = true
// The history must be updated to take into account all preceding
// tracked deletes at the same position
hpos += trackedDeleteRejectionOffset
// No need to continue. All subsequent tracked deletes are after the
// insert.
break
} else {
// This tracked delete does not match the insert. Note its length in
// case we find a tracked delete that matches later.
trackedDeleteRejectionOffset += change.op.d.length
}
} else {
// Tracked delete is after the insert. Tracked deletes are ordered, so
// we know that all subsequent tracked deletes will be after the insert
// and we can bail out.
break
}
}
/** @type {HistoryInsertOp} */
const historyOp = { ...op }
if (commentIds.size > 0) {
historyOp.commentIds = Array.from(commentIds)
}
if (hpos !== op.p) {
historyOp.hpos = hpos
}
if (trackedDeleteRejection) {
historyOp.trackedDeleteRejection = true
}
return historyOp
}
/**
* Calculate history op for a delete
*
* Deletes are moved forward by tracked deletes placed before or at the position of the
* op. If a tracked delete is inside the delete, the delete is split in parts
* so that characters are deleted around the tracked delete, but the tracked
* delete itself is not deleted.
*
* @param {DeleteOp} op
* @param {TrackedChange[]} changes
* @returns {HistoryDeleteOp}
*/
function getHistoryOpForDelete(op, changes, opts = {}) {
let hpos = op.p
const opEnd = op.p + op.d.length
/** @type HistoryDeleteTrackedChange[] */
const changesInsideDelete = []
for (const change of changes) {
if (change.op.p <= op.p) {
if (isDelete(change.op)) {
// Tracked delete is before or at the position of the incoming delete.
// Move the op forward.
hpos += change.op.d.length
} else if (isInsert(change.op)) {
const changeEnd = change.op.p + change.op.i.length
const endPos = Math.min(changeEnd, opEnd)
if (endPos > op.p) {
// Part of the tracked insert is inside the delete
changesInsideDelete.push({
type: 'insert',
offset: 0,
length: endPos - op.p,
})
}
}
} else if (change.op.p < op.p + op.d.length) {
// Tracked change inside the deleted text. Record it for the history system.
if (isDelete(change.op)) {
changesInsideDelete.push({
type: 'delete',
offset: change.op.p - op.p,
length: change.op.d.length,
})
} else if (isInsert(change.op)) {
changesInsideDelete.push({
type: 'insert',
offset: change.op.p - op.p,
length: Math.min(change.op.i.length, opEnd - change.op.p),
})
}
} else {
// We've seen all tracked changes before or inside the delete
break
}
}
/** @type {HistoryDeleteOp} */
const historyOp = { ...op }
if (hpos !== op.p) {
historyOp.hpos = hpos
}
if (changesInsideDelete.length > 0) {
historyOp.trackedChanges = changesInsideDelete
}
return historyOp
}
/**
* Calculate history ops for a comment
*
* Comments are moved forward by tracked deletes placed before or at the
* position of the op. If a tracked delete is inside the comment, the length of
* the comment is extended to include the tracked delete.
*
* @param {CommentOp} op
* @param {TrackedChange[]} changes
* @returns {HistoryCommentOp}
*/
function getHistoryOpForComment(op, changes) {
let hpos = op.p
let hlen = op.c.length
for (const change of changes) {
if (!isDelete(change.op)) {
// We're only interested in tracked deletes
continue
}
if (change.op.p <= op.p) {
// Tracked delete is before or at the position of the incoming comment.
// Move the op forward.
hpos += change.op.d.length
} else if (change.op.p < op.p + op.c.length) {
// Tracked comment inside the comment. Extend the length
hlen += change.op.d.length
} else {
// We've seen all tracked deletes before or inside the comment
break
}
}
/** @type {HistoryCommentOp} */
const historyOp = { ...op }
if (hpos !== op.p) {
historyOp.hpos = hpos
}
if (hlen !== op.c.length) {
historyOp.hlen = hlen
}
return historyOp
}
/**
* Return the ops necessary to properly crop comments when a tracked delete is
* received
*
* The editor treats a tracked delete as a proper delete and updates the
* comment range accordingly. The history doesn't do that and remembers the
* extent of the comment in the tracked delete. In order to keep the history
* consistent with the editor, we'll send ops that will crop the comment in
* the history.
*
* @param {DeleteOp} op
* @param {Comment[]} comments
* @returns {CommentOp[]}
*/
function getCroppedCommentOps(op, comments) {
const deleteStart = op.p
const deleteLength = op.d.length
const deleteEnd = deleteStart + deleteLength
/** @type {HistoryCommentOp[]} */
const historyCommentOps = []
for (const comment of comments) {
const commentStart = comment.op.p
const commentLength = comment.op.c.length
const commentEnd = commentStart + commentLength
if (deleteStart <= commentStart && deleteEnd > commentStart) {
// The comment overlaps the start of the comment or all of it.
const overlapLength = Math.min(deleteEnd, commentEnd) - commentStart
/** @type {CommentOp} */
const commentOp = {
p: deleteStart,
c: comment.op.c.slice(overlapLength),
t: comment.op.t,
}
if (comment.op.resolved) {
commentOp.resolved = true
}
historyCommentOps.push(commentOp)
} else if (
deleteStart > commentStart &&
deleteStart < commentEnd &&
deleteEnd >= commentEnd
) {
// The comment overlaps the end of the comment.
const overlapLength = commentEnd - deleteStart
/** @type {CommentOp} */
const commentOp = {
p: commentStart,
c: comment.op.c.slice(0, -overlapLength),
t: comment.op.t,
}
if (comment.op.resolved) {
commentOp.resolved = true
}
historyCommentOps.push(commentOp)
}
}
return historyCommentOps
}
module.exports = RangesManager

View File

@@ -0,0 +1,85 @@
/* eslint-disable
no-unused-vars,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
let RateLimiter
const Settings = require('@overleaf/settings')
const logger = require('@overleaf/logger')
const Metrics = require('./Metrics')
module.exports = RateLimiter = class RateLimiter {
constructor(number) {
if (number == null) {
number = 10
}
this.ActiveWorkerCount = 0
this.CurrentWorkerLimit = number
this.BaseWorkerCount = number
}
_adjustLimitUp() {
this.CurrentWorkerLimit += 0.1 // allow target worker limit to increase gradually
return Metrics.gauge('currentLimit', Math.ceil(this.CurrentWorkerLimit))
}
_adjustLimitDown() {
this.CurrentWorkerLimit = Math.max(
this.BaseWorkerCount,
this.CurrentWorkerLimit * 0.9
)
logger.debug(
{ currentLimit: Math.ceil(this.CurrentWorkerLimit) },
'reducing rate limit'
)
return Metrics.gauge('currentLimit', Math.ceil(this.CurrentWorkerLimit))
}
_trackAndRun(task, callback) {
if (callback == null) {
callback = function () {}
}
this.ActiveWorkerCount++
Metrics.gauge('processingUpdates', this.ActiveWorkerCount)
return task(err => {
this.ActiveWorkerCount--
Metrics.gauge('processingUpdates', this.ActiveWorkerCount)
return callback(err)
})
}
run(task, callback) {
if (this.ActiveWorkerCount < this.CurrentWorkerLimit) {
// below the limit, just put the task in the background
this._trackAndRun(task, err => {
if (err) {
logger.error({ err }, 'error in background task')
}
})
callback() // return immediately
if (this.CurrentWorkerLimit > this.BaseWorkerCount) {
return this._adjustLimitDown()
}
} else {
logger.debug(
{
active: this.ActiveWorkerCount,
currentLimit: Math.ceil(this.CurrentWorkerLimit),
},
'hit rate limit'
)
return this._trackAndRun(task, err => {
if (err == null) {
this._adjustLimitUp()
} // don't increment rate limit if there was an error
return callback(err)
}) // only return after task completes
}
}
}

View File

@@ -0,0 +1,136 @@
/* eslint-disable
no-unused-vars,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
const Settings = require('@overleaf/settings')
const { promisifyAll } = require('@overleaf/promise-utils')
const rclient = require('@overleaf/redis-wrapper').createClient(
Settings.redis.documentupdater
)
const pubsubClient = require('@overleaf/redis-wrapper').createClient(
Settings.redis.pubsub
)
const Keys = Settings.redis.documentupdater.key_schema
const logger = require('@overleaf/logger')
const os = require('node:os')
const crypto = require('node:crypto')
const metrics = require('./Metrics')
const HOST = os.hostname()
const RND = crypto.randomBytes(4).toString('hex') // generate a random key for this process
let COUNT = 0
const MAX_OPS_PER_ITERATION = 8 // process a limited number of ops for safety
const RealTimeRedisManager = {
getPendingUpdatesForDoc(docId, callback) {
// Make sure that this MULTI operation only operates on doc
// specific keys, i.e. keys that have the doc id in curly braces.
// The curly braces identify a hash key for Redis and ensures that
// the MULTI's operations are all done on the same node in a
// cluster environment.
const multi = rclient.multi()
multi.llen(Keys.pendingUpdates({ doc_id: docId }))
multi.lrange(
Keys.pendingUpdates({ doc_id: docId }),
0,
MAX_OPS_PER_ITERATION - 1
)
multi.ltrim(
Keys.pendingUpdates({ doc_id: docId }),
MAX_OPS_PER_ITERATION,
-1
)
return multi.exec(function (error, replys) {
if (error != null) {
return callback(error)
}
const [llen, jsonUpdates, _trimResult] = replys
metrics.histogram(
'redis.pendingUpdates.llen',
llen,
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 75, 100]
)
for (const jsonUpdate of jsonUpdates) {
// record metric for each update removed from queue
metrics.summary('redis.pendingUpdates', jsonUpdate.length, {
status: 'pop',
})
}
const updates = []
for (const jsonUpdate of jsonUpdates) {
let update
try {
update = JSON.parse(jsonUpdate)
} catch (e) {
return callback(e)
}
updates.push(update)
}
return callback(error, updates)
})
},
getUpdatesLength(docId, callback) {
return rclient.llen(Keys.pendingUpdates({ doc_id: docId }), callback)
},
sendCanaryAppliedOp({ projectId, docId, op }) {
const ack = JSON.stringify({ v: op.v, doc: docId }).length
// Updates with op.dup===true will not get sent to other clients, they only get acked.
const broadcast = op.dup ? 0 : JSON.stringify(op).length
const payload = JSON.stringify({
message: 'canary-applied-op',
payload: {
ack,
broadcast,
docId,
projectId,
source: op.meta.source,
},
})
// Publish on the editor-events channel of the project as real-time already listens to that before completing the connection startup.
// publish on separate channels for individual projects and docs when
// configured (needs realtime to be configured for this too).
if (Settings.publishOnIndividualChannels) {
return pubsubClient.publish(`editor-events:${projectId}`, payload)
} else {
return pubsubClient.publish('editor-events', payload)
}
},
sendData(data) {
// create a unique message id using a counter
const messageId = `doc:${HOST}:${RND}-${COUNT++}`
if (data != null) {
data._id = messageId
}
const blob = JSON.stringify(data)
metrics.summary('redis.publish.applied-ops', blob.length)
// publish on separate channels for individual projects and docs when
// configured (needs realtime to be configured for this too).
if (Settings.publishOnIndividualChannels) {
return pubsubClient.publish(`applied-ops:${data.doc_id}`, blob)
} else {
return pubsubClient.publish('applied-ops', blob)
}
},
}
module.exports = RealTimeRedisManager
module.exports.promises = promisifyAll(RealTimeRedisManager, {
without: ['sendData'],
})

View File

@@ -0,0 +1,796 @@
const Settings = require('@overleaf/settings')
const rclient = require('@overleaf/redis-wrapper').createClient(
Settings.redis.documentupdater
)
const logger = require('@overleaf/logger')
const OError = require('@overleaf/o-error')
const { promisifyAll } = require('@overleaf/promise-utils')
const metrics = require('./Metrics')
const Errors = require('./Errors')
const crypto = require('node:crypto')
const async = require('async')
const { docIsTooLarge } = require('./Limits')
// Sometimes Redis calls take an unexpectedly long time. We have to be
// quick with Redis calls because we're holding a lock that expires
// after 30 seconds. We can't let any errors in the rest of the stack
// hold us up, and need to bail out quickly if there is a problem.
const MAX_REDIS_REQUEST_LENGTH = 5000 // 5 seconds
const PROJECT_BLOCK_TTL_SECS = 30
// Make times easy to read
const minutes = 60 // seconds for Redis expire
const logHashReadErrors = Settings.documentupdater?.logHashErrors?.read
const MEGABYTES = 1024 * 1024
const MAX_RANGES_SIZE = 3 * MEGABYTES
const keys = Settings.redis.documentupdater.key_schema
const RedisManager = {
rclient,
putDocInMemory(
projectId,
docId,
docLines,
version,
ranges,
resolvedCommentIds,
pathname,
projectHistoryId,
historyRangesSupport,
_callback
) {
const timer = new metrics.Timer('redis.put-doc')
const callback = error => {
timer.done()
_callback(error)
}
const docLinesArray = docLines
docLines = JSON.stringify(docLines)
if (docLines.indexOf('\u0000') !== -1) {
const error = new Error('null bytes found in doc lines')
// this check was added to catch memory corruption in JSON.stringify.
// It sometimes returned null bytes at the end of the string.
logger.error({ err: error, docId, docLines }, error.message)
return callback(error)
}
// Do an optimised size check on the docLines using the serialised
// length as an upper bound
const sizeBound = docLines.length
if (docIsTooLarge(sizeBound, docLinesArray, Settings.max_doc_length)) {
const docSize = docLines.length
const err = new Error('blocking doc insert into redis: doc is too large')
logger.error({ projectId, docId, err, docSize }, err.message)
return callback(err)
}
const docHash = RedisManager._computeHash(docLines)
// record bytes sent to redis
metrics.summary('redis.docLines', docLines.length, { status: 'set' })
logger.debug(
{ projectId, docId, version, docHash, pathname, projectHistoryId },
'putting doc in redis'
)
RedisManager._serializeRanges(ranges, (error, ranges) => {
if (error) {
logger.error({ err: error, docId, projectId }, error.message)
return callback(error)
}
// update docsInProject set before writing doc contents
const multi = rclient.multi()
multi.exists(keys.projectBlock({ project_id: projectId }))
multi.sadd(keys.docsInProject({ project_id: projectId }), docId)
multi.exec((err, reply) => {
if (err) {
return callback(err)
}
const projectBlocked = reply[0] === 1
if (projectBlocked) {
// We don't clean up the spurious docId added in the docsInProject
// set. There is a risk that the docId was successfully added by a
// concurrent process. This set is used when unloading projects. An
// extra docId will not prevent the project from being uploaded, but
// a missing docId means that the doc might stay in Redis forever.
return callback(
new OError('Project blocked from loading docs', { projectId })
)
}
RedisManager.setHistoryRangesSupportFlag(
docId,
historyRangesSupport,
err => {
if (err) {
return callback(err)
}
if (!pathname) {
metrics.inc('pathname', 1, {
path: 'RedisManager.setDoc',
status: pathname === '' ? 'zero-length' : 'undefined',
})
}
// Make sure that this MULTI operation only operates on doc
// specific keys, i.e. keys that have the doc id in curly braces.
// The curly braces identify a hash key for Redis and ensures that
// the MULTI's operations are all done on the same node in a
// cluster environment.
const multi = rclient.multi()
multi.mset({
[keys.docLines({ doc_id: docId })]: docLines,
[keys.projectKey({ doc_id: docId })]: projectId,
[keys.docVersion({ doc_id: docId })]: version,
[keys.docHash({ doc_id: docId })]: docHash,
[keys.ranges({ doc_id: docId })]: ranges,
[keys.pathname({ doc_id: docId })]: pathname,
[keys.projectHistoryId({ doc_id: docId })]: projectHistoryId,
})
if (historyRangesSupport) {
multi.del(keys.resolvedCommentIds({ doc_id: docId }))
if (resolvedCommentIds.length > 0) {
multi.sadd(
keys.resolvedCommentIds({ doc_id: docId }),
...resolvedCommentIds
)
}
}
multi.exec(err => {
if (err) {
callback(
OError.tag(err, 'failed to write doc to Redis in MULTI', {
previousErrors: err.previousErrors.map(e => ({
name: e.name,
message: e.message,
command: e.command,
})),
})
)
} else {
callback()
}
})
}
)
})
})
},
removeDocFromMemory(projectId, docId, _callback) {
logger.debug({ projectId, docId }, 'removing doc from redis')
const callback = err => {
if (err) {
logger.err({ projectId, docId, err }, 'error removing doc from redis')
_callback(err)
} else {
logger.debug({ projectId, docId }, 'removed doc from redis')
_callback()
}
}
// Make sure that this MULTI operation only operates on doc
// specific keys, i.e. keys that have the doc id in curly braces.
// The curly braces identify a hash key for Redis and ensures that
// the MULTI's operations are all done on the same node in a
// cluster environment.
let multi = rclient.multi()
multi.strlen(keys.docLines({ doc_id: docId }))
multi.del(
keys.docLines({ doc_id: docId }),
keys.projectKey({ doc_id: docId }),
keys.docVersion({ doc_id: docId }),
keys.docHash({ doc_id: docId }),
keys.ranges({ doc_id: docId }),
keys.pathname({ doc_id: docId }),
keys.projectHistoryId({ doc_id: docId }),
keys.unflushedTime({ doc_id: docId }),
keys.lastUpdatedAt({ doc_id: docId }),
keys.lastUpdatedBy({ doc_id: docId }),
keys.resolvedCommentIds({ doc_id: docId })
)
multi.exec((error, response) => {
if (error) {
return callback(error)
}
const length = response?.[0]
if (length > 0) {
// record bytes freed in redis
metrics.summary('redis.docLines', length, { status: 'del' })
}
// Make sure that this MULTI operation only operates on project
// specific keys, i.e. keys that have the project id in curly braces.
// The curly braces identify a hash key for Redis and ensures that
// the MULTI's operations are all done on the same node in a
// cluster environment.
multi = rclient.multi()
multi.srem(keys.docsInProject({ project_id: projectId }), docId)
multi.del(keys.projectState({ project_id: projectId }))
multi.exec(err => {
if (err) {
return callback(err)
}
rclient.srem(keys.historyRangesSupport(), docId, callback)
})
})
},
checkOrSetProjectState(projectId, newState, callback) {
// Make sure that this MULTI operation only operates on project
// specific keys, i.e. keys that have the project id in curly braces.
// The curly braces identify a hash key for Redis and ensures that
// the MULTI's operations are all done on the same node in a
// cluster environment.
const multi = rclient.multi()
multi.getset(keys.projectState({ project_id: projectId }), newState)
multi.expire(keys.projectState({ project_id: projectId }), 30 * minutes)
multi.exec((error, response) => {
if (error) {
return callback(error)
}
logger.debug(
{ projectId, newState, oldState: response[0] },
'checking project state'
)
callback(null, response[0] !== newState)
})
},
clearProjectState(projectId, callback) {
rclient.del(keys.projectState({ project_id: projectId }), callback)
},
getDoc(projectId, docId, callback) {
const timer = new metrics.Timer('redis.get-doc')
const collectKeys = [
keys.docLines({ doc_id: docId }),
keys.docVersion({ doc_id: docId }),
keys.docHash({ doc_id: docId }),
keys.projectKey({ doc_id: docId }),
keys.ranges({ doc_id: docId }),
keys.pathname({ doc_id: docId }),
keys.projectHistoryId({ doc_id: docId }),
keys.unflushedTime({ doc_id: docId }),
keys.lastUpdatedAt({ doc_id: docId }),
keys.lastUpdatedBy({ doc_id: docId }),
]
rclient.mget(...collectKeys, (error, result) => {
if (error) {
return callback(error)
}
let [
docLines,
version,
storedHash,
docProjectId,
ranges,
pathname,
projectHistoryId,
unflushedTime,
lastUpdatedAt,
lastUpdatedBy,
] = result
rclient.sismember(keys.historyRangesSupport(), docId, (error, result) => {
if (error) {
return callback(error)
}
rclient.smembers(
keys.resolvedCommentIds({ doc_id: docId }),
(error, resolvedCommentIds) => {
if (error) {
return callback(error)
}
const historyRangesSupport = result === 1
const timeSpan = timer.done()
// check if request took too long and bail out. only do this for
// get, because it is the first call in each update, so if this
// passes we'll assume others have a reasonable chance to succeed.
if (timeSpan > MAX_REDIS_REQUEST_LENGTH) {
error = new Error('redis getDoc exceeded timeout')
return callback(error)
}
// record bytes loaded from redis
if (docLines != null) {
metrics.summary('redis.docLines', docLines.length, {
status: 'get',
})
}
// check sha1 hash value if present
if (docLines != null && storedHash != null) {
const computedHash = RedisManager._computeHash(docLines)
if (logHashReadErrors && computedHash !== storedHash) {
logger.error(
{
projectId,
docId,
docProjectId,
computedHash,
storedHash,
docLines,
},
'hash mismatch on retrieved document'
)
}
}
try {
docLines = JSON.parse(docLines)
ranges = RedisManager._deserializeRanges(ranges)
} catch (e) {
return callback(e)
}
version = parseInt(version || 0, 10)
// check doc is in requested project
if (docProjectId != null && docProjectId !== projectId) {
logger.error(
{ projectId, docId, docProjectId },
'doc not in project'
)
return callback(new Errors.NotFoundError('document not found'))
}
if (docLines && version && !pathname) {
metrics.inc('pathname', 1, {
path: 'RedisManager.getDoc',
status: pathname === '' ? 'zero-length' : 'undefined',
})
}
callback(
null,
docLines,
version,
ranges,
pathname,
projectHistoryId,
unflushedTime,
lastUpdatedAt,
lastUpdatedBy,
historyRangesSupport,
resolvedCommentIds
)
}
)
})
})
},
getDocVersion(docId, callback) {
rclient.mget(keys.docVersion({ doc_id: docId }), (error, result) => {
if (error) {
return callback(error)
}
let [version] = result || []
version = parseInt(version, 10)
callback(null, version)
})
},
getDocLines(docId, callback) {
rclient.get(keys.docLines({ doc_id: docId }), (error, docLines) => {
if (error) {
return callback(error)
}
callback(null, docLines)
})
},
getPreviousDocOps(docId, start, end, callback) {
const timer = new metrics.Timer('redis.get-prev-docops')
rclient.llen(keys.docOps({ doc_id: docId }), (error, length) => {
if (error) {
return callback(error)
}
rclient.get(keys.docVersion({ doc_id: docId }), (error, version) => {
if (error) {
return callback(error)
}
version = parseInt(version, 10)
const firstVersionInRedis = version - length
if (start < firstVersionInRedis || end > version) {
error = new Errors.OpRangeNotAvailableError(
'doc ops range is not loaded in redis',
{ firstVersionInRedis, version, ttlInS: RedisManager.DOC_OPS_TTL }
)
logger.debug(
{ err: error, docId, length, version, start, end },
'doc ops range is not loaded in redis'
)
return callback(error)
}
start = start - firstVersionInRedis
if (end > -1) {
end = end - firstVersionInRedis
}
if (isNaN(start) || isNaN(end)) {
error = new Error('inconsistent version or lengths')
logger.error(
{ err: error, docId, length, version, start, end },
'inconsistent version or length'
)
return callback(error)
}
rclient.lrange(
keys.docOps({ doc_id: docId }),
start,
end,
(error, jsonOps) => {
let ops
if (error) {
return callback(error)
}
try {
ops = jsonOps.map(jsonOp => JSON.parse(jsonOp))
} catch (e) {
return callback(e)
}
const timeSpan = timer.done()
if (timeSpan > MAX_REDIS_REQUEST_LENGTH) {
error = new Error('redis getPreviousDocOps exceeded timeout')
return callback(error)
}
callback(null, ops)
}
)
})
})
},
DOC_OPS_TTL: 60 * minutes,
DOC_OPS_MAX_LENGTH: 100,
updateDocument(
projectId,
docId,
docLines,
newVersion,
appliedOps,
ranges,
updateMeta,
callback
) {
if (appliedOps == null) {
appliedOps = []
}
RedisManager.getDocVersion(docId, (error, currentVersion) => {
if (error) {
return callback(error)
}
if (currentVersion + appliedOps.length !== newVersion) {
error = new Error(`Version mismatch. '${docId}' is corrupted.`)
logger.error(
{
err: error,
docId,
currentVersion,
newVersion,
opsLength: appliedOps.length,
},
'version mismatch'
)
return callback(error)
}
const jsonOps = appliedOps.map(op => JSON.stringify(op))
for (const op of jsonOps) {
if (op.indexOf('\u0000') !== -1) {
error = new Error('null bytes found in jsonOps')
// this check was added to catch memory corruption in JSON.stringify
logger.error({ err: error, docId, jsonOps }, error.message)
return callback(error)
}
}
const newDocLines = JSON.stringify(docLines)
if (newDocLines.indexOf('\u0000') !== -1) {
error = new Error('null bytes found in doc lines')
// this check was added to catch memory corruption in JSON.stringify
logger.error({ err: error, docId, newDocLines }, error.message)
return callback(error)
}
// Do an optimised size check on the docLines using the serialised
// length as an upper bound
const sizeBound = newDocLines.length
if (docIsTooLarge(sizeBound, docLines, Settings.max_doc_length)) {
const err = new Error('blocking doc update: doc is too large')
const docSize = newDocLines.length
logger.error({ projectId, docId, err, docSize }, err.message)
return callback(err)
}
const newHash = RedisManager._computeHash(newDocLines)
const opVersions = appliedOps.map(op => op?.v)
logger.debug(
{
docId,
version: newVersion,
hash: newHash,
opVersions,
},
'updating doc in redis'
)
// record bytes sent to redis in update
metrics.summary('redis.docLines', newDocLines.length, {
status: 'update',
})
RedisManager._serializeRanges(ranges, (error, ranges) => {
if (error) {
logger.error({ err: error, docId }, error.message)
return callback(error)
}
if (ranges && ranges.indexOf('\u0000') !== -1) {
error = new Error('null bytes found in ranges')
// this check was added to catch memory corruption in JSON.stringify
logger.error({ err: error, docId, ranges }, error.message)
return callback(error)
}
// Make sure that this MULTI operation only operates on doc
// specific keys, i.e. keys that have the doc id in curly braces.
// The curly braces identify a hash key for Redis and ensures that
// the MULTI's operations are all done on the same node in a
// cluster environment.
const multi = rclient.multi()
multi.mset({
[keys.docLines({ doc_id: docId })]: newDocLines,
[keys.docVersion({ doc_id: docId })]: newVersion,
[keys.docHash({ doc_id: docId })]: newHash,
[keys.ranges({ doc_id: docId })]: ranges,
[keys.lastUpdatedAt({ doc_id: docId })]: Date.now(),
[keys.lastUpdatedBy({ doc_id: docId })]:
updateMeta && updateMeta.user_id,
})
multi.ltrim(
keys.docOps({ doc_id: docId }),
-RedisManager.DOC_OPS_MAX_LENGTH,
-1
) // index 3
// push the ops last so we can get the lengths at fixed index position 7
if (jsonOps.length > 0) {
multi.rpush(keys.docOps({ doc_id: docId }), ...jsonOps) // index 5
// expire must come after rpush since before it will be a no-op if the list is empty
multi.expire(keys.docOps({ doc_id: docId }), RedisManager.DOC_OPS_TTL) // index 6
}
// Set the unflushed timestamp to the current time if not set ("NX" flag).
multi.set(keys.unflushedTime({ doc_id: docId }), Date.now(), 'NX')
multi.exec((error, result) => {
if (error) {
return callback(error)
}
callback()
})
})
})
},
renameDoc(projectId, docId, userId, update, projectHistoryId, callback) {
RedisManager.getDoc(projectId, docId, (error, lines, version) => {
if (error) {
return callback(error)
}
if (lines != null && version != null) {
if (!update.newPathname) {
logger.warn(
{ projectId, docId, update },
'missing pathname in RedisManager.renameDoc'
)
metrics.inc('pathname', 1, {
path: 'RedisManager.renameDoc',
status: update.newPathname === '' ? 'zero-length' : 'undefined',
})
}
rclient.set(
keys.pathname({ doc_id: docId }),
update.newPathname,
callback
)
} else {
callback()
}
})
},
clearUnflushedTime(docId, callback) {
rclient.del(keys.unflushedTime({ doc_id: docId }), callback)
},
updateCommentState(docId, commentId, resolved, callback) {
if (resolved) {
rclient.sadd(
keys.resolvedCommentIds({ doc_id: docId }),
commentId,
callback
)
} else {
rclient.srem(
keys.resolvedCommentIds({ doc_id: docId }),
commentId,
callback
)
}
},
getDocIdsInProject(projectId, callback) {
rclient.smembers(keys.docsInProject({ project_id: projectId }), callback)
},
/**
* Get lastupdatedat timestamps for an array of docIds
*/
getDocTimestamps(docIds, callback) {
async.mapSeries(
docIds,
(docId, cb) => rclient.get(keys.lastUpdatedAt({ doc_id: docId }), cb),
callback
)
},
/**
* Store the project id in a sorted set ordered by time with a random offset
* to smooth out spikes
*/
queueFlushAndDeleteProject(projectId, callback) {
const SMOOTHING_OFFSET =
Settings.smoothingOffset > 0
? Math.round(Settings.smoothingOffset * Math.random())
: 0
rclient.zadd(
keys.flushAndDeleteQueue(),
Date.now() + SMOOTHING_OFFSET,
projectId,
callback
)
},
/**
* Find the oldest queued flush that is before the cutoff time
*/
getNextProjectToFlushAndDelete(cutoffTime, callback) {
rclient.zrangebyscore(
keys.flushAndDeleteQueue(),
0,
cutoffTime,
'WITHSCORES',
'LIMIT',
0,
1,
(err, reply) => {
if (err) {
return callback(err)
}
// return if no projects ready to be processed
if (!reply || reply.length === 0) {
return callback()
}
// pop the oldest entry (get and remove in a multi)
const multi = rclient.multi()
// Poor man's version of ZPOPMIN, which is only available in Redis 5.
multi.zrange(keys.flushAndDeleteQueue(), 0, 0, 'WITHSCORES')
multi.zremrangebyrank(keys.flushAndDeleteQueue(), 0, 0)
multi.zcard(keys.flushAndDeleteQueue()) // the total length of the queue (for metrics)
multi.exec((err, reply) => {
if (err) {
return callback(err)
}
if (!reply || reply.length === 0) {
return callback()
}
const [key, timestamp] = reply[0]
const queueLength = reply[2]
callback(null, key, timestamp, queueLength)
})
}
)
},
setHistoryRangesSupportFlag(docId, historyRangesSupport, callback) {
if (historyRangesSupport) {
rclient.sadd(keys.historyRangesSupport(), docId, callback)
} else {
rclient.srem(keys.historyRangesSupport(), docId, callback)
}
},
blockProject(projectId, callback) {
// Make sure that this MULTI operation only operates on project
// specific keys, i.e. keys that have the project id in curly braces.
// The curly braces identify a hash key for Redis and ensures that
// the MULTI's operations are all done on the same node in a
// cluster environment.
const multi = rclient.multi()
multi.setex(
keys.projectBlock({ project_id: projectId }),
PROJECT_BLOCK_TTL_SECS,
'1'
)
multi.scard(keys.docsInProject({ project_id: projectId }))
multi.exec((err, reply) => {
if (err) {
return callback(err)
}
const docsInProject = reply[1]
if (docsInProject > 0) {
// Too late to lock the project
rclient.del(keys.projectBlock({ project_id: projectId }), err => {
if (err) {
return callback(err)
}
callback(null, false)
})
} else {
callback(null, true)
}
})
},
unblockProject(projectId, callback) {
rclient.del(keys.projectBlock({ project_id: projectId }), (err, reply) => {
if (err) {
return callback(err)
}
const wasBlocked = reply === 1
callback(null, wasBlocked)
})
},
_serializeRanges(ranges, callback) {
let jsonRanges = JSON.stringify(ranges)
if (jsonRanges && jsonRanges.length > MAX_RANGES_SIZE) {
return callback(new Error('ranges are too large'))
}
if (jsonRanges === '{}') {
// Most doc will have empty ranges so don't fill redis with lots of '{}' keys
jsonRanges = null
}
callback(null, jsonRanges)
},
_deserializeRanges(ranges) {
if (ranges == null || ranges === '') {
return {}
} else {
return JSON.parse(ranges)
}
},
_computeHash(docLines) {
// use sha1 checksum of doclines to detect data corruption.
//
// note: must specify 'utf8' encoding explicitly, as the default is
// binary in node < v5
return crypto.createHash('sha1').update(docLines, 'utf8').digest('hex')
},
}
module.exports = RedisManager
module.exports.promises = promisifyAll(RedisManager, {
without: ['_deserializeRanges', '_computeHash'],
multiResult: {
getDoc: [
'lines',
'version',
'ranges',
'pathname',
'projectHistoryId',
'unflushedTime',
'lastUpdatedAt',
'lastUpdatedBy',
'historyRangesSupport',
'resolvedCommentIds',
],
getNextProjectToFlushAndDelete: [
'projectId',
'flushTimestamp',
'queueLength',
],
},
})

View File

@@ -0,0 +1,147 @@
/* eslint-disable
no-unused-vars,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
let ShareJsDB
const logger = require('@overleaf/logger')
const Metrics = require('@overleaf/metrics')
const Keys = require('./UpdateKeys')
const RedisManager = require('./RedisManager')
const Errors = require('./Errors')
const TRANSFORM_UPDATES_COUNT_BUCKETS = [
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 75, 100,
// prepare buckets for full-project history/larger buffer experiments
150, 200, 300, 400,
]
module.exports = ShareJsDB = class ShareJsDB {
constructor(projectId, docId, lines, version) {
this.project_id = projectId
this.doc_id = docId
this.lines = lines
this.version = version
this.appliedOps = {}
// ShareJS calls this detacted from the instance, so we need
// bind it to keep our context that can access @appliedOps
this.writeOp = this._writeOp.bind(this)
this.startTimeShareJsDB = performance.now()
}
getOps(docKey, start, end, callback) {
if (start === end || (start === this.version && end === null)) {
const status = 'is-up-to-date'
Metrics.inc('transform-updates', 1, {
status,
path: 'sharejs',
})
Metrics.histogram(
'transform-updates.count',
0,
TRANSFORM_UPDATES_COUNT_BUCKETS,
{ path: 'sharejs', status }
)
return callback(null, [])
}
// In redis, lrange values are inclusive.
if (end != null) {
end--
} else {
end = -1
}
const [projectId, docId] = Array.from(Keys.splitProjectIdAndDocId(docKey))
const timer = new Metrics.Timer(
'transform-updates.timing',
1,
{ path: 'sharejs' },
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 50, 100, 200, 500, 1000]
)
RedisManager.getPreviousDocOps(docId, start, end, (err, ops) => {
let status
if (err) {
if (err instanceof Errors.OpRangeNotAvailableError) {
status = 'out-of-range'
} else {
status = 'error'
}
} else {
if (ops.length === 0) {
status = 'fetched-zero'
// The sharejs processing is happening under a lock.
// In case there are no other ops available, something bypassed the lock (or we overran it).
logger.warn(
{
projectId,
docId,
start,
end,
timeSinceShareJsDBInit:
performance.now() - this.startTimeShareJsDB,
},
'found zero docOps while transforming update'
)
} else {
status = 'fetched'
}
Metrics.histogram(
'transform-updates.count',
ops.length,
TRANSFORM_UPDATES_COUNT_BUCKETS,
{ path: 'sharejs', status }
)
}
timer.done({ status })
Metrics.inc('transform-updates', 1, { status, path: 'sharejs' })
callback(err, ops)
})
}
_writeOp(docKey, opData, callback) {
if (this.appliedOps[docKey] == null) {
this.appliedOps[docKey] = []
}
this.appliedOps[docKey].push(opData)
return callback()
}
getSnapshot(docKey, callback) {
if (
docKey !== Keys.combineProjectIdAndDocId(this.project_id, this.doc_id)
) {
return callback(
new Errors.NotFoundError(
`unexpected doc_key ${docKey}, expected ${Keys.combineProjectIdAndDocId(
this.project_id,
this.doc_id
)}`
)
)
} else {
return callback(null, {
snapshot: this.lines.join('\n'),
v: parseInt(this.version, 10),
type: 'text',
})
}
}
// To be able to remove a doc from the ShareJS memory
// we need to called Model::delete, which calls this
// method on the database. However, we will handle removing
// it from Redis ourselves
delete(docName, dbMeta, callback) {
return callback()
}
}

View File

@@ -0,0 +1,158 @@
/* eslint-disable
no-unused-vars,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
const ShareJsModel = require('./sharejs/server/model')
const ShareJsDB = require('./ShareJsDB')
const logger = require('@overleaf/logger')
const Settings = require('@overleaf/settings')
const { promisifyAll } = require('@overleaf/promise-utils')
const Keys = require('./UpdateKeys')
const { EventEmitter } = require('node:events')
const util = require('node:util')
const RealTimeRedisManager = require('./RealTimeRedisManager')
const crypto = require('node:crypto')
const metrics = require('./Metrics')
const Errors = require('./Errors')
ShareJsModel.prototype = {}
util.inherits(ShareJsModel, EventEmitter)
const MAX_AGE_OF_OP = 80
const ShareJsUpdateManager = {
getNewShareJsModel(projectId, docId, lines, version) {
const db = new ShareJsDB(projectId, docId, lines, version)
const model = new ShareJsModel(db, {
maxDocLength: Settings.max_doc_length,
maximumAge: MAX_AGE_OF_OP,
})
model.db = db
return model
},
applyUpdate(projectId, docId, update, lines, version, callback) {
if (callback == null) {
callback = function () {}
}
logger.debug({ projectId, docId, update }, 'applying sharejs updates')
const jobs = []
// record the update version before it is modified
const incomingUpdateVersion = update.v
// We could use a global model for all docs, but we're hitting issues with the
// internal state of ShareJS not being accessible for clearing caches, and
// getting stuck due to queued callbacks (line 260 of sharejs/server/model.coffee)
// This adds a small but hopefully acceptable overhead (~12ms per 1000 updates on
// my 2009 MBP).
const model = this.getNewShareJsModel(projectId, docId, lines, version)
this._listenForOps(model)
const docKey = Keys.combineProjectIdAndDocId(projectId, docId)
return model.applyOp(docKey, update, function (error) {
if (error != null) {
if (error === 'Op already submitted') {
metrics.inc('sharejs.already-submitted')
logger.debug(
{ projectId, docId, update },
'op has already been submitted'
)
update.dup = true
ShareJsUpdateManager._sendOp(projectId, docId, update)
} else if (/^Delete component/.test(error)) {
metrics.inc('sharejs.delete-mismatch')
logger.debug(
{ projectId, docId, update, shareJsErr: error },
'sharejs delete does not match'
)
error = new Errors.DeleteMismatchError(
'Delete component does not match'
)
return callback(error)
} else {
metrics.inc('sharejs.other-error')
return callback(error)
}
}
logger.debug({ projectId, docId, error }, 'applied update')
return model.getSnapshot(docKey, (error, data) => {
if (error != null) {
return callback(error)
}
const docSizeAfter = data.snapshot.length
if (docSizeAfter > Settings.max_doc_length) {
const docSizeBefore = lines.join('\n').length
const err = new Error(
'blocking persistence of ShareJs update: doc size exceeds limits'
)
logger.error(
{ projectId, docId, err, docSizeBefore, docSizeAfter },
err.message
)
metrics.inc('sharejs.other-error')
const publicError = 'Update takes doc over max doc size'
return callback(publicError)
}
// only check hash when present and no other updates have been applied
if (update.hash != null && incomingUpdateVersion === version) {
const ourHash = ShareJsUpdateManager._computeHash(data.snapshot)
if (ourHash !== update.hash) {
metrics.inc('sharejs.hash-fail')
return callback(new Error('Invalid hash'))
} else {
metrics.inc('sharejs.hash-pass', 0.001)
}
}
const docLines = data.snapshot.split(/\r\n|\n|\r/)
return callback(
null,
docLines,
data.v,
model.db.appliedOps[docKey] || []
)
})
})
},
_listenForOps(model) {
return model.on('applyOp', function (docKey, opData) {
const [projectId, docId] = Array.from(Keys.splitProjectIdAndDocId(docKey))
return ShareJsUpdateManager._sendOp(projectId, docId, opData)
})
},
_sendOp(projectId, docId, op) {
RealTimeRedisManager.sendData({
project_id: projectId,
doc_id: docId,
op,
})
RealTimeRedisManager.sendCanaryAppliedOp({
projectId,
docId,
op,
})
},
_computeHash(content) {
return crypto
.createHash('sha1')
.update('blob ' + content.length + '\x00')
.update(content, 'utf8')
.digest('hex')
},
}
module.exports = ShareJsUpdateManager
module.exports.promises = promisifyAll(ShareJsUpdateManager, {
without: ['getNewShareJsModel', '_listenForOps', '_sendOp', '_computeHash'],
multiResult: {
applyUpdate: ['updatedDocLines', 'version', 'appliedOps'],
},
})

View File

@@ -0,0 +1,83 @@
/* eslint-disable
no-return-assign,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
const { promisifyAll } = require('@overleaf/promise-utils')
const { db, ObjectId } = require('./mongodb')
const SnapshotManager = {
recordSnapshot(projectId, docId, version, pathname, lines, ranges, callback) {
try {
projectId = new ObjectId(projectId)
docId = new ObjectId(docId)
} catch (error) {
return callback(error)
}
db.docSnapshots.insertOne(
{
project_id: projectId,
doc_id: docId,
version,
lines,
pathname,
ranges: SnapshotManager.jsonRangesToMongo(ranges),
ts: new Date(),
},
callback
)
},
// Suggested indexes:
// db.docSnapshots.createIndex({project_id:1, doc_id:1})
// db.docSnapshots.createIndex({ts:1},{expiresAfterSeconds: 30*24*3600)) # expires after 30 days
jsonRangesToMongo(ranges) {
if (ranges == null) {
return null
}
const updateMetadata = function (metadata) {
if ((metadata != null ? metadata.ts : undefined) != null) {
metadata.ts = new Date(metadata.ts)
}
if ((metadata != null ? metadata.user_id : undefined) != null) {
return (metadata.user_id = SnapshotManager._safeObjectId(
metadata.user_id
))
}
}
for (const change of Array.from(ranges.changes || [])) {
change.id = SnapshotManager._safeObjectId(change.id)
updateMetadata(change.metadata)
}
for (const comment of Array.from(ranges.comments || [])) {
comment.id = SnapshotManager._safeObjectId(comment.id)
if ((comment.op != null ? comment.op.t : undefined) != null) {
comment.op.t = SnapshotManager._safeObjectId(comment.op.t)
}
updateMetadata(comment.metadata)
}
return ranges
},
_safeObjectId(data) {
try {
return new ObjectId(data)
} catch (error) {
return data
}
},
}
module.exports = SnapshotManager
module.exports.promises = promisifyAll(SnapshotManager, {
without: ['jsonRangesToMongo', '_safeObjectId'],
})

View File

@@ -0,0 +1,10 @@
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
module.exports = {
combineProjectIdAndDocId(projectId, docId) {
return `${projectId}:${docId}`
},
splitProjectIdAndDocId(projectAndDocId) {
return projectAndDocId.split(':')
},
}

View File

@@ -0,0 +1,378 @@
// @ts-check
const { callbackifyAll } = require('@overleaf/promise-utils')
const LockManager = require('./LockManager')
const RedisManager = require('./RedisManager')
const ProjectHistoryRedisManager = require('./ProjectHistoryRedisManager')
const RealTimeRedisManager = require('./RealTimeRedisManager')
const ShareJsUpdateManager = require('./ShareJsUpdateManager')
const HistoryManager = require('./HistoryManager')
const logger = require('@overleaf/logger')
const Metrics = require('./Metrics')
const Errors = require('./Errors')
const DocumentManager = require('./DocumentManager')
const RangesManager = require('./RangesManager')
const SnapshotManager = require('./SnapshotManager')
const Profiler = require('./Profiler')
const { isInsert, isDelete, getDocLength, computeDocHash } = require('./Utils')
/**
* @import { DeleteOp, InsertOp, Op, Ranges, Update, HistoryUpdate } from "./types"
*/
const UpdateManager = {
async processOutstandingUpdates(projectId, docId) {
const timer = new Metrics.Timer('updateManager.processOutstandingUpdates')
try {
await UpdateManager.fetchAndApplyUpdates(projectId, docId)
timer.done({ status: 'success' })
} catch (err) {
timer.done({ status: 'error' })
throw err
}
},
async processOutstandingUpdatesWithLock(projectId, docId) {
const profile = new Profiler('processOutstandingUpdatesWithLock', {
project_id: projectId,
doc_id: docId,
})
const lockValue = await LockManager.promises.tryLock(docId)
if (lockValue == null) {
return
}
profile.log('tryLock')
try {
await UpdateManager.processOutstandingUpdates(projectId, docId)
profile.log('processOutstandingUpdates')
} finally {
await LockManager.promises.releaseLock(docId, lockValue)
profile.log('releaseLock').end()
}
await UpdateManager.continueProcessingUpdatesWithLock(projectId, docId)
},
async continueProcessingUpdatesWithLock(projectId, docId) {
const length = await RealTimeRedisManager.promises.getUpdatesLength(docId)
if (length > 0) {
await UpdateManager.processOutstandingUpdatesWithLock(projectId, docId)
}
},
async fetchAndApplyUpdates(projectId, docId) {
const profile = new Profiler('fetchAndApplyUpdates', {
project_id: projectId,
doc_id: docId,
})
const updates =
await RealTimeRedisManager.promises.getPendingUpdatesForDoc(docId)
logger.debug(
{ projectId, docId, count: updates.length },
'processing updates'
)
if (updates.length === 0) {
return
}
profile.log('getPendingUpdatesForDoc')
for (const update of updates) {
await UpdateManager.applyUpdate(projectId, docId, update)
profile.log('applyUpdate')
}
profile.log('async done').end()
},
/**
* Apply an update to the given document
*
* @param {string} projectId
* @param {string} docId
* @param {Update} update
*/
async applyUpdate(projectId, docId, update) {
const profile = new Profiler('applyUpdate', {
project_id: projectId,
doc_id: docId,
})
UpdateManager._sanitizeUpdate(update)
profile.log('sanitizeUpdate', { sync: true })
try {
let {
lines,
version,
ranges,
pathname,
projectHistoryId,
historyRangesSupport,
} = await DocumentManager.promises.getDoc(projectId, docId)
profile.log('getDoc')
if (lines == null || version == null) {
throw new Errors.NotFoundError(`document not found: ${docId}`)
}
const previousVersion = version
const incomingUpdateVersion = update.v
let updatedDocLines, appliedOps
;({ updatedDocLines, version, appliedOps } =
await ShareJsUpdateManager.promises.applyUpdate(
projectId,
docId,
update,
lines,
version
))
profile.log('sharejs.applyUpdate', {
// only synchronous when the update applies directly to the
// doc version, otherwise getPreviousDocOps is called.
sync: incomingUpdateVersion === previousVersion,
})
const { newRanges, rangesWereCollapsed, historyUpdates } =
RangesManager.applyUpdate(
projectId,
docId,
ranges,
appliedOps,
updatedDocLines,
{ historyRangesSupport }
)
profile.log('RangesManager.applyUpdate', { sync: true })
await RedisManager.promises.updateDocument(
projectId,
docId,
updatedDocLines,
version,
appliedOps,
newRanges,
update.meta
)
profile.log('RedisManager.updateDocument')
UpdateManager._adjustHistoryUpdatesMetadata(
historyUpdates,
pathname,
projectHistoryId,
lines,
ranges,
updatedDocLines,
historyRangesSupport
)
if (historyUpdates.length > 0) {
Metrics.inc('history-queue', 1, { status: 'project-history' })
try {
const projectOpsLength =
await ProjectHistoryRedisManager.promises.queueOps(
projectId,
...historyUpdates.map(op => JSON.stringify(op))
)
HistoryManager.recordAndFlushHistoryOps(
projectId,
historyUpdates,
projectOpsLength
)
profile.log('recordAndFlushHistoryOps')
} catch (err) {
// The full project history can re-sync a project in case
// updates went missing.
// Just record the error here and acknowledge the write-op.
Metrics.inc('history-queue-error')
}
}
if (rangesWereCollapsed) {
Metrics.inc('doc-snapshot')
logger.debug(
{
projectId,
docId,
previousVersion,
lines,
ranges,
update,
},
'update collapsed some ranges, snapshotting previous content'
)
// Do this last, since it's a mongo call, and so potentially longest running
// If it overruns the lock, it's ok, since all of our redis work is done
await SnapshotManager.promises.recordSnapshot(
projectId,
docId,
previousVersion,
pathname,
lines,
ranges
)
}
} catch (error) {
RealTimeRedisManager.sendData({
project_id: projectId,
doc_id: docId,
error: error instanceof Error ? error.message : error,
})
profile.log('sendData')
throw error
} finally {
profile.end()
}
},
async lockUpdatesAndDo(method, projectId, docId, ...args) {
const profile = new Profiler('lockUpdatesAndDo', {
project_id: projectId,
doc_id: docId,
})
const lockValue = await LockManager.promises.getLock(docId)
profile.log('getLock')
let result
try {
await UpdateManager.processOutstandingUpdates(projectId, docId)
profile.log('processOutstandingUpdates')
result = await method(projectId, docId, ...args)
profile.log('method')
} finally {
await LockManager.promises.releaseLock(docId, lockValue)
profile.log('releaseLock').end()
}
// We held the lock for a while so updates might have queued up
UpdateManager.continueProcessingUpdatesWithLock(projectId, docId).catch(
err => {
// The processing may fail for invalid user updates.
// This can be very noisy, put them on level DEBUG
// and record a metric.
Metrics.inc('background-processing-updates-error')
logger.debug(
{ err, projectId, docId },
'error processing updates in background'
)
}
)
return result
},
_sanitizeUpdate(update) {
// In Javascript, characters are 16-bits wide. It does not understand surrogates as characters.
//
// From Wikipedia (http://en.wikipedia.org/wiki/Plane_(Unicode)#Basic_Multilingual_Plane):
// "The High Surrogates (U+D800U+DBFF) and Low Surrogate (U+DC00U+DFFF) codes are reserved
// for encoding non-BMP characters in UTF-16 by using a pair of 16-bit codes: one High Surrogate
// and one Low Surrogate. A single surrogate code point will never be assigned a character.""
//
// The main offender seems to be \uD835 as a stand alone character, which would be the first
// 16-bit character of a blackboard bold character (http://www.fileformat.info/info/unicode/char/1d400/index.htm).
// Something must be going on client side that is screwing up the encoding and splitting the
// two 16-bit characters so that \uD835 is standalone.
for (const op of update.op || []) {
if (op.i != null) {
// Replace high and low surrogate characters with 'replacement character' (\uFFFD)
op.i = op.i.replace(/[\uD800-\uDFFF]/g, '\uFFFD')
}
}
return update
},
/**
* Add metadata that will be useful to project history
*
* @param {HistoryUpdate[]} updates
* @param {string} pathname
* @param {string} projectHistoryId
* @param {string[]} lines - document lines before updates were applied
* @param {Ranges} ranges - ranges before updates were applied
* @param {string[]} newLines - document lines after updates were applied
* @param {boolean} historyRangesSupport
*/
_adjustHistoryUpdatesMetadata(
updates,
pathname,
projectHistoryId,
lines,
ranges,
newLines,
historyRangesSupport
) {
let docLength = getDocLength(lines)
let historyDocLength = docLength
for (const change of ranges.changes ?? []) {
if ('d' in change.op) {
historyDocLength += change.op.d.length
}
}
for (const update of updates) {
update.projectHistoryId = projectHistoryId
if (!update.meta) {
update.meta = {}
}
update.meta.pathname = pathname
update.meta.doc_length = docLength
if (historyRangesSupport && historyDocLength !== docLength) {
update.meta.history_doc_length = historyDocLength
}
// Each update may contain multiple ops, i.e.
// [{
// ops: [{i: "foo", p: 4}, {d: "bar", p:8}]
// }, {
// ops: [{d: "baz", p: 40}, {i: "qux", p:8}]
// }]
// We want to include the doc_length at the start of each update,
// before it's ops are applied. However, we need to track any
// changes to it for the next update.
for (const op of update.op) {
if (isInsert(op)) {
docLength += op.i.length
if (!op.trackedDeleteRejection) {
// Tracked delete rejections end up retaining characters rather
// than inserting
historyDocLength += op.i.length
}
}
if (isDelete(op)) {
docLength -= op.d.length
if (update.meta.tc) {
// This is a tracked delete. It will be translated into a retain in
// history, except any enclosed tracked inserts, which will be
// translated into regular deletes.
for (const change of op.trackedChanges ?? []) {
if (change.type === 'insert') {
historyDocLength -= change.length
}
}
} else {
// This is a regular delete. It will be translated to a delete in
// history.
historyDocLength -= op.d.length
}
}
}
if (!historyRangesSupport) {
// Prevent project-history from processing tracked changes
delete update.meta.tc
}
}
if (historyRangesSupport && updates.length > 0) {
const lastUpdate = updates[updates.length - 1]
lastUpdate.meta ??= {}
lastUpdate.meta.doc_hash = computeDocHash(newLines)
}
},
}
module.exports = { ...callbackifyAll(UpdateManager), promises: UpdateManager }

View File

@@ -0,0 +1,129 @@
// @ts-check
const { createHash } = require('node:crypto')
const _ = require('lodash')
/**
* @import { CommentOp, DeleteOp, InsertOp, Op, TrackedChange } from './types'
*/
/**
* Returns true if the op is an insert
*
* @param {Op} op
* @returns {op is InsertOp}
*/
function isInsert(op) {
return 'i' in op && op.i != null
}
/**
* Returns true if the op is an insert
*
* @param {Op} op
* @returns {op is DeleteOp}
*/
function isDelete(op) {
return 'd' in op && op.d != null
}
/**
* Returns true if the op is a comment
*
* @param {Op} op
* @returns {op is CommentOp}
*/
function isComment(op) {
return 'c' in op && op.c != null
}
/**
* Get the length of a document from its lines
*
* @param {string[]} lines
* @returns {number}
*/
function getDocLength(lines) {
let docLength = _.reduce(lines, (chars, line) => chars + line.length, 0)
// Add newline characters. Lines are joined by newlines, but the last line
// doesn't include a newline. We must make a special case for an empty list
// so that it doesn't report a doc length of -1.
docLength += Math.max(lines.length - 1, 0)
return docLength
}
/**
* Adds given tracked deletes to the given content.
*
* The history system includes tracked deletes in the document content.
*
* @param {string} content
* @param {TrackedChange[]} trackedChanges
* @return {string} content for the history service
*/
function addTrackedDeletesToContent(content, trackedChanges) {
let cursor = 0
let result = ''
for (const change of trackedChanges) {
if (isDelete(change.op)) {
// Add the content before the tracked delete
result += content.slice(cursor, change.op.p)
cursor = change.op.p
// Add the content of the tracked delete
result += change.op.d
}
}
// Add the content after all tracked deletes
result += content.slice(cursor)
return result
}
/**
* Compute the content hash for a doc
*
* This hash is sent to the history to validate updates.
*
* @param {string[]} lines
* @return {string} the doc hash
*/
function computeDocHash(lines) {
const hash = createHash('sha1')
if (lines.length > 0) {
for (const line of lines.slice(0, lines.length - 1)) {
hash.update(line)
hash.update('\n')
}
// The last line doesn't end with a newline
hash.update(lines[lines.length - 1])
}
return hash.digest('hex')
}
/**
* checks if the given originOrSource should be treated as a source or origin
* TODO: remove this hack and remove all "source" references
*/
function extractOriginOrSource(originOrSource) {
let source = null
let origin = null
if (typeof originOrSource === 'string') {
source = originOrSource
} else if (originOrSource && typeof originOrSource === 'object') {
origin = originOrSource
}
return { source, origin }
}
module.exports = {
isInsert,
isDelete,
isComment,
addTrackedDeletesToContent,
getDocLength,
computeDocHash,
extractOriginOrSource,
}

View File

@@ -0,0 +1,28 @@
const Metrics = require('@overleaf/metrics')
const Settings = require('@overleaf/settings')
const { MongoClient, ObjectId } = require('mongodb-legacy')
const mongoClient = new MongoClient(Settings.mongo.url, Settings.mongo.options)
const mongoDb = mongoClient.db()
const db = {
docs: mongoDb.collection('docs'),
docSnapshots: mongoDb.collection('docSnapshots'),
projects: mongoDb.collection('projects'),
}
async function healthCheck() {
const res = await mongoDb.command({ ping: 1 })
if (!res.ok) {
throw new Error('failed mongo ping')
}
}
Metrics.mongodb.monitor(mongoClient)
module.exports = {
db,
ObjectId,
mongoClient,
healthCheck: require('node:util').callbackify(healthCheck),
}

View File

@@ -0,0 +1,22 @@
Licensed under the standard MIT license:
Copyright 2011 Joseph Gentle.
Copyright 2012-2024 Overleaf.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

View File

@@ -0,0 +1,6 @@
This folder contains a modified version of the ShareJS source code, forked from [v0.5.0](https://github.com/josephg/ShareJS/tree/v0.5.0/).
The original CoffeeScript code has been decaffeinated to JavaScript, and further modified. Some folders have been removed. See https://github.com/josephg/ShareJS/blob/v0.5.0/src/types/README.md for the original README.
The original code, and the current modified code in this directory, are published under the MIT license.

View File

@@ -0,0 +1,895 @@
/* eslint-disable
no-console,
no-return-assign,
n/no-callback-literal,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS103: Rewrite code to no longer use __guard__
* DS104: Avoid inline assignments
* DS204: Change includes calls to have a more natural evaluation order
* DS205: Consider reworking code to avoid use of IIFEs
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
// The model of all the ops. Responsible for applying & transforming remote deltas
// and managing the storage layer.
//
// Actual storage is handled by the database wrappers in db/*, wrapped by DocCache
let Model
const { EventEmitter } = require('node:events')
const queue = require('./syncqueue')
const types = require('../types')
const Profiler = require('../../Profiler')
const isArray = o => Object.prototype.toString.call(o) === '[object Array]'
// This constructor creates a new Model object. There will be one model object
// per server context.
//
// The model object is responsible for a lot of things:
//
// - It manages the interactions with the database
// - It maintains (in memory) a set of all active documents
// - It calls out to the OT functions when necessary
//
// The model is an event emitter. It emits the following events:
//
// create(docName, data): A document has been created with the specified name & data
module.exports = Model = function (db, options) {
// db can be null if the user doesn't want persistance.
let getOps
if (!(this instanceof Model)) {
return new Model(db, options)
}
const model = this
if (options == null) {
options = {}
}
// This is a cache of 'live' documents.
//
// The cache is a map from docName -> {
// ops:[{op, meta}]
// snapshot
// type
// v
// meta
// eventEmitter
// reapTimer
// committedVersion: v
// snapshotWriteLock: bool to make sure writeSnapshot isn't re-entrant
// dbMeta: database specific data
// opQueue: syncQueue for processing ops
// }
//
// The ops list contains the document's last options.numCachedOps ops. (Or all
// of them if we're using a memory store).
//
// Documents are stored in this set so long as the document has been accessed in
// the last few seconds (options.reapTime) OR at least one client has the document
// open. I don't know if I should keep open (but not being edited) documents live -
// maybe if a client has a document open but the document isn't being edited, I should
// flush it from the cache.
//
// In any case, the API to model is designed such that if we want to change that later
// it should be pretty easy to do so without any external-to-the-model code changes.
const docs = {}
// This is a map from docName -> [callback]. It is used when a document hasn't been
// cached and multiple getSnapshot() / getVersion() requests come in. All requests
// are added to the callback list and called when db.getSnapshot() returns.
//
// callback(error, snapshot data)
const awaitingGetSnapshot = {}
// The time that documents which no clients have open will stay in the cache.
// Should be > 0.
if (options.reapTime == null) {
options.reapTime = 3000
}
// The number of operations the cache holds before reusing the space
if (options.numCachedOps == null) {
options.numCachedOps = 10
}
// This option forces documents to be reaped, even when there's no database backend.
// This is useful when you don't care about persistance and don't want to gradually
// fill memory.
//
// You might want to set reapTime to a day or something.
if (options.forceReaping == null) {
options.forceReaping = false
}
// Until I come up with a better strategy, we'll save a copy of the document snapshot
// to the database every ~20 submitted ops.
if (options.opsBeforeCommit == null) {
options.opsBeforeCommit = 20
}
// It takes some processing time to transform client ops. The server will punt ops back to the
// client to transform if they're too old.
if (options.maximumAge == null) {
options.maximumAge = 40
}
// **** Cache API methods
// Its important that all ops are applied in order. This helper method creates the op submission queue
// for a single document. This contains the logic for transforming & applying ops.
const makeOpQueue = (docName, doc) =>
queue(function (opData, callback) {
if (!(opData.v >= 0)) {
return callback('Version missing')
}
if (opData.v > doc.v) {
return callback('Op at future version')
}
// Punt the transforming work back to the client if the op is too old.
if (opData.v + options.maximumAge < doc.v) {
return callback('Op too old')
}
if (!opData.meta) {
opData.meta = {}
}
opData.meta.ts = Date.now()
// We'll need to transform the op to the current version of the document. This
// calls the callback immediately if opVersion == doc.v.
return getOps(docName, opData.v, doc.v, function (error, ops) {
let snapshot
if (error) {
return callback(error)
}
if (doc.v - opData.v !== ops.length) {
// This should never happen. It indicates that we didn't get all the ops we
// asked for. Its important that the submitted op is correctly transformed.
console.error(
`Could not get old ops in model for document ${docName}`
)
console.error(
`Expected ops ${opData.v} to ${doc.v} and got ${ops.length} ops`
)
return callback('Internal error')
}
if (ops.length > 0) {
try {
const profile = new Profiler('model.transform')
// If there's enough ops, it might be worth spinning this out into a webworker thread.
for (const oldOp of Array.from(ops)) {
// Dup detection works by sending the id(s) the op has been submitted with previously.
// If the id matches, we reject it. The client can also detect the op has been submitted
// already if it sees its own previous id in the ops it sees when it does catchup.
if (
oldOp.meta.source &&
opData.dupIfSource &&
Array.from(opData.dupIfSource).includes(oldOp.meta.source)
) {
return callback('Op already submitted')
}
opData.op = doc.type.transform(opData.op, oldOp.op, 'left')
opData.v++
}
profile.log('transform', { sync: true }).end()
} catch (error1) {
error = error1
return callback(error.message)
}
}
try {
const profile = new Profiler('model.apply')
snapshot = doc.type.apply(doc.snapshot, opData.op)
profile.log('model.apply', { sync: true }).end()
} catch (error2) {
error = error2
return callback(error.message)
}
if (
options.maxDocLength != null &&
doc.snapshot.length > options.maxDocLength
) {
return callback('Update takes doc over max doc size')
}
// The op data should be at the current version, and the new document data should be at
// the next version.
//
// This should never happen in practice, but its a nice little check to make sure everything
// is hunky-dory.
if (opData.v !== doc.v) {
// This should never happen.
console.error(
'Version mismatch detected in model. File a ticket - this is a bug.'
)
console.error(`Expecting ${opData.v} == ${doc.v}`)
return callback('Internal error')
}
// newDocData = {snapshot, type:type.name, v:opVersion + 1, meta:docData.meta}
const writeOp =
(db != null ? db.writeOp : undefined) ||
((docName, newOpData, callback) => callback())
return writeOp(docName, opData, function (error) {
if (error) {
// The user should probably know about this.
console.warn(`Error writing ops to database: ${error}`)
return callback(error)
}
__guardMethod__(options.stats, 'writeOp', o => o.writeOp())
// This is needed when we emit the 'change' event, below.
const oldSnapshot = doc.snapshot
// All the heavy lifting is now done. Finally, we'll update the cache with the new data
// and (maybe!) save a new document snapshot to the database.
doc.v = opData.v + 1
doc.snapshot = snapshot
doc.ops.push(opData)
if (db && doc.ops.length > options.numCachedOps) {
doc.ops.shift()
}
model.emit('applyOp', docName, opData, snapshot, oldSnapshot)
doc.eventEmitter.emit('op', opData, snapshot, oldSnapshot)
// The callback is called with the version of the document at which the op was applied.
// This is the op.v after transformation, and its doc.v - 1.
callback(null, opData.v)
// I need a decent strategy here for deciding whether or not to save the snapshot.
//
// The 'right' strategy looks something like "Store the snapshot whenever the snapshot
// is smaller than the accumulated op data". For now, I'll just store it every 20
// ops or something. (Configurable with doc.committedVersion)
if (
!doc.snapshotWriteLock &&
doc.committedVersion + options.opsBeforeCommit <= doc.v
) {
return tryWriteSnapshot(docName, function (error) {
if (error) {
return console.warn(
`Error writing snapshot ${error}. This is nonfatal`
)
}
})
}
})
})
})
// Add the data for the given docName to the cache. The named document shouldn't already
// exist in the doc set.
//
// Returns the new doc.
const add = function (docName, error, data, committedVersion, ops, dbMeta) {
let callback, doc
const callbacks = awaitingGetSnapshot[docName]
delete awaitingGetSnapshot[docName]
if (error) {
if (callbacks) {
for (callback of Array.from(callbacks)) {
callback(error)
}
}
} else {
doc = docs[docName] = {
snapshot: data.snapshot,
v: data.v,
type: data.type,
meta: data.meta,
// Cache of ops
ops: ops || [],
eventEmitter: new EventEmitter(),
// Timer before the document will be invalidated from the cache (if the document has no
// listeners)
reapTimer: null,
// Version of the snapshot thats in the database
committedVersion: committedVersion != null ? committedVersion : data.v,
snapshotWriteLock: false,
dbMeta,
}
doc.opQueue = makeOpQueue(docName, doc)
refreshReapingTimeout(docName)
model.emit('add', docName, data)
if (callbacks) {
for (callback of Array.from(callbacks)) {
callback(null, doc)
}
}
}
return doc
}
// This is a little helper wrapper around db.getOps. It does two things:
//
// - If there's no database set, it returns an error to the callback
// - It adds version numbers to each op returned from the database
// (These can be inferred from context so the DB doesn't store them, but its useful to have them).
const getOpsInternal = function (docName, start, end, callback) {
if (!db) {
return typeof callback === 'function'
? callback('Document does not exist')
: undefined
}
return db.getOps(docName, start, end, function (error, ops) {
if (error) {
return typeof callback === 'function' ? callback(error) : undefined
}
let v = start
for (const op of Array.from(ops)) {
op.v = v++
}
return typeof callback === 'function' ? callback(null, ops) : undefined
})
}
// Load the named document into the cache. This function is re-entrant.
//
// The callback is called with (error, doc)
const load = function (docName, callback) {
if (docs[docName]) {
// The document is already loaded. Return immediately.
__guardMethod__(options.stats, 'cacheHit', o => o.cacheHit('getSnapshot'))
return callback(null, docs[docName])
}
// We're a memory store. If we don't have it, nobody does.
if (!db) {
return callback('Document does not exist')
}
const callbacks = awaitingGetSnapshot[docName]
// The document is being loaded already. Add ourselves as a callback.
if (callbacks) {
return callbacks.push(callback)
}
__guardMethod__(options.stats, 'cacheMiss', o1 =>
o1.cacheMiss('getSnapshot')
)
// The document isn't loaded and isn't being loaded. Load it.
awaitingGetSnapshot[docName] = [callback]
return db.getSnapshot(docName, function (error, data, dbMeta) {
if (error) {
return add(docName, error)
}
const type = types[data.type]
if (!type) {
console.warn(`Type '${data.type}' missing`)
return callback('Type not found')
}
data.type = type
const committedVersion = data.v
// The server can close without saving the most recent document snapshot.
// In this case, there are extra ops which need to be applied before
// returning the snapshot.
return getOpsInternal(docName, data.v, null, function (error, ops) {
if (error) {
return callback(error)
}
if (ops.length > 0) {
console.log(`Catchup ${docName} ${data.v} -> ${data.v + ops.length}`)
try {
for (const op of Array.from(ops)) {
data.snapshot = type.apply(data.snapshot, op.op)
data.v++
}
} catch (e) {
// This should never happen - it indicates that whats in the
// database is invalid.
console.error(`Op data invalid for ${docName}: ${e.stack}`)
return callback('Op data invalid')
}
}
model.emit('load', docName, data)
return add(docName, error, data, committedVersion, ops, dbMeta)
})
})
}
// This makes sure the cache contains a document. If the doc cache doesn't contain
// a document, it is loaded from the database and stored.
//
// Documents are stored so long as either:
// - They have been accessed within the past #{PERIOD}
// - At least one client has the document open
function refreshReapingTimeout(docName) {
const doc = docs[docName]
if (!doc) {
return
}
// I want to let the clients list be updated before this is called.
return process.nextTick(function () {
// This is an awkward way to find out the number of clients on a document. If this
// causes performance issues, add a numClients field to the document.
//
// The first check is because its possible that between refreshReapingTimeout being called and this
// event being fired, someone called delete() on the document and hence the doc is something else now.
if (
doc === docs[docName] &&
doc.eventEmitter.listeners('op').length === 0 &&
(db || options.forceReaping) &&
doc.opQueue.busy === false
) {
let reapTimer
clearTimeout(doc.reapTimer)
return (doc.reapTimer = reapTimer =
setTimeout(
() =>
tryWriteSnapshot(docName, function () {
// If the reaping timeout has been refreshed while we're writing the snapshot, or if we're
// in the middle of applying an operation, don't reap.
if (
docs[docName].reapTimer === reapTimer &&
doc.opQueue.busy === false
) {
return delete docs[docName]
}
}),
options.reapTime
))
}
})
}
function tryWriteSnapshot(docName, callback) {
if (!db) {
return typeof callback === 'function' ? callback() : undefined
}
const doc = docs[docName]
// The doc is closed
if (!doc) {
return typeof callback === 'function' ? callback() : undefined
}
// The document is already saved.
if (doc.committedVersion === doc.v) {
return typeof callback === 'function' ? callback() : undefined
}
if (doc.snapshotWriteLock) {
return typeof callback === 'function'
? callback('Another snapshot write is in progress')
: undefined
}
doc.snapshotWriteLock = true
__guardMethod__(options.stats, 'writeSnapshot', o => o.writeSnapshot())
const writeSnapshot =
(db != null ? db.writeSnapshot : undefined) ||
((docName, docData, dbMeta, callback) => callback())
const data = {
v: doc.v,
meta: doc.meta,
snapshot: doc.snapshot,
// The database doesn't know about object types.
type: doc.type.name,
}
// Commit snapshot.
return writeSnapshot(docName, data, doc.dbMeta, function (error, dbMeta) {
doc.snapshotWriteLock = false
// We have to use data.v here because the version in the doc could
// have been updated between the call to writeSnapshot() and now.
doc.committedVersion = data.v
doc.dbMeta = dbMeta
return typeof callback === 'function' ? callback(error) : undefined
})
}
// *** Model interface methods
// Create a new document.
//
// data should be {snapshot, type, [meta]}. The version of a new document is 0.
this.create = function (docName, type, meta, callback) {
if (typeof meta === 'function') {
;[meta, callback] = Array.from([{}, meta])
}
if (docName.match(/\//)) {
return typeof callback === 'function'
? callback('Invalid document name')
: undefined
}
if (docs[docName]) {
return typeof callback === 'function'
? callback('Document already exists')
: undefined
}
if (typeof type === 'string') {
type = types[type]
}
if (!type) {
return typeof callback === 'function'
? callback('Type not found')
: undefined
}
const data = {
snapshot: type.create(),
type: type.name,
meta: meta || {},
v: 0,
}
const done = function (error, dbMeta) {
// dbMeta can be used to cache extra state needed by the database to access the document, like an ID or something.
if (error) {
return typeof callback === 'function' ? callback(error) : undefined
}
// From here on we'll store the object version of the type name.
data.type = type
add(docName, null, data, 0, [], dbMeta)
model.emit('create', docName, data)
return typeof callback === 'function' ? callback() : undefined
}
if (db) {
return db.create(docName, data, done)
} else {
return done()
}
}
// Perminantly deletes the specified document.
// If listeners are attached, they are removed.
//
// The callback is called with (error) if there was an error. If error is null / undefined, the
// document was deleted.
//
// WARNING: This isn't well supported throughout the code. (Eg, streaming clients aren't told about the
// deletion. Subsequent op submissions will fail).
this.delete = function (docName, callback) {
const doc = docs[docName]
if (doc) {
clearTimeout(doc.reapTimer)
delete docs[docName]
}
const done = function (error) {
if (!error) {
model.emit('delete', docName)
}
return typeof callback === 'function' ? callback(error) : undefined
}
if (db) {
return db.delete(docName, doc != null ? doc.dbMeta : undefined, done)
} else {
return done(!doc ? 'Document does not exist' : undefined)
}
}
// This gets all operations from [start...end]. (That is, its not inclusive.)
//
// end can be null. This means 'get me all ops from start'.
//
// Each op returned is in the form {op:o, meta:m, v:version}.
//
// Callback is called with (error, [ops])
//
// If the document does not exist, getOps doesn't necessarily return an error. This is because
// its awkward to figure out whether or not the document exists for things
// like the redis database backend. I guess its a bit gross having this inconsistant
// with the other DB calls, but its certainly convenient.
//
// Use getVersion() to determine if a document actually exists, if thats what you're
// after.
this.getOps = getOps = function (docName, start, end, callback) {
// getOps will only use the op cache if its there. It won't fill the op cache in.
if (!(start >= 0)) {
throw new Error('start must be 0+')
}
if (typeof end === 'function') {
;[end, callback] = Array.from([null, end])
}
const ops = docs[docName] != null ? docs[docName].ops : undefined
if (ops) {
const version = docs[docName].v
// Ops contains an array of ops. The last op in the list is the last op applied
if (end == null) {
end = version
}
start = Math.min(start, end)
if (start === end) {
return callback(null, [])
}
// Base is the version number of the oldest op we have cached
const base = version - ops.length
// If the database is null, we'll trim to the ops we do have and hope thats enough.
if (start >= base || db === null) {
refreshReapingTimeout(docName)
if (options.stats != null) {
options.stats.cacheHit('getOps')
}
return callback(null, ops.slice(start - base, end - base))
}
}
if (options.stats != null) {
options.stats.cacheMiss('getOps')
}
return getOpsInternal(docName, start, end, callback)
}
// Gets the snapshot data for the specified document.
// getSnapshot(docName, callback)
// Callback is called with (error, {v: <version>, type: <type>, snapshot: <snapshot>, meta: <meta>})
this.getSnapshot = (docName, callback) =>
load(docName, (error, doc) =>
callback(
error,
doc
? { v: doc.v, type: doc.type, snapshot: doc.snapshot, meta: doc.meta }
: undefined
)
)
// Gets the latest version # of the document.
// getVersion(docName, callback)
// callback is called with (error, version).
this.getVersion = (docName, callback) =>
load(docName, (error, doc) =>
callback(error, doc != null ? doc.v : undefined)
)
// Apply an op to the specified document.
// The callback is passed (error, applied version #)
// opData = {op:op, v:v, meta:metadata}
//
// Ops are queued before being applied so that the following code applies op C before op B:
// model.applyOp 'doc', OPA, -> model.applyOp 'doc', OPB
// model.applyOp 'doc', OPC
this.applyOp = (
docName,
opData,
callback // All the logic for this is in makeOpQueue, above.
) =>
load(docName, function (error, doc) {
if (error) {
return callback(error)
}
return process.nextTick(() =>
doc.opQueue(opData, function (error, newVersion) {
refreshReapingTimeout(docName)
return typeof callback === 'function'
? callback(error, newVersion)
: undefined
})
)
})
// TODO: store (some) metadata in DB
// TODO: op and meta should be combineable in the op that gets sent
this.applyMetaOp = function (docName, metaOpData, callback) {
const { path, value } = metaOpData.meta
if (!isArray(path)) {
return typeof callback === 'function'
? callback('path should be an array')
: undefined
}
return load(docName, function (error, doc) {
if (error != null) {
return typeof callback === 'function' ? callback(error) : undefined
} else {
let applied = false
switch (path[0]) {
case 'shout':
doc.eventEmitter.emit('op', metaOpData)
applied = true
break
}
if (applied) {
model.emit('applyMetaOp', docName, path, value)
}
return typeof callback === 'function'
? callback(null, doc.v)
: undefined
}
})
}
// Listen to all ops from the specified version. If version is in the past, all
// ops since that version are sent immediately to the listener.
//
// The callback is called once the listener is attached, but before any ops have been passed
// to the listener.
//
// This will _not_ edit the document metadata.
//
// If there are any listeners, we don't purge the document from the cache. But be aware, this behaviour
// might change in a future version.
//
// version is the document version at which the document is opened. It can be left out if you want to open
// the document at the most recent version.
//
// listener is called with (opData) each time an op is applied.
//
// callback(error, openedVersion)
this.listen = function (docName, version, listener, callback) {
if (typeof version === 'function') {
;[version, listener, callback] = Array.from([null, version, listener])
}
return load(docName, function (error, doc) {
if (error) {
return typeof callback === 'function' ? callback(error) : undefined
}
clearTimeout(doc.reapTimer)
if (version != null) {
return getOps(docName, version, null, function (error, data) {
if (error) {
return typeof callback === 'function' ? callback(error) : undefined
}
doc.eventEmitter.on('op', listener)
if (typeof callback === 'function') {
callback(null, version)
}
return (() => {
const result = []
for (const op of Array.from(data)) {
let needle
listener(op)
// The listener may well remove itself during the catchup phase. If this happens, break early.
// This is done in a quite inefficient way. (O(n) where n = #listeners on doc)
if (
((needle = listener),
!Array.from(doc.eventEmitter.listeners('op')).includes(needle))
) {
break
} else {
result.push(undefined)
}
}
return result
})()
})
} else {
// Version is null / undefined. Just add the listener.
doc.eventEmitter.on('op', listener)
return typeof callback === 'function'
? callback(null, doc.v)
: undefined
}
})
}
// Remove a listener for a particular document.
//
// removeListener(docName, listener)
//
// This is synchronous.
this.removeListener = function (docName, listener) {
// The document should already be loaded.
const doc = docs[docName]
if (!doc) {
throw new Error('removeListener called but document not loaded')
}
doc.eventEmitter.removeListener('op', listener)
return refreshReapingTimeout(docName)
}
// Flush saves all snapshot data to the database. I'm not sure whether or not this is actually needed -
// sharejs will happily replay uncommitted ops when documents are re-opened anyway.
this.flush = function (callback) {
if (!db) {
return typeof callback === 'function' ? callback() : undefined
}
let pendingWrites = 0
for (const docName in docs) {
const doc = docs[docName]
if (doc.committedVersion < doc.v) {
pendingWrites++
// I'm hoping writeSnapshot will always happen in another thread.
tryWriteSnapshot(docName, () =>
process.nextTick(function () {
pendingWrites--
if (pendingWrites === 0) {
return typeof callback === 'function' ? callback() : undefined
}
})
)
}
}
// If nothing was queued, terminate immediately.
if (pendingWrites === 0) {
return typeof callback === 'function' ? callback() : undefined
}
}
// Close the database connection. This is needed so nodejs can shut down cleanly.
this.closeDb = function () {
__guardMethod__(db, 'close', o => o.close())
return (db = null)
}
}
// Model inherits from EventEmitter.
Model.prototype = new EventEmitter()
function __guardMethod__(obj, methodName, transform) {
if (
typeof obj !== 'undefined' &&
obj !== null &&
typeof obj[methodName] === 'function'
) {
return transform(obj, methodName)
} else {
return undefined
}
}

View File

@@ -0,0 +1,60 @@
// TODO: This file was created by bulk-decaffeinate.
// Sanity-check the conversion and remove this comment.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
// A synchronous processing queue. The queue calls process on the arguments,
// ensuring that process() is only executing once at a time.
//
// process(data, callback) _MUST_ eventually call its callback.
//
// Example:
//
// queue = require 'syncqueue'
//
// fn = queue (data, callback) ->
// asyncthing data, ->
// callback(321)
//
// fn(1)
// fn(2)
// fn(3, (result) -> console.log(result))
//
// ^--- async thing will only be running once at any time.
module.exports = function (process) {
if (typeof process !== 'function') {
throw new Error('process is not a function')
}
const queue = []
const enqueue = function (data, callback) {
queue.push([data, callback])
return flush()
}
enqueue.busy = false
function flush() {
if (enqueue.busy || queue.length === 0) {
return
}
enqueue.busy = true
const [data, callback] = Array.from(queue.shift())
return process(data, function (...result) {
// TODO: Make this not use varargs - varargs are really slow.
enqueue.busy = false
// This is called after busy = false so a user can check if enqueue.busy is set in the callback.
if (callback) {
callback.apply(null, result)
}
return flush()
})
}
return enqueue
}

View File

@@ -0,0 +1,48 @@
This directory contains all the operational transform code. Each file defines a type.
Most of the types in here are for testing or demonstration. The only types which are sent to the webclient
are `text` and `json`.
# An OT type
All OT types have the following fields:
`name`: _(string)_ Name of the type. Should match the filename.
`create() -> snapshot`: Function which creates and returns a new document snapshot
`apply(snapshot, op) -> snapshot`: A function which creates a new document snapshot with the op applied
`transform(op1, op2, side) -> op1'`: OT transform function.
Given op1, op2, `apply(s, op2, transform(op1, op2, 'left')) == apply(s, op1, transform(op2, op1, 'right'))`.
Transform and apply must never modify their arguments.
Optional properties:
`tp2`: _(bool)_ True if the transform function supports TP2. This allows p2p architectures to work.
`compose(op1, op2) -> op`: Create and return a new op which has the same effect as op1 + op2.
`serialize(snapshot) -> JSON object`: Serialize a document to something we can JSON.stringify()
`deserialize(object) -> snapshot`: Deserialize a JSON object into the document's internal snapshot format
`prune(op1', op2, side) -> op1`: Inserse transform function. Only required for TP2 types.
`normalize(op) -> op`: Fix up an op to make it valid. Eg, remove skips of size zero.
`api`: _(object)_ Set of helper methods which will be mixed in to the client document object for manipulating documents. See below.
# Examples
`count` and `simple` are two trivial OT type definitions if you want to take a look. JSON defines
the ot-for-JSON type (see the wiki for documentation) and all the text types define different text
implementations. (I still have no idea which one I like the most, and they're fun to write!)
# API
Types can also define API functions. These methods are mixed into the client's Doc object when a document is created.
You can use them to help construct ops programatically (so users don't need to understand how ops are structured).
For example, the three text types defined here (text, text-composable and text-tp2) all provide the text API, supplying
`.insert()`, `.del()`, `.getLength` and `.getText` methods.
See text-api.coffee for an example.

View File

@@ -0,0 +1,37 @@
// TODO: This file was created by bulk-decaffeinate.
// Sanity-check the conversion and remove this comment.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
// This is a simple type used for testing other OT code. Each op is [expectedSnapshot, increment]
exports.name = 'count'
exports.create = () => 1
exports.apply = function (snapshot, op) {
const [v, inc] = Array.from(op)
if (snapshot !== v) {
throw new Error(`Op ${v} != snapshot ${snapshot}`)
}
return snapshot + inc
}
// transform op1 by op2. Return transformed version of op1.
exports.transform = function (op1, op2) {
if (op1[0] !== op2[0]) {
throw new Error(`Op1 ${op1[0]} != op2 ${op2[0]}`)
}
return [op1[0] + op2[1], op1[1]]
}
exports.compose = function (op1, op2) {
if (op1[0] + op1[1] !== op2[0]) {
throw new Error(`Op1 ${op1} + 1 != op2 ${op2}`)
}
return [op1[0], op1[1] + op2[1]]
}
exports.generateRandomOp = doc => [[doc, 1], doc + 1]

View File

@@ -0,0 +1,116 @@
/* eslint-disable
no-return-assign,
no-unused-vars,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
// These methods let you build a transform function from a transformComponent function
// for OT types like text and JSON in which operations are lists of components
// and transforming them requires N^2 work.
// Add transform and transformX functions for an OT type which has transformComponent defined.
// transformComponent(destination array, component, other component, side)
let bootstrapTransform
exports._bt = bootstrapTransform = function (
type,
transformComponent,
checkValidOp,
append
) {
let transformX
const transformComponentX = function (left, right, destLeft, destRight) {
transformComponent(destLeft, left, right, 'left')
return transformComponent(destRight, right, left, 'right')
}
// Transforms rightOp by leftOp. Returns ['rightOp', clientOp']
type.transformX =
type.transformX =
transformX =
function (leftOp, rightOp) {
checkValidOp(leftOp)
checkValidOp(rightOp)
const newRightOp = []
for (let rightComponent of Array.from(rightOp)) {
// Generate newLeftOp by composing leftOp by rightComponent
const newLeftOp = []
let k = 0
while (k < leftOp.length) {
let l
const nextC = []
transformComponentX(leftOp[k], rightComponent, newLeftOp, nextC)
k++
if (nextC.length === 1) {
rightComponent = nextC[0]
} else if (nextC.length === 0) {
for (l of Array.from(leftOp.slice(k))) {
append(newLeftOp, l)
}
rightComponent = null
break
} else {
// Recurse.
const [l_, r_] = Array.from(transformX(leftOp.slice(k), nextC))
for (l of Array.from(l_)) {
append(newLeftOp, l)
}
for (const r of Array.from(r_)) {
append(newRightOp, r)
}
rightComponent = null
break
}
}
if (rightComponent != null) {
append(newRightOp, rightComponent)
}
leftOp = newLeftOp
}
return [leftOp, newRightOp]
}
// Transforms op with specified type ('left' or 'right') by otherOp.
return (type.transform = type.transform =
function (op, otherOp, type) {
let _
if (type !== 'left' && type !== 'right') {
throw new Error("type must be 'left' or 'right'")
}
if (otherOp.length === 0) {
return op
}
// TODO: Benchmark with and without this line. I _think_ it'll make a big difference...?
if (op.length === 1 && otherOp.length === 1) {
return transformComponent([], op[0], otherOp[0], type)
}
if (type === 'left') {
let left
;[left, _] = Array.from(transformX(op, otherOp))
return left
} else {
let right
;[_, right] = Array.from(transformX(otherOp, op))
return right
}
})
}
if (typeof WEB === 'undefined') {
exports.bootstrapTransform = bootstrapTransform
}

View File

@@ -0,0 +1,25 @@
// TODO: This file was created by bulk-decaffeinate.
// Sanity-check the conversion and remove this comment.
/*
* decaffeinate suggestions:
* DS102: Remove unnecessary code created because of implicit returns
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
const register = function (file) {
const type = require(file)
exports[type.name] = type
try {
return require(`${file}-api`)
} catch (error) {}
}
// Import all the built-in types.
register('./simple')
register('./count')
register('./text')
register('./text-composable')
register('./text-tp2')
register('./json')

View File

@@ -0,0 +1,356 @@
/* eslint-disable
no-undef,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS205: Consider reworking code to avoid use of IIFEs
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
// API for JSON OT
let json
if (typeof WEB === 'undefined') {
json = require('./json')
}
if (typeof WEB !== 'undefined' && WEB !== null) {
const { extendDoc } = exports
exports.extendDoc = function (name, fn) {
SubDoc.prototype[name] = fn
return extendDoc(name, fn)
}
}
const depath = function (path) {
if (path.length === 1 && path[0].constructor === Array) {
return path[0]
} else {
return path
}
}
class SubDoc {
constructor(doc, path) {
this.doc = doc
this.path = path
}
at(...path) {
return this.doc.at(this.path.concat(depath(path)))
}
get() {
return this.doc.getAt(this.path)
}
// for objects and lists
set(value, cb) {
return this.doc.setAt(this.path, value, cb)
}
// for strings and lists.
insert(pos, value, cb) {
return this.doc.insertAt(this.path, pos, value, cb)
}
// for strings
del(pos, length, cb) {
return this.doc.deleteTextAt(this.path, length, pos, cb)
}
// for objects and lists
remove(cb) {
return this.doc.removeAt(this.path, cb)
}
push(value, cb) {
return this.insert(this.get().length, value, cb)
}
move(from, to, cb) {
return this.doc.moveAt(this.path, from, to, cb)
}
add(amount, cb) {
return this.doc.addAt(this.path, amount, cb)
}
on(event, cb) {
return this.doc.addListener(this.path, event, cb)
}
removeListener(l) {
return this.doc.removeListener(l)
}
// text API compatibility
getLength() {
return this.get().length
}
getText() {
return this.get()
}
}
const traverse = function (snapshot, path) {
const container = { data: snapshot }
let key = 'data'
let elem = container
for (const p of Array.from(path)) {
elem = elem[key]
key = p
if (typeof elem === 'undefined') {
throw new Error('bad path')
}
}
return { elem, key }
}
const pathEquals = function (p1, p2) {
if (p1.length !== p2.length) {
return false
}
for (let i = 0; i < p1.length; i++) {
const e = p1[i]
if (e !== p2[i]) {
return false
}
}
return true
}
json.api = {
provides: { json: true },
at(...path) {
return new SubDoc(this, depath(path))
},
get() {
return this.snapshot
},
set(value, cb) {
return this.setAt([], value, cb)
},
getAt(path) {
const { elem, key } = traverse(this.snapshot, path)
return elem[key]
},
setAt(path, value, cb) {
const { elem, key } = traverse(this.snapshot, path)
const op = { p: path }
if (elem.constructor === Array) {
op.li = value
if (typeof elem[key] !== 'undefined') {
op.ld = elem[key]
}
} else if (typeof elem === 'object') {
op.oi = value
if (typeof elem[key] !== 'undefined') {
op.od = elem[key]
}
} else {
throw new Error('bad path')
}
return this.submitOp([op], cb)
},
removeAt(path, cb) {
const { elem, key } = traverse(this.snapshot, path)
if (typeof elem[key] === 'undefined') {
throw new Error('no element at that path')
}
const op = { p: path }
if (elem.constructor === Array) {
op.ld = elem[key]
} else if (typeof elem === 'object') {
op.od = elem[key]
} else {
throw new Error('bad path')
}
return this.submitOp([op], cb)
},
insertAt(path, pos, value, cb) {
const { elem, key } = traverse(this.snapshot, path)
const op = { p: path.concat(pos) }
if (elem[key].constructor === Array) {
op.li = value
} else if (typeof elem[key] === 'string') {
op.si = value
}
return this.submitOp([op], cb)
},
moveAt(path, from, to, cb) {
const op = [{ p: path.concat(from), lm: to }]
return this.submitOp(op, cb)
},
addAt(path, amount, cb) {
const op = [{ p: path, na: amount }]
return this.submitOp(op, cb)
},
deleteTextAt(path, length, pos, cb) {
const { elem, key } = traverse(this.snapshot, path)
const op = [{ p: path.concat(pos), sd: elem[key].slice(pos, pos + length) }]
return this.submitOp(op, cb)
},
addListener(path, event, cb) {
const l = { path, event, cb }
this._listeners.push(l)
return l
},
removeListener(l) {
const i = this._listeners.indexOf(l)
if (i < 0) {
return false
}
this._listeners.splice(i, 1)
return true
},
_register() {
this._listeners = []
this.on('change', function (op) {
return (() => {
const result = []
for (const c of Array.from(op)) {
let i
if (c.na !== undefined || c.si !== undefined || c.sd !== undefined) {
// no change to structure
continue
}
const toRemove = []
for (i = 0; i < this._listeners.length; i++) {
// Transform a dummy op by the incoming op to work out what
// should happen to the listener.
const l = this._listeners[i]
const dummy = { p: l.path, na: 0 }
const xformed = this.type.transformComponent([], dummy, c, 'left')
if (xformed.length === 0) {
// The op was transformed to noop, so we should delete the listener.
toRemove.push(i)
} else if (xformed.length === 1) {
// The op remained, so grab its new path into the listener.
l.path = xformed[0].p
} else {
throw new Error(
"Bad assumption in json-api: xforming an 'si' op will always result in 0 or 1 components."
)
}
}
toRemove.sort((a, b) => b - a)
result.push(
(() => {
const result1 = []
for (i of Array.from(toRemove)) {
result1.push(this._listeners.splice(i, 1))
}
return result1
})()
)
}
return result
})()
})
return this.on('remoteop', function (op) {
return (() => {
const result = []
for (const c of Array.from(op)) {
const matchPath =
c.na === undefined ? c.p.slice(0, c.p.length - 1) : c.p
result.push(
(() => {
const result1 = []
for (const { path, event, cb } of Array.from(this._listeners)) {
let common
if (pathEquals(path, matchPath)) {
switch (event) {
case 'insert':
if (c.li !== undefined && c.ld === undefined) {
result1.push(cb(c.p[c.p.length - 1], c.li))
} else if (c.oi !== undefined && c.od === undefined) {
result1.push(cb(c.p[c.p.length - 1], c.oi))
} else if (c.si !== undefined) {
result1.push(cb(c.p[c.p.length - 1], c.si))
} else {
result1.push(undefined)
}
break
case 'delete':
if (c.li === undefined && c.ld !== undefined) {
result1.push(cb(c.p[c.p.length - 1], c.ld))
} else if (c.oi === undefined && c.od !== undefined) {
result1.push(cb(c.p[c.p.length - 1], c.od))
} else if (c.sd !== undefined) {
result1.push(cb(c.p[c.p.length - 1], c.sd))
} else {
result1.push(undefined)
}
break
case 'replace':
if (c.li !== undefined && c.ld !== undefined) {
result1.push(cb(c.p[c.p.length - 1], c.ld, c.li))
} else if (c.oi !== undefined && c.od !== undefined) {
result1.push(cb(c.p[c.p.length - 1], c.od, c.oi))
} else {
result1.push(undefined)
}
break
case 'move':
if (c.lm !== undefined) {
result1.push(cb(c.p[c.p.length - 1], c.lm))
} else {
result1.push(undefined)
}
break
case 'add':
if (c.na !== undefined) {
result1.push(cb(c.na))
} else {
result1.push(undefined)
}
break
default:
result1.push(undefined)
}
} else if (
(common = this.type.commonPath(matchPath, path)) != null
) {
if (event === 'child op') {
if (
matchPath.length === path.length &&
path.length === common
) {
throw new Error(
"paths match length and have commonality, but aren't equal?"
)
}
const childPath = c.p.slice(common + 1)
result1.push(cb(childPath, c))
} else {
result1.push(undefined)
}
} else {
result1.push(undefined)
}
}
return result1
})()
)
}
return result
})()
})
},
}

View File

@@ -0,0 +1,630 @@
/* eslint-disable
no-return-assign,
no-undef,
no-useless-catch,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
// This is the implementation of the JSON OT type.
//
// Spec is here: https://github.com/josephg/ShareJS/wiki/JSON-Operations
let text
if (typeof WEB !== 'undefined' && WEB !== null) {
;({ text } = exports.types)
} else {
text = require('./text')
}
const json = {}
json.name = 'json'
json.create = () => null
json.invertComponent = function (c) {
const c_ = { p: c.p }
if (c.si !== undefined) {
c_.sd = c.si
}
if (c.sd !== undefined) {
c_.si = c.sd
}
if (c.oi !== undefined) {
c_.od = c.oi
}
if (c.od !== undefined) {
c_.oi = c.od
}
if (c.li !== undefined) {
c_.ld = c.li
}
if (c.ld !== undefined) {
c_.li = c.ld
}
if (c.na !== undefined) {
c_.na = -c.na
}
if (c.lm !== undefined) {
c_.lm = c.p[c.p.length - 1]
c_.p = c.p.slice(0, c.p.length - 1).concat([c.lm])
}
return c_
}
json.invert = op =>
Array.from(op.slice().reverse()).map(c => json.invertComponent(c))
json.checkValidOp = function (op) {}
const isArray = o => Object.prototype.toString.call(o) === '[object Array]'
json.checkList = function (elem) {
if (!isArray(elem)) {
throw new Error('Referenced element not a list')
}
}
json.checkObj = function (elem) {
if (elem.constructor !== Object) {
throw new Error(
`Referenced element not an object (it was ${JSON.stringify(elem)})`
)
}
}
json.apply = function (snapshot, op) {
json.checkValidOp(op)
op = clone(op)
const container = { data: clone(snapshot) }
try {
for (let i = 0; i < op.length; i++) {
const c = op[i]
let parent = null
let parentkey = null
let elem = container
let key = 'data'
for (const p of Array.from(c.p)) {
parent = elem
parentkey = key
elem = elem[key]
key = p
if (parent == null) {
throw new Error('Path invalid')
}
}
if (c.na !== undefined) {
// Number add
if (typeof elem[key] !== 'number') {
throw new Error('Referenced element not a number')
}
elem[key] += c.na
} else if (c.si !== undefined) {
// String insert
if (typeof elem !== 'string') {
throw new Error(
`Referenced element not a string (it was ${JSON.stringify(elem)})`
)
}
parent[parentkey] = elem.slice(0, key) + c.si + elem.slice(key)
} else if (c.sd !== undefined) {
// String delete
if (typeof elem !== 'string') {
throw new Error('Referenced element not a string')
}
if (elem.slice(key, key + c.sd.length) !== c.sd) {
throw new Error('Deleted string does not match')
}
parent[parentkey] = elem.slice(0, key) + elem.slice(key + c.sd.length)
} else if (c.li !== undefined && c.ld !== undefined) {
// List replace
json.checkList(elem)
// Should check the list element matches c.ld
elem[key] = c.li
} else if (c.li !== undefined) {
// List insert
json.checkList(elem)
elem.splice(key, 0, c.li)
} else if (c.ld !== undefined) {
// List delete
json.checkList(elem)
// Should check the list element matches c.ld here too.
elem.splice(key, 1)
} else if (c.lm !== undefined) {
// List move
json.checkList(elem)
if (c.lm !== key) {
const e = elem[key]
// Remove it...
elem.splice(key, 1)
// And insert it back.
elem.splice(c.lm, 0, e)
}
} else if (c.oi !== undefined) {
// Object insert / replace
json.checkObj(elem)
// Should check that elem[key] == c.od
elem[key] = c.oi
} else if (c.od !== undefined) {
// Object delete
json.checkObj(elem)
// Should check that elem[key] == c.od
delete elem[key]
} else {
throw new Error('invalid / missing instruction in op')
}
}
} catch (error) {
// TODO: Roll back all already applied changes. Write tests before implementing this code.
throw error
}
return container.data
}
// Checks if two paths, p1 and p2 match.
json.pathMatches = function (p1, p2, ignoreLast) {
if (p1.length !== p2.length) {
return false
}
for (let i = 0; i < p1.length; i++) {
const p = p1[i]
if (p !== p2[i] && (!ignoreLast || i !== p1.length - 1)) {
return false
}
}
return true
}
json.append = function (dest, c) {
let last
c = clone(c)
if (
dest.length !== 0 &&
json.pathMatches(c.p, (last = dest[dest.length - 1]).p)
) {
if (last.na !== undefined && c.na !== undefined) {
return (dest[dest.length - 1] = { p: last.p, na: last.na + c.na })
} else if (
last.li !== undefined &&
c.li === undefined &&
c.ld === last.li
) {
// insert immediately followed by delete becomes a noop.
if (last.ld !== undefined) {
// leave the delete part of the replace
return delete last.li
} else {
return dest.pop()
}
} else if (
last.od !== undefined &&
last.oi === undefined &&
c.oi !== undefined &&
c.od === undefined
) {
return (last.oi = c.oi)
} else if (c.lm !== undefined && c.p[c.p.length - 1] === c.lm) {
return null // don't do anything
} else {
return dest.push(c)
}
} else {
return dest.push(c)
}
}
json.compose = function (op1, op2) {
json.checkValidOp(op1)
json.checkValidOp(op2)
const newOp = clone(op1)
for (const c of Array.from(op2)) {
json.append(newOp, c)
}
return newOp
}
json.normalize = function (op) {
const newOp = []
if (!isArray(op)) {
op = [op]
}
for (const c of Array.from(op)) {
if (c.p == null) {
c.p = []
}
json.append(newOp, c)
}
return newOp
}
// hax, copied from test/types/json. Apparently this is still the fastest way to deep clone an object, assuming
// we have browser support for JSON.
// http://jsperf.com/cloning-an-object/12
const clone = o => JSON.parse(JSON.stringify(o))
json.commonPath = function (p1, p2) {
p1 = p1.slice()
p2 = p2.slice()
p1.unshift('data')
p2.unshift('data')
p1 = p1.slice(0, p1.length - 1)
p2 = p2.slice(0, p2.length - 1)
if (p2.length === 0) {
return -1
}
let i = 0
while (p1[i] === p2[i] && i < p1.length) {
i++
if (i === p2.length) {
return i - 1
}
}
}
// transform c so it applies to a document with otherC applied.
json.transformComponent = function (dest, c, otherC, type) {
let oc
c = clone(c)
if (c.na !== undefined) {
c.p.push(0)
}
if (otherC.na !== undefined) {
otherC.p.push(0)
}
const common = json.commonPath(c.p, otherC.p)
const common2 = json.commonPath(otherC.p, c.p)
const cplength = c.p.length
const otherCplength = otherC.p.length
if (c.na !== undefined) {
c.p.pop()
} // hax
if (otherC.na !== undefined) {
otherC.p.pop()
}
if (otherC.na) {
if (
common2 != null &&
otherCplength >= cplength &&
otherC.p[common2] === c.p[common2]
) {
if (c.ld !== undefined) {
oc = clone(otherC)
oc.p = oc.p.slice(cplength)
c.ld = json.apply(clone(c.ld), [oc])
} else if (c.od !== undefined) {
oc = clone(otherC)
oc.p = oc.p.slice(cplength)
c.od = json.apply(clone(c.od), [oc])
}
}
json.append(dest, c)
return dest
}
if (
common2 != null &&
otherCplength > cplength &&
c.p[common2] === otherC.p[common2]
) {
// transform based on c
if (c.ld !== undefined) {
oc = clone(otherC)
oc.p = oc.p.slice(cplength)
c.ld = json.apply(clone(c.ld), [oc])
} else if (c.od !== undefined) {
oc = clone(otherC)
oc.p = oc.p.slice(cplength)
c.od = json.apply(clone(c.od), [oc])
}
}
if (common != null) {
let from, p, to
const commonOperand = cplength === otherCplength
// transform based on otherC
if (otherC.na !== undefined) {
// this case is handled above due to icky path hax
} else if (otherC.si !== undefined || otherC.sd !== undefined) {
// String op vs string op - pass through to text type
if (c.si !== undefined || c.sd !== undefined) {
if (!commonOperand) {
throw new Error('must be a string?')
}
// Convert an op component to a text op component
const convert = function (component) {
const newC = { p: component.p[component.p.length - 1] }
if (component.si) {
newC.i = component.si
} else {
newC.d = component.sd
}
return newC
}
const tc1 = convert(c)
const tc2 = convert(otherC)
const res = []
text._tc(res, tc1, tc2, type)
for (const tc of Array.from(res)) {
const jc = { p: c.p.slice(0, common) }
jc.p.push(tc.p)
if (tc.i != null) {
jc.si = tc.i
}
if (tc.d != null) {
jc.sd = tc.d
}
json.append(dest, jc)
}
return dest
}
} else if (otherC.li !== undefined && otherC.ld !== undefined) {
if (otherC.p[common] === c.p[common]) {
// noop
if (!commonOperand) {
// we're below the deleted element, so -> noop
return dest
} else if (c.ld !== undefined) {
// we're trying to delete the same element, -> noop
if (c.li !== undefined && type === 'left') {
// we're both replacing one element with another. only one can
// survive!
c.ld = clone(otherC.li)
} else {
return dest
}
}
}
} else if (otherC.li !== undefined) {
if (
c.li !== undefined &&
c.ld === undefined &&
commonOperand &&
c.p[common] === otherC.p[common]
) {
// in li vs. li, left wins.
if (type === 'right') {
c.p[common]++
}
} else if (otherC.p[common] <= c.p[common]) {
c.p[common]++
}
if (c.lm !== undefined) {
if (commonOperand) {
// otherC edits the same list we edit
if (otherC.p[common] <= c.lm) {
c.lm++
}
}
}
// changing c.from is handled above.
} else if (otherC.ld !== undefined) {
if (c.lm !== undefined) {
if (commonOperand) {
if (otherC.p[common] === c.p[common]) {
// they deleted the thing we're trying to move
return dest
}
// otherC edits the same list we edit
p = otherC.p[common]
from = c.p[common]
to = c.lm
if (p < to || (p === to && from < to)) {
c.lm--
}
}
}
if (otherC.p[common] < c.p[common]) {
c.p[common]--
} else if (otherC.p[common] === c.p[common]) {
if (otherCplength < cplength) {
// we're below the deleted element, so -> noop
return dest
} else if (c.ld !== undefined) {
if (c.li !== undefined) {
// we're replacing, they're deleting. we become an insert.
delete c.ld
} else {
// we're trying to delete the same element, -> noop
return dest
}
}
}
} else if (otherC.lm !== undefined) {
if (c.lm !== undefined && cplength === otherCplength) {
// lm vs lm, here we go!
from = c.p[common]
to = c.lm
const otherFrom = otherC.p[common]
const otherTo = otherC.lm
if (otherFrom !== otherTo) {
// if otherFrom == otherTo, we don't need to change our op.
// where did my thing go?
if (from === otherFrom) {
// they moved it! tie break.
if (type === 'left') {
c.p[common] = otherTo
if (from === to) {
// ugh
c.lm = otherTo
}
} else {
return dest
}
} else {
// they moved around it
if (from > otherFrom) {
c.p[common]--
}
if (from > otherTo) {
c.p[common]++
} else if (from === otherTo) {
if (otherFrom > otherTo) {
c.p[common]++
if (from === to) {
// ugh, again
c.lm++
}
}
}
// step 2: where am i going to put it?
if (to > otherFrom) {
c.lm--
} else if (to === otherFrom) {
if (to > from) {
c.lm--
}
}
if (to > otherTo) {
c.lm++
} else if (to === otherTo) {
// if we're both moving in the same direction, tie break
if (
(otherTo > otherFrom && to > from) ||
(otherTo < otherFrom && to < from)
) {
if (type === 'right') {
c.lm++
}
} else {
if (to > from) {
c.lm++
} else if (to === otherFrom) {
c.lm--
}
}
}
}
}
} else if (c.li !== undefined && c.ld === undefined && commonOperand) {
// li
from = otherC.p[common]
to = otherC.lm
p = c.p[common]
if (p > from) {
c.p[common]--
}
if (p > to) {
c.p[common]++
}
} else {
// ld, ld+li, si, sd, na, oi, od, oi+od, any li on an element beneath
// the lm
//
// i.e. things care about where their item is after the move.
from = otherC.p[common]
to = otherC.lm
p = c.p[common]
if (p === from) {
c.p[common] = to
} else {
if (p > from) {
c.p[common]--
}
if (p > to) {
c.p[common]++
} else if (p === to) {
if (from > to) {
c.p[common]++
}
}
}
}
} else if (otherC.oi !== undefined && otherC.od !== undefined) {
if (c.p[common] === otherC.p[common]) {
if (c.oi !== undefined && commonOperand) {
// we inserted where someone else replaced
if (type === 'right') {
// left wins
return dest
} else {
// we win, make our op replace what they inserted
c.od = otherC.oi
}
} else {
// -> noop if the other component is deleting the same object (or any
// parent)
return dest
}
}
} else if (otherC.oi !== undefined) {
if (c.oi !== undefined && c.p[common] === otherC.p[common]) {
// left wins if we try to insert at the same place
if (type === 'left') {
json.append(dest, { p: c.p, od: otherC.oi })
} else {
return dest
}
}
} else if (otherC.od !== undefined) {
if (c.p[common] === otherC.p[common]) {
if (!commonOperand) {
return dest
}
if (c.oi !== undefined) {
delete c.od
} else {
return dest
}
}
}
}
json.append(dest, c)
return dest
}
if (typeof WEB !== 'undefined' && WEB !== null) {
if (!exports.types) {
exports.types = {}
}
// This is kind of awful - come up with a better way to hook this helper code up.
exports._bt(json, json.transformComponent, json.checkValidOp, json.append)
// [] is used to prevent closure from renaming types.text
exports.types.json = json
} else {
module.exports = json
require('./helpers').bootstrapTransform(
json,
json.transformComponent,
json.checkValidOp,
json.append
)
}

View File

@@ -0,0 +1,882 @@
/* eslint-disable
no-console,
no-return-assign,
n/no-callback-literal,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS103: Rewrite code to no longer use __guard__
* DS104: Avoid inline assignments
* DS204: Change includes calls to have a more natural evaluation order
* DS205: Consider reworking code to avoid use of IIFEs
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
// The model of all the ops. Responsible for applying & transforming remote deltas
// and managing the storage layer.
//
// Actual storage is handled by the database wrappers in db/*, wrapped by DocCache
let Model
const { EventEmitter } = require('node:events')
const queue = require('./syncqueue')
const types = require('../types')
const isArray = o => Object.prototype.toString.call(o) === '[object Array]'
// This constructor creates a new Model object. There will be one model object
// per server context.
//
// The model object is responsible for a lot of things:
//
// - It manages the interactions with the database
// - It maintains (in memory) a set of all active documents
// - It calls out to the OT functions when necessary
//
// The model is an event emitter. It emits the following events:
//
// create(docName, data): A document has been created with the specified name & data
module.exports = Model = function (db, options) {
// db can be null if the user doesn't want persistance.
let getOps
if (!(this instanceof Model)) {
return new Model(db, options)
}
const model = this
if (options == null) {
options = {}
}
// This is a cache of 'live' documents.
//
// The cache is a map from docName -> {
// ops:[{op, meta}]
// snapshot
// type
// v
// meta
// eventEmitter
// reapTimer
// committedVersion: v
// snapshotWriteLock: bool to make sure writeSnapshot isn't re-entrant
// dbMeta: database specific data
// opQueue: syncQueue for processing ops
// }
//
// The ops list contains the document's last options.numCachedOps ops. (Or all
// of them if we're using a memory store).
//
// Documents are stored in this set so long as the document has been accessed in
// the last few seconds (options.reapTime) OR at least one client has the document
// open. I don't know if I should keep open (but not being edited) documents live -
// maybe if a client has a document open but the document isn't being edited, I should
// flush it from the cache.
//
// In any case, the API to model is designed such that if we want to change that later
// it should be pretty easy to do so without any external-to-the-model code changes.
const docs = {}
// This is a map from docName -> [callback]. It is used when a document hasn't been
// cached and multiple getSnapshot() / getVersion() requests come in. All requests
// are added to the callback list and called when db.getSnapshot() returns.
//
// callback(error, snapshot data)
const awaitingGetSnapshot = {}
// The time that documents which no clients have open will stay in the cache.
// Should be > 0.
if (options.reapTime == null) {
options.reapTime = 3000
}
// The number of operations the cache holds before reusing the space
if (options.numCachedOps == null) {
options.numCachedOps = 10
}
// This option forces documents to be reaped, even when there's no database backend.
// This is useful when you don't care about persistance and don't want to gradually
// fill memory.
//
// You might want to set reapTime to a day or something.
if (options.forceReaping == null) {
options.forceReaping = false
}
// Until I come up with a better strategy, we'll save a copy of the document snapshot
// to the database every ~20 submitted ops.
if (options.opsBeforeCommit == null) {
options.opsBeforeCommit = 20
}
// It takes some processing time to transform client ops. The server will punt ops back to the
// client to transform if they're too old.
if (options.maximumAge == null) {
options.maximumAge = 40
}
// **** Cache API methods
// Its important that all ops are applied in order. This helper method creates the op submission queue
// for a single document. This contains the logic for transforming & applying ops.
const makeOpQueue = (docName, doc) =>
queue(function (opData, callback) {
if (!(opData.v >= 0)) {
return callback('Version missing')
}
if (opData.v > doc.v) {
return callback('Op at future version')
}
// Punt the transforming work back to the client if the op is too old.
if (opData.v + options.maximumAge < doc.v) {
return callback('Op too old')
}
if (!opData.meta) {
opData.meta = {}
}
opData.meta.ts = Date.now()
// We'll need to transform the op to the current version of the document. This
// calls the callback immediately if opVersion == doc.v.
return getOps(docName, opData.v, doc.v, function (error, ops) {
let snapshot
if (error) {
return callback(error)
}
if (doc.v - opData.v !== ops.length) {
// This should never happen. It indicates that we didn't get all the ops we
// asked for. Its important that the submitted op is correctly transformed.
console.error(
`Could not get old ops in model for document ${docName}`
)
console.error(
`Expected ops ${opData.v} to ${doc.v} and got ${ops.length} ops`
)
return callback('Internal error')
}
if (ops.length > 0) {
try {
// If there's enough ops, it might be worth spinning this out into a webworker thread.
for (const oldOp of Array.from(ops)) {
// Dup detection works by sending the id(s) the op has been submitted with previously.
// If the id matches, we reject it. The client can also detect the op has been submitted
// already if it sees its own previous id in the ops it sees when it does catchup.
if (
oldOp.meta.source &&
opData.dupIfSource &&
Array.from(opData.dupIfSource).includes(oldOp.meta.source)
) {
return callback('Op already submitted')
}
opData.op = doc.type.transform(opData.op, oldOp.op, 'left')
opData.v++
}
} catch (error1) {
error = error1
return callback(error.message)
}
}
try {
snapshot = doc.type.apply(doc.snapshot, opData.op)
} catch (error2) {
error = error2
return callback(error.message)
}
// The op data should be at the current version, and the new document data should be at
// the next version.
//
// This should never happen in practice, but its a nice little check to make sure everything
// is hunky-dory.
if (opData.v !== doc.v) {
// This should never happen.
console.error(
'Version mismatch detected in model. File a ticket - this is a bug.'
)
console.error(`Expecting ${opData.v} == ${doc.v}`)
return callback('Internal error')
}
// newDocData = {snapshot, type:type.name, v:opVersion + 1, meta:docData.meta}
const writeOp =
(db != null ? db.writeOp : undefined) ||
((docName, newOpData, callback) => callback())
return writeOp(docName, opData, function (error) {
if (error) {
// The user should probably know about this.
console.warn(`Error writing ops to database: ${error}`)
return callback(error)
}
__guardMethod__(options.stats, 'writeOp', o => o.writeOp())
// This is needed when we emit the 'change' event, below.
const oldSnapshot = doc.snapshot
// All the heavy lifting is now done. Finally, we'll update the cache with the new data
// and (maybe!) save a new document snapshot to the database.
doc.v = opData.v + 1
doc.snapshot = snapshot
doc.ops.push(opData)
if (db && doc.ops.length > options.numCachedOps) {
doc.ops.shift()
}
model.emit('applyOp', docName, opData, snapshot, oldSnapshot)
doc.eventEmitter.emit('op', opData, snapshot, oldSnapshot)
// The callback is called with the version of the document at which the op was applied.
// This is the op.v after transformation, and its doc.v - 1.
callback(null, opData.v)
// I need a decent strategy here for deciding whether or not to save the snapshot.
//
// The 'right' strategy looks something like "Store the snapshot whenever the snapshot
// is smaller than the accumulated op data". For now, I'll just store it every 20
// ops or something. (Configurable with doc.committedVersion)
if (
!doc.snapshotWriteLock &&
doc.committedVersion + options.opsBeforeCommit <= doc.v
) {
return tryWriteSnapshot(docName, function (error) {
if (error) {
return console.warn(
`Error writing snapshot ${error}. This is nonfatal`
)
}
})
}
})
})
})
// Add the data for the given docName to the cache. The named document shouldn't already
// exist in the doc set.
//
// Returns the new doc.
const add = function (docName, error, data, committedVersion, ops, dbMeta) {
let callback, doc
const callbacks = awaitingGetSnapshot[docName]
delete awaitingGetSnapshot[docName]
if (error) {
if (callbacks) {
for (callback of Array.from(callbacks)) {
callback(error)
}
}
} else {
doc = docs[docName] = {
snapshot: data.snapshot,
v: data.v,
type: data.type,
meta: data.meta,
// Cache of ops
ops: ops || [],
eventEmitter: new EventEmitter(),
// Timer before the document will be invalidated from the cache (if the document has no
// listeners)
reapTimer: null,
// Version of the snapshot thats in the database
committedVersion: committedVersion != null ? committedVersion : data.v,
snapshotWriteLock: false,
dbMeta,
}
doc.opQueue = makeOpQueue(docName, doc)
refreshReapingTimeout(docName)
model.emit('add', docName, data)
if (callbacks) {
for (callback of Array.from(callbacks)) {
callback(null, doc)
}
}
}
return doc
}
// This is a little helper wrapper around db.getOps. It does two things:
//
// - If there's no database set, it returns an error to the callback
// - It adds version numbers to each op returned from the database
// (These can be inferred from context so the DB doesn't store them, but its useful to have them).
const getOpsInternal = function (docName, start, end, callback) {
if (!db) {
return typeof callback === 'function'
? callback('Document does not exist')
: undefined
}
return db.getOps(docName, start, end, function (error, ops) {
if (error) {
return typeof callback === 'function' ? callback(error) : undefined
}
let v = start
for (const op of Array.from(ops)) {
op.v = v++
}
return typeof callback === 'function' ? callback(null, ops) : undefined
})
}
// Load the named document into the cache. This function is re-entrant.
//
// The callback is called with (error, doc)
const load = function (docName, callback) {
if (docs[docName]) {
// The document is already loaded. Return immediately.
__guardMethod__(options.stats, 'cacheHit', o => o.cacheHit('getSnapshot'))
return callback(null, docs[docName])
}
// We're a memory store. If we don't have it, nobody does.
if (!db) {
return callback('Document does not exist')
}
const callbacks = awaitingGetSnapshot[docName]
// The document is being loaded already. Add ourselves as a callback.
if (callbacks) {
return callbacks.push(callback)
}
__guardMethod__(options.stats, 'cacheMiss', o1 =>
o1.cacheMiss('getSnapshot')
)
// The document isn't loaded and isn't being loaded. Load it.
awaitingGetSnapshot[docName] = [callback]
return db.getSnapshot(docName, function (error, data, dbMeta) {
if (error) {
return add(docName, error)
}
const type = types[data.type]
if (!type) {
console.warn(`Type '${data.type}' missing`)
return callback('Type not found')
}
data.type = type
const committedVersion = data.v
// The server can close without saving the most recent document snapshot.
// In this case, there are extra ops which need to be applied before
// returning the snapshot.
return getOpsInternal(docName, data.v, null, function (error, ops) {
if (error) {
return callback(error)
}
if (ops.length > 0) {
console.log(`Catchup ${docName} ${data.v} -> ${data.v + ops.length}`)
try {
for (const op of Array.from(ops)) {
data.snapshot = type.apply(data.snapshot, op.op)
data.v++
}
} catch (e) {
// This should never happen - it indicates that whats in the
// database is invalid.
console.error(`Op data invalid for ${docName}: ${e.stack}`)
return callback('Op data invalid')
}
}
model.emit('load', docName, data)
return add(docName, error, data, committedVersion, ops, dbMeta)
})
})
}
// This makes sure the cache contains a document. If the doc cache doesn't contain
// a document, it is loaded from the database and stored.
//
// Documents are stored so long as either:
// - They have been accessed within the past #{PERIOD}
// - At least one client has the document open
function refreshReapingTimeout(docName) {
const doc = docs[docName]
if (!doc) {
return
}
// I want to let the clients list be updated before this is called.
return process.nextTick(function () {
// This is an awkward way to find out the number of clients on a document. If this
// causes performance issues, add a numClients field to the document.
//
// The first check is because its possible that between refreshReapingTimeout being called and this
// event being fired, someone called delete() on the document and hence the doc is something else now.
if (
doc === docs[docName] &&
doc.eventEmitter.listeners('op').length === 0 &&
(db || options.forceReaping) &&
doc.opQueue.busy === false
) {
let reapTimer
clearTimeout(doc.reapTimer)
return (doc.reapTimer = reapTimer =
setTimeout(
() =>
tryWriteSnapshot(docName, function () {
// If the reaping timeout has been refreshed while we're writing the snapshot, or if we're
// in the middle of applying an operation, don't reap.
if (
docs[docName].reapTimer === reapTimer &&
doc.opQueue.busy === false
) {
return delete docs[docName]
}
}),
options.reapTime
))
}
})
}
function tryWriteSnapshot(docName, callback) {
if (!db) {
return typeof callback === 'function' ? callback() : undefined
}
const doc = docs[docName]
// The doc is closed
if (!doc) {
return typeof callback === 'function' ? callback() : undefined
}
// The document is already saved.
if (doc.committedVersion === doc.v) {
return typeof callback === 'function' ? callback() : undefined
}
if (doc.snapshotWriteLock) {
return typeof callback === 'function'
? callback('Another snapshot write is in progress')
: undefined
}
doc.snapshotWriteLock = true
__guardMethod__(options.stats, 'writeSnapshot', o => o.writeSnapshot())
const writeSnapshot =
(db != null ? db.writeSnapshot : undefined) ||
((docName, docData, dbMeta, callback) => callback())
const data = {
v: doc.v,
meta: doc.meta,
snapshot: doc.snapshot,
// The database doesn't know about object types.
type: doc.type.name,
}
// Commit snapshot.
return writeSnapshot(docName, data, doc.dbMeta, function (error, dbMeta) {
doc.snapshotWriteLock = false
// We have to use data.v here because the version in the doc could
// have been updated between the call to writeSnapshot() and now.
doc.committedVersion = data.v
doc.dbMeta = dbMeta
return typeof callback === 'function' ? callback(error) : undefined
})
}
// *** Model interface methods
// Create a new document.
//
// data should be {snapshot, type, [meta]}. The version of a new document is 0.
this.create = function (docName, type, meta, callback) {
if (typeof meta === 'function') {
;[meta, callback] = Array.from([{}, meta])
}
if (docName.match(/\//)) {
return typeof callback === 'function'
? callback('Invalid document name')
: undefined
}
if (docs[docName]) {
return typeof callback === 'function'
? callback('Document already exists')
: undefined
}
if (typeof type === 'string') {
type = types[type]
}
if (!type) {
return typeof callback === 'function'
? callback('Type not found')
: undefined
}
const data = {
snapshot: type.create(),
type: type.name,
meta: meta || {},
v: 0,
}
const done = function (error, dbMeta) {
// dbMeta can be used to cache extra state needed by the database to access the document, like an ID or something.
if (error) {
return typeof callback === 'function' ? callback(error) : undefined
}
// From here on we'll store the object version of the type name.
data.type = type
add(docName, null, data, 0, [], dbMeta)
model.emit('create', docName, data)
return typeof callback === 'function' ? callback() : undefined
}
if (db) {
return db.create(docName, data, done)
} else {
return done()
}
}
// Perminantly deletes the specified document.
// If listeners are attached, they are removed.
//
// The callback is called with (error) if there was an error. If error is null / undefined, the
// document was deleted.
//
// WARNING: This isn't well supported throughout the code. (Eg, streaming clients aren't told about the
// deletion. Subsequent op submissions will fail).
this.delete = function (docName, callback) {
const doc = docs[docName]
if (doc) {
clearTimeout(doc.reapTimer)
delete docs[docName]
}
const done = function (error) {
if (!error) {
model.emit('delete', docName)
}
return typeof callback === 'function' ? callback(error) : undefined
}
if (db) {
return db.delete(docName, doc != null ? doc.dbMeta : undefined, done)
} else {
return done(!doc ? 'Document does not exist' : undefined)
}
}
// This gets all operations from [start...end]. (That is, its not inclusive.)
//
// end can be null. This means 'get me all ops from start'.
//
// Each op returned is in the form {op:o, meta:m, v:version}.
//
// Callback is called with (error, [ops])
//
// If the document does not exist, getOps doesn't necessarily return an error. This is because
// its awkward to figure out whether or not the document exists for things
// like the redis database backend. I guess its a bit gross having this inconsistant
// with the other DB calls, but its certainly convenient.
//
// Use getVersion() to determine if a document actually exists, if thats what you're
// after.
this.getOps = getOps = function (docName, start, end, callback) {
// getOps will only use the op cache if its there. It won't fill the op cache in.
if (!(start >= 0)) {
throw new Error('start must be 0+')
}
if (typeof end === 'function') {
;[end, callback] = Array.from([null, end])
}
const ops = docs[docName] != null ? docs[docName].ops : undefined
if (ops) {
const version = docs[docName].v
// Ops contains an array of ops. The last op in the list is the last op applied
if (end == null) {
end = version
}
start = Math.min(start, end)
if (start === end) {
return callback(null, [])
}
// Base is the version number of the oldest op we have cached
const base = version - ops.length
// If the database is null, we'll trim to the ops we do have and hope thats enough.
if (start >= base || db === null) {
refreshReapingTimeout(docName)
if (options.stats != null) {
options.stats.cacheHit('getOps')
}
return callback(null, ops.slice(start - base, end - base))
}
}
if (options.stats != null) {
options.stats.cacheMiss('getOps')
}
return getOpsInternal(docName, start, end, callback)
}
// Gets the snapshot data for the specified document.
// getSnapshot(docName, callback)
// Callback is called with (error, {v: <version>, type: <type>, snapshot: <snapshot>, meta: <meta>})
this.getSnapshot = (docName, callback) =>
load(docName, (error, doc) =>
callback(
error,
doc
? { v: doc.v, type: doc.type, snapshot: doc.snapshot, meta: doc.meta }
: undefined
)
)
// Gets the latest version # of the document.
// getVersion(docName, callback)
// callback is called with (error, version).
this.getVersion = (docName, callback) =>
load(docName, (error, doc) =>
callback(error, doc != null ? doc.v : undefined)
)
// Apply an op to the specified document.
// The callback is passed (error, applied version #)
// opData = {op:op, v:v, meta:metadata}
//
// Ops are queued before being applied so that the following code applies op C before op B:
// model.applyOp 'doc', OPA, -> model.applyOp 'doc', OPB
// model.applyOp 'doc', OPC
this.applyOp = (
docName,
opData,
callback // All the logic for this is in makeOpQueue, above.
) =>
load(docName, function (error, doc) {
if (error) {
return callback(error)
}
return process.nextTick(() =>
doc.opQueue(opData, function (error, newVersion) {
refreshReapingTimeout(docName)
return typeof callback === 'function'
? callback(error, newVersion)
: undefined
})
)
})
// TODO: store (some) metadata in DB
// TODO: op and meta should be combineable in the op that gets sent
this.applyMetaOp = function (docName, metaOpData, callback) {
const { path, value } = metaOpData.meta
if (!isArray(path)) {
return typeof callback === 'function'
? callback('path should be an array')
: undefined
}
return load(docName, function (error, doc) {
if (error != null) {
return typeof callback === 'function' ? callback(error) : undefined
} else {
let applied = false
switch (path[0]) {
case 'shout':
doc.eventEmitter.emit('op', metaOpData)
applied = true
break
}
if (applied) {
model.emit('applyMetaOp', docName, path, value)
}
return typeof callback === 'function'
? callback(null, doc.v)
: undefined
}
})
}
// Listen to all ops from the specified version. If version is in the past, all
// ops since that version are sent immediately to the listener.
//
// The callback is called once the listener is attached, but before any ops have been passed
// to the listener.
//
// This will _not_ edit the document metadata.
//
// If there are any listeners, we don't purge the document from the cache. But be aware, this behaviour
// might change in a future version.
//
// version is the document version at which the document is opened. It can be left out if you want to open
// the document at the most recent version.
//
// listener is called with (opData) each time an op is applied.
//
// callback(error, openedVersion)
this.listen = function (docName, version, listener, callback) {
if (typeof version === 'function') {
;[version, listener, callback] = Array.from([null, version, listener])
}
return load(docName, function (error, doc) {
if (error) {
return typeof callback === 'function' ? callback(error) : undefined
}
clearTimeout(doc.reapTimer)
if (version != null) {
return getOps(docName, version, null, function (error, data) {
if (error) {
return typeof callback === 'function' ? callback(error) : undefined
}
doc.eventEmitter.on('op', listener)
if (typeof callback === 'function') {
callback(null, version)
}
return (() => {
const result = []
for (const op of Array.from(data)) {
let needle
listener(op)
// The listener may well remove itself during the catchup phase. If this happens, break early.
// This is done in a quite inefficient way. (O(n) where n = #listeners on doc)
if (
((needle = listener),
!Array.from(doc.eventEmitter.listeners('op')).includes(needle))
) {
break
} else {
result.push(undefined)
}
}
return result
})()
})
} else {
// Version is null / undefined. Just add the listener.
doc.eventEmitter.on('op', listener)
return typeof callback === 'function'
? callback(null, doc.v)
: undefined
}
})
}
// Remove a listener for a particular document.
//
// removeListener(docName, listener)
//
// This is synchronous.
this.removeListener = function (docName, listener) {
// The document should already be loaded.
const doc = docs[docName]
if (!doc) {
throw new Error('removeListener called but document not loaded')
}
doc.eventEmitter.removeListener('op', listener)
return refreshReapingTimeout(docName)
}
// Flush saves all snapshot data to the database. I'm not sure whether or not this is actually needed -
// sharejs will happily replay uncommitted ops when documents are re-opened anyway.
this.flush = function (callback) {
if (!db) {
return typeof callback === 'function' ? callback() : undefined
}
let pendingWrites = 0
for (const docName in docs) {
const doc = docs[docName]
if (doc.committedVersion < doc.v) {
pendingWrites++
// I'm hoping writeSnapshot will always happen in another thread.
tryWriteSnapshot(docName, () =>
process.nextTick(function () {
pendingWrites--
if (pendingWrites === 0) {
return typeof callback === 'function' ? callback() : undefined
}
})
)
}
}
// If nothing was queued, terminate immediately.
if (pendingWrites === 0) {
return typeof callback === 'function' ? callback() : undefined
}
}
// Close the database connection. This is needed so nodejs can shut down cleanly.
this.closeDb = function () {
__guardMethod__(db, 'close', o => o.close())
return (db = null)
}
}
// Model inherits from EventEmitter.
Model.prototype = new EventEmitter()
function __guardMethod__(obj, methodName, transform) {
if (
typeof obj !== 'undefined' &&
obj !== null &&
typeof obj[methodName] === 'function'
) {
return transform(obj, methodName)
} else {
return undefined
}
}

View File

@@ -0,0 +1,54 @@
// TODO: This file was created by bulk-decaffeinate.
// Sanity-check the conversion and remove this comment.
/*
* decaffeinate suggestions:
* DS102: Remove unnecessary code created because of implicit returns
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
// This is a really simple OT type. Its not compiled with the web client, but it could be.
//
// Its mostly included for demonstration purposes and its used in a lot of unit tests.
//
// This defines a really simple text OT type which only allows inserts. (No deletes).
//
// Ops look like:
// {position:#, text:"asdf"}
//
// Document snapshots look like:
// {str:string}
module.exports = {
// The name of the OT type. The type is stored in types[type.name]. The name can be
// used in place of the actual type in all the API methods.
name: 'simple',
// Create a new document snapshot
create() {
return { str: '' }
},
// Apply the given op to the document snapshot. Returns the new snapshot.
//
// The original snapshot should not be modified.
apply(snapshot, op) {
if (!(op.position >= 0 && op.position <= snapshot.str.length)) {
throw new Error('Invalid position')
}
let { str } = snapshot
str = str.slice(0, op.position) + op.text + str.slice(op.position)
return { str }
},
// transform op1 by op2. Return transformed version of op1.
// sym describes the symmetry of the op. Its 'left' or 'right' depending on whether the
// op being transformed comes from the client or the server.
transform(op1, op2, sym) {
let pos = op1.position
if (op2.position < pos || (op2.position === pos && sym === 'left')) {
pos += op2.text.length
}
return { position: pos, text: op1.text }
},
}

View File

@@ -0,0 +1,60 @@
// TODO: This file was created by bulk-decaffeinate.
// Sanity-check the conversion and remove this comment.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
// A synchronous processing queue. The queue calls process on the arguments,
// ensuring that process() is only executing once at a time.
//
// process(data, callback) _MUST_ eventually call its callback.
//
// Example:
//
// queue = require 'syncqueue'
//
// fn = queue (data, callback) ->
// asyncthing data, ->
// callback(321)
//
// fn(1)
// fn(2)
// fn(3, (result) -> console.log(result))
//
// ^--- async thing will only be running once at any time.
module.exports = function (process) {
if (typeof process !== 'function') {
throw new Error('process is not a function')
}
const queue = []
const enqueue = function (data, callback) {
queue.push([data, callback])
return flush()
}
enqueue.busy = false
function flush() {
if (enqueue.busy || queue.length === 0) {
return
}
enqueue.busy = true
const [data, callback] = Array.from(queue.shift())
return process(data, function (...result) {
// TODO: Make this not use varargs - varargs are really slow.
enqueue.busy = false
// This is called after busy = false so a user can check if enqueue.busy is set in the callback.
if (callback) {
callback.apply(null, result)
}
return flush()
})
}
return enqueue
}

View File

@@ -0,0 +1,52 @@
// TODO: This file was created by bulk-decaffeinate.
// Sanity-check the conversion and remove this comment.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
// Text document API for text
let text
if (typeof WEB === 'undefined') {
text = require('./text')
}
text.api = {
provides: { text: true },
// The number of characters in the string
getLength() {
return this.snapshot.length
},
// Get the text contents of a document
getText() {
return this.snapshot
},
insert(pos, text, callback) {
const op = [{ p: pos, i: text }]
this.submitOp(op, callback)
return op
},
del(pos, length, callback) {
const op = [{ p: pos, d: this.snapshot.slice(pos, pos + length) }]
this.submitOp(op, callback)
return op
},
_register() {
return this.on('remoteop', function (op) {
return Array.from(op).map(component =>
component.i !== undefined
? this.emit('insert', component.p, component.i)
: this.emit('delete', component.p, component.d)
)
})
},
}

View File

@@ -0,0 +1,76 @@
/* eslint-disable
no-undef,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS205: Consider reworking code to avoid use of IIFEs
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
// Text document API for text
let type
if (typeof WEB !== 'undefined' && WEB !== null) {
type = exports.types['text-composable']
} else {
type = require('./text-composable')
}
type.api = {
provides: { text: true },
// The number of characters in the string
getLength() {
return this.snapshot.length
},
// Get the text contents of a document
getText() {
return this.snapshot
},
insert(pos, text, callback) {
const op = type.normalize([pos, { i: text }, this.snapshot.length - pos])
this.submitOp(op, callback)
return op
},
del(pos, length, callback) {
const op = type.normalize([
pos,
{ d: this.snapshot.slice(pos, pos + length) },
this.snapshot.length - pos - length,
])
this.submitOp(op, callback)
return op
},
_register() {
return this.on('remoteop', function (op) {
let pos = 0
return (() => {
const result = []
for (const component of Array.from(op)) {
if (typeof component === 'number') {
result.push((pos += component))
} else if (component.i !== undefined) {
this.emit('insert', pos, component.i)
result.push((pos += component.i.length))
} else {
// delete
result.push(this.emit('delete', pos, component.d))
}
}
return result
})()
})
},
}
// We don't increment pos, because the position
// specified is after the delete has happened.

View File

@@ -0,0 +1,400 @@
/* eslint-disable
no-cond-assign,
no-return-assign,
no-undef,
no-unused-vars,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS205: Consider reworking code to avoid use of IIFEs
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
// An alternate composable implementation for text. This is much closer
// to the implementation used by google wave.
//
// Ops are lists of components which iterate over the whole document.
// Components are either:
// A number N: Skip N characters in the original document
// {i:'str'}: Insert 'str' at the current position in the document
// {d:'str'}: Delete 'str', which appears at the current position in the document
//
// Eg: [3, {i:'hi'}, 5, {d:'internet'}]
//
// Snapshots are strings.
let makeAppend
const p = function () {} // require('util').debug
const i = function () {} // require('util').inspect
const moduleExport =
typeof WEB !== 'undefined' && WEB !== null ? {} : module.exports
moduleExport.name = 'text-composable'
moduleExport.create = () => ''
// -------- Utility methods
const checkOp = function (op) {
if (!Array.isArray(op)) {
throw new Error('Op must be an array of components')
}
let last = null
return (() => {
const result = []
for (const c of Array.from(op)) {
if (typeof c === 'object') {
if (
(c.i == null || !(c.i.length > 0)) &&
(c.d == null || !(c.d.length > 0))
) {
throw new Error(`Invalid op component: ${i(c)}`)
}
} else {
if (typeof c !== 'number') {
throw new Error('Op components must be objects or numbers')
}
if (!(c > 0)) {
throw new Error('Skip components must be a positive number')
}
if (typeof last === 'number') {
throw new Error('Adjacent skip components should be added')
}
}
result.push((last = c))
}
return result
})()
}
// Makes a function for appending components to a given op.
// Exported for the randomOpGenerator.
moduleExport._makeAppend = makeAppend = op =>
function (component) {
if (component === 0 || component.i === '' || component.d === '') {
return
}
if (op.length === 0) {
return op.push(component)
} else if (
typeof component === 'number' &&
typeof op[op.length - 1] === 'number'
) {
return (op[op.length - 1] += component)
} else if (component.i != null && op[op.length - 1].i != null) {
return (op[op.length - 1].i += component.i)
} else if (component.d != null && op[op.length - 1].d != null) {
return (op[op.length - 1].d += component.d)
} else {
return op.push(component)
}
}
// checkOp op
// Makes 2 functions for taking components from the start of an op, and for peeking
// at the next op that could be taken.
const makeTake = function (op) {
// The index of the next component to take
let idx = 0
// The offset into the component
let offset = 0
// Take up to length n from the front of op. If n is null, take the next
// op component. If indivisableField == 'd', delete components won't be separated.
// If indivisableField == 'i', insert components won't be separated.
const take = function (n, indivisableField) {
let c
if (idx === op.length) {
return null
}
// assert.notStrictEqual op.length, i, 'The op is too short to traverse the document'
if (typeof op[idx] === 'number') {
if (n == null || op[idx] - offset <= n) {
c = op[idx] - offset
++idx
offset = 0
return c
} else {
offset += n
return n
}
} else {
// Take from the string
const field = op[idx].i ? 'i' : 'd'
c = {}
if (
n == null ||
op[idx][field].length - offset <= n ||
field === indivisableField
) {
c[field] = op[idx][field].slice(offset)
++idx
offset = 0
} else {
c[field] = op[idx][field].slice(offset, offset + n)
offset += n
}
return c
}
}
const peekType = () => op[idx]
return [take, peekType]
}
// Find and return the length of an op component
const componentLength = function (component) {
if (typeof component === 'number') {
return component
} else if (component.i != null) {
return component.i.length
} else {
return component.d.length
}
}
// Normalize an op, removing all empty skips and empty inserts / deletes. Concatenate
// adjacent inserts and deletes.
moduleExport.normalize = function (op) {
const newOp = []
const append = makeAppend(newOp)
for (const component of Array.from(op)) {
append(component)
}
return newOp
}
// Apply the op to the string. Returns the new string.
moduleExport.apply = function (str, op) {
p(`Applying ${i(op)} to '${str}'`)
if (typeof str !== 'string') {
throw new Error('Snapshot should be a string')
}
checkOp(op)
const pos = 0
const newDoc = []
for (const component of Array.from(op)) {
if (typeof component === 'number') {
if (component > str.length) {
throw new Error('The op is too long for this document')
}
newDoc.push(str.slice(0, component))
str = str.slice(component)
} else if (component.i != null) {
newDoc.push(component.i)
} else {
if (component.d !== str.slice(0, component.d.length)) {
throw new Error(
`The deleted text '${
component.d
}' doesn't match the next characters in the document '${str.slice(
0,
component.d.length
)}'`
)
}
str = str.slice(component.d.length)
}
}
if (str !== '') {
throw new Error("The applied op doesn't traverse the entire document")
}
return newDoc.join('')
}
// transform op1 by op2. Return transformed version of op1.
// op1 and op2 are unchanged by transform.
moduleExport.transform = function (op, otherOp, side) {
if (side !== 'left' && side !== 'right') {
throw new Error(`side (${side} must be 'left' or 'right'`)
}
checkOp(op)
checkOp(otherOp)
const newOp = []
const append = makeAppend(newOp)
const [take, peek] = Array.from(makeTake(op))
for (component of Array.from(otherOp)) {
let chunk, length
if (typeof component === 'number') {
// Skip
length = component
while (length > 0) {
chunk = take(length, 'i')
if (chunk === null) {
throw new Error(
'The op traverses more elements than the document has'
)
}
append(chunk)
if (typeof chunk !== 'object' || chunk.i == null) {
length -= componentLength(chunk)
}
}
} else if (component.i != null) {
// Insert
if (side === 'left') {
// The left insert should go first.
const o = peek()
if (o != null ? o.i : undefined) {
append(take())
}
}
// Otherwise, skip the inserted text.
append(component.i.length)
} else {
// Delete.
// assert.ok component.d
;({ length } = component.d)
while (length > 0) {
chunk = take(length, 'i')
if (chunk === null) {
throw new Error(
'The op traverses more elements than the document has'
)
}
if (typeof chunk === 'number') {
length -= chunk
} else if (chunk.i != null) {
append(chunk)
} else {
// assert.ok chunk.d
// The delete is unnecessary now.
length -= chunk.d.length
}
}
}
}
// Append extras from op1
while ((component = take())) {
if ((component != null ? component.i : undefined) == null) {
throw new Error(`Remaining fragments in the op: ${i(component)}`)
}
append(component)
}
return newOp
}
// Compose 2 ops into 1 op.
moduleExport.compose = function (op1, op2) {
p(`COMPOSE ${i(op1)} + ${i(op2)}`)
checkOp(op1)
checkOp(op2)
const result = []
const append = makeAppend(result)
const [take, _] = Array.from(makeTake(op1))
for (component of Array.from(op2)) {
let chunk, length
if (typeof component === 'number') {
// Skip
length = component
while (length > 0) {
chunk = take(length, 'd')
if (chunk === null) {
throw new Error(
'The op traverses more elements than the document has'
)
}
append(chunk)
if (typeof chunk !== 'object' || chunk.d == null) {
length -= componentLength(chunk)
}
}
} else if (component.i != null) {
// Insert
append({ i: component.i })
} else {
// Delete
let offset = 0
while (offset < component.d.length) {
chunk = take(component.d.length - offset, 'd')
if (chunk === null) {
throw new Error(
'The op traverses more elements than the document has'
)
}
// If its delete, append it. If its skip, drop it and decrease length. If its insert, check the strings match, drop it and decrease length.
if (typeof chunk === 'number') {
append({ d: component.d.slice(offset, offset + chunk) })
offset += chunk
} else if (chunk.i != null) {
if (component.d.slice(offset, offset + chunk.i.length) !== chunk.i) {
throw new Error("The deleted text doesn't match the inserted text")
}
offset += chunk.i.length
// The ops cancel each other out.
} else {
// Delete
append(chunk)
}
}
}
}
// Append extras from op1
while ((component = take())) {
if ((component != null ? component.d : undefined) == null) {
throw new Error(`Trailing stuff in op1 ${i(component)}`)
}
append(component)
}
return result
}
const invertComponent = function (c) {
if (typeof c === 'number') {
return c
} else if (c.i != null) {
return { d: c.i }
} else {
return { i: c.d }
}
}
// Invert an op
moduleExport.invert = function (op) {
const result = []
const append = makeAppend(result)
for (const component of Array.from(op)) {
append(invertComponent(component))
}
return result
}
if (typeof window !== 'undefined' && window !== null) {
if (!window.ot) {
window.ot = {}
}
if (!window.ot.types) {
window.ot.types = {}
}
window.ot.types.text = moduleExport
}

View File

@@ -0,0 +1,133 @@
/* eslint-disable
no-undef,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS205: Consider reworking code to avoid use of IIFEs
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
// Text document API for text-tp2
let type
if (typeof WEB !== 'undefined' && WEB !== null) {
type = exports.types['text-tp2']
} else {
type = require('./text-tp2')
}
const { _takeDoc: takeDoc, _append: append } = type
const appendSkipChars = (op, doc, pos, maxlength) =>
(() => {
const result = []
while (
(maxlength === undefined || maxlength > 0) &&
pos.index < doc.data.length
) {
const part = takeDoc(doc, pos, maxlength, true)
if (maxlength !== undefined && typeof part === 'string') {
maxlength -= part.length
}
result.push(append(op, part.length || part))
}
return result
})()
type.api = {
provides: { text: true },
// The number of characters in the string
getLength() {
return this.snapshot.charLength
},
// Flatten a document into a string
getText() {
const strings = Array.from(this.snapshot.data).filter(
elem => typeof elem === 'string'
)
return strings.join('')
},
insert(pos, text, callback) {
if (pos === undefined) {
pos = 0
}
const op = []
const docPos = { index: 0, offset: 0 }
appendSkipChars(op, this.snapshot, docPos, pos)
append(op, { i: text })
appendSkipChars(op, this.snapshot, docPos)
this.submitOp(op, callback)
return op
},
del(pos, length, callback) {
const op = []
const docPos = { index: 0, offset: 0 }
appendSkipChars(op, this.snapshot, docPos, pos)
while (length > 0) {
const part = takeDoc(this.snapshot, docPos, length, true)
if (typeof part === 'string') {
append(op, { d: part.length })
length -= part.length
} else {
append(op, part)
}
}
appendSkipChars(op, this.snapshot, docPos)
this.submitOp(op, callback)
return op
},
_register() {
// Interpret recieved ops + generate more detailed events for them
return this.on('remoteop', function (op, snapshot) {
let textPos = 0
const docPos = { index: 0, offset: 0 }
for (const component of Array.from(op)) {
let part, remainder
if (typeof component === 'number') {
// Skip
remainder = component
while (remainder > 0) {
part = takeDoc(snapshot, docPos, remainder)
if (typeof part === 'string') {
textPos += part.length
}
remainder -= part.length || part
}
} else if (component.i !== undefined) {
// Insert
if (typeof component.i === 'string') {
this.emit('insert', textPos, component.i)
textPos += component.i.length
}
} else {
// Delete
remainder = component.d
while (remainder > 0) {
part = takeDoc(snapshot, docPos, remainder)
if (typeof part === 'string') {
this.emit('delete', textPos, part)
}
remainder -= part.length || part
}
}
}
})
},
}

View File

@@ -0,0 +1,499 @@
/* eslint-disable
no-cond-assign,
no-return-assign,
no-undef,
no-unused-vars,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS103: Rewrite code to no longer use __guard__
* DS205: Consider reworking code to avoid use of IIFEs
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
// A TP2 implementation of text, following this spec:
// http://code.google.com/p/lightwave/source/browse/trunk/experimental/ot/README
//
// A document is made up of a string and a set of tombstones inserted throughout
// the string. For example, 'some ', (2 tombstones), 'string'.
//
// This is encoded in a document as: {s:'some string', t:[5, -2, 6]}
//
// Ops are lists of components which iterate over the whole document.
// Components are either:
// N: Skip N characters in the original document
// {i:'str'}: Insert 'str' at the current position in the document
// {i:N}: Insert N tombstones at the current position in the document
// {d:N}: Delete (tombstone) N characters at the current position in the document
//
// Eg: [3, {i:'hi'}, 5, {d:8}]
//
// Snapshots are lists with characters and tombstones. Characters are stored in strings
// and adjacent tombstones are flattened into numbers.
//
// Eg, the document: 'Hello .....world' ('.' denotes tombstoned (deleted) characters)
// would be represented by a document snapshot of ['Hello ', 5, 'world']
let append, appendDoc, takeDoc
const type = {
name: 'text-tp2',
tp2: true,
create() {
return { charLength: 0, totalLength: 0, positionCache: [], data: [] }
},
serialize(doc) {
if (!doc.data) {
throw new Error('invalid doc snapshot')
}
return doc.data
},
deserialize(data) {
const doc = type.create()
doc.data = data
for (const component of Array.from(data)) {
if (typeof component === 'string') {
doc.charLength += component.length
doc.totalLength += component.length
} else {
doc.totalLength += component
}
}
return doc
},
}
const checkOp = function (op) {
if (!Array.isArray(op)) {
throw new Error('Op must be an array of components')
}
let last = null
return (() => {
const result = []
for (const c of Array.from(op)) {
if (typeof c === 'object') {
if (c.i !== undefined) {
if (
(typeof c.i !== 'string' || !(c.i.length > 0)) &&
(typeof c.i !== 'number' || !(c.i > 0))
) {
throw new Error('Inserts must insert a string or a +ive number')
}
} else if (c.d !== undefined) {
if (typeof c.d !== 'number' || !(c.d > 0)) {
throw new Error('Deletes must be a +ive number')
}
} else {
throw new Error('Operation component must define .i or .d')
}
} else {
if (typeof c !== 'number') {
throw new Error('Op components must be objects or numbers')
}
if (!(c > 0)) {
throw new Error('Skip components must be a positive number')
}
if (typeof last === 'number') {
throw new Error('Adjacent skip components should be combined')
}
}
result.push((last = c))
}
return result
})()
}
// Take the next part from the specified position in a document snapshot.
// position = {index, offset}. It will be updated.
type._takeDoc = takeDoc = function (
doc,
position,
maxlength,
tombsIndivisible
) {
if (position.index >= doc.data.length) {
throw new Error('Operation goes past the end of the document')
}
const part = doc.data[position.index]
// peel off data[0]
const result =
typeof part === 'string'
? maxlength !== undefined
? part.slice(position.offset, position.offset + maxlength)
: part.slice(position.offset)
: maxlength === undefined || tombsIndivisible
? part - position.offset
: Math.min(maxlength, part - position.offset)
const resultLen = result.length || result
if ((part.length || part) - position.offset > resultLen) {
position.offset += resultLen
} else {
position.index++
position.offset = 0
}
return result
}
// Append a part to the end of a document
type._appendDoc = appendDoc = function (doc, p) {
if (p === 0 || p === '') {
return
}
if (typeof p === 'string') {
doc.charLength += p.length
doc.totalLength += p.length
} else {
doc.totalLength += p
}
const { data } = doc
if (data.length === 0) {
data.push(p)
} else if (typeof data[data.length - 1] === typeof p) {
data[data.length - 1] += p
} else {
data.push(p)
}
}
// Apply the op to the document. The document is not modified in the process.
type.apply = function (doc, op) {
if (
doc.totalLength === undefined ||
doc.charLength === undefined ||
doc.data.length === undefined
) {
throw new Error('Snapshot is invalid')
}
checkOp(op)
const newDoc = type.create()
const position = { index: 0, offset: 0 }
for (const component of Array.from(op)) {
let part, remainder
if (typeof component === 'number') {
remainder = component
while (remainder > 0) {
part = takeDoc(doc, position, remainder)
appendDoc(newDoc, part)
remainder -= part.length || part
}
} else if (component.i !== undefined) {
appendDoc(newDoc, component.i)
} else if (component.d !== undefined) {
remainder = component.d
while (remainder > 0) {
part = takeDoc(doc, position, remainder)
remainder -= part.length || part
}
appendDoc(newDoc, component.d)
}
}
return newDoc
}
// Append an op component to the end of the specified op.
// Exported for the randomOpGenerator.
type._append = append = function (op, component) {
if (
component === 0 ||
component.i === '' ||
component.i === 0 ||
component.d === 0
) {
return
}
if (op.length === 0) {
return op.push(component)
} else {
const last = op[op.length - 1]
if (typeof component === 'number' && typeof last === 'number') {
return (op[op.length - 1] += component)
} else if (
component.i !== undefined &&
last.i != null &&
typeof last.i === typeof component.i
) {
return (last.i += component.i)
} else if (component.d !== undefined && last.d != null) {
return (last.d += component.d)
} else {
return op.push(component)
}
}
}
// Makes 2 functions for taking components from the start of an op, and for peeking
// at the next op that could be taken.
const makeTake = function (op) {
// The index of the next component to take
let index = 0
// The offset into the component
let offset = 0
// Take up to length maxlength from the op. If maxlength is not defined, there is no max.
// If insertsIndivisible is true, inserts (& insert tombstones) won't be separated.
//
// Returns null when op is fully consumed.
const take = function (maxlength, insertsIndivisible) {
let current
if (index === op.length) {
return null
}
const e = op[index]
if (
typeof (current = e) === 'number' ||
typeof (current = e.i) === 'number' ||
(current = e.d) !== undefined
) {
let c
if (
maxlength == null ||
current - offset <= maxlength ||
(insertsIndivisible && e.i !== undefined)
) {
// Return the rest of the current element.
c = current - offset
++index
offset = 0
} else {
offset += maxlength
c = maxlength
}
if (e.i !== undefined) {
return { i: c }
} else if (e.d !== undefined) {
return { d: c }
} else {
return c
}
} else {
// Take from the inserted string
let result
if (
maxlength == null ||
e.i.length - offset <= maxlength ||
insertsIndivisible
) {
result = { i: e.i.slice(offset) }
++index
offset = 0
} else {
result = { i: e.i.slice(offset, offset + maxlength) }
offset += maxlength
}
return result
}
}
const peekType = () => op[index]
return [take, peekType]
}
// Find and return the length of an op component
const componentLength = function (component) {
if (typeof component === 'number') {
return component
} else if (typeof component.i === 'string') {
return component.i.length
} else {
// This should work because c.d and c.i must be +ive.
return component.d || component.i
}
}
// Normalize an op, removing all empty skips and empty inserts / deletes. Concatenate
// adjacent inserts and deletes.
type.normalize = function (op) {
const newOp = []
for (const component of Array.from(op)) {
append(newOp, component)
}
return newOp
}
// This is a helper method to transform and prune. goForwards is true for transform, false for prune.
const transformer = function (op, otherOp, goForwards, side) {
let component
checkOp(op)
checkOp(otherOp)
const newOp = []
const [take, peek] = Array.from(makeTake(op))
for (component of Array.from(otherOp)) {
let chunk
let length = componentLength(component)
if (component.i !== undefined) {
// Insert text or tombs
if (goForwards) {
// transform - insert skips over inserted parts
if (side === 'left') {
// The left insert should go first.
while (__guard__(peek(), x => x.i) !== undefined) {
append(newOp, take())
}
}
// In any case, skip the inserted text.
append(newOp, length)
} else {
// Prune. Remove skips for inserts.
while (length > 0) {
chunk = take(length, true)
if (chunk === null) {
throw new Error('The transformed op is invalid')
}
if (chunk.d !== undefined) {
throw new Error(
'The transformed op deletes locally inserted characters - it cannot be purged of the insert.'
)
}
if (typeof chunk === 'number') {
length -= chunk
} else {
append(newOp, chunk)
}
}
}
} else {
// Skip or delete
while (length > 0) {
chunk = take(length, true)
if (chunk === null) {
throw new Error(
'The op traverses more elements than the document has'
)
}
append(newOp, chunk)
if (!chunk.i) {
length -= componentLength(chunk)
}
}
}
}
// Append extras from op1
while ((component = take())) {
if (component.i === undefined) {
throw new Error(`Remaining fragments in the op: ${component}`)
}
append(newOp, component)
}
return newOp
}
// transform op1 by op2. Return transformed version of op1.
// op1 and op2 are unchanged by transform.
// side should be 'left' or 'right', depending on if op1.id <> op2.id. 'left' == client op.
type.transform = function (op, otherOp, side) {
if (side !== 'left' && side !== 'right') {
throw new Error(`side (${side}) should be 'left' or 'right'`)
}
return transformer(op, otherOp, true, side)
}
// Prune is the inverse of transform.
type.prune = (op, otherOp) => transformer(op, otherOp, false)
// Compose 2 ops into 1 op.
type.compose = function (op1, op2) {
let component
if (op1 === null || op1 === undefined) {
return op2
}
checkOp(op1)
checkOp(op2)
const result = []
const [take, _] = Array.from(makeTake(op1))
for (component of Array.from(op2)) {
let chunk, length
if (typeof component === 'number') {
// Skip
// Just copy from op1.
length = component
while (length > 0) {
chunk = take(length)
if (chunk === null) {
throw new Error(
'The op traverses more elements than the document has'
)
}
append(result, chunk)
length -= componentLength(chunk)
}
} else if (component.i !== undefined) {
// Insert
append(result, { i: component.i })
} else {
// Delete
length = component.d
while (length > 0) {
chunk = take(length)
if (chunk === null) {
throw new Error(
'The op traverses more elements than the document has'
)
}
const chunkLength = componentLength(chunk)
if (chunk.i !== undefined) {
append(result, { i: chunkLength })
} else {
append(result, { d: chunkLength })
}
length -= chunkLength
}
}
}
// Append extras from op1
while ((component = take())) {
if (component.i === undefined) {
throw new Error(`Remaining fragments in op1: ${component}`)
}
append(result, component)
}
return result
}
if (typeof WEB !== 'undefined' && WEB !== null) {
exports.types['text-tp2'] = type
} else {
module.exports = type
}
function __guard__(value, transform) {
return typeof value !== 'undefined' && value !== null
? transform(value)
: undefined
}

View File

@@ -0,0 +1,387 @@
/* eslint-disable
no-return-assign,
no-undef,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
// A simple text implementation
//
// Operations are lists of components.
// Each component either inserts or deletes at a specified position in the document.
//
// Components are either:
// {i:'str', p:100}: Insert 'str' at position 100 in the document
// {d:'str', p:100}: Delete 'str' at position 100 in the document
//
// Components in an operation are executed sequentially, so the position of components
// assumes previous components have already executed.
//
// Eg: This op:
// [{i:'abc', p:0}]
// is equivalent to this op:
// [{i:'a', p:0}, {i:'b', p:1}, {i:'c', p:2}]
// NOTE: The global scope here is shared with other sharejs files when built with closure.
// Be careful what ends up in your namespace.
let append, transformComponent
const text = {}
text.name = 'text'
text.create = () => ''
const strInject = (s1, pos, s2) => s1.slice(0, pos) + s2 + s1.slice(pos)
const checkValidComponent = function (c) {
if (typeof c.p !== 'number') {
throw new Error('component missing position field')
}
const iType = typeof c.i
const dType = typeof c.d
const cType = typeof c.c
if (!((iType === 'string') ^ (dType === 'string') ^ (cType === 'string'))) {
throw new Error('component needs an i, d or c field')
}
if (!(c.p >= 0)) {
throw new Error('position cannot be negative')
}
}
const checkValidOp = function (op) {
for (const c of Array.from(op)) {
checkValidComponent(c)
}
return true
}
text.apply = function (snapshot, op) {
checkValidOp(op)
for (const component of Array.from(op)) {
if (component.i != null) {
snapshot = strInject(snapshot, component.p, component.i)
} else if (component.d != null) {
const deleted = snapshot.slice(
component.p,
component.p + component.d.length
)
if (component.d !== deleted) {
throw new Error(
`Delete component '${component.d}' does not match deleted text '${deleted}'`
)
}
snapshot =
snapshot.slice(0, component.p) +
snapshot.slice(component.p + component.d.length)
} else if (component.c != null) {
const comment = snapshot.slice(
component.p,
component.p + component.c.length
)
if (component.c !== comment) {
throw new Error(
`Comment component '${component.c}' does not match commented text '${comment}'`
)
}
} else {
throw new Error('Unknown op type')
}
}
return snapshot
}
// Exported for use by the random op generator.
//
// For simplicity, this version of append does not compress adjacent inserts and deletes of
// the same text. It would be nice to change that at some stage.
text._append = append = function (newOp, c) {
if (c.i === '' || c.d === '') {
return
}
if (newOp.length === 0) {
return newOp.push(c)
} else {
const last = newOp[newOp.length - 1]
// Compose the insert into the previous insert if possible
if (
last.i != null &&
c.i != null &&
last.p <= c.p &&
c.p <= last.p + last.i.length
) {
return (newOp[newOp.length - 1] = {
i: strInject(last.i, c.p - last.p, c.i),
p: last.p,
})
} else if (
last.d != null &&
c.d != null &&
c.p <= last.p &&
last.p <= c.p + c.d.length
) {
return (newOp[newOp.length - 1] = {
d: strInject(c.d, last.p - c.p, last.d),
p: c.p,
})
} else {
return newOp.push(c)
}
}
}
text.compose = function (op1, op2) {
checkValidOp(op1)
checkValidOp(op2)
const newOp = op1.slice()
for (const c of Array.from(op2)) {
append(newOp, c)
}
return newOp
}
// Attempt to compress the op components together 'as much as possible'.
// This implementation preserves order and preserves create/delete pairs.
text.compress = op => text.compose([], op)
text.normalize = function (op) {
const newOp = []
// Normalize should allow ops which are a single (unwrapped) component:
// {i:'asdf', p:23}.
// There's no good way to test if something is an array:
// http://perfectionkills.com/instanceof-considered-harmful-or-how-to-write-a-robust-isarray/
// so this is probably the least bad solution.
if (op.i != null || op.p != null) {
op = [op]
}
for (const c of Array.from(op)) {
if (c.p == null) {
c.p = 0
}
append(newOp, c)
}
return newOp
}
// This helper method transforms a position by an op component.
//
// If c is an insert, insertAfter specifies whether the transform
// is pushed after the insert (true) or before it (false).
//
// insertAfter is optional for deletes.
const transformPosition = function (pos, c, insertAfter) {
if (c.i != null) {
if (c.p < pos || (c.p === pos && insertAfter)) {
return pos + c.i.length
} else {
return pos
}
} else if (c.d != null) {
// I think this could also be written as: Math.min(c.p, Math.min(c.p - otherC.p, otherC.d.length))
// but I think its harder to read that way, and it compiles using ternary operators anyway
// so its no slower written like this.
if (pos <= c.p) {
return pos
} else if (pos <= c.p + c.d.length) {
return c.p
} else {
return pos - c.d.length
}
} else if (c.c != null) {
return pos
} else {
throw new Error('unknown op type')
}
}
// Helper method to transform a cursor position as a result of an op.
//
// Like transformPosition above, if c is an insert, insertAfter specifies whether the cursor position
// is pushed after an insert (true) or before it (false).
text.transformCursor = function (position, op, side) {
const insertAfter = side === 'right'
for (const c of Array.from(op)) {
position = transformPosition(position, c, insertAfter)
}
return position
}
// Transform an op component by another op component. Asymmetric.
// The result will be appended to destination.
//
// exported for use in JSON type
text._tc = transformComponent = function (dest, c, otherC, side) {
let cIntersect, intersectEnd, intersectStart, newC, otherIntersect
checkValidOp([c])
checkValidOp([otherC])
if (c.i != null) {
append(dest, {
i: c.i,
p: transformPosition(c.p, otherC, side === 'right'),
})
} else if (c.d != null) {
// Delete
if (otherC.i != null) {
// delete vs insert
let s = c.d
if (c.p < otherC.p) {
append(dest, { d: s.slice(0, otherC.p - c.p), p: c.p })
s = s.slice(otherC.p - c.p)
}
if (s !== '') {
append(dest, { d: s, p: c.p + otherC.i.length })
}
} else if (otherC.d != null) {
// Delete vs delete
if (c.p >= otherC.p + otherC.d.length) {
append(dest, { d: c.d, p: c.p - otherC.d.length })
} else if (c.p + c.d.length <= otherC.p) {
append(dest, c)
} else {
// They overlap somewhere.
newC = { d: '', p: c.p }
if (c.p < otherC.p) {
newC.d = c.d.slice(0, otherC.p - c.p)
}
if (c.p + c.d.length > otherC.p + otherC.d.length) {
newC.d += c.d.slice(otherC.p + otherC.d.length - c.p)
}
// This is entirely optional - just for a check that the deleted
// text in the two ops matches
intersectStart = Math.max(c.p, otherC.p)
intersectEnd = Math.min(c.p + c.d.length, otherC.p + otherC.d.length)
cIntersect = c.d.slice(intersectStart - c.p, intersectEnd - c.p)
otherIntersect = otherC.d.slice(
intersectStart - otherC.p,
intersectEnd - otherC.p
)
if (cIntersect !== otherIntersect) {
throw new Error(
'Delete ops delete different text in the same region of the document'
)
}
if (newC.d !== '') {
// This could be rewritten similarly to insert v delete, above.
newC.p = transformPosition(newC.p, otherC)
append(dest, newC)
}
}
} else if (otherC.c != null) {
append(dest, c)
} else {
throw new Error('unknown op type')
}
} else if (c.c != null) {
// Comment
if (otherC.i != null) {
if (c.p < otherC.p && otherC.p < c.p + c.c.length) {
const offset = otherC.p - c.p
const newC =
c.c.slice(0, +(offset - 1) + 1 || undefined) +
otherC.i +
c.c.slice(offset)
append(dest, { c: newC, p: c.p, t: c.t })
} else {
append(dest, {
c: c.c,
p: transformPosition(c.p, otherC, true),
t: c.t,
})
}
} else if (otherC.d != null) {
if (c.p >= otherC.p + otherC.d.length) {
append(dest, { c: c.c, p: c.p - otherC.d.length, t: c.t })
} else if (c.p + c.c.length <= otherC.p) {
append(dest, c)
} else {
// Delete overlaps comment
// They overlap somewhere.
newC = { c: '', p: c.p, t: c.t }
if (c.p < otherC.p) {
newC.c = c.c.slice(0, otherC.p - c.p)
}
if (c.p + c.c.length > otherC.p + otherC.d.length) {
newC.c += c.c.slice(otherC.p + otherC.d.length - c.p)
}
// This is entirely optional - just for a check that the deleted
// text in the two ops matches
intersectStart = Math.max(c.p, otherC.p)
intersectEnd = Math.min(c.p + c.c.length, otherC.p + otherC.d.length)
cIntersect = c.c.slice(intersectStart - c.p, intersectEnd - c.p)
otherIntersect = otherC.d.slice(
intersectStart - otherC.p,
intersectEnd - otherC.p
)
if (cIntersect !== otherIntersect) {
throw new Error(
'Delete ops delete different text in the same region of the document'
)
}
newC.p = transformPosition(newC.p, otherC)
append(dest, newC)
}
} else if (otherC.c != null) {
append(dest, c)
} else {
throw new Error('unknown op type')
}
}
return dest
}
const invertComponent = function (c) {
if (c.i != null) {
return { d: c.i, p: c.p }
} else {
return { i: c.d, p: c.p }
}
}
// No need to use append for invert, because the components won't be able to
// cancel with one another.
text.invert = op =>
Array.from(op.slice().reverse()).map(c => invertComponent(c))
if (typeof WEB !== 'undefined' && WEB !== null) {
if (!exports.types) {
exports.types = {}
}
// This is kind of awful - come up with a better way to hook this helper code up.
bootstrapTransform(text, transformComponent, checkValidOp, append)
// [] is used to prevent closure from renaming types.text
exports.types.text = text
} else {
module.exports = text
// The text type really shouldn't need this - it should be possible to define
// an efficient transform function by making a sort of transform map and passing each
// op component through it.
require('./helpers').bootstrapTransform(
text,
transformComponent,
checkValidOp,
append
)
}

View File

@@ -0,0 +1,14 @@
/* eslint-disable
no-unused-vars,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
// This is included at the top of each compiled type file for the web.
/**
@const
@type {boolean}
*/
const WEB = true
const exports = window.sharejs

View File

@@ -0,0 +1,136 @@
import {
TrackingPropsRawData,
ClearTrackingPropsRawData,
} from 'overleaf-editor-core/lib/types'
/**
* An update coming from the editor
*/
export type Update = {
doc: string
op: Op[]
v: number
meta?: {
tc?: boolean
user_id?: string
ts?: number
}
projectHistoryId?: string
}
export type Op = InsertOp | DeleteOp | CommentOp | RetainOp
export type InsertOp = {
i: string
p: number
u?: boolean
}
export type RetainOp = {
r: string
p: number
}
export type DeleteOp = {
d: string
p: number
u?: boolean
}
export type CommentOp = {
c: string
p: number
t: string
u?: boolean
// Used by project-history when restoring CommentSnapshots
resolved?: boolean
}
/**
* Ranges record on a document
*/
export type Ranges = {
comments?: Comment[]
changes?: TrackedChange[]
}
export type Comment = {
id: string
op: CommentOp
metadata?: {
user_id: string
ts: string
}
}
export type TrackedChange = {
id: string
op: InsertOp | DeleteOp
metadata: {
user_id: string
ts: string
}
}
/**
* Updates sent to project-history
*/
export type HistoryUpdate = {
op: HistoryOp[]
doc: string
v?: number
meta?: {
ts?: number
pathname?: string
doc_length?: number
history_doc_length?: number
doc_hash?: string
tc?: boolean
user_id?: string
}
projectHistoryId?: string
}
export type HistoryOp =
| HistoryInsertOp
| HistoryDeleteOp
| HistoryCommentOp
| HistoryRetainOp
export type HistoryInsertOp = InsertOp & {
commentIds?: string[]
hpos?: number
trackedDeleteRejection?: boolean
}
export type HistoryRetainOp = RetainOp & {
hpos?: number
tracking?: TrackingPropsRawData | ClearTrackingPropsRawData
}
export type HistoryDeleteOp = DeleteOp & {
hpos?: number
trackedChanges?: HistoryDeleteTrackedChange[]
}
export type HistoryDeleteTrackedChange = {
type: 'insert' | 'delete'
offset: number
length: number
}
export type HistoryCommentOp = CommentOp & {
hpos?: number
hlen?: number
}
export type HistoryRanges = {
comments?: HistoryComment[]
changes?: HistoryTrackedChange[]
}
export type HistoryComment = Comment & { op: HistoryCommentOp }
export type HistoryTrackedChange = TrackedChange & {
op: HistoryInsertOp | HistoryDeleteOp
}

View File

@@ -0,0 +1,78 @@
const text = require('../app/js/sharejs/types/text.js')
const TEST_RUNS = 1_000_000
const MAX_OPS_BATCH_SIZE = 35
const KB = 1000
function runTestCase(testCase, documentSizeBytes) {
const initialText = 'A'.repeat(documentSizeBytes)
console.log(`test: ${testCase.name}`)
console.log(`opsBatchSize\topsPerSeconds ${documentSizeBytes / 1000}KB`)
for (let i = 1; i <= MAX_OPS_BATCH_SIZE; i++) {
const ops = testCase(documentSizeBytes, i)
let timeTotal = 0
for (let i = 0; i < TEST_RUNS; i++) {
const start = performance.now()
try {
text.apply(initialText, ops)
} catch {
console.error(`test failed: ${testCase.name}, with ops:`)
console.error(ops)
return
}
const done = performance.now()
timeTotal += done - start
}
const opsPerSeconds = TEST_RUNS / (timeTotal / 1000)
console.log(`${i}\t${opsPerSeconds}`)
}
}
const randomAdditionTestCase = (docSize, opsSize) =>
Array.from({ length: opsSize }, () => ({
p: Math.floor(Math.random() * docSize),
i: 'B',
}))
const sequentialAdditionsTestCase = (docSize, opsSize) =>
Array.from({ length: opsSize }, (_, i) => ({ p: i + docSize, i: 'B' }))
const sequentialAdditionsInMiddleTestCase = (docSize, opsSize) =>
Array.from({ length: opsSize }, (_, i) => ({
p: Math.floor(docSize / 2) + i,
i: 'B',
}))
const randomDeletionTestCase = (docSize, opsSize) =>
Array.from({ length: opsSize }, (_, i) => ({
p: Math.floor(Math.random() * (docSize - 1 - i)),
d: 'A',
}))
const sequentialDeletionTestCase = (docSize, opsSize) =>
Array.from({ length: opsSize }, (_, i) => ({
p: docSize - 1 - i,
d: 'A',
}))
const sequentialDeletionInMiddleTestCase = (docSize, opsSize) =>
Array.from({ length: opsSize }, (_, i) => ({
p: Math.floor(docSize / 2),
d: 'A',
}))
for (const docSize of [10 * KB, 100 * KB]) {
for (const testCase of [
randomAdditionTestCase,
sequentialAdditionsTestCase,
sequentialAdditionsInMiddleTestCase,
randomDeletionTestCase,
sequentialDeletionTestCase,
sequentialDeletionInMiddleTestCase,
]) {
runTestCase(testCase, docSize)
}
}

View File

@@ -0,0 +1,188 @@
require "benchmark"
require "redis"
N = (ARGV.first || 1).to_i
DOC_ID = (ARGV.last || "606072b20bb4d3109fb5b122")
@r = Redis.new
def get
@r.get("doclines:{#{DOC_ID}}")
@r.get("DocVersion:{#{DOC_ID}}")
@r.get("DocHash:{#{DOC_ID}}")
@r.get("ProjectId:{#{DOC_ID}}")
@r.get("Ranges:{#{DOC_ID}}")
@r.get("Pathname:{#{DOC_ID}}")
@r.get("ProjectHistoryId:{#{DOC_ID}}")
@r.get("UnflushedTime:{#{DOC_ID}}")
@r.get("lastUpdatedAt:{#{DOC_ID}}")
@r.get("lastUpdatedBy:{#{DOC_ID}}")
end
def mget
@r.mget(
"doclines:{#{DOC_ID}}",
"DocVersion:{#{DOC_ID}}",
"DocHash:{#{DOC_ID}}",
"ProjectId:{#{DOC_ID}}",
"Ranges:{#{DOC_ID}}",
"Pathname:{#{DOC_ID}}",
"ProjectHistoryId:{#{DOC_ID}}",
"UnflushedTime:{#{DOC_ID}}",
"lastUpdatedAt:{#{DOC_ID}}",
"lastUpdatedBy:{#{DOC_ID}}",
)
end
def set
@r.set("doclines:{#{DOC_ID}}", "[\"@book{adams1995hitchhiker,\",\" title={The Hitchhiker's Guide to the Galaxy},\",\" author={Adams, D.},\",\" isbn={9781417642595},\",\" url={http://books.google.com/books?id=W-xMPgAACAAJ},\",\" year={1995},\",\" publisher={San Val}\",\"}\",\"\"]")
@r.set("DocVersion:{#{DOC_ID}}", "0")
@r.set("DocHash:{#{DOC_ID}}", "0075bb0629c6c13d0d68918443648bbfe7d98869")
@r.set("ProjectId:{#{DOC_ID}}", "606072b20bb4d3109fb5b11e")
@r.set("Ranges:{#{DOC_ID}}", "")
@r.set("Pathname:{#{DOC_ID}}", "/references.bib")
@r.set("ProjectHistoryId:{#{DOC_ID}}", "")
@r.set("UnflushedTime:{#{DOC_ID}}", "")
@r.set("lastUpdatedAt:{#{DOC_ID}}", "")
@r.set("lastUpdatedBy:{#{DOC_ID}}", "")
end
def mset
@r.mset(
"doclines:{#{DOC_ID}}", "[\"@book{adams1995hitchhiker,\",\" title={The Hitchhiker's Guide to the Galaxy},\",\" author={Adams, D.},\",\" isbn={9781417642595},\",\" url={http://books.google.com/books?id=W-xMPgAACAAJ},\",\" year={1995},\",\" publisher={San Val}\",\"}\",\"\"]",
"DocVersion:{#{DOC_ID}}", "0",
"DocHash:{#{DOC_ID}}", "0075bb0629c6c13d0d68918443648bbfe7d98869",
"ProjectId:{#{DOC_ID}}", "606072b20bb4d3109fb5b11e",
"Ranges:{#{DOC_ID}}", "",
"Pathname:{#{DOC_ID}}", "/references.bib",
"ProjectHistoryId:{#{DOC_ID}}", "",
"UnflushedTime:{#{DOC_ID}}", "",
"lastUpdatedAt:{#{DOC_ID}}", "",
"lastUpdatedBy:{#{DOC_ID}}", "",
)
end
def benchmark_multi_get(benchmark, i)
benchmark.report("#{i}: multi get") do
N.times do
@r.multi do
get
end
end
end
end
def benchmark_mget(benchmark, i)
benchmark.report("#{i}: mget") do
N.times do
mget
end
end
end
def benchmark_multi_set(benchmark, i)
benchmark.report("#{i}: multi set") do
N.times do
@r.multi do
set
end
end
end
end
def benchmark_mset(benchmark, i)
benchmark.report("#{i}: mset") do
N.times do
mset
end
end
end
# init
set
Benchmark.bmbm do |benchmark|
3.times do |i|
benchmark_multi_get(benchmark, i)
benchmark_mget(benchmark, i)
benchmark_multi_set(benchmark, i)
benchmark_mset(benchmark, i)
end
end
=begin
# Results
I could not max out the redis-server process with this benchmark.
The ruby process hit 100% of a modern i7 CPU thread and the redis-server process
barely hit 50% of a CPU thread.
Based on the timings below, mget is about 3 times faster and mset about 4 times
faster than multiple get/set commands in a multi.
=end
=begin
$ redis-server --version
Redis server v=5.0.7 sha=00000000:0 malloc=jemalloc-5.2.1 bits=64 build=636cde3b5c7a3923
$ ruby multi_vs_mget_mset.rb 100000
Rehearsal ------------------------------------------------
0: multi get 12.132423 4.246689 16.379112 ( 16.420069)
0: mget 4.499457 0.947556 5.447013 ( 6.274883)
0: multi set 12.685936 4.495241 17.181177 ( 17.225984)
0: mset 2.543401 0.913448 3.456849 ( 4.554799)
1: multi get 13.397207 4.581881 17.979088 ( 18.027755)
1: mget 4.551287 1.160531 5.711818 ( 6.579168)
1: multi set 13.018957 4.927175 17.946132 ( 17.987502)
1: mset 2.561096 1.048416 3.609512 ( 4.780087)
2: multi get 13.224422 5.014475 18.238897 ( 18.284152)
2: mget 4.664434 1.051083 5.715517 ( 6.592088)
2: multi set 12.972284 4.600422 17.572706 ( 17.613185)
2: mset 2.621344 0.984123 3.605467 ( 4.766855)
------------------------------------- total: 132.843288sec
user system total real
0: multi get 13.341552 4.900892 18.242444 ( 18.289912)
0: mget 5.056534 0.960954 6.017488 ( 6.971189)
0: multi set 12.989880 4.823793 17.813673 ( 17.858393)
0: mset 2.543434 1.025352 3.568786 ( 4.723040)
1: multi get 13.059379 4.674345 17.733724 ( 17.777859)
1: mget 4.698754 0.915637 5.614391 ( 6.489614)
1: multi set 12.608293 4.729163 17.337456 ( 17.372993)
1: mset 2.645290 0.940584 3.585874 ( 4.744134)
2: multi get 13.678224 4.732373 18.410597 ( 18.457525)
2: mget 4.716749 1.072064 5.788813 ( 6.697683)
2: multi set 13.058710 4.889801 17.948511 ( 17.988742)
2: mset 2.311854 0.989166 3.301020 ( 4.346467)
=end
=begin
# multi get/set run at about O(65'000) operations per second
$ redis-cli info | grep 'instantaneous_ops_per_sec'
instantaneous_ops_per_sec:65557
# mget runs at about O(15'000) operations per second
$ redis-cli info | grep 'instantaneous_ops_per_sec'
instantaneous_ops_per_sec:14580
# mset runs at about O(20'000) operations per second
$ redis-cli info | grep 'instantaneous_ops_per_sec'
instantaneous_ops_per_sec:20792
These numbers are pretty reasonable:
multi: 100'000 * 12 ops / 18s = 66'666 ops/s
mget : 100'000 * 1 ops / 7s = 14'285 ops/s
mset : 100'000 * 1 ops / 5s = 20'000 ops/s
Bonus: Running three benchmarks in parallel on different keys.
multi get: O(125'000) ops/s and 80% CPU load of redis-server
multi set: O(130'000) ops/s and 90% CPU load of redis-server
mget : O( 30'000) ops/s and 70% CPU load of redis-server
mset : O( 40'000) ops/s and 90% CPU load of redis-server
=end

View File

@@ -0,0 +1,9 @@
document-updater
--dependencies=mongo,redis
--docker-repos=us-east1-docker.pkg.dev/overleaf-ops/ol-docker
--env-add=
--env-pass-through=
--esmock-loader=False
--node-version=20.18.2
--public-repo=True
--script-version=4.7.0

View File

@@ -0,0 +1,187 @@
const http = require('node:http')
const https = require('node:https')
http.globalAgent.keepAlive = false
https.globalAgent.keepAlive = false
module.exports = {
internal: {
documentupdater: {
host: process.env.LISTEN_ADDRESS || '127.0.0.1',
port: 3003,
},
},
apis: {
web: {
url: `http://${
process.env.WEB_API_HOST || process.env.WEB_HOST || '127.0.0.1'
}:${process.env.WEB_API_PORT || process.env.WEB_PORT || 3000}`,
user: process.env.WEB_API_USER || 'overleaf',
pass: process.env.WEB_API_PASSWORD || 'password',
},
project_history: {
url: `http://${process.env.PROJECT_HISTORY_HOST || '127.0.0.1'}:3054`,
},
},
redis: {
pubsub: {
host:
process.env.PUBSUB_REDIS_HOST || process.env.REDIS_HOST || '127.0.0.1',
port: process.env.PUBSUB_REDIS_PORT || process.env.REDIS_PORT || '6379',
password:
process.env.PUBSUB_REDIS_PASSWORD || process.env.REDIS_PASSWORD || '',
maxRetriesPerRequest: parseInt(
process.env.REDIS_MAX_RETRIES_PER_REQUEST || '20'
),
},
history: {
port: process.env.HISTORY_REDIS_PORT || process.env.REDIS_PORT || '6379',
host:
process.env.HISTORY_REDIS_HOST || process.env.REDIS_HOST || '127.0.0.1',
password:
process.env.HISTORY_REDIS_PASSWORD || process.env.REDIS_PASSWORD || '',
maxRetriesPerRequest: parseInt(
process.env.REDIS_MAX_RETRIES_PER_REQUEST || '20'
),
},
project_history: {
port: process.env.HISTORY_REDIS_PORT || process.env.REDIS_PORT || '6379',
host:
process.env.HISTORY_REDIS_HOST || process.env.REDIS_HOST || '127.0.0.1',
password:
process.env.HISTORY_REDIS_PASSWORD || process.env.REDIS_PASSWORD || '',
maxRetriesPerRequest: parseInt(
process.env.REDIS_MAX_RETRIES_PER_REQUEST || '20'
),
key_schema: {
projectHistoryOps({ project_id: projectId }) {
return `ProjectHistory:Ops:{${projectId}}`
},
projectHistoryFirstOpTimestamp({ project_id: projectId }) {
return `ProjectHistory:FirstOpTimestamp:{${projectId}}`
},
},
},
lock: {
port: process.env.LOCK_REDIS_PORT || process.env.REDIS_PORT || '6379',
host:
process.env.LOCK_REDIS_HOST || process.env.REDIS_HOST || '127.0.0.1',
password:
process.env.LOCK_REDIS_PASSWORD || process.env.REDIS_PASSWORD || '',
maxRetriesPerRequest: parseInt(
process.env.REDIS_MAX_RETRIES_PER_REQUEST || '20'
),
key_schema: {
blockingKey({ doc_id: docId }) {
return `Blocking:{${docId}}`
},
},
},
documentupdater: {
port:
process.env.DOC_UPDATER_REDIS_PORT || process.env.REDIS_PORT || '6379',
host:
process.env.DOC_UPDATER_REDIS_HOST ||
process.env.REDIS_HOST ||
'127.0.0.1',
password:
process.env.DOC_UPDATER_REDIS_PASSWORD ||
process.env.REDIS_PASSWORD ||
'',
maxRetriesPerRequest: parseInt(
process.env.REDIS_MAX_RETRIES_PER_REQUEST || '20'
),
key_schema: {
blockingKey({ doc_id: docId }) {
return `Blocking:{${docId}}`
},
docLines({ doc_id: docId }) {
return `doclines:{${docId}}`
},
docOps({ doc_id: docId }) {
return `DocOps:{${docId}}`
},
docVersion({ doc_id: docId }) {
return `DocVersion:{${docId}}`
},
docHash({ doc_id: docId }) {
return `DocHash:{${docId}}`
},
projectKey({ doc_id: docId }) {
return `ProjectId:{${docId}}`
},
docsInProject({ project_id: projectId }) {
return `DocsIn:{${projectId}}`
},
ranges({ doc_id: docId }) {
return `Ranges:{${docId}}`
},
unflushedTime({ doc_id: docId }) {
return `UnflushedTime:{${docId}}`
},
pathname({ doc_id: docId }) {
return `Pathname:{${docId}}`
},
projectHistoryId({ doc_id: docId }) {
return `ProjectHistoryId:{${docId}}`
},
projectState({ project_id: projectId }) {
return `ProjectState:{${projectId}}`
},
projectBlock({ project_id: projectId }) {
return `ProjectBlock:{${projectId}}`
},
pendingUpdates({ doc_id: docId }) {
return `PendingUpdates:{${docId}}`
},
lastUpdatedBy({ doc_id: docId }) {
return `lastUpdatedBy:{${docId}}`
},
lastUpdatedAt({ doc_id: docId }) {
return `lastUpdatedAt:{${docId}}`
},
resolvedCommentIds({ doc_id: docId }) {
return `ResolvedCommentIds:{${docId}}`
},
flushAndDeleteQueue() {
return 'DocUpdaterFlushAndDeleteQueue'
},
historyRangesSupport() {
return 'HistoryRangesSupport'
},
},
},
},
max_doc_length: 2 * 1024 * 1024, // 2mb
maxJsonRequestSize:
parseInt(process.env.MAX_JSON_REQUEST_SIZE, 10) || 8 * 1024 * 1024,
dispatcherCount: parseInt(process.env.DISPATCHER_COUNT || 10, 10),
redisLockTTLSeconds: 30,
mongo: {
url:
process.env.MONGO_CONNECTION_STRING ||
`mongodb://${process.env.MONGO_HOST || '127.0.0.1'}/sharelatex`,
options: {
monitorCommands: true,
},
},
publishOnIndividualChannels:
process.env.PUBLISH_ON_INDIVIDUAL_CHANNELS === 'true',
continuousBackgroundFlush: process.env.CONTINUOUS_BACKGROUND_FLUSH === 'true',
smoothingOffset: process.env.SMOOTHING_OFFSET || 1000, // milliseconds
gracefulShutdownDelayInMs:
parseInt(process.env.GRACEFUL_SHUTDOWN_DELAY_SECONDS ?? '10', 10) * 1000,
}

View File

@@ -0,0 +1,65 @@
# This file was auto-generated, do not edit it directly.
# Instead run bin/update_build_scripts from
# https://github.com/overleaf/internal/
version: "2.3"
services:
test_unit:
image: ci/$PROJECT_NAME:$BRANCH_NAME-$BUILD_NUMBER
user: node
command: npm run test:unit:_run
environment:
NODE_ENV: test
NODE_OPTIONS: "--unhandled-rejections=strict"
test_acceptance:
build: .
image: ci/$PROJECT_NAME:$BRANCH_NAME-$BUILD_NUMBER
environment:
ELASTIC_SEARCH_DSN: es:9200
REDIS_HOST: redis
QUEUES_REDIS_HOST: redis
HISTORY_REDIS_HOST: redis
ANALYTICS_QUEUES_REDIS_HOST: redis
MONGO_HOST: mongo
POSTGRES_HOST: postgres
MOCHA_GREP: ${MOCHA_GREP}
NODE_ENV: test
NODE_OPTIONS: "--unhandled-rejections=strict"
depends_on:
mongo:
condition: service_started
redis:
condition: service_healthy
user: node
command: npm run test:acceptance
tar:
build: .
image: ci/$PROJECT_NAME:$BRANCH_NAME-$BUILD_NUMBER
volumes:
- ./:/tmp/build/
command: tar -czf /tmp/build/build.tar.gz --exclude=build.tar.gz --exclude-vcs .
user: root
redis:
image: redis
healthcheck:
test: ping="$$(redis-cli ping)" && [ "$$ping" = 'PONG' ]
interval: 1s
retries: 20
mongo:
image: mongo:6.0.13
command: --replSet overleaf
volumes:
- ../../bin/shared/mongodb-init-replica-set.js:/docker-entrypoint-initdb.d/mongodb-init-replica-set.js
environment:
MONGO_INITDB_DATABASE: sharelatex
extra_hosts:
# Required when using the automatic database setup for initializing the
# replica set. This override is not needed when running the setup after
# starting up mongo.
- mongo:127.0.0.1

View File

@@ -0,0 +1,69 @@
# This file was auto-generated, do not edit it directly.
# Instead run bin/update_build_scripts from
# https://github.com/overleaf/internal/
version: "2.3"
services:
test_unit:
image: node:20.18.2
volumes:
- .:/overleaf/services/document-updater
- ../../node_modules:/overleaf/node_modules
- ../../libraries:/overleaf/libraries
working_dir: /overleaf/services/document-updater
environment:
MOCHA_GREP: ${MOCHA_GREP}
LOG_LEVEL: ${LOG_LEVEL:-}
NODE_ENV: test
NODE_OPTIONS: "--unhandled-rejections=strict"
command: npm run --silent test:unit
user: node
test_acceptance:
image: node:20.18.2
volumes:
- .:/overleaf/services/document-updater
- ../../node_modules:/overleaf/node_modules
- ../../libraries:/overleaf/libraries
working_dir: /overleaf/services/document-updater
environment:
ELASTIC_SEARCH_DSN: es:9200
REDIS_HOST: redis
HISTORY_REDIS_HOST: redis
QUEUES_REDIS_HOST: redis
ANALYTICS_QUEUES_REDIS_HOST: redis
MONGO_HOST: mongo
POSTGRES_HOST: postgres
MOCHA_GREP: ${MOCHA_GREP}
LOG_LEVEL: ${LOG_LEVEL:-}
NODE_ENV: test
NODE_OPTIONS: "--unhandled-rejections=strict"
user: node
depends_on:
mongo:
condition: service_started
redis:
condition: service_healthy
command: npm run --silent test:acceptance
redis:
image: redis
healthcheck:
test: ping=$$(redis-cli ping) && [ "$$ping" = 'PONG' ]
interval: 1s
retries: 20
mongo:
image: mongo:6.0.13
command: --replSet overleaf
volumes:
- ../../bin/shared/mongodb-init-replica-set.js:/docker-entrypoint-initdb.d/mongodb-init-replica-set.js
environment:
MONGO_INITDB_DATABASE: sharelatex
extra_hosts:
# Required when using the automatic database setup for initializing the
# replica set. This override is not needed when running the setup after
# starting up mongo.
- mongo:127.0.0.1

View File

@@ -0,0 +1,51 @@
{
"name": "@overleaf/document-updater",
"description": "An API for applying incoming updates to documents in real-time",
"private": true,
"main": "app.js",
"scripts": {
"start": "node app.js",
"test:acceptance:_run": "mocha --recursive --reporter spec --timeout 15000 --exit $@ test/acceptance/js",
"test:acceptance": "npm run test:acceptance:_run -- --grep=$MOCHA_GREP",
"test:unit:_run": "mocha --recursive --reporter spec $@ test/unit/js",
"test:unit": "npm run test:unit:_run -- --grep=$MOCHA_GREP",
"nodemon": "node --watch app.js",
"benchmark:apply": "node benchmarks/apply",
"lint": "eslint --max-warnings 0 --format unix .",
"format": "prettier --list-different $PWD/'**/*.*js'",
"format:fix": "prettier --write $PWD/'**/*.*js'",
"lint:fix": "eslint --fix .",
"types:check": "tsc --noEmit"
},
"dependencies": {
"@overleaf/logger": "*",
"@overleaf/metrics": "*",
"@overleaf/o-error": "*",
"@overleaf/promise-utils": "*",
"@overleaf/ranges-tracker": "*",
"@overleaf/redis-wrapper": "*",
"@overleaf/settings": "*",
"@types/chai-as-promised": "^7.1.8",
"async": "^3.2.5",
"body-parser": "^1.20.3",
"bunyan": "^1.8.15",
"diff-match-patch": "overleaf/diff-match-patch#89805f9c671a77a263fc53461acd62aa7498f688",
"express": "^4.21.2",
"lodash": "^4.17.21",
"minimist": "^1.2.8",
"mongodb-legacy": "6.1.3",
"request": "^2.88.2",
"requestretry": "^7.1.0"
},
"devDependencies": {
"chai": "^4.3.6",
"chai-as-promised": "^7.1.1",
"cluster-key-slot": "^1.0.5",
"mocha": "^11.1.0",
"sandboxed-module": "^2.0.4",
"sinon": "^9.2.4",
"sinon-chai": "^3.7.0",
"timekeeper": "^2.0.0",
"typescript": "^5.0.4"
}
}

View File

@@ -0,0 +1,425 @@
const fs = require('node:fs')
const Path = require('node:path')
const _ = require('lodash')
const logger = require('@overleaf/logger')
const OError = require('@overleaf/o-error')
const Errors = require('../app/js/Errors')
const LockManager = require('../app/js/LockManager')
const PersistenceManager = require('../app/js/PersistenceManager')
const ProjectFlusher = require('../app/js/ProjectFlusher')
const ProjectManager = require('../app/js/ProjectManager')
const RedisManager = require('../app/js/RedisManager')
const Settings = require('@overleaf/settings')
const request = require('requestretry').defaults({
maxAttempts: 2,
retryDelay: 10,
})
const AUTO_FIX_VERSION_MISMATCH =
process.env.AUTO_FIX_VERSION_MISMATCH === 'true'
const AUTO_FIX_PARTIALLY_DELETED_DOC_METADATA =
process.env.AUTO_FIX_PARTIALLY_DELETED_DOC_METADATA === 'true'
const SCRIPT_LOG_LEVEL = process.env.SCRIPT_LOG_LEVEL || 'warn'
const FLUSH_IN_SYNC_PROJECTS = process.env.FLUSH_IN_SYNC_PROJECTS === 'true'
const FOLDER =
process.env.FOLDER || '/tmp/overleaf-check-redis-mongo-sync-state'
const LIMIT = parseInt(process.env.LIMIT || '1000', 10)
const RETRIES = parseInt(process.env.RETRIES || '5', 10)
const WRITE_CONTENT = process.env.WRITE_CONTENT === 'true'
process.env.LOG_LEVEL = SCRIPT_LOG_LEVEL
logger.initialize('check-redis-mongo-sync-state')
const COMPARE_AND_SET =
'if redis.call("get", KEYS[1]) == ARGV[1] then return redis.call("set", KEYS[1], ARGV[2]) else return 0 end'
/**
* @typedef {Object} Doc
* @property {number} version
* @property {Array<string>} lines
* @property {string} pathname
* @property {Object} ranges
* @property {boolean} [partiallyDeleted]
*/
class TryAgainError extends Error {}
/**
* @param {string} docId
* @param {Doc} redisDoc
* @param {Doc} mongoDoc
* @return {Promise<void>}
*/
async function updateDocVersionInRedis(docId, redisDoc, mongoDoc) {
const lockValue = await LockManager.promises.getLock(docId)
try {
const key = Settings.redis.documentupdater.key_schema.docVersion({
doc_id: docId,
})
const numberOfKeys = 1
const ok = await RedisManager.rclient.eval(
COMPARE_AND_SET,
numberOfKeys,
key,
redisDoc.version,
mongoDoc.version
)
if (!ok) {
throw new TryAgainError(
'document has been updated, aborting overwrite. Try again.'
)
}
} finally {
await LockManager.promises.releaseLock(docId, lockValue)
}
}
async function fixPartiallyDeletedDocMetadata(projectId, docId, pathname) {
await new Promise((resolve, reject) => {
request(
{
method: 'PATCH',
url: `http://${process.env.DOCSTORE_HOST || '127.0.0.1'}:3016/project/${projectId}/doc/${docId}`,
timeout: 60 * 1000,
json: {
name: Path.basename(pathname),
deleted: true,
deletedAt: new Date(),
},
},
(err, res, body) => {
if (err) return reject(err)
const { statusCode } = res
if (statusCode !== 204) {
return reject(
new OError('patch request to docstore failed', {
statusCode,
body,
})
)
}
resolve()
}
)
})
}
async function getDocFromMongo(projectId, docId) {
try {
return await PersistenceManager.promises.getDoc(projectId, docId)
} catch (err) {
if (!(err instanceof Errors.NotFoundError)) {
throw err
}
}
const docstoreDoc = await new Promise((resolve, reject) => {
request(
{
url: `http://${process.env.DOCSTORE_HOST || '127.0.0.1'}:3016/project/${projectId}/doc/${docId}/peek`,
timeout: 60 * 1000,
json: true,
},
(err, res, body) => {
if (err) return reject(err)
const { statusCode } = res
if (statusCode !== 200) {
return reject(
new OError('fallback request to docstore failed', {
statusCode,
body,
})
)
}
resolve(body)
}
)
})
const deletedDocName = await new Promise((resolve, reject) => {
request(
{
url: `http://${process.env.DOCSTORE_HOST || '127.0.0.1'}:3016/project/${projectId}/doc-deleted`,
timeout: 60 * 1000,
json: true,
},
(err, res, body) => {
if (err) return reject(err)
const { statusCode } = res
if (statusCode !== 200) {
return reject(
new OError('list deleted docs request to docstore failed', {
statusCode,
body,
})
)
}
resolve(body.find(doc => doc._id === docId)?.name)
}
)
})
if (docstoreDoc.deleted && deletedDocName) {
return {
...docstoreDoc,
pathname: deletedDocName,
}
}
return {
...docstoreDoc,
pathname: `/partially-deleted-doc-with-unknown-name-and-id-${docId}.txt`,
partiallyDeleted: true,
}
}
/**
* @param {string} projectId
* @param {string} docId
* @return {Promise<boolean>}
*/
async function processDoc(projectId, docId) {
const redisDoc = /** @type Doc */ await RedisManager.promises.getDoc(
projectId,
docId
)
const mongoDoc = /** @type Doc */ await getDocFromMongo(projectId, docId)
if (mongoDoc.partiallyDeleted) {
if (AUTO_FIX_PARTIALLY_DELETED_DOC_METADATA) {
console.log(
`Found partially deleted doc ${docId} in project ${projectId}: fixing metadata`
)
await fixPartiallyDeletedDocMetadata(projectId, docId, redisDoc.pathname)
} else {
console.log(
`Found partially deleted doc ${docId} in project ${projectId}: use AUTO_FIX_PARTIALLY_DELETED_DOC_METADATA=true to fix metadata`
)
}
}
if (mongoDoc.version < redisDoc.version) {
// mongo is behind, we can flush to mongo when all docs are processed.
return false
}
mongoDoc.snapshot = mongoDoc.lines.join('\n')
redisDoc.snapshot = redisDoc.lines.join('\n')
if (!mongoDoc.ranges) mongoDoc.ranges = {}
if (!redisDoc.ranges) redisDoc.ranges = {}
const sameLines = mongoDoc.snapshot === redisDoc.snapshot
const sameRanges = _.isEqual(mongoDoc.ranges, redisDoc.ranges)
if (sameLines && sameRanges) {
if (mongoDoc.version > redisDoc.version) {
// mongo is ahead, technically out of sync, but practically the content is identical
if (AUTO_FIX_VERSION_MISMATCH) {
console.log(
`Fixing out of sync doc version for doc ${docId} in project ${projectId}: mongo=${mongoDoc.version} > redis=${redisDoc.version}`
)
await updateDocVersionInRedis(docId, redisDoc, mongoDoc)
return false
} else {
console.error(
`Detected out of sync redis and mongo version for doc ${docId} in project ${projectId}, auto-fixable via AUTO_FIX_VERSION_MISMATCH=true`
)
return true
}
} else {
// same lines, same ranges, same version
return false
}
}
const dir = Path.join(FOLDER, projectId, docId)
console.error(
`Detected out of sync redis and mongo content for doc ${docId} in project ${projectId}`
)
if (!WRITE_CONTENT) return true
console.log(`pathname: ${mongoDoc.pathname}`)
if (mongoDoc.pathname !== redisDoc.pathname) {
console.log(`pathname redis: ${redisDoc.pathname}`)
}
console.log(`mongo version: ${mongoDoc.version}`)
console.log(`redis version: ${redisDoc.version}`)
await fs.promises.mkdir(dir, { recursive: true })
if (sameLines) {
console.log('mongo lines match redis lines')
} else {
console.log(
`mongo lines and redis lines out of sync, writing content into ${dir}`
)
await fs.promises.writeFile(
Path.join(dir, 'mongo-snapshot.txt'),
mongoDoc.snapshot
)
await fs.promises.writeFile(
Path.join(dir, 'redis-snapshot.txt'),
redisDoc.snapshot
)
}
if (sameRanges) {
console.log('mongo ranges match redis ranges')
} else {
console.log(
`mongo ranges and redis ranges out of sync, writing content into ${dir}`
)
await fs.promises.writeFile(
Path.join(dir, 'mongo-ranges.json'),
JSON.stringify(mongoDoc.ranges)
)
await fs.promises.writeFile(
Path.join(dir, 'redis-ranges.json'),
JSON.stringify(redisDoc.ranges)
)
}
console.log('---')
return true
}
/**
* @param {string} projectId
* @return {Promise<number>}
*/
async function processProject(projectId) {
const docIds = await RedisManager.promises.getDocIdsInProject(projectId)
let outOfSync = 0
for (const docId of docIds) {
let lastErr
for (let i = 0; i <= RETRIES; i++) {
try {
if (await processDoc(projectId, docId)) {
outOfSync++
}
break
} catch (err) {
lastErr = err
}
}
if (lastErr) {
throw OError.tag(lastErr, 'process doc', { docId })
}
}
if (outOfSync === 0 && FLUSH_IN_SYNC_PROJECTS) {
try {
await ProjectManager.promises.flushAndDeleteProjectWithLocks(
projectId,
{}
)
} catch (err) {
throw OError.tag(err, 'flush project with only in-sync docs')
}
}
return outOfSync
}
/**
* @param {Set<string>} processed
* @param {Set<string>} outOfSync
* @return {Promise<{perIterationOutOfSync: number, done: boolean}>}
*/
async function scanOnce(processed, outOfSync) {
const projectIds = await ProjectFlusher.promises.flushAllProjects({
limit: LIMIT,
dryRun: true,
})
let perIterationOutOfSync = 0
for (const projectId of projectIds) {
if (processed.has(projectId)) continue
processed.add(projectId)
let perProjectOutOfSync = 0
try {
perProjectOutOfSync = await processProject(projectId)
} catch (err) {
throw OError.tag(err, 'process project', { projectId })
}
perIterationOutOfSync += perProjectOutOfSync
if (perProjectOutOfSync > 0) {
outOfSync.add(projectId)
}
}
return { perIterationOutOfSync, done: projectIds.length < LIMIT }
}
/**
* @return {Promise<number>}
*/
async function main() {
if (!WRITE_CONTENT) {
console.warn()
console.warn(
` Use WRITE_CONTENT=true to write the content of out of sync docs to FOLDER=${FOLDER}`
)
console.warn()
} else {
console.log(
`Writing content for projects with out of sync docs into FOLDER=${FOLDER}`
)
await fs.promises.mkdir(FOLDER, { recursive: true })
const existing = await fs.promises.readdir(FOLDER)
if (existing.length > 0) {
console.warn()
console.warn(
` Found existing entries in FOLDER=${FOLDER}. Please delete or move these before running the script again.`
)
console.warn()
return 101
}
}
if (LIMIT < 100) {
console.warn()
console.warn(
` Using small LIMIT=${LIMIT}, this can take a while to SCAN in a large redis database.`
)
console.warn()
}
const processed = new Set()
const outOfSyncProjects = new Set()
let totalOutOfSyncDocs = 0
while (true) {
const before = processed.size
const { perIterationOutOfSync, done } = await scanOnce(
processed,
outOfSyncProjects
)
totalOutOfSyncDocs += perIterationOutOfSync
console.log(`Processed ${processed.size} projects`)
console.log(
`Found ${
outOfSyncProjects.size
} projects with ${totalOutOfSyncDocs} out of sync docs: ${JSON.stringify(
Array.from(outOfSyncProjects)
)}`
)
if (done) {
console.log('Finished iterating all projects in redis')
break
}
if (processed.size === before) {
console.error(
`Found too many un-flushed projects (LIMIT=${LIMIT}). Please fix the reported projects first, then try again.`
)
if (!FLUSH_IN_SYNC_PROJECTS) {
console.error(
'Use FLUSH_IN_SYNC_PROJECTS=true to flush projects that have been checked.'
)
}
return 2
}
}
return totalOutOfSyncDocs > 0 ? 1 : 0
}
main()
.then(code => {
process.exit(code)
})
.catch(error => {
console.error(OError.getFullStack(error))
console.error(OError.getFullInfo(error))
process.exit(1)
})

View File

@@ -0,0 +1,65 @@
const Settings = require('@overleaf/settings')
const rclient = require('@overleaf/redis-wrapper').createClient(
Settings.redis.documentupdater
)
let keys = Settings.redis.documentupdater.key_schema
const async = require('async')
const RedisManager = require('../app/js/RedisManager')
const getKeysFromNode = function (node, pattern, callback) {
let cursor = 0 // redis iterator
const keySet = {} // use hash to avoid duplicate results
// scan over all keys looking for pattern
const doIteration = () =>
node.scan(cursor, 'MATCH', pattern, 'COUNT', 1000, function (error, reply) {
if (error) {
return callback(error)
}
;[cursor, keys] = reply
console.log('SCAN', keys.length)
for (const key of keys) {
keySet[key] = true
}
if (cursor === '0') {
// note redis returns string result not numeric
return callback(null, Object.keys(keySet))
} else {
return doIteration()
}
})
return doIteration()
}
const getKeys = function (pattern, callback) {
const nodes = (typeof rclient.nodes === 'function'
? rclient.nodes('master')
: undefined) || [rclient]
console.log('GOT NODES', nodes.length)
const doKeyLookupForNode = (node, cb) => getKeysFromNode(node, pattern, cb)
return async.concatSeries(nodes, doKeyLookupForNode, callback)
}
const expireDocOps = callback =>
getKeys(keys.docOps({ doc_id: '*' }), (error, keys) => {
if (error) return callback(error)
async.mapSeries(
keys,
function (key, cb) {
console.log(`EXPIRE ${key} ${RedisManager.DOC_OPS_TTL}`)
return rclient.expire(key, RedisManager.DOC_OPS_TTL, cb)
},
callback
)
})
setTimeout(
() =>
// Give redis a chance to connect
expireDocOps(function (error) {
if (error) {
throw error
}
return process.exit()
}),
1000
)

View File

@@ -0,0 +1,79 @@
const Settings = require('@overleaf/settings')
const logger = require('@overleaf/logger')
const rclient = require('@overleaf/redis-wrapper').createClient(
Settings.redis.documentupdater
)
const keys = Settings.redis.documentupdater.key_schema
const ProjectFlusher = require('app/js/ProjectFlusher')
const DocumentManager = require('app/js/DocumentManager')
const util = require('node:util')
const flushAndDeleteDocWithLock = util.promisify(
DocumentManager.flushAndDeleteDocWithLock
)
async function flushAndDeleteDocs(dockeys, options) {
const docIds = ProjectFlusher._extractIds(dockeys)
for (const docId of docIds) {
const pathname = await rclient.get(keys.pathname({ doc_id: docId }))
if (!pathname) {
const projectId = await rclient.get(keys.projectKey({ doc_id: docId }))
if (!projectId) {
// await deleteDanglingDoc(projectId, docId, pathname, options)
logger.info(
{ projectId, docId, pathname },
'skipping doc with empty pathname and project id'
)
} else {
await flushAndDeleteDoc(projectId, docId, pathname, options)
}
}
}
}
async function flushAndDeleteDoc(projectId, docId, pathname, options) {
if (options.dryRun) {
logger.info(
{ projectId, docId, pathname },
'dry run mode - would flush doc with empty pathname'
)
return
}
logger.info(
{ projectId, docId, pathname },
'flushing doc with empty pathname'
)
try {
await flushAndDeleteDocWithLock(projectId, docId, {})
} catch (err) {
logger.error(
{ projectId, docId, pathname, err },
'error flushing and deleting doc without pathname'
)
}
}
async function cleanUpDocs(options) {
logger.info({ options }, 'cleaning up docs without pathnames')
let cursor = 0
do {
const [newCursor, doclinesKeys] = await rclient.scan(
cursor,
'MATCH',
keys.docLines({ doc_id: '*' }),
'COUNT',
options.limit
)
await flushAndDeleteDocs(doclinesKeys, options)
cursor = newCursor
} while (cursor !== '0')
}
cleanUpDocs({ limit: 1000, dryRun: process.env.DRY_RUN !== 'false' })
.then(result => {
rclient.quit()
console.log('DONE')
})
.catch(function (error) {
console.error(error)
process.exit(1)
})

View File

@@ -0,0 +1,87 @@
const Settings = require('@overleaf/settings')
const logger = require('@overleaf/logger')
const rclient = require('@overleaf/redis-wrapper').createClient(
Settings.redis.documentupdater
)
const keys = Settings.redis.documentupdater.key_schema
const ProjectFlusher = require('../app/js/ProjectFlusher')
const DocumentManager = require('../app/js/DocumentManager')
const { mongoClient, db, ObjectId } = require('../app/js/mongodb')
const util = require('node:util')
const flushAndDeleteDocWithLock = util.promisify(
DocumentManager.flushAndDeleteDocWithLock
)
async function fixDocsWithMissingProjectIds(dockeys, options) {
const docIds = ProjectFlusher._extractIds(dockeys)
for (const docId of docIds) {
const projectId = await rclient.get(keys.projectKey({ doc_id: docId }))
logger.debug({ docId, projectId }, 'checking doc')
if (!projectId) {
try {
await insertMissingProjectId(docId, options)
} catch (err) {
logger.error({ docId, err }, 'error fixing doc without project id')
}
}
}
}
async function insertMissingProjectId(docId, options) {
const doc = await db.docs.findOne({ _id: ObjectId(docId) })
if (!doc) {
logger.warn({ docId }, 'doc not found in mongo')
return
}
if (!doc.project_id) {
logger.error({ docId }, 'doc does not have project id in mongo')
return
}
logger.debug({ docId, doc }, 'found doc')
const projectIdFromMongo = doc.project_id.toString()
if (options.dryRun) {
logger.info(
{ projectIdFromMongo, docId },
'dry run mode - would insert project id in redis'
)
return
}
// set the project id for this doc
await rclient.set(keys.projectKey({ doc_id: docId }), projectIdFromMongo)
logger.debug({ docId, projectIdFromMongo }, 'inserted project id in redis')
if (projectIdFromMongo) {
await flushAndDeleteDocWithLock(projectIdFromMongo, docId, {})
logger.info(
{ docId, projectIdFromMongo },
'fixed doc with empty project id'
)
}
return projectIdFromMongo
}
async function findAndProcessDocs(options) {
logger.info({ options }, 'fixing docs with missing projcct id')
let cursor = 0
do {
const [newCursor, doclinesKeys] = await rclient.scan(
cursor,
'MATCH',
keys.docLines({ doc_id: '*' }),
'COUNT',
options.limit
)
await fixDocsWithMissingProjectIds(doclinesKeys, options)
cursor = newCursor
} while (cursor !== '0')
}
findAndProcessDocs({ limit: 1000, dryRun: process.env.DRY_RUN !== 'false' })
.then(result => {
rclient.quit()
mongoClient.close()
console.log('DONE')
})
.catch(function (error) {
console.error(error)
process.exit(1)
})

View File

@@ -0,0 +1,54 @@
const ProjectFlusher = require('../app/js/ProjectFlusher')
const minimist = require('minimist')
async function main() {
const argv = minimist(process.argv.slice(2), {
default: {
limit: 100000,
concurrency: 5,
'dry-run': false,
},
boolean: ['dry-run', 'help'],
alias: { h: 'help', n: 'dry-run', j: 'concurrency' },
})
if (argv.help) {
console.log(`
Usage: node scripts/flush_all.js [options]
Options:
--limit Number of projects to flush (default: 100000)
--concurrency, -j Number of concurrent flush operations (default: 5)
--dryRun, -n Perform a dry run without making any changes (default: false)
--help, -h Show this help message
`)
process.exit(0)
}
const options = {
limit: argv.limit,
concurrency: argv.concurrency,
dryRun: argv['dry-run'],
}
console.log('Flushing all projects with options:', options)
return await new Promise((resolve, reject) => {
ProjectFlusher.flushAllProjects(options, err => {
if (err) {
reject(err)
} else {
resolve()
}
})
})
}
main()
.then(() => {
console.log('Done flushing all projects')
process.exit(0)
})
.catch(error => {
console.error('There was an error flushing all projects', { error })
process.exit(1)
})

View File

@@ -0,0 +1,161 @@
const Settings = require('@overleaf/settings')
const logger = require('@overleaf/logger')
const rclient = require('@overleaf/redis-wrapper').createClient(
Settings.redis.documentupdater
)
const keys = Settings.redis.documentupdater.key_schema
const ProjectFlusher = require('../app/js/ProjectFlusher')
const RedisManager = require('../app/js/RedisManager')
const { mongoClient, db, ObjectId } = require('../app/js/mongodb')
const util = require('node:util')
const getDoc = util.promisify((projectId, docId, cb) =>
RedisManager.getDoc(projectId, docId, (err, ...args) => cb(err, args))
)
const removeDocFromMemory = util.promisify(RedisManager.removeDocFromMemory)
const summary = { totalDocs: 0, deletedDocs: 0, skippedDocs: 0 }
async function removeDeletedDocs(dockeys, options) {
const docIds = ProjectFlusher._extractIds(dockeys)
for (const docId of docIds) {
summary.totalDocs++
const docCount = await db.docs.find({ _id: new ObjectId(docId) }).count()
if (!docCount) {
try {
await removeDeletedDoc(docId, options)
} catch (err) {
logger.error({ docId, err }, 'error removing deleted doc')
}
}
}
}
async function removeDeletedDoc(docId, options) {
const projectId = await rclient.get(keys.projectKey({ doc_id: docId }))
const [
docLines,
version,
ranges,
pathname,
projectHistoryId,
unflushedTime,
lastUpdatedAt,
lastUpdatedBy,
] = await getDoc(projectId, docId)
const project = await db.projects.findOne({ _id: new ObjectId(projectId) })
let status
if (project) {
const projectJSON = JSON.stringify(project.rootFolder)
const containsDoc = projectJSON.indexOf(docId) !== -1
if (containsDoc) {
logger.warn(
{
projectId,
docId,
docLinesBytes: docLines && docLines.length,
version,
rangesBytes: ranges && ranges.length,
pathname,
projectHistoryId,
unflushedTime,
lastUpdatedAt,
lastUpdatedBy,
},
'refusing to delete doc, project contains docId'
)
summary.skippedDocs++
return
} else {
logger.warn(
{
projectId,
docId,
docLinesBytes: docLines && docLines.length,
version,
rangesBytes: ranges && ranges.length,
pathname,
projectHistoryId,
unflushedTime,
lastUpdatedAt,
lastUpdatedBy,
},
'refusing to delete doc, project still exists'
)
summary.skippedDocs++
return
}
} else {
status = 'projectDeleted'
}
summary.deletedDocs++
if (options.dryRun) {
logger.info(
{
projectId,
docId,
docLinesBytes: docLines && docLines.length,
version,
rangesBytes: ranges && ranges.length,
pathname,
projectHistoryId,
unflushedTime,
lastUpdatedAt,
lastUpdatedBy,
status,
summary,
},
'dry run mode - would remove doc from redis'
)
return
}
removeDocFromMemory(projectId, docId)
logger.info(
{
projectId,
docId,
docLinesBytes: docLines && docLines.length,
version,
rangesBytes: ranges && ranges.length,
pathname,
projectHistoryId,
unflushedTime,
lastUpdatedAt,
lastUpdatedBy,
status,
summary,
},
'removed doc from redis'
)
}
async function findAndProcessDocs(options) {
logger.info({ options }, 'removing deleted docs')
let cursor = 0
do {
const [newCursor, doclinesKeys] = await rclient.scan(
cursor,
'MATCH',
keys.docLines({ doc_id: '*' }),
'COUNT',
options.limit
)
await removeDeletedDocs(doclinesKeys, options)
cursor = newCursor
} while (cursor !== '0')
}
findAndProcessDocs({ limit: 1000, dryRun: process.env.DRY_RUN !== 'false' })
.then(result => {
rclient.quit()
mongoClient.close()
console.log('DONE')
process.exit(0)
})
.catch(function (error) {
console.error(error)
process.exit(1)
})

View File

@@ -0,0 +1,723 @@
const sinon = require('sinon')
const { expect } = require('chai')
const async = require('async')
const Settings = require('@overleaf/settings')
const rclientProjectHistory = require('@overleaf/redis-wrapper').createClient(
Settings.redis.project_history
)
const rclientDU = require('@overleaf/redis-wrapper').createClient(
Settings.redis.documentupdater
)
const Keys = Settings.redis.documentupdater.key_schema
const ProjectHistoryKeys = Settings.redis.project_history.key_schema
const MockWebApi = require('./helpers/MockWebApi')
const DocUpdaterClient = require('./helpers/DocUpdaterClient')
const DocUpdaterApp = require('./helpers/DocUpdaterApp')
describe('Applying updates to a doc', function () {
before(function (done) {
this.lines = ['one', 'two', 'three']
this.version = 42
this.op = {
i: 'one and a half\n',
p: 4,
}
this.update = {
doc: this.doc_id,
op: [this.op],
v: this.version,
}
this.result = ['one', 'one and a half', 'two', 'three']
DocUpdaterApp.ensureRunning(done)
})
describe('when the document is not loaded', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.doc_id = DocUpdaterClient.randomId()
sinon.spy(MockWebApi, 'getDocument')
this.startTime = Date.now()
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
version: this.version,
})
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc_id,
this.update,
error => {
if (error != null) {
throw error
}
setTimeout(done, 200)
}
)
})
after(function () {
MockWebApi.getDocument.restore()
})
it('should load the document from the web API', function () {
MockWebApi.getDocument
.calledWith(this.project_id, this.doc_id)
.should.equal(true)
})
it('should update the doc', function (done) {
DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, doc) => {
if (error) done(error)
doc.lines.should.deep.equal(this.result)
done()
}
)
})
it('should push the applied updates to the project history changes api', function (done) {
rclientProjectHistory.lrange(
ProjectHistoryKeys.projectHistoryOps({ project_id: this.project_id }),
0,
-1,
(error, updates) => {
if (error != null) {
throw error
}
JSON.parse(updates[0]).op.should.deep.equal([this.op])
done()
}
)
})
it('should set the first op timestamp', function (done) {
rclientProjectHistory.get(
ProjectHistoryKeys.projectHistoryFirstOpTimestamp({
project_id: this.project_id,
}),
(error, result) => {
if (error != null) {
throw error
}
result = parseInt(result, 10)
result.should.be.within(this.startTime, Date.now())
this.firstOpTimestamp = result
done()
}
)
})
it('should yield last updated time', function (done) {
DocUpdaterClient.getProjectLastUpdatedAt(
this.project_id,
(error, res, body) => {
if (error != null) {
throw error
}
res.statusCode.should.equal(200)
body.lastUpdatedAt.should.be.within(this.startTime, Date.now())
done()
}
)
})
it('should yield no last updated time for another project', function (done) {
DocUpdaterClient.getProjectLastUpdatedAt(
DocUpdaterClient.randomId(),
(error, res, body) => {
if (error != null) {
throw error
}
res.statusCode.should.equal(200)
body.should.deep.equal({})
done()
}
)
})
describe('when sending another update', function () {
before(function (done) {
this.timeout(10000)
this.second_update = Object.assign({}, this.update)
this.second_update.v = this.version + 1
this.secondStartTime = Date.now()
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc_id,
this.second_update,
error => {
if (error != null) {
throw error
}
setTimeout(done, 200)
}
)
})
it('should update the doc', function (done) {
DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, doc) => {
if (error) done(error)
doc.lines.should.deep.equal([
'one',
'one and a half',
'one and a half',
'two',
'three',
])
done()
}
)
})
it('should not change the first op timestamp', function (done) {
rclientProjectHistory.get(
ProjectHistoryKeys.projectHistoryFirstOpTimestamp({
project_id: this.project_id,
}),
(error, result) => {
if (error != null) {
throw error
}
result = parseInt(result, 10)
result.should.equal(this.firstOpTimestamp)
done()
}
)
})
it('should yield last updated time', function (done) {
DocUpdaterClient.getProjectLastUpdatedAt(
this.project_id,
(error, res, body) => {
if (error != null) {
throw error
}
res.statusCode.should.equal(200)
body.lastUpdatedAt.should.be.within(
this.secondStartTime,
Date.now()
)
done()
}
)
})
})
})
describe('when the document is loaded', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.doc_id = DocUpdaterClient.randomId()
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
version: this.version,
})
DocUpdaterClient.preloadDoc(this.project_id, this.doc_id, error => {
if (error != null) {
throw error
}
sinon.spy(MockWebApi, 'getDocument')
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc_id,
this.update,
error => {
if (error != null) {
throw error
}
setTimeout(done, 200)
}
)
})
})
after(function () {
MockWebApi.getDocument.restore()
})
it('should not need to call the web api', function () {
MockWebApi.getDocument.called.should.equal(false)
})
it('should update the doc', function (done) {
DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, doc) => {
if (error) return done(error)
doc.lines.should.deep.equal(this.result)
done()
}
)
})
it('should push the applied updates to the project history changes api', function (done) {
rclientProjectHistory.lrange(
ProjectHistoryKeys.projectHistoryOps({ project_id: this.project_id }),
0,
-1,
(error, updates) => {
if (error) return done(error)
JSON.parse(updates[0]).op.should.deep.equal([this.op])
done()
}
)
})
})
describe('when the document is loaded and is using project-history only', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.doc_id = DocUpdaterClient.randomId()
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
version: this.version,
})
DocUpdaterClient.preloadDoc(this.project_id, this.doc_id, error => {
if (error != null) {
throw error
}
sinon.spy(MockWebApi, 'getDocument')
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc_id,
this.update,
error => {
if (error != null) {
throw error
}
setTimeout(done, 200)
}
)
})
})
after(function () {
MockWebApi.getDocument.restore()
})
it('should update the doc', function (done) {
DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, doc) => {
if (error) return done(error)
doc.lines.should.deep.equal(this.result)
done()
}
)
})
it('should push the applied updates to the project history changes api', function (done) {
rclientProjectHistory.lrange(
ProjectHistoryKeys.projectHistoryOps({ project_id: this.project_id }),
0,
-1,
(error, updates) => {
if (error) return done(error)
JSON.parse(updates[0]).op.should.deep.equal([this.op])
done()
}
)
})
})
describe('when the document has been deleted', function () {
describe('when the ops come in a single linear order', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.doc_id = DocUpdaterClient.randomId()
const lines = ['', '', '']
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines,
version: 0,
})
this.updates = [
{ doc_id: this.doc_id, v: 0, op: [{ i: 'h', p: 0 }] },
{ doc_id: this.doc_id, v: 1, op: [{ i: 'e', p: 1 }] },
{ doc_id: this.doc_id, v: 2, op: [{ i: 'l', p: 2 }] },
{ doc_id: this.doc_id, v: 3, op: [{ i: 'l', p: 3 }] },
{ doc_id: this.doc_id, v: 4, op: [{ i: 'o', p: 4 }] },
{ doc_id: this.doc_id, v: 5, op: [{ i: ' ', p: 5 }] },
{ doc_id: this.doc_id, v: 6, op: [{ i: 'w', p: 6 }] },
{ doc_id: this.doc_id, v: 7, op: [{ i: 'o', p: 7 }] },
{ doc_id: this.doc_id, v: 8, op: [{ i: 'r', p: 8 }] },
{ doc_id: this.doc_id, v: 9, op: [{ i: 'l', p: 9 }] },
{ doc_id: this.doc_id, v: 10, op: [{ i: 'd', p: 10 }] },
]
this.my_result = ['hello world', '', '']
done()
})
it('should be able to continue applying updates when the project has been deleted', function (done) {
let update
const actions = []
for (update of this.updates.slice(0, 6)) {
;(update => {
actions.push(callback =>
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc_id,
update,
callback
)
)
})(update)
}
actions.push(callback =>
DocUpdaterClient.deleteDoc(this.project_id, this.doc_id, callback)
)
for (update of this.updates.slice(6)) {
;(update => {
actions.push(callback =>
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc_id,
update,
callback
)
)
})(update)
}
async.series(actions, error => {
if (error != null) {
throw error
}
DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, doc) => {
if (error) return done(error)
doc.lines.should.deep.equal(this.my_result)
done()
}
)
})
})
it('should store the doc ops in the correct order', function (done) {
rclientDU.lrange(
Keys.docOps({ doc_id: this.doc_id }),
0,
-1,
(error, updates) => {
if (error) return done(error)
updates = updates.map(u => JSON.parse(u))
for (let i = 0; i < this.updates.length; i++) {
const appliedUpdate = this.updates[i]
appliedUpdate.op.should.deep.equal(updates[i].op)
}
done()
}
)
})
})
describe('when older ops come in after the delete', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.doc_id = DocUpdaterClient.randomId()
const lines = ['', '', '']
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines,
version: 0,
})
this.updates = [
{ doc_id: this.doc_id, v: 0, op: [{ i: 'h', p: 0 }] },
{ doc_id: this.doc_id, v: 1, op: [{ i: 'e', p: 1 }] },
{ doc_id: this.doc_id, v: 2, op: [{ i: 'l', p: 2 }] },
{ doc_id: this.doc_id, v: 3, op: [{ i: 'l', p: 3 }] },
{ doc_id: this.doc_id, v: 4, op: [{ i: 'o', p: 4 }] },
{ doc_id: this.doc_id, v: 0, op: [{ i: 'world', p: 1 }] },
]
this.my_result = ['hello', 'world', '']
done()
})
it('should be able to continue applying updates when the project has been deleted', function (done) {
let update
const actions = []
for (update of this.updates.slice(0, 5)) {
;(update => {
actions.push(callback =>
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc_id,
update,
callback
)
)
})(update)
}
actions.push(callback =>
DocUpdaterClient.deleteDoc(this.project_id, this.doc_id, callback)
)
for (update of this.updates.slice(5)) {
;(update => {
actions.push(callback =>
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc_id,
update,
callback
)
)
})(update)
}
async.series(actions, error => {
if (error != null) {
throw error
}
DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, doc) => {
if (error) return done(error)
doc.lines.should.deep.equal(this.my_result)
done()
}
)
})
})
})
})
describe('with a broken update', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.doc_id = DocUpdaterClient.randomId()
this.broken_update = {
doc_id: this.doc_id,
v: this.version,
op: [{ d: 'not the correct content', p: 0 }],
}
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
version: this.version,
})
DocUpdaterClient.subscribeToAppliedOps(
(this.messageCallback = sinon.stub())
)
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc_id,
this.broken_update,
error => {
if (error != null) {
throw error
}
setTimeout(done, 200)
}
)
})
it('should not update the doc', function (done) {
DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, doc) => {
if (error) return done(error)
doc.lines.should.deep.equal(this.lines)
done()
}
)
})
it('should send a message with an error', function () {
this.messageCallback.called.should.equal(true)
const [channel, message] = this.messageCallback.args[0]
channel.should.equal('applied-ops')
JSON.parse(message).should.deep.include({
project_id: this.project_id,
doc_id: this.doc_id,
error: 'Delete component does not match',
})
})
})
describe('when there is no version in Mongo', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.doc_id = DocUpdaterClient.randomId()
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
})
const update = {
doc: this.doc_id,
op: this.update.op,
v: 0,
}
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc_id,
update,
error => {
if (error != null) {
throw error
}
setTimeout(done, 200)
}
)
})
it('should update the doc (using version = 0)', function (done) {
DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, doc) => {
if (error) return done(error)
doc.lines.should.deep.equal(this.result)
done()
}
)
})
})
describe('when the sending duplicate ops', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.doc_id = DocUpdaterClient.randomId()
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
version: this.version,
})
DocUpdaterClient.subscribeToAppliedOps(
(this.messageCallback = sinon.stub())
)
// One user delete 'one', the next turns it into 'once'. The second becomes a NOP.
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc_id,
{
doc: this.doc_id,
op: [
{
i: 'one and a half\n',
p: 4,
},
],
v: this.version,
meta: {
source: 'ikHceq3yfAdQYzBo4-xZ',
},
},
error => {
if (error != null) {
throw error
}
setTimeout(() => {
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc_id,
{
doc: this.doc_id,
op: [
{
i: 'one and a half\n',
p: 4,
},
],
v: this.version,
dupIfSource: ['ikHceq3yfAdQYzBo4-xZ'],
meta: {
source: 'ikHceq3yfAdQYzBo4-xZ',
},
},
error => {
if (error != null) {
throw error
}
setTimeout(done, 200)
}
)
}, 200)
}
)
})
it('should update the doc', function (done) {
DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, doc) => {
if (error) return done(error)
doc.lines.should.deep.equal(this.result)
done()
}
)
})
it('should return a message about duplicate ops', function () {
this.messageCallback.calledTwice.should.equal(true)
this.messageCallback.args[0][0].should.equal('applied-ops')
expect(JSON.parse(this.messageCallback.args[0][1]).op.dup).to.be.undefined
this.messageCallback.args[1][0].should.equal('applied-ops')
expect(JSON.parse(this.messageCallback.args[1][1]).op.dup).to.equal(true)
})
})
describe('when sending updates for a non-existing doc id', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.doc_id = DocUpdaterClient.randomId()
this.non_existing = {
doc_id: this.doc_id,
v: this.version,
op: [{ d: 'content', p: 0 }],
}
DocUpdaterClient.subscribeToAppliedOps(
(this.messageCallback = sinon.stub())
)
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc_id,
this.non_existing,
error => {
if (error != null) {
throw error
}
setTimeout(done, 200)
}
)
})
it('should not update or create a doc', function (done) {
DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, doc) => {
if (error) return done(error)
res.statusCode.should.equal(404)
done()
}
)
})
it('should send a message with an error', function () {
this.messageCallback.called.should.equal(true)
const [channel, message] = this.messageCallback.args[0]
channel.should.equal('applied-ops')
JSON.parse(message).should.deep.include({
project_id: this.project_id,
doc_id: this.doc_id,
error: `doc not not found: /project/${this.project_id}/doc/${this.doc_id}`,
})
})
})
})

View File

@@ -0,0 +1,671 @@
const sinon = require('sinon')
const Settings = require('@overleaf/settings')
const rclientProjectHistory = require('@overleaf/redis-wrapper').createClient(
Settings.redis.project_history
)
const ProjectHistoryKeys = Settings.redis.project_history.key_schema
const MockProjectHistoryApi = require('./helpers/MockProjectHistoryApi')
const MockWebApi = require('./helpers/MockWebApi')
const DocUpdaterClient = require('./helpers/DocUpdaterClient')
const DocUpdaterApp = require('./helpers/DocUpdaterApp')
describe("Applying updates to a project's structure", function () {
before(function () {
this.user_id = 'user-id-123'
this.version = 1234
})
describe('renaming a file', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.fileUpdate = {
type: 'rename-file',
id: DocUpdaterClient.randomId(),
pathname: '/file-path',
newPathname: '/new-file-path',
}
this.updates = [this.fileUpdate]
DocUpdaterApp.ensureRunning(error => {
if (error) {
return done(error)
}
DocUpdaterClient.sendProjectUpdate(
this.project_id,
this.user_id,
this.updates,
this.version,
error => {
if (error) {
return done(error)
}
setTimeout(done, 200)
}
)
})
})
it('should push the applied file renames to the project history api', function (done) {
rclientProjectHistory.lrange(
ProjectHistoryKeys.projectHistoryOps({ project_id: this.project_id }),
0,
-1,
(error, updates) => {
if (error) {
return done(error)
}
const update = JSON.parse(updates[0])
update.file.should.equal(this.fileUpdate.id)
update.pathname.should.equal('/file-path')
update.new_pathname.should.equal('/new-file-path')
update.meta.user_id.should.equal(this.user_id)
update.meta.ts.should.be.a('string')
update.version.should.equal(`${this.version}.0`)
done()
}
)
})
})
describe('deleting a file', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.fileUpdate = {
type: 'rename-file',
id: DocUpdaterClient.randomId(),
pathname: '/file-path',
newPathname: '',
}
this.updates = [this.fileUpdate]
DocUpdaterClient.sendProjectUpdate(
this.project_id,
this.user_id,
this.updates,
this.version,
error => {
if (error) {
return done(error)
}
setTimeout(done, 200)
}
)
})
it('should push the applied file renames to the project history api', function (done) {
rclientProjectHistory.lrange(
ProjectHistoryKeys.projectHistoryOps({ project_id: this.project_id }),
0,
-1,
(error, updates) => {
if (error) {
return done(error)
}
const update = JSON.parse(updates[0])
update.file.should.equal(this.fileUpdate.id)
update.pathname.should.equal('/file-path')
update.new_pathname.should.equal('')
update.meta.user_id.should.equal(this.user_id)
update.meta.ts.should.be.a('string')
update.version.should.equal(`${this.version}.0`)
done()
}
)
})
})
describe('renaming a document', function () {
before(function () {
this.update = {
type: 'rename-doc',
id: DocUpdaterClient.randomId(),
pathname: '/doc-path',
newPathname: '/new-doc-path',
}
this.updates = [this.update]
})
describe('when the document is not loaded', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
DocUpdaterClient.sendProjectUpdate(
this.project_id,
this.user_id,
this.updates,
this.version,
error => {
if (error) {
return done(error)
}
setTimeout(done, 200)
}
)
})
it('should push the applied doc renames to the project history api', function (done) {
rclientProjectHistory.lrange(
ProjectHistoryKeys.projectHistoryOps({ project_id: this.project_id }),
0,
-1,
(error, updates) => {
if (error) {
return done(error)
}
const update = JSON.parse(updates[0])
update.doc.should.equal(this.update.id)
update.pathname.should.equal('/doc-path')
update.new_pathname.should.equal('/new-doc-path')
update.meta.user_id.should.equal(this.user_id)
update.meta.ts.should.be.a('string')
update.version.should.equal(`${this.version}.0`)
done()
}
)
})
})
describe('when the document is loaded', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
MockWebApi.insertDoc(this.project_id, this.update.id, {})
DocUpdaterClient.preloadDoc(this.project_id, this.update.id, error => {
if (error) {
return done(error)
}
sinon.spy(MockWebApi, 'getDocument')
DocUpdaterClient.sendProjectUpdate(
this.project_id,
this.user_id,
this.updates,
this.version,
error => {
if (error) {
return done(error)
}
setTimeout(done, 200)
}
)
})
})
after(function () {
MockWebApi.getDocument.restore()
})
it('should update the doc', function (done) {
DocUpdaterClient.getDoc(
this.project_id,
this.update.id,
(error, res, doc) => {
if (error) {
return done(error)
}
doc.pathname.should.equal(this.update.newPathname)
done()
}
)
})
it('should push the applied doc renames to the project history api', function (done) {
rclientProjectHistory.lrange(
ProjectHistoryKeys.projectHistoryOps({ project_id: this.project_id }),
0,
-1,
(error, updates) => {
if (error) {
return done(error)
}
const update = JSON.parse(updates[0])
update.doc.should.equal(this.update.id)
update.pathname.should.equal('/doc-path')
update.new_pathname.should.equal('/new-doc-path')
update.meta.user_id.should.equal(this.user_id)
update.meta.ts.should.be.a('string')
update.version.should.equal(`${this.version}.0`)
done()
}
)
})
})
})
describe('renaming multiple documents and files', function () {
before(function () {
this.docUpdate0 = {
type: 'rename-doc',
id: DocUpdaterClient.randomId(),
pathname: '/doc-path0',
newPathname: '/new-doc-path0',
}
this.docUpdate1 = {
type: 'rename-doc',
id: DocUpdaterClient.randomId(),
pathname: '/doc-path1',
newPathname: '/new-doc-path1',
}
this.fileUpdate0 = {
type: 'rename-file',
id: DocUpdaterClient.randomId(),
pathname: '/file-path0',
newPathname: '/new-file-path0',
}
this.fileUpdate1 = {
type: 'rename-file',
id: DocUpdaterClient.randomId(),
pathname: '/file-path1',
newPathname: '/new-file-path1',
}
this.updates = [
this.docUpdate0,
this.docUpdate1,
this.fileUpdate0,
this.fileUpdate1,
]
})
describe('when the documents are not loaded', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
DocUpdaterClient.sendProjectUpdate(
this.project_id,
this.user_id,
this.updates,
this.version,
error => {
if (error) {
return done(error)
}
setTimeout(done, 200)
}
)
})
it('should push the applied doc renames to the project history api', function (done) {
rclientProjectHistory.lrange(
ProjectHistoryKeys.projectHistoryOps({ project_id: this.project_id }),
0,
-1,
(error, updates) => {
if (error) {
return done(error)
}
let update = JSON.parse(updates[0])
update.doc.should.equal(this.docUpdate0.id)
update.pathname.should.equal('/doc-path0')
update.new_pathname.should.equal('/new-doc-path0')
update.meta.user_id.should.equal(this.user_id)
update.meta.ts.should.be.a('string')
update.version.should.equal(`${this.version}.0`)
update = JSON.parse(updates[1])
update.doc.should.equal(this.docUpdate1.id)
update.pathname.should.equal('/doc-path1')
update.new_pathname.should.equal('/new-doc-path1')
update.meta.user_id.should.equal(this.user_id)
update.meta.ts.should.be.a('string')
update.version.should.equal(`${this.version}.1`)
update = JSON.parse(updates[2])
update.file.should.equal(this.fileUpdate0.id)
update.pathname.should.equal('/file-path0')
update.new_pathname.should.equal('/new-file-path0')
update.meta.user_id.should.equal(this.user_id)
update.meta.ts.should.be.a('string')
update.version.should.equal(`${this.version}.2`)
update = JSON.parse(updates[3])
update.file.should.equal(this.fileUpdate1.id)
update.pathname.should.equal('/file-path1')
update.new_pathname.should.equal('/new-file-path1')
update.meta.user_id.should.equal(this.user_id)
update.meta.ts.should.be.a('string')
update.version.should.equal(`${this.version}.3`)
done()
}
)
})
})
})
describe('deleting a document', function () {
before(function () {
this.update = {
type: 'rename-doc',
id: DocUpdaterClient.randomId(),
pathname: '/doc-path',
newPathname: '',
}
this.updates = [this.update]
})
describe('when the document is not loaded', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
DocUpdaterClient.sendProjectUpdate(
this.project_id,
this.user_id,
this.updates,
this.version,
error => {
if (error) {
return done(error)
}
setTimeout(done, 200)
}
)
})
it('should push the applied doc update to the project history api', function (done) {
rclientProjectHistory.lrange(
ProjectHistoryKeys.projectHistoryOps({ project_id: this.project_id }),
0,
-1,
(error, updates) => {
if (error) {
return done(error)
}
const update = JSON.parse(updates[0])
update.doc.should.equal(this.update.id)
update.pathname.should.equal('/doc-path')
update.new_pathname.should.equal('')
update.meta.user_id.should.equal(this.user_id)
update.meta.ts.should.be.a('string')
update.version.should.equal(`${this.version}.0`)
done()
}
)
})
})
describe('when the document is loaded', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
MockWebApi.insertDoc(this.project_id, this.update.id, {})
DocUpdaterClient.preloadDoc(this.project_id, this.update.id, error => {
if (error) {
return done(error)
}
sinon.spy(MockWebApi, 'getDocument')
DocUpdaterClient.sendProjectUpdate(
this.project_id,
this.user_id,
this.updates,
this.version,
error => {
if (error) {
return done(error)
}
setTimeout(done, 200)
}
)
})
})
after(function () {
MockWebApi.getDocument.restore()
})
it('should not modify the doc', function (done) {
DocUpdaterClient.getDoc(
this.project_id,
this.update.id,
(error, res, doc) => {
if (error) {
return done(error)
}
doc.pathname.should.equal('/a/b/c.tex') // default pathname from MockWebApi
done()
}
)
})
it('should push the applied doc update to the project history api', function (done) {
rclientProjectHistory.lrange(
ProjectHistoryKeys.projectHistoryOps({ project_id: this.project_id }),
0,
-1,
(error, updates) => {
if (error) {
return done(error)
}
const update = JSON.parse(updates[0])
update.doc.should.equal(this.update.id)
update.pathname.should.equal('/doc-path')
update.new_pathname.should.equal('')
update.meta.user_id.should.equal(this.user_id)
update.meta.ts.should.be.a('string')
update.version.should.equal(`${this.version}.0`)
done()
}
)
})
})
})
describe('adding a file', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.fileUpdate = {
type: 'add-file',
id: DocUpdaterClient.randomId(),
pathname: '/file-path',
url: 'filestore.example.com',
}
this.updates = [this.fileUpdate]
DocUpdaterClient.sendProjectUpdate(
this.project_id,
this.user_id,
this.updates,
this.version,
error => {
if (error) {
return done(error)
}
setTimeout(done, 200)
}
)
})
it('should push the file addition to the project history api', function (done) {
rclientProjectHistory.lrange(
ProjectHistoryKeys.projectHistoryOps({ project_id: this.project_id }),
0,
-1,
(error, updates) => {
if (error) {
return done(error)
}
const update = JSON.parse(updates[0])
update.file.should.equal(this.fileUpdate.id)
update.pathname.should.equal('/file-path')
update.url.should.equal('filestore.example.com')
update.meta.user_id.should.equal(this.user_id)
update.meta.ts.should.be.a('string')
update.version.should.equal(`${this.version}.0`)
done()
}
)
})
})
describe('adding a doc', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.docUpdate = {
type: 'add-doc',
id: DocUpdaterClient.randomId(),
pathname: '/file-path',
docLines: 'a\nb',
}
this.updates = [this.docUpdate]
DocUpdaterClient.sendProjectUpdate(
this.project_id,
this.user_id,
this.updates,
this.version,
error => {
if (error) {
return done(error)
}
setTimeout(done, 200)
}
)
})
it('should push the doc addition to the project history api', function (done) {
rclientProjectHistory.lrange(
ProjectHistoryKeys.projectHistoryOps({ project_id: this.project_id }),
0,
-1,
(error, updates) => {
if (error) {
return done(error)
}
const update = JSON.parse(updates[0])
update.doc.should.equal(this.docUpdate.id)
update.pathname.should.equal('/file-path')
update.docLines.should.equal('a\nb')
update.meta.user_id.should.equal(this.user_id)
update.meta.ts.should.be.a('string')
update.version.should.equal(`${this.version}.0`)
done()
}
)
})
})
describe('with enough updates to flush to the history service', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.user_id = DocUpdaterClient.randomId()
this.version0 = 12345
this.version1 = this.version0 + 1
const updates = []
for (let v = 0; v <= 599; v++) {
// Should flush after 500 ops
updates.push({
type: 'add-doc',
id: DocUpdaterClient.randomId(),
pathname: '/file-' + v,
docLines: 'a\nb',
})
}
sinon.spy(MockProjectHistoryApi, 'flushProject')
// Send updates in chunks to causes multiple flushes
const projectId = this.project_id
const userId = this.project_id
DocUpdaterClient.sendProjectUpdate(
projectId,
userId,
updates.slice(0, 250),
this.version0,
function (error) {
if (error) {
return done(error)
}
DocUpdaterClient.sendProjectUpdate(
projectId,
userId,
updates.slice(250),
this.version1,
error => {
if (error) {
return done(error)
}
setTimeout(done, 2000)
}
)
}
)
})
after(function () {
MockProjectHistoryApi.flushProject.restore()
})
it('should flush project history', function () {
MockProjectHistoryApi.flushProject
.calledWith(this.project_id)
.should.equal(true)
})
})
describe('with too few updates to flush to the history service', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.user_id = DocUpdaterClient.randomId()
this.version0 = 12345
this.version1 = this.version0 + 1
const updates = []
for (let v = 0; v <= 42; v++) {
// Should flush after 500 ops
updates.push({
type: 'add-doc',
id: DocUpdaterClient.randomId(),
pathname: '/file-' + v,
docLines: 'a\nb',
})
}
sinon.spy(MockProjectHistoryApi, 'flushProject')
// Send updates in chunks
const projectId = this.project_id
const userId = this.project_id
DocUpdaterClient.sendProjectUpdate(
projectId,
userId,
updates.slice(0, 10),
this.version0,
function (error) {
if (error) {
return done(error)
}
DocUpdaterClient.sendProjectUpdate(
projectId,
userId,
updates.slice(10),
this.version1,
error => {
if (error) {
return done(error)
}
setTimeout(done, 2000)
}
)
}
)
})
after(function () {
MockProjectHistoryApi.flushProject.restore()
})
it('should not flush project history', function () {
MockProjectHistoryApi.flushProject
.calledWith(this.project_id)
.should.equal(false)
})
})
})

View File

@@ -0,0 +1,371 @@
const MockWebApi = require('./helpers/MockWebApi')
const DocUpdaterClient = require('./helpers/DocUpdaterClient')
const DocUpdaterApp = require('./helpers/DocUpdaterApp')
const { promisify } = require('node:util')
const { exec } = require('node:child_process')
const { expect } = require('chai')
const Settings = require('@overleaf/settings')
const fs = require('node:fs')
const Path = require('node:path')
const MockDocstoreApi = require('./helpers/MockDocstoreApi')
const sinon = require('sinon')
const rclient = require('@overleaf/redis-wrapper').createClient(
Settings.redis.documentupdater
)
describe('CheckRedisMongoSyncState', function () {
beforeEach(function (done) {
DocUpdaterApp.ensureRunning(done)
})
beforeEach(async function () {
await rclient.flushall()
})
let peekDocumentInDocstore
beforeEach(function () {
peekDocumentInDocstore = sinon.spy(MockDocstoreApi, 'peekDocument')
})
afterEach(function () {
peekDocumentInDocstore.restore()
})
async function runScript(options) {
let result
try {
result = await promisify(exec)(
Object.entries(options)
.map(([key, value]) => `${key}=${value}`)
.concat(['node', 'scripts/check_redis_mongo_sync_state.js'])
.join(' ')
)
} catch (error) {
// includes details like exit code, stdErr and stdOut
return error
}
result.code = 0
return result
}
describe('without projects', function () {
it('should work when in sync', async function () {
const result = await runScript({})
expect(result.code).to.equal(0)
expect(result.stdout).to.include('Processed 0 projects')
expect(result.stdout).to.include(
'Found 0 projects with 0 out of sync docs'
)
})
})
describe('with a project', function () {
let projectId, docId
beforeEach(function (done) {
projectId = DocUpdaterClient.randomId()
docId = DocUpdaterClient.randomId()
MockWebApi.insertDoc(projectId, docId, {
lines: ['mongo', 'lines'],
version: 1,
})
DocUpdaterClient.getDoc(projectId, docId, done)
})
it('should work when in sync', async function () {
const result = await runScript({})
expect(result.code).to.equal(0)
expect(result.stdout).to.include('Processed 1 projects')
expect(result.stdout).to.include(
'Found 0 projects with 0 out of sync docs'
)
expect(peekDocumentInDocstore).to.not.have.been.called
})
describe('with out of sync lines', function () {
beforeEach(function () {
MockWebApi.insertDoc(projectId, docId, {
lines: ['updated', 'mongo', 'lines'],
version: 1,
})
})
it('should detect the out of sync state', async function () {
const result = await runScript({})
expect(result.code).to.equal(1)
expect(result.stdout).to.include('Processed 1 projects')
expect(result.stdout).to.include(
'Found 1 projects with 1 out of sync docs'
)
})
})
describe('with out of sync ranges', function () {
beforeEach(function () {
MockWebApi.insertDoc(projectId, docId, {
lines: ['mongo', 'lines'],
version: 1,
ranges: { changes: ['FAKE CHANGE'] },
})
})
it('should detect the out of sync state', async function () {
const result = await runScript({})
expect(result.code).to.equal(1)
expect(result.stdout).to.include('Processed 1 projects')
expect(result.stdout).to.include(
'Found 1 projects with 1 out of sync docs'
)
})
})
describe('with out of sync version', function () {
beforeEach(function () {
MockWebApi.insertDoc(projectId, docId, {
lines: ['mongo', 'lines'],
version: 2,
})
})
it('should detect the out of sync state', async function () {
const result = await runScript({})
expect(result.code).to.equal(1)
expect(result.stdout).to.include('Processed 1 projects')
expect(result.stdout).to.include(
'Found 1 projects with 1 out of sync docs'
)
})
it('should auto-fix the out of sync state', async function () {
const result = await runScript({
AUTO_FIX_VERSION_MISMATCH: 'true',
})
expect(result.code).to.equal(0)
expect(result.stdout).to.include('Processed 1 projects')
expect(result.stdout).to.include(
'Found 0 projects with 0 out of sync docs'
)
})
})
describe('with a project', function () {
let projectId2, docId2
beforeEach(function (done) {
projectId2 = DocUpdaterClient.randomId()
docId2 = DocUpdaterClient.randomId()
MockWebApi.insertDoc(projectId2, docId2, {
lines: ['mongo', 'lines'],
version: 1,
})
DocUpdaterClient.getDoc(projectId2, docId2, done)
})
it('should work when in sync', async function () {
const result = await runScript({})
expect(result.code).to.equal(0)
expect(result.stdout).to.include('Processed 2 projects')
expect(result.stdout).to.include(
'Found 0 projects with 0 out of sync docs'
)
})
describe('with one out of sync', function () {
beforeEach(function () {
MockWebApi.insertDoc(projectId, docId, {
lines: ['updated', 'mongo', 'lines'],
version: 1,
})
})
it('should detect one project out of sync', async function () {
const result = await runScript({})
expect(result.code).to.equal(1)
expect(result.stdout).to.include('Processed 2 projects')
expect(result.stdout).to.include(
'Found 1 projects with 1 out of sync docs'
)
})
it('should write differences to disk', async function () {
const FOLDER = '/tmp/folder'
await fs.promises.rm(FOLDER, { recursive: true, force: true })
const result = await runScript({
WRITE_CONTENT: 'true',
FOLDER,
})
expect(result.code).to.equal(1)
expect(result.stdout).to.include('Processed 2 projects')
expect(result.stdout).to.include(
'Found 1 projects with 1 out of sync docs'
)
const dir = Path.join(FOLDER, projectId, docId)
expect(await fs.promises.readdir(FOLDER)).to.deep.equal([projectId])
expect(await fs.promises.readdir(dir)).to.deep.equal([
'mongo-snapshot.txt',
'redis-snapshot.txt',
])
expect(
await fs.promises.readFile(
Path.join(dir, 'mongo-snapshot.txt'),
'utf-8'
)
).to.equal('updated\nmongo\nlines')
expect(
await fs.promises.readFile(
Path.join(dir, 'redis-snapshot.txt'),
'utf-8'
)
).to.equal('mongo\nlines')
})
})
describe('with both out of sync', function () {
beforeEach(function () {
MockWebApi.insertDoc(projectId, docId, {
lines: ['updated', 'mongo', 'lines'],
version: 1,
})
MockWebApi.insertDoc(projectId2, docId2, {
lines: ['updated2', 'mongo', 'lines'],
version: 1,
})
})
it('should detect both projects out of sync', async function () {
const result = await runScript({})
expect(result.code).to.equal(1)
expect(result.stdout).to.include('Processed 2 projects')
expect(result.stdout).to.include(
'Found 2 projects with 2 out of sync docs'
)
})
})
})
})
describe('with more projects than the LIMIT', function () {
for (let i = 0; i < 20; i++) {
beforeEach(function (done) {
const projectId = DocUpdaterClient.randomId()
const docId = DocUpdaterClient.randomId()
MockWebApi.insertDoc(projectId, docId, {
lines: ['mongo', 'lines'],
version: 1,
})
DocUpdaterClient.getDoc(projectId, docId, done)
})
}
it('should flag limit', async function () {
const result = await runScript({ LIMIT: '4' })
expect(result.code).to.equal(2)
// A redis SCAN may return more than COUNT (aka LIMIT) entries. Match loosely.
expect(result.stdout).to.match(/Processed \d+ projects/)
expect(result.stderr).to.include(
'Found too many un-flushed projects (LIMIT=4). Please fix the reported projects first, then try again.'
)
})
it('should continue with auto-flush', async function () {
const result = await runScript({
LIMIT: '4',
FLUSH_IN_SYNC_PROJECTS: 'true',
})
expect(result.code).to.equal(0)
expect(result.stdout).to.include('Processed 20 projects')
})
})
describe('with partially deleted doc', function () {
let projectId, docId
beforeEach(function (done) {
projectId = DocUpdaterClient.randomId()
docId = DocUpdaterClient.randomId()
MockWebApi.insertDoc(projectId, docId, {
lines: ['mongo', 'lines'],
version: 1,
})
MockDocstoreApi.insertDoc(projectId, docId, {
lines: ['mongo', 'lines'],
version: 1,
})
DocUpdaterClient.getDoc(projectId, docId, err => {
MockWebApi.clearDocs()
done(err)
})
})
describe('with only the file-tree entry deleted', function () {
it('should flag the partial deletion', async function () {
const result = await runScript({})
expect(result.code).to.equal(0)
expect(result.stdout).to.include('Processed 1 projects')
expect(result.stdout).to.include(
`Found partially deleted doc ${docId} in project ${projectId}: use AUTO_FIX_PARTIALLY_DELETED_DOC_METADATA=true to fix metadata`
)
expect(result.stdout).to.include(
'Found 0 projects with 0 out of sync docs'
)
expect(MockDocstoreApi.getDoc(projectId, docId)).to.not.include({
deleted: true,
name: 'c.tex',
})
expect(peekDocumentInDocstore).to.have.been.called
})
it('should autofix the partial deletion', async function () {
const result = await runScript({
AUTO_FIX_PARTIALLY_DELETED_DOC_METADATA: 'true',
})
expect(result.code).to.equal(0)
expect(result.stdout).to.include('Processed 1 projects')
expect(result.stdout).to.include(
`Found partially deleted doc ${docId} in project ${projectId}: fixing metadata`
)
expect(result.stdout).to.include(
'Found 0 projects with 0 out of sync docs'
)
expect(MockDocstoreApi.getDoc(projectId, docId)).to.include({
deleted: true,
name: 'c.tex',
})
const result2 = await runScript({})
expect(result2.code).to.equal(0)
expect(result2.stdout).to.include('Processed 1 projects')
expect(result2.stdout).to.not.include(
`Found partially deleted doc ${docId} in project ${projectId}`
)
expect(result2.stdout).to.include(
'Found 0 projects with 0 out of sync docs'
)
})
})
describe('with docstore metadata updated', function () {
beforeEach(function (done) {
MockDocstoreApi.patchDocument(
projectId,
docId,
{
deleted: true,
deletedAt: new Date(),
name: 'c.tex',
},
done
)
})
it('should work when in sync', async function () {
const result = await runScript({})
expect(result.code).to.equal(0)
expect(result.stdout).to.include('Processed 1 projects')
expect(result.stdout).to.not.include(
`Found partially deleted doc ${docId} in project ${projectId}`
)
expect(result.stdout).to.include(
'Found 0 projects with 0 out of sync docs'
)
expect(peekDocumentInDocstore).to.have.been.called
})
})
})
})

View File

@@ -0,0 +1,174 @@
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
const sinon = require('sinon')
const MockProjectHistoryApi = require('./helpers/MockProjectHistoryApi')
const MockWebApi = require('./helpers/MockWebApi')
const DocUpdaterClient = require('./helpers/DocUpdaterClient')
const DocUpdaterApp = require('./helpers/DocUpdaterApp')
describe('Deleting a document', function () {
before(function (done) {
this.lines = ['one', 'two', 'three']
this.version = 42
this.update = {
doc: this.doc_id,
op: [
{
i: 'one and a half\n',
p: 4,
},
],
v: this.version,
}
this.result = ['one', 'one and a half', 'two', 'three']
sinon.spy(MockProjectHistoryApi, 'flushProject')
DocUpdaterApp.ensureRunning(done)
})
after(function () {
MockProjectHistoryApi.flushProject.restore()
})
describe('when the updated doc exists in the doc updater', function () {
before(function (done) {
;[this.project_id, this.doc_id] = Array.from([
DocUpdaterClient.randomId(),
DocUpdaterClient.randomId(),
])
sinon.spy(MockWebApi, 'setDocument')
sinon.spy(MockWebApi, 'getDocument')
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
version: this.version,
})
DocUpdaterClient.preloadDoc(this.project_id, this.doc_id, error => {
if (error != null) {
throw error
}
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc_id,
this.update,
error => {
if (error != null) {
throw error
}
setTimeout(() => {
DocUpdaterClient.deleteDoc(
this.project_id,
this.doc_id,
(error, res, body) => {
if (error) return done(error)
this.statusCode = res.statusCode
setTimeout(done, 200)
}
)
}, 200)
}
)
})
})
after(function () {
MockWebApi.setDocument.restore()
MockWebApi.getDocument.restore()
})
it('should return a 204 status code', function () {
this.statusCode.should.equal(204)
})
it('should send the updated document and version to the web api', function () {
MockWebApi.setDocument
.calledWith(this.project_id, this.doc_id, this.result, this.version + 1)
.should.equal(true)
})
it('should need to reload the doc if read again', function (done) {
MockWebApi.getDocument.resetHistory()
MockWebApi.getDocument.called.should.equals(false)
DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, doc) => {
if (error) return done(error)
MockWebApi.getDocument
.calledWith(this.project_id, this.doc_id)
.should.equal(true)
done()
}
)
})
it('should flush project history', function () {
MockProjectHistoryApi.flushProject
.calledWith(this.project_id)
.should.equal(true)
})
})
describe('when the doc is not in the doc updater', function () {
before(function (done) {
;[this.project_id, this.doc_id] = Array.from([
DocUpdaterClient.randomId(),
DocUpdaterClient.randomId(),
])
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
})
sinon.spy(MockWebApi, 'setDocument')
sinon.spy(MockWebApi, 'getDocument')
DocUpdaterClient.deleteDoc(
this.project_id,
this.doc_id,
(error, res, body) => {
if (error) return done(error)
this.statusCode = res.statusCode
setTimeout(done, 200)
}
)
})
after(function () {
MockWebApi.setDocument.restore()
MockWebApi.getDocument.restore()
})
it('should return a 204 status code', function () {
this.statusCode.should.equal(204)
})
it('should not need to send the updated document to the web api', function () {
MockWebApi.setDocument.called.should.equal(false)
})
it('should need to reload the doc if read again', function (done) {
MockWebApi.getDocument.called.should.equals(false)
DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, doc) => {
if (error) return done(error)
MockWebApi.getDocument
.calledWith(this.project_id, this.doc_id)
.should.equal(true)
done()
}
)
})
it('should flush project history', function () {
MockProjectHistoryApi.flushProject
.calledWith(this.project_id)
.should.equal(true)
})
})
})

View File

@@ -0,0 +1,357 @@
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
const sinon = require('sinon')
const async = require('async')
const MockProjectHistoryApi = require('./helpers/MockProjectHistoryApi')
const MockWebApi = require('./helpers/MockWebApi')
const DocUpdaterClient = require('./helpers/DocUpdaterClient')
const DocUpdaterApp = require('./helpers/DocUpdaterApp')
describe('Deleting a project', function () {
beforeEach(function (done) {
let docId0, docId1
this.project_id = DocUpdaterClient.randomId()
this.docs = [
{
id: (docId0 = DocUpdaterClient.randomId()),
lines: ['one', 'two', 'three'],
update: {
doc: docId0,
op: [
{
i: 'one and a half\n',
p: 4,
},
],
v: 0,
},
updatedLines: ['one', 'one and a half', 'two', 'three'],
},
{
id: (docId1 = DocUpdaterClient.randomId()),
lines: ['four', 'five', 'six'],
update: {
doc: docId1,
op: [
{
i: 'four and a half\n',
p: 5,
},
],
v: 0,
},
updatedLines: ['four', 'four and a half', 'five', 'six'],
},
]
for (const doc of Array.from(this.docs)) {
MockWebApi.insertDoc(this.project_id, doc.id, {
lines: doc.lines,
version: doc.update.v,
})
}
DocUpdaterApp.ensureRunning(done)
})
describe('without updates', function () {
beforeEach(function (done) {
sinon.spy(MockWebApi, 'setDocument')
sinon.spy(MockProjectHistoryApi, 'flushProject')
async.series(
this.docs.map(doc => {
return callback => {
DocUpdaterClient.preloadDoc(this.project_id, doc.id, error => {
callback(error)
})
}
}),
error => {
if (error != null) {
throw error
}
setTimeout(() => {
DocUpdaterClient.deleteProject(
this.project_id,
(error, res, body) => {
if (error) return done(error)
this.statusCode = res.statusCode
done()
}
)
}, 200)
}
)
})
afterEach(function () {
MockWebApi.setDocument.restore()
MockProjectHistoryApi.flushProject.restore()
})
it('should return a 204 status code', function () {
this.statusCode.should.equal(204)
})
it('should not send any document to the web api', function () {
MockWebApi.setDocument.should.not.have.been.called
})
it('should need to reload the docs if read again', function (done) {
sinon.spy(MockWebApi, 'getDocument')
async.series(
this.docs.map(doc => {
return callback => {
MockWebApi.getDocument
.calledWith(this.project_id, doc.id)
.should.equal(false)
DocUpdaterClient.getDoc(
this.project_id,
doc.id,
(error, res, returnedDoc) => {
if (error) return done(error)
MockWebApi.getDocument
.calledWith(this.project_id, doc.id)
.should.equal(true)
callback()
}
)
}
}),
() => {
MockWebApi.getDocument.restore()
done()
}
)
})
it('should flush each doc in project history', function () {
MockProjectHistoryApi.flushProject
.calledWith(this.project_id)
.should.equal(true)
})
})
describe('with documents which have been updated', function () {
beforeEach(function (done) {
sinon.spy(MockWebApi, 'setDocument')
sinon.spy(MockProjectHistoryApi, 'flushProject')
async.series(
this.docs.map(doc => {
return callback => {
DocUpdaterClient.preloadDoc(this.project_id, doc.id, error => {
if (error != null) {
return callback(error)
}
DocUpdaterClient.sendUpdate(
this.project_id,
doc.id,
doc.update,
error => {
callback(error)
}
)
})
}
}),
error => {
if (error != null) {
throw error
}
setTimeout(() => {
DocUpdaterClient.deleteProject(
this.project_id,
(error, res, body) => {
if (error) return done(error)
this.statusCode = res.statusCode
done()
}
)
}, 200)
}
)
})
afterEach(function () {
MockWebApi.setDocument.restore()
MockProjectHistoryApi.flushProject.restore()
})
it('should return a 204 status code', function () {
this.statusCode.should.equal(204)
})
it('should send each document to the web api', function () {
Array.from(this.docs).map(doc =>
MockWebApi.setDocument
.calledWith(this.project_id, doc.id, doc.updatedLines)
.should.equal(true)
)
})
it('should need to reload the docs if read again', function (done) {
sinon.spy(MockWebApi, 'getDocument')
async.series(
this.docs.map(doc => {
return callback => {
MockWebApi.getDocument
.calledWith(this.project_id, doc.id)
.should.equal(false)
DocUpdaterClient.getDoc(
this.project_id,
doc.id,
(error, res, returnedDoc) => {
if (error) return done(error)
MockWebApi.getDocument
.calledWith(this.project_id, doc.id)
.should.equal(true)
callback()
}
)
}
}),
() => {
MockWebApi.getDocument.restore()
done()
}
)
})
it('should flush each doc in project history', function () {
MockProjectHistoryApi.flushProject
.calledWith(this.project_id)
.should.equal(true)
})
})
describe('with the background=true parameter from realtime and no request to flush the queue', function () {
beforeEach(function (done) {
sinon.spy(MockWebApi, 'setDocument')
sinon.spy(MockProjectHistoryApi, 'flushProject')
async.series(
this.docs.map(doc => {
return callback => {
DocUpdaterClient.preloadDoc(this.project_id, doc.id, error => {
if (error != null) {
return callback(error)
}
DocUpdaterClient.sendUpdate(
this.project_id,
doc.id,
doc.update,
error => {
callback(error)
}
)
})
}
}),
error => {
if (error != null) {
throw error
}
setTimeout(() => {
DocUpdaterClient.deleteProjectOnShutdown(
this.project_id,
(error, res, body) => {
if (error) return done(error)
this.statusCode = res.statusCode
done()
}
)
}, 200)
}
)
})
afterEach(function () {
MockWebApi.setDocument.restore()
MockProjectHistoryApi.flushProject.restore()
})
it('should return a 204 status code', function () {
this.statusCode.should.equal(204)
})
it('should not send any documents to the web api', function () {
MockWebApi.setDocument.called.should.equal(false)
})
it('should not flush to project history', function () {
MockProjectHistoryApi.flushProject.called.should.equal(false)
})
})
describe('with the background=true parameter from realtime and a request to flush the queue', function () {
beforeEach(function (done) {
sinon.spy(MockWebApi, 'setDocument')
sinon.spy(MockProjectHistoryApi, 'flushProject')
async.series(
this.docs.map(doc => {
return callback => {
DocUpdaterClient.preloadDoc(this.project_id, doc.id, error => {
if (error != null) {
return callback(error)
}
DocUpdaterClient.sendUpdate(
this.project_id,
doc.id,
doc.update,
error => {
callback(error)
}
)
})
}
}),
error => {
if (error != null) {
throw error
}
setTimeout(() => {
DocUpdaterClient.deleteProjectOnShutdown(
this.project_id,
(error, res, body) => {
if (error) return done(error)
this.statusCode = res.statusCode
// after deleting the project and putting it in the queue, flush the queue
setTimeout(() => DocUpdaterClient.flushOldProjects(done), 2000)
}
)
}, 200)
}
)
})
afterEach(function () {
MockWebApi.setDocument.restore()
MockProjectHistoryApi.flushProject.restore()
})
it('should return a 204 status code', function () {
this.statusCode.should.equal(204)
})
it('should send each document to the web api', function () {
Array.from(this.docs).map(doc =>
MockWebApi.setDocument
.calledWith(this.project_id, doc.id, doc.updatedLines)
.should.equal(true)
)
})
it('should flush to project history', function () {
MockProjectHistoryApi.flushProject.called.should.equal(true)
})
})
})

View File

@@ -0,0 +1,141 @@
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
const sinon = require('sinon')
const async = require('async')
const MockWebApi = require('./helpers/MockWebApi')
const DocUpdaterClient = require('./helpers/DocUpdaterClient')
const DocUpdaterApp = require('./helpers/DocUpdaterApp')
describe('Flushing a project', function () {
before(function (done) {
let docId0, docId1
this.project_id = DocUpdaterClient.randomId()
this.docs = [
{
id: (docId0 = DocUpdaterClient.randomId()),
lines: ['one', 'two', 'three'],
update: {
doc: docId0,
op: [
{
i: 'one and a half\n',
p: 4,
},
],
v: 0,
},
updatedLines: ['one', 'one and a half', 'two', 'three'],
},
{
id: (docId1 = DocUpdaterClient.randomId()),
lines: ['four', 'five', 'six'],
update: {
doc: docId1,
op: [
{
i: 'four and a half\n',
p: 5,
},
],
v: 0,
},
updatedLines: ['four', 'four and a half', 'five', 'six'],
},
]
for (const doc of Array.from(this.docs)) {
MockWebApi.insertDoc(this.project_id, doc.id, {
lines: doc.lines,
version: doc.update.v,
})
}
return DocUpdaterApp.ensureRunning(done)
})
return describe('with documents which have been updated', function () {
before(function (done) {
sinon.spy(MockWebApi, 'setDocument')
return async.series(
this.docs.map(doc => {
return callback => {
return DocUpdaterClient.preloadDoc(
this.project_id,
doc.id,
error => {
if (error != null) {
return callback(error)
}
return DocUpdaterClient.sendUpdate(
this.project_id,
doc.id,
doc.update,
error => {
return callback(error)
}
)
}
)
}
}),
error => {
if (error != null) {
throw error
}
return setTimeout(() => {
return DocUpdaterClient.flushProject(
this.project_id,
(error, res, body) => {
if (error) return done(error)
this.statusCode = res.statusCode
return done()
}
)
}, 200)
}
)
})
after(function () {
return MockWebApi.setDocument.restore()
})
it('should return a 204 status code', function () {
return this.statusCode.should.equal(204)
})
it('should send each document to the web api', function () {
return Array.from(this.docs).map(doc =>
MockWebApi.setDocument
.calledWith(this.project_id, doc.id, doc.updatedLines)
.should.equal(true)
)
})
return it('should update the lines in the doc updater', function (done) {
return async.series(
this.docs.map(doc => {
return callback => {
return DocUpdaterClient.getDoc(
this.project_id,
doc.id,
(error, res, returnedDoc) => {
if (error) return done(error)
returnedDoc.lines.should.deep.equal(doc.updatedLines)
return callback()
}
)
}
}),
done
)
})
})
})

View File

@@ -0,0 +1,162 @@
/* eslint-disable
no-return-assign,
no-unused-vars,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
const sinon = require('sinon')
const { expect } = require('chai')
const async = require('async')
const MockWebApi = require('./helpers/MockWebApi')
const DocUpdaterClient = require('./helpers/DocUpdaterClient')
const DocUpdaterApp = require('./helpers/DocUpdaterApp')
describe('Flushing a doc to Mongo', function () {
before(function (done) {
this.lines = ['one', 'two', 'three']
this.version = 42
this.update = {
doc: this.doc_id,
meta: { user_id: 'last-author-fake-id' },
op: [
{
i: 'one and a half\n',
p: 4,
},
],
v: this.version,
}
this.result = ['one', 'one and a half', 'two', 'three']
return DocUpdaterApp.ensureRunning(done)
})
describe('when the updated doc exists in the doc updater', function () {
before(function (done) {
;[this.project_id, this.doc_id] = Array.from([
DocUpdaterClient.randomId(),
DocUpdaterClient.randomId(),
])
sinon.spy(MockWebApi, 'setDocument')
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
version: this.version,
})
return DocUpdaterClient.sendUpdates(
this.project_id,
this.doc_id,
[this.update],
error => {
if (error != null) {
throw error
}
return setTimeout(() => {
return DocUpdaterClient.flushDoc(this.project_id, this.doc_id, done)
}, 200)
}
)
})
after(function () {
return MockWebApi.setDocument.restore()
})
it('should flush the updated doc lines and version to the web api', function () {
return MockWebApi.setDocument
.calledWith(this.project_id, this.doc_id, this.result, this.version + 1)
.should.equal(true)
})
return it('should flush the last update author and time to the web api', function () {
const lastUpdatedAt = MockWebApi.setDocument.lastCall.args[5]
parseInt(lastUpdatedAt).should.be.closeTo(new Date().getTime(), 30000)
const lastUpdatedBy = MockWebApi.setDocument.lastCall.args[6]
return lastUpdatedBy.should.equal('last-author-fake-id')
})
})
describe('when the doc does not exist in the doc updater', function () {
before(function (done) {
;[this.project_id, this.doc_id] = Array.from([
DocUpdaterClient.randomId(),
DocUpdaterClient.randomId(),
])
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
})
sinon.spy(MockWebApi, 'setDocument')
return DocUpdaterClient.flushDoc(this.project_id, this.doc_id, done)
})
after(function () {
return MockWebApi.setDocument.restore()
})
return it('should not flush the doc to the web api', function () {
return MockWebApi.setDocument.called.should.equal(false)
})
})
return describe('when the web api http request takes a long time on first request', function () {
before(function (done) {
;[this.project_id, this.doc_id] = Array.from([
DocUpdaterClient.randomId(),
DocUpdaterClient.randomId(),
])
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
version: this.version,
})
let t = 30000
sinon
.stub(MockWebApi, 'setDocument')
.callsFake(
(
projectId,
docId,
lines,
version,
ranges,
lastUpdatedAt,
lastUpdatedBy,
callback
) => {
if (callback == null) {
callback = function () {}
}
setTimeout(callback, t)
return (t = 0)
}
)
return DocUpdaterClient.preloadDoc(this.project_id, this.doc_id, done)
})
after(function () {
return MockWebApi.setDocument.restore()
})
return it('should still work', function (done) {
const start = Date.now()
return DocUpdaterClient.flushDoc(
this.project_id,
this.doc_id,
(error, res, doc) => {
if (error) return done(error)
res.statusCode.should.equal(204)
const delta = Date.now() - start
expect(delta).to.be.below(20000)
return done()
}
)
})
})
})

View File

@@ -0,0 +1,293 @@
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
const sinon = require('sinon')
const { expect } = require('chai')
const MockWebApi = require('./helpers/MockWebApi')
const DocUpdaterClient = require('./helpers/DocUpdaterClient')
const DocUpdaterApp = require('./helpers/DocUpdaterApp')
describe('Getting a document', function () {
before(function (done) {
this.lines = ['one', 'two', 'three']
this.version = 42
return DocUpdaterApp.ensureRunning(done)
})
describe('when the document is not loaded', function () {
before(function (done) {
;[this.project_id, this.doc_id] = Array.from([
DocUpdaterClient.randomId(),
DocUpdaterClient.randomId(),
])
sinon.spy(MockWebApi, 'getDocument')
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
version: this.version,
})
return DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, returnedDoc) => {
if (error) return done(error)
this.returnedDoc = returnedDoc
return done()
}
)
})
after(function () {
return MockWebApi.getDocument.restore()
})
it('should load the document from the web API', function () {
return MockWebApi.getDocument
.calledWith(this.project_id, this.doc_id)
.should.equal(true)
})
it('should return the document lines', function () {
return this.returnedDoc.lines.should.deep.equal(this.lines)
})
return it('should return the document at its current version', function () {
return this.returnedDoc.version.should.equal(this.version)
})
})
describe('when the document is already loaded', function () {
before(function (done) {
;[this.project_id, this.doc_id] = Array.from([
DocUpdaterClient.randomId(),
DocUpdaterClient.randomId(),
])
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
version: this.version,
})
return DocUpdaterClient.preloadDoc(
this.project_id,
this.doc_id,
error => {
if (error != null) {
throw error
}
sinon.spy(MockWebApi, 'getDocument')
return DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, returnedDoc) => {
if (error) return done(error)
this.returnedDoc = returnedDoc
return done()
}
)
}
)
})
after(function () {
return MockWebApi.getDocument.restore()
})
it('should not load the document from the web API', function () {
return MockWebApi.getDocument.called.should.equal(false)
})
return it('should return the document lines', function () {
return this.returnedDoc.lines.should.deep.equal(this.lines)
})
})
describe('when the request asks for some recent ops', function () {
before(function (done) {
;[this.project_id, this.doc_id] = Array.from([
DocUpdaterClient.randomId(),
DocUpdaterClient.randomId(),
])
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: (this.lines = ['one', 'two', 'three']),
})
this.updates = __range__(0, 199, true).map(v => ({
doc_id: this.doc_id,
op: [{ i: v.toString(), p: 0 }],
v,
}))
return DocUpdaterClient.sendUpdates(
this.project_id,
this.doc_id,
this.updates,
error => {
if (error != null) {
throw error
}
sinon.spy(MockWebApi, 'getDocument')
return done()
}
)
})
after(function () {
return MockWebApi.getDocument.restore()
})
describe('when the ops are loaded', function () {
before(function (done) {
return DocUpdaterClient.getDocAndRecentOps(
this.project_id,
this.doc_id,
190,
(error, res, returnedDoc) => {
if (error) return done(error)
this.returnedDoc = returnedDoc
return done()
}
)
})
return it('should return the recent ops', function () {
this.returnedDoc.ops.length.should.equal(10)
return Array.from(this.updates.slice(190, -1)).map((update, i) =>
this.returnedDoc.ops[i].op.should.deep.equal(update.op)
)
})
})
return describe('when the ops are not all loaded', function () {
before(function (done) {
// We only track 100 ops
return DocUpdaterClient.getDocAndRecentOps(
this.project_id,
this.doc_id,
10,
(error, res, returnedDoc) => {
if (error) return done(error)
this.res = res
this.returnedDoc = returnedDoc
return done()
}
)
})
return it('should return UnprocessableEntity', function () {
return this.res.statusCode.should.equal(422)
})
})
})
describe('when the document does not exist', function () {
before(function (done) {
;[this.project_id, this.doc_id] = Array.from([
DocUpdaterClient.randomId(),
DocUpdaterClient.randomId(),
])
return DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, doc) => {
if (error) return done(error)
this.statusCode = res.statusCode
return done()
}
)
})
return it('should return 404', function () {
return this.statusCode.should.equal(404)
})
})
describe('when the web api returns an error', function () {
before(function (done) {
;[this.project_id, this.doc_id] = Array.from([
DocUpdaterClient.randomId(),
DocUpdaterClient.randomId(),
])
sinon
.stub(MockWebApi, 'getDocument')
.callsFake((projectId, docId, callback) => {
if (callback == null) {
callback = function () {}
}
return callback(new Error('oops'))
})
return DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, doc) => {
if (error) return done(error)
this.statusCode = res.statusCode
return done()
}
)
})
after(function () {
return MockWebApi.getDocument.restore()
})
return it('should return 500', function () {
return this.statusCode.should.equal(500)
})
})
return describe('when the web api http request takes a long time', function () {
before(function (done) {
this.timeout = 10000
;[this.project_id, this.doc_id] = Array.from([
DocUpdaterClient.randomId(),
DocUpdaterClient.randomId(),
])
sinon
.stub(MockWebApi, 'getDocument')
.callsFake((projectId, docId, callback) => {
if (callback == null) {
callback = function () {}
}
return setTimeout(callback, 30000)
})
return done()
})
after(function () {
return MockWebApi.getDocument.restore()
})
return it('should return quickly(ish)', function (done) {
const start = Date.now()
return DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, doc) => {
if (error) return done(error)
res.statusCode.should.equal(500)
const delta = Date.now() - start
expect(delta).to.be.below(20000)
return done()
}
)
})
})
})
function __range__(left, right, inclusive) {
const range = []
const ascending = left < right
const end = !inclusive ? right : ascending ? right + 1 : right - 1
for (let i = left; ascending ? i < end : i > end; ascending ? i++ : i--) {
range.push(i)
}
return range
}

View File

@@ -0,0 +1,176 @@
/* eslint-disable
no-unused-vars,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
const sinon = require('sinon')
const { expect } = require('chai')
const MockWebApi = require('./helpers/MockWebApi')
const DocUpdaterClient = require('./helpers/DocUpdaterClient')
const DocUpdaterApp = require('./helpers/DocUpdaterApp')
describe('Getting documents for project', function () {
before(function (done) {
this.lines = ['one', 'two', 'three']
this.version = 42
return DocUpdaterApp.ensureRunning(done)
})
describe('when project state hash does not match', function () {
before(function (done) {
this.projectStateHash = DocUpdaterClient.randomId()
;[this.project_id, this.doc_id] = Array.from([
DocUpdaterClient.randomId(),
DocUpdaterClient.randomId(),
])
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
version: this.version,
})
return DocUpdaterClient.preloadDoc(
this.project_id,
this.doc_id,
error => {
if (error != null) {
throw error
}
return DocUpdaterClient.getProjectDocs(
this.project_id,
this.projectStateHash,
(error, res, returnedDocs) => {
if (error) return done(error)
this.res = res
this.returnedDocs = returnedDocs
return done()
}
)
}
)
})
return it('should return a 409 Conflict response', function () {
return this.res.statusCode.should.equal(409)
})
})
describe('when project state hash matches', function () {
before(function (done) {
this.projectStateHash = DocUpdaterClient.randomId()
;[this.project_id, this.doc_id] = Array.from([
DocUpdaterClient.randomId(),
DocUpdaterClient.randomId(),
])
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
version: this.version,
})
return DocUpdaterClient.preloadDoc(
this.project_id,
this.doc_id,
error => {
if (error != null) {
throw error
}
return DocUpdaterClient.getProjectDocs(
this.project_id,
this.projectStateHash,
(error, res0, returnedDocs0) => {
if (error) return done(error)
// set the hash
this.res0 = res0
this.returnedDocs0 = returnedDocs0
return DocUpdaterClient.getProjectDocs(
this.project_id,
this.projectStateHash,
(error, res, returnedDocs) => {
if (error) return done(error)
// the hash should now match
this.res = res
this.returnedDocs = returnedDocs
return done()
}
)
}
)
}
)
})
it('should return a 200 response', function () {
return this.res.statusCode.should.equal(200)
})
return it('should return the documents', function () {
return this.returnedDocs.should.deep.equal([
{ _id: this.doc_id, lines: this.lines, v: this.version },
])
})
})
return describe('when the doc has been removed', function () {
before(function (done) {
this.projectStateHash = DocUpdaterClient.randomId()
;[this.project_id, this.doc_id] = Array.from([
DocUpdaterClient.randomId(),
DocUpdaterClient.randomId(),
])
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
version: this.version,
})
return DocUpdaterClient.preloadDoc(
this.project_id,
this.doc_id,
error => {
if (error != null) {
throw error
}
return DocUpdaterClient.getProjectDocs(
this.project_id,
this.projectStateHash,
(error, res0, returnedDocs0) => {
if (error) return done(error)
// set the hash
this.res0 = res0
this.returnedDocs0 = returnedDocs0
return DocUpdaterClient.deleteDoc(
this.project_id,
this.doc_id,
(error, res, body) => {
if (error) return done(error)
// delete the doc
return DocUpdaterClient.getProjectDocs(
this.project_id,
this.projectStateHash,
(error, res1, returnedDocs) => {
if (error) return done(error)
// the hash would match, but the doc has been deleted
this.res = res1
this.returnedDocs = returnedDocs
return done()
}
)
}
)
}
)
}
)
})
return it('should return a 409 Conflict response', function () {
return this.res.statusCode.should.equal(409)
})
})
})

View File

@@ -0,0 +1,100 @@
const sinon = require('sinon')
const MockWebApi = require('./helpers/MockWebApi')
const DocUpdaterClient = require('./helpers/DocUpdaterClient')
const DocUpdaterApp = require('./helpers/DocUpdaterApp')
describe('Peeking a document', function () {
before(function (done) {
this.lines = ['one', 'two', 'three']
this.version = 42
return DocUpdaterApp.ensureRunning(done)
})
describe('when the document is not loaded', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.doc_id = DocUpdaterClient.randomId()
sinon.spy(MockWebApi, 'getDocument')
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
version: this.version,
})
return DocUpdaterClient.peekDoc(
this.project_id,
this.doc_id,
(error, res, returnedDoc) => {
this.error = error
this.res = res
this.returnedDoc = returnedDoc
return done()
}
)
})
after(function () {
return MockWebApi.getDocument.restore()
})
it('should return a 404 response', function () {
this.res.statusCode.should.equal(404)
})
it('should not load the document from the web API', function () {
return MockWebApi.getDocument.called.should.equal(false)
})
})
describe('when the document is already loaded', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.doc_id = DocUpdaterClient.randomId()
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
version: this.version,
})
return DocUpdaterClient.preloadDoc(
this.project_id,
this.doc_id,
error => {
if (error != null) {
throw error
}
sinon.spy(MockWebApi, 'getDocument')
return DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, returnedDoc) => {
if (error) return done(error)
this.res = res
this.returnedDoc = returnedDoc
return done()
}
)
}
)
})
after(function () {
return MockWebApi.getDocument.restore()
})
it('should return a 200 response', function () {
this.res.statusCode.should.equal(200)
})
it('should return the document lines', function () {
return this.returnedDoc.lines.should.deep.equal(this.lines)
})
it('should return the document version', function () {
return this.returnedDoc.version.should.equal(this.version)
})
it('should not load the document from the web API', function () {
return MockWebApi.getDocument.called.should.equal(false)
})
})
})

View File

@@ -0,0 +1,882 @@
const sinon = require('sinon')
const { expect } = require('chai')
const async = require('async')
const { db, ObjectId } = require('../../../app/js/mongodb')
const MockWebApi = require('./helpers/MockWebApi')
const DocUpdaterClient = require('./helpers/DocUpdaterClient')
const DocUpdaterApp = require('./helpers/DocUpdaterApp')
const RangesManager = require('../../../app/js/RangesManager')
const sandbox = sinon.createSandbox()
describe('Ranges', function () {
before(function (done) {
DocUpdaterApp.ensureRunning(done)
})
describe('tracking changes from ops', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.user_id = DocUpdaterClient.randomId()
this.id_seed = '587357bd35e64f6157'
this.doc = {
id: DocUpdaterClient.randomId(),
lines: ['aaa'],
}
this.updates = [
{
doc: this.doc.id,
op: [{ i: '123', p: 1 }],
v: 0,
meta: { user_id: this.user_id },
},
{
doc: this.doc.id,
op: [{ i: '456', p: 5 }],
v: 1,
meta: { user_id: this.user_id, tc: this.id_seed },
},
{
doc: this.doc.id,
op: [{ d: '12', p: 1 }],
v: 2,
meta: { user_id: this.user_id },
},
]
MockWebApi.insertDoc(this.project_id, this.doc.id, {
lines: this.doc.lines,
version: 0,
})
const jobs = []
for (const update of this.updates) {
jobs.push(callback =>
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc.id,
update,
callback
)
)
}
DocUpdaterApp.ensureRunning(error => {
if (error != null) {
throw error
}
DocUpdaterClient.preloadDoc(this.project_id, this.doc.id, error => {
if (error != null) {
throw error
}
async.series(jobs, error => {
if (error != null) {
throw error
}
done()
})
})
})
})
it('should update the ranges', function (done) {
DocUpdaterClient.getDoc(
this.project_id,
this.doc.id,
(error, res, data) => {
if (error != null) {
throw error
}
const { ranges } = data
const change = ranges.changes[0]
change.op.should.deep.equal({ i: '456', p: 3 })
change.id.should.equal(this.id_seed + '000001')
change.metadata.user_id.should.equal(this.user_id)
done()
}
)
})
describe('Adding comments', function () {
describe('standalone', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.user_id = DocUpdaterClient.randomId()
this.doc = {
id: DocUpdaterClient.randomId(),
lines: ['foo bar baz'],
}
this.updates = [
{
doc: this.doc.id,
op: [
{ c: 'bar', p: 4, t: (this.tid = DocUpdaterClient.randomId()) },
],
v: 0,
},
]
MockWebApi.insertDoc(this.project_id, this.doc.id, {
lines: this.doc.lines,
version: 0,
})
const jobs = []
for (const update of this.updates) {
jobs.push(callback =>
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc.id,
update,
callback
)
)
}
DocUpdaterClient.preloadDoc(this.project_id, this.doc.id, error => {
if (error != null) {
throw error
}
async.series(jobs, error => {
if (error != null) {
throw error
}
setTimeout(done, 200)
})
})
})
it('should update the ranges', function (done) {
DocUpdaterClient.getDoc(
this.project_id,
this.doc.id,
(error, res, data) => {
if (error != null) {
throw error
}
const { ranges } = data
const comment = ranges.comments[0]
comment.op.should.deep.equal({ c: 'bar', p: 4, t: this.tid })
comment.id.should.equal(this.tid)
done()
}
)
})
})
describe('with conflicting ops needing OT', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.user_id = DocUpdaterClient.randomId()
this.doc = {
id: DocUpdaterClient.randomId(),
lines: ['foo bar baz'],
}
this.updates = [
{
doc: this.doc.id,
op: [{ i: 'ABC', p: 3 }],
v: 0,
meta: { user_id: this.user_id },
},
{
doc: this.doc.id,
op: [
{ c: 'bar', p: 4, t: (this.tid = DocUpdaterClient.randomId()) },
],
v: 0,
},
]
MockWebApi.insertDoc(this.project_id, this.doc.id, {
lines: this.doc.lines,
version: 0,
})
const jobs = []
for (const update of this.updates) {
jobs.push(callback =>
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc.id,
update,
callback
)
)
}
DocUpdaterClient.preloadDoc(this.project_id, this.doc.id, error => {
if (error != null) {
throw error
}
async.series(jobs, error => {
if (error != null) {
throw error
}
setTimeout(done, 200)
})
})
})
it('should update the comments with the OT shifted comment', function (done) {
DocUpdaterClient.getDoc(
this.project_id,
this.doc.id,
(error, res, data) => {
if (error != null) {
throw error
}
const { ranges } = data
const comment = ranges.comments[0]
comment.op.should.deep.equal({ c: 'bar', p: 7, t: this.tid })
done()
}
)
})
})
})
})
describe('Loading ranges from persistence layer', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.user_id = DocUpdaterClient.randomId()
this.id_seed = '587357bd35e64f6157'
this.doc = {
id: DocUpdaterClient.randomId(),
lines: ['a123aa'],
}
this.update = {
doc: this.doc.id,
op: [{ i: '456', p: 5 }],
v: 0,
meta: { user_id: this.user_id, tc: this.id_seed },
}
MockWebApi.insertDoc(this.project_id, this.doc.id, {
lines: this.doc.lines,
version: 0,
ranges: {
changes: [
{
op: { i: '123', p: 1 },
metadata: {
user_id: this.user_id,
ts: new Date(),
},
},
],
},
})
DocUpdaterClient.preloadDoc(this.project_id, this.doc.id, error => {
if (error != null) {
throw error
}
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc.id,
this.update,
error => {
if (error != null) {
throw error
}
setTimeout(done, 200)
}
)
})
})
it('should have preloaded the existing ranges', function (done) {
DocUpdaterClient.getDoc(
this.project_id,
this.doc.id,
(error, res, data) => {
if (error != null) {
throw error
}
const { changes } = data.ranges
changes[0].op.should.deep.equal({ i: '123', p: 1 })
changes[1].op.should.deep.equal({ i: '456', p: 5 })
done()
}
)
})
it('should flush the ranges to the persistence layer again', function (done) {
DocUpdaterClient.flushDoc(this.project_id, this.doc.id, error => {
if (error != null) {
throw error
}
MockWebApi.getDocument(this.project_id, this.doc.id, (error, doc) => {
if (error) return done(error)
const { changes } = doc.ranges
changes[0].op.should.deep.equal({ i: '123', p: 1 })
changes[1].op.should.deep.equal({ i: '456', p: 5 })
done()
})
})
})
})
describe('accepting a change', function () {
beforeEach(function (done) {
sandbox.spy(MockWebApi, 'setDocument')
this.project_id = DocUpdaterClient.randomId()
this.user_id = DocUpdaterClient.randomId()
this.id_seed = '587357bd35e64f6157'
this.doc = {
id: DocUpdaterClient.randomId(),
lines: ['aaa'],
}
this.update = {
doc: this.doc.id,
op: [{ i: '456', p: 1 }],
v: 0,
meta: { user_id: this.user_id, tc: this.id_seed },
}
MockWebApi.insertDoc(this.project_id, this.doc.id, {
lines: this.doc.lines,
version: 0,
})
DocUpdaterClient.preloadDoc(this.project_id, this.doc.id, error => {
if (error != null) {
throw error
}
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc.id,
this.update,
error => {
if (error != null) {
throw error
}
setTimeout(() => {
DocUpdaterClient.getDoc(
this.project_id,
this.doc.id,
(error, res, data) => {
if (error != null) {
throw error
}
const { ranges } = data
const change = ranges.changes[0]
change.op.should.deep.equal({ i: '456', p: 1 })
change.id.should.equal(this.id_seed + '000001')
change.metadata.user_id.should.equal(this.user_id)
done()
}
)
}, 200)
}
)
})
})
afterEach(function () {
sandbox.restore()
})
it('should remove the change after accepting', function (done) {
DocUpdaterClient.acceptChange(
this.project_id,
this.doc.id,
this.id_seed + '000001',
error => {
if (error != null) {
throw error
}
DocUpdaterClient.getDoc(
this.project_id,
this.doc.id,
(error, res, data) => {
if (error != null) {
throw error
}
expect(data.ranges.changes).to.be.undefined
done()
}
)
}
)
})
it('should persist the ranges after accepting', function (done) {
DocUpdaterClient.flushDoc(this.project_id, this.doc.id, err => {
if (err) return done(err)
DocUpdaterClient.acceptChange(
this.project_id,
this.doc.id,
this.id_seed + '000001',
error => {
if (error != null) {
throw error
}
DocUpdaterClient.flushDoc(this.project_id, this.doc.id, err => {
if (err) return done(err)
DocUpdaterClient.getDoc(
this.project_id,
this.doc.id,
(error, res, data) => {
if (error != null) {
throw error
}
expect(data.ranges.changes).to.be.undefined
MockWebApi.setDocument
.calledWith(this.project_id, this.doc.id, ['a456aa'], 1, {})
.should.equal(true)
done()
}
)
})
}
)
})
})
})
describe('accepting multiple changes', function () {
beforeEach(function (done) {
this.getHistoryUpdatesSpy = sandbox.spy(
RangesManager,
'getHistoryUpdatesForAcceptedChanges'
)
this.project_id = DocUpdaterClient.randomId()
this.user_id = DocUpdaterClient.randomId()
this.doc = {
id: DocUpdaterClient.randomId(),
lines: ['aaa', 'bbb', 'ccc', 'ddd', 'eee'],
}
MockWebApi.insertDoc(this.project_id, this.doc.id, {
lines: this.doc.lines,
version: 0,
historyRangesSupport: true,
})
DocUpdaterClient.preloadDoc(this.project_id, this.doc.id, error => {
if (error != null) {
throw error
}
this.id_seed_1 = 'tc_1'
this.id_seed_2 = 'tc_2'
this.id_seed_3 = 'tc_3'
this.updates = [
{
doc: this.doc.id,
op: [{ d: 'bbb', p: 4 }],
v: 0,
meta: {
user_id: this.user_id,
tc: this.id_seed_1,
},
},
{
doc: this.doc.id,
op: [{ d: 'ccc', p: 5 }],
v: 1,
meta: {
user_id: this.user_id,
tc: this.id_seed_2,
},
},
{
doc: this.doc.id,
op: [{ d: 'ddd', p: 6 }],
v: 2,
meta: {
user_id: this.user_id,
tc: this.id_seed_3,
},
},
]
DocUpdaterClient.sendUpdates(
this.project_id,
this.doc.id,
this.updates,
error => {
if (error != null) {
throw error
}
setTimeout(() => {
DocUpdaterClient.getDoc(
this.project_id,
this.doc.id,
(error, res, data) => {
if (error != null) {
throw error
}
const { ranges } = data
const changeOps = ranges.changes
.map(change => change.op)
.flat()
changeOps.should.deep.equal([
{ d: 'bbb', p: 4 },
{ d: 'ccc', p: 5 },
{ d: 'ddd', p: 6 },
])
done()
}
)
}, 200)
}
)
})
})
afterEach(function () {
sandbox.restore()
})
it('accepting changes in order', function (done) {
DocUpdaterClient.acceptChanges(
this.project_id,
this.doc.id,
[
this.id_seed_1 + '000001',
this.id_seed_2 + '000001',
this.id_seed_3 + '000001',
],
error => {
if (error != null) {
throw error
}
const historyUpdates = this.getHistoryUpdatesSpy.returnValues[0]
expect(historyUpdates[0]).to.deep.equal({
doc: this.doc.id,
meta: {
pathname: '/a/b/c.tex',
doc_length: 10,
history_doc_length: 19,
ts: historyUpdates[0].meta.ts,
user_id: this.user_id,
},
op: [{ p: 4, d: 'bbb' }],
})
expect(historyUpdates[1]).to.deep.equal({
doc: this.doc.id,
meta: {
pathname: '/a/b/c.tex',
doc_length: 10,
history_doc_length: 16,
ts: historyUpdates[1].meta.ts,
user_id: this.user_id,
},
op: [{ p: 5, d: 'ccc' }],
})
expect(historyUpdates[2]).to.deep.equal({
doc: this.doc.id,
meta: {
pathname: '/a/b/c.tex',
doc_length: 10,
history_doc_length: 13,
ts: historyUpdates[2].meta.ts,
user_id: this.user_id,
},
op: [{ p: 6, d: 'ddd' }],
})
done()
}
)
})
it('accepting changes in reverse order', function (done) {
DocUpdaterClient.acceptChanges(
this.project_id,
this.doc.id,
[
this.id_seed_3 + '000001',
this.id_seed_2 + '000001',
this.id_seed_1 + '000001',
],
error => {
if (error != null) {
throw error
}
const historyUpdates = this.getHistoryUpdatesSpy.returnValues[0]
expect(historyUpdates[0]).to.deep.equal({
doc: this.doc.id,
meta: {
pathname: '/a/b/c.tex',
doc_length: 10,
history_doc_length: 19,
ts: historyUpdates[0].meta.ts,
user_id: this.user_id,
},
op: [{ p: 4, d: 'bbb' }],
})
expect(historyUpdates[1]).to.deep.equal({
doc: this.doc.id,
meta: {
pathname: '/a/b/c.tex',
doc_length: 10,
history_doc_length: 16,
ts: historyUpdates[1].meta.ts,
user_id: this.user_id,
},
op: [{ p: 5, d: 'ccc' }],
})
expect(historyUpdates[2]).to.deep.equal({
doc: this.doc.id,
meta: {
pathname: '/a/b/c.tex',
doc_length: 10,
history_doc_length: 13,
ts: historyUpdates[2].meta.ts,
user_id: this.user_id,
},
op: [{ p: 6, d: 'ddd' }],
})
done()
}
)
})
})
describe('deleting a comment range', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.user_id = DocUpdaterClient.randomId()
this.doc = {
id: DocUpdaterClient.randomId(),
lines: ['foo bar'],
}
this.update = {
doc: this.doc.id,
op: [{ c: 'bar', p: 4, t: (this.tid = DocUpdaterClient.randomId()) }],
v: 0,
}
MockWebApi.insertDoc(this.project_id, this.doc.id, {
lines: this.doc.lines,
version: 0,
})
DocUpdaterClient.preloadDoc(this.project_id, this.doc.id, error => {
if (error != null) {
throw error
}
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc.id,
this.update,
error => {
if (error != null) {
throw error
}
setTimeout(() => {
DocUpdaterClient.getDoc(
this.project_id,
this.doc.id,
(error, res, data) => {
if (error != null) {
throw error
}
const { ranges } = data
const change = ranges.comments[0]
change.op.should.deep.equal({ c: 'bar', p: 4, t: this.tid })
change.id.should.equal(this.tid)
done()
}
)
}, 200)
}
)
})
})
it('should remove the comment range', function (done) {
DocUpdaterClient.removeComment(
this.project_id,
this.doc.id,
this.tid,
(error, res) => {
if (error != null) {
throw error
}
expect(res.statusCode).to.equal(204)
DocUpdaterClient.getDoc(
this.project_id,
this.doc.id,
(error, res, data) => {
if (error != null) {
throw error
}
expect(data.ranges.comments).to.be.undefined
done()
}
)
}
)
})
})
describe('tripping range size limit', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.user_id = DocUpdaterClient.randomId()
this.id_seed = DocUpdaterClient.randomId()
this.doc = {
id: DocUpdaterClient.randomId(),
lines: ['aaa'],
}
this.i = new Array(3 * 1024 * 1024).join('a')
this.updates = [
{
doc: this.doc.id,
op: [{ i: this.i, p: 1 }],
v: 0,
meta: { user_id: this.user_id, tc: this.id_seed },
},
]
MockWebApi.insertDoc(this.project_id, this.doc.id, {
lines: this.doc.lines,
version: 0,
})
const jobs = []
for (const update of this.updates) {
jobs.push(callback =>
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc.id,
update,
callback
)
)
}
DocUpdaterClient.preloadDoc(this.project_id, this.doc.id, error => {
if (error != null) {
throw error
}
async.series(jobs, error => {
if (error != null) {
throw error
}
setTimeout(done, 200)
})
})
})
it('should not update the ranges', function (done) {
DocUpdaterClient.getDoc(
this.project_id,
this.doc.id,
(error, res, data) => {
if (error != null) {
throw error
}
const { ranges } = data
expect(ranges.changes).to.be.undefined
done()
}
)
})
})
describe('deleting text surrounding a comment', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.user_id = DocUpdaterClient.randomId()
this.doc_id = DocUpdaterClient.randomId()
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: ['foo bar baz'],
version: 0,
ranges: {
comments: [
{
op: {
c: 'a',
p: 5,
tid: (this.tid = DocUpdaterClient.randomId()),
},
metadata: {
user_id: this.user_id,
ts: new Date(),
},
},
],
},
})
this.updates = [
{
doc: this.doc_id,
op: [{ d: 'foo ', p: 0 }],
v: 0,
meta: { user_id: this.user_id },
},
{
doc: this.doc_id,
op: [{ d: 'bar ', p: 0 }],
v: 1,
meta: { user_id: this.user_id },
},
]
const jobs = []
for (const update of this.updates) {
jobs.push(callback =>
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc_id,
update,
callback
)
)
}
DocUpdaterClient.preloadDoc(this.project_id, this.doc_id, error => {
if (error != null) {
throw error
}
async.series(jobs, function (error) {
if (error != null) {
throw error
}
setTimeout(() => {
DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, data) => {
if (error != null) {
throw error
}
done()
}
)
}, 200)
})
})
})
it('should write a snapshot from before the destructive change', function (done) {
DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, data) => {
if (error != null) {
return done(error)
}
db.docSnapshots
.find({
project_id: new ObjectId(this.project_id),
doc_id: new ObjectId(this.doc_id),
})
.toArray((error, docSnapshots) => {
if (error != null) {
return done(error)
}
expect(docSnapshots.length).to.equal(1)
expect(docSnapshots[0].version).to.equal(1)
expect(docSnapshots[0].lines).to.deep.equal(['bar baz'])
expect(docSnapshots[0].ranges.comments[0].op).to.deep.equal({
c: 'a',
p: 1,
tid: this.tid,
})
done()
})
}
)
})
})
})

View File

@@ -0,0 +1,528 @@
const sinon = require('sinon')
const { expect } = require('chai')
const Settings = require('@overleaf/settings')
const docUpdaterRedis = require('@overleaf/redis-wrapper').createClient(
Settings.redis.documentupdater
)
const Keys = Settings.redis.documentupdater.key_schema
const MockProjectHistoryApi = require('./helpers/MockProjectHistoryApi')
const MockWebApi = require('./helpers/MockWebApi')
const DocUpdaterClient = require('./helpers/DocUpdaterClient')
const DocUpdaterApp = require('./helpers/DocUpdaterApp')
describe('Setting a document', function () {
let numberOfReceivedUpdates = 0
before(function (done) {
DocUpdaterClient.subscribeToAppliedOps(() => {
numberOfReceivedUpdates++
})
this.lines = ['one', 'two', 'three']
this.version = 42
this.update = {
doc: this.doc_id,
op: [
{
i: 'one and a half\n',
p: 4,
},
],
v: this.version,
}
this.result = ['one', 'one and a half', 'two', 'three']
this.newLines = ['these', 'are', 'the', 'new', 'lines']
this.source = 'dropbox'
this.user_id = 'user-id-123'
sinon.spy(MockProjectHistoryApi, 'flushProject')
sinon.spy(MockWebApi, 'setDocument')
DocUpdaterApp.ensureRunning(done)
})
after(function () {
MockProjectHistoryApi.flushProject.restore()
MockWebApi.setDocument.restore()
})
describe('when the updated doc exists in the doc updater', function () {
before(function (done) {
numberOfReceivedUpdates = 0
this.project_id = DocUpdaterClient.randomId()
this.doc_id = DocUpdaterClient.randomId()
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
version: this.version,
})
DocUpdaterClient.preloadDoc(this.project_id, this.doc_id, error => {
if (error) {
throw error
}
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc_id,
this.update,
error => {
if (error) {
throw error
}
setTimeout(() => {
DocUpdaterClient.setDocLines(
this.project_id,
this.doc_id,
this.newLines,
this.source,
this.user_id,
false,
(error, res, body) => {
if (error) {
return done(error)
}
this.statusCode = res.statusCode
this.body = body
done()
}
)
}, 200)
}
)
})
})
after(function () {
MockProjectHistoryApi.flushProject.resetHistory()
MockWebApi.setDocument.resetHistory()
})
it('should return a 200 status code', function () {
this.statusCode.should.equal(200)
})
it('should emit two updates (from sendUpdate and setDocLines)', function () {
expect(numberOfReceivedUpdates).to.equal(2)
})
it('should send the updated doc lines and version to the web api', function () {
MockWebApi.setDocument
.calledWith(this.project_id, this.doc_id, this.newLines)
.should.equal(true)
})
it('should update the lines in the doc updater', function (done) {
DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, doc) => {
if (error) {
return done(error)
}
doc.lines.should.deep.equal(this.newLines)
done()
}
)
})
it('should bump the version in the doc updater', function (done) {
DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, doc) => {
if (error) {
return done(error)
}
doc.version.should.equal(this.version + 2)
done()
}
)
})
it('should leave the document in redis', function (done) {
docUpdaterRedis.get(
Keys.docLines({ doc_id: this.doc_id }),
(error, lines) => {
if (error) {
throw error
}
expect(JSON.parse(lines)).to.deep.equal(this.newLines)
done()
}
)
})
it('should return the mongo rev in the json response', function () {
this.body.should.deep.equal({ rev: '123' })
})
describe('when doc has the same contents', function () {
beforeEach(function (done) {
numberOfReceivedUpdates = 0
DocUpdaterClient.setDocLines(
this.project_id,
this.doc_id,
this.newLines,
this.source,
this.user_id,
false,
(error, res, body) => {
if (error) {
return done(error)
}
this.statusCode = res.statusCode
this.body = body
done()
}
)
})
it('should not bump the version in doc updater', function (done) {
DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, doc) => {
if (error) {
return done(error)
}
doc.version.should.equal(this.version + 2)
done()
}
)
})
it('should not emit any updates', function (done) {
setTimeout(() => {
expect(numberOfReceivedUpdates).to.equal(0)
done()
}, 100) // delay by 100ms: make sure we do not check too early!
})
})
})
describe('when the updated doc does not exist in the doc updater', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.doc_id = DocUpdaterClient.randomId()
numberOfReceivedUpdates = 0
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
version: this.version,
})
DocUpdaterClient.setDocLines(
this.project_id,
this.doc_id,
this.newLines,
this.source,
this.user_id,
false,
(error, res, body) => {
if (error) {
return done(error)
}
this.statusCode = res.statusCode
this.body = body
setTimeout(done, 200)
}
)
})
after(function () {
MockProjectHistoryApi.flushProject.resetHistory()
MockWebApi.setDocument.resetHistory()
})
it('should return a 200 status code', function () {
this.statusCode.should.equal(200)
})
it('should emit an update', function () {
expect(numberOfReceivedUpdates).to.equal(1)
})
it('should send the updated doc lines to the web api', function () {
MockWebApi.setDocument
.calledWith(this.project_id, this.doc_id, this.newLines)
.should.equal(true)
})
it('should flush project history', function () {
MockProjectHistoryApi.flushProject
.calledWith(this.project_id)
.should.equal(true)
})
it('should remove the document from redis', function (done) {
docUpdaterRedis.get(
Keys.docLines({ doc_id: this.doc_id }),
(error, lines) => {
if (error) {
throw error
}
expect(lines).to.not.exist
done()
}
)
})
it('should return the mongo rev in the json response', function () {
this.body.should.deep.equal({ rev: '123' })
})
})
const DOC_TOO_LARGE_TEST_CASES = [
{
desc: 'when the updated doc is too large for the body parser',
size: Settings.maxJsonRequestSize,
expectedStatusCode: 413,
},
{
desc: 'when the updated doc is larger than the HTTP controller limit',
size: Settings.max_doc_length,
expectedStatusCode: 406,
},
]
DOC_TOO_LARGE_TEST_CASES.forEach(testCase => {
describe(testCase.desc, function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.doc_id = DocUpdaterClient.randomId()
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
version: this.version,
})
this.newLines = []
while (JSON.stringify(this.newLines).length <= testCase.size) {
this.newLines.push('(a long line of text)'.repeat(10000))
}
DocUpdaterClient.setDocLines(
this.project_id,
this.doc_id,
this.newLines,
this.source,
this.user_id,
false,
(error, res, body) => {
if (error) {
return done(error)
}
this.statusCode = res.statusCode
setTimeout(done, 200)
}
)
})
after(function () {
MockProjectHistoryApi.flushProject.resetHistory()
MockWebApi.setDocument.resetHistory()
})
it(`should return a ${testCase.expectedStatusCode} status code`, function () {
this.statusCode.should.equal(testCase.expectedStatusCode)
})
it('should not send the updated doc lines to the web api', function () {
MockWebApi.setDocument.called.should.equal(false)
})
it('should not flush project history', function () {
MockProjectHistoryApi.flushProject.called.should.equal(false)
})
})
})
describe('when the updated doc is large but under the bodyParser and HTTPController size limit', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.doc_id = DocUpdaterClient.randomId()
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
version: this.version,
})
this.newLines = []
while (JSON.stringify(this.newLines).length < 2 * 1024 * 1024) {
// limit in HTTPController
this.newLines.push('(a long line of text)'.repeat(10000))
}
this.newLines.pop() // remove the line which took it over the limit
DocUpdaterClient.setDocLines(
this.project_id,
this.doc_id,
this.newLines,
this.source,
this.user_id,
false,
(error, res, body) => {
if (error) {
return done(error)
}
this.statusCode = res.statusCode
this.body = body
setTimeout(done, 200)
}
)
})
after(function () {
MockProjectHistoryApi.flushProject.resetHistory()
MockWebApi.setDocument.resetHistory()
})
it('should return a 200 status code', function () {
this.statusCode.should.equal(200)
})
it('should send the updated doc lines to the web api', function () {
MockWebApi.setDocument
.calledWith(this.project_id, this.doc_id, this.newLines)
.should.equal(true)
})
it('should return the mongo rev in the json response', function () {
this.body.should.deep.equal({ rev: '123' })
})
})
describe('with track changes', function () {
before(function () {
this.lines = ['one', 'one and a half', 'two', 'three']
this.id_seed = '587357bd35e64f6157'
this.update = {
doc: this.doc_id,
op: [
{
d: 'one and a half\n',
p: 4,
},
],
meta: {
tc: this.id_seed,
user_id: this.user_id,
},
v: this.version,
}
})
describe('with the undo flag', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.doc_id = DocUpdaterClient.randomId()
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
version: this.version,
})
DocUpdaterClient.preloadDoc(this.project_id, this.doc_id, error => {
if (error) {
throw error
}
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc_id,
this.update,
error => {
if (error) {
throw error
}
// Go back to old lines, with undo flag
DocUpdaterClient.setDocLines(
this.project_id,
this.doc_id,
this.lines,
this.source,
this.user_id,
true,
(error, res, body) => {
if (error) {
return done(error)
}
this.statusCode = res.statusCode
setTimeout(done, 200)
}
)
}
)
})
})
after(function () {
MockProjectHistoryApi.flushProject.resetHistory()
MockWebApi.setDocument.resetHistory()
})
it('should undo the tracked changes', function (done) {
DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, data) => {
if (error) {
throw error
}
const { ranges } = data
expect(ranges.changes).to.be.undefined
done()
}
)
})
})
describe('without the undo flag', function () {
before(function (done) {
this.project_id = DocUpdaterClient.randomId()
this.doc_id = DocUpdaterClient.randomId()
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
version: this.version,
})
DocUpdaterClient.preloadDoc(this.project_id, this.doc_id, error => {
if (error) {
throw error
}
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc_id,
this.update,
error => {
if (error) {
throw error
}
// Go back to old lines, without undo flag
DocUpdaterClient.setDocLines(
this.project_id,
this.doc_id,
this.lines,
this.source,
this.user_id,
false,
(error, res, body) => {
if (error) {
return done(error)
}
this.statusCode = res.statusCode
setTimeout(done, 200)
}
)
}
)
})
})
after(function () {
MockProjectHistoryApi.flushProject.resetHistory()
MockWebApi.setDocument.resetHistory()
})
it('should not undo the tracked changes', function (done) {
DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, data) => {
if (error) {
throw error
}
const { ranges } = data
expect(ranges.changes.length).to.equal(1)
done()
}
)
})
})
})
})

View File

@@ -0,0 +1,194 @@
const { expect } = require('chai')
const Settings = require('@overleaf/settings')
const MockWebApi = require('./helpers/MockWebApi')
const DocUpdaterClient = require('./helpers/DocUpdaterClient')
const DocUpdaterApp = require('./helpers/DocUpdaterApp')
describe('SizeChecks', function () {
before(function (done) {
DocUpdaterApp.ensureRunning(done)
})
beforeEach(function () {
this.version = 0
this.update = {
doc: this.doc_id,
op: [
{
i: 'insert some more lines that will bring it above the limit\n',
p: 42,
},
],
v: this.version,
}
this.project_id = DocUpdaterClient.randomId()
this.doc_id = DocUpdaterClient.randomId()
})
describe('when a doc is above the doc size limit already', function () {
beforeEach(function () {
this.lines = ['x'.repeat(Settings.max_doc_length)] // including the extra newline, this will be over the limit
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
v: this.version,
})
})
it('should error when fetching the doc', function (done) {
DocUpdaterClient.getDoc(this.project_id, this.doc_id, (error, res) => {
if (error) return done(error)
expect(res.statusCode).to.equal(500)
done()
})
})
describe('when trying to update', function () {
beforeEach(function (done) {
const update = {
doc: this.doc_id,
op: this.update.op,
v: this.version,
}
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc_id,
update,
error => {
if (error != null) {
throw error
}
setTimeout(done, 200)
}
)
})
it('should still error when fetching the doc', function (done) {
DocUpdaterClient.getDoc(this.project_id, this.doc_id, (error, res) => {
if (error) return done(error)
expect(res.statusCode).to.equal(500)
done()
})
})
})
})
describe('when the stringified JSON is above the doc size limit but the doc character count is not', function () {
beforeEach(function () {
let charsRemaining = Settings.max_doc_length
this.lines = []
// Take the maximum allowed doc length and split it into N lines of 63 characters + a newline.
// The character count will be exactly max_doc_length
// The JSON stringified size will exceed max_doc_length, due to the JSON formatting of the array.
// This document should be allowed, because we use the character count as the limit, not the JSON size.
while (charsRemaining > 0) {
const charstoAdd = Math.min(charsRemaining - 1, 63) // allow for additional newline
this.lines.push('x'.repeat(charstoAdd))
charsRemaining -= charstoAdd + 1
}
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
v: this.version,
})
})
it('should be able to fetch the doc', function (done) {
DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, doc) => {
if (error) return done(error)
expect(doc.lines).to.deep.equal(this.lines)
done()
}
)
})
describe('when trying to update', function () {
beforeEach(function (done) {
const update = {
doc: this.doc_id,
op: this.update.op,
v: this.version,
}
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc_id,
update,
error => {
if (error != null) {
throw error
}
setTimeout(done, 200)
}
)
})
it('should not update the doc', function (done) {
DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, doc) => {
if (error) return done(error)
expect(doc.lines).to.deep.equal(this.lines)
done()
}
)
})
})
})
describe('when a doc is just below the doc size limit', function () {
beforeEach(function () {
this.lines = ['x'.repeat(Settings.max_doc_length - 1)] // character count is exactly max_doc_length after including the newline
MockWebApi.insertDoc(this.project_id, this.doc_id, {
lines: this.lines,
v: this.version,
})
})
it('should be able to fetch the doc', function (done) {
DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, doc) => {
if (error) return done(error)
expect(doc.lines).to.deep.equal(this.lines)
done()
}
)
})
describe('when trying to update', function () {
beforeEach(function (done) {
const update = {
doc: this.doc_id,
op: this.update.op,
v: this.version,
}
DocUpdaterClient.sendUpdate(
this.project_id,
this.doc_id,
update,
error => {
if (error != null) {
throw error
}
setTimeout(done, 200)
}
)
})
it('should not update the doc', function (done) {
DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, doc) => {
if (error) return done(error)
expect(doc.lines).to.deep.equal(this.lines)
done()
}
)
})
})
})
})

View File

@@ -0,0 +1,42 @@
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS205: Consider reworking code to avoid use of IIFEs
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
const app = require('../../../../app')
module.exports = {
running: false,
initing: false,
callbacks: [],
ensureRunning(callback) {
if (callback == null) {
callback = function () {}
}
if (this.running) {
return callback()
} else if (this.initing) {
return this.callbacks.push(callback)
}
this.initing = true
this.callbacks.push(callback)
app.listen(3003, '127.0.0.1', error => {
if (error != null) {
throw error
}
this.running = true
return (() => {
const result = []
for (callback of Array.from(this.callbacks)) {
result.push(callback())
}
return result
})()
})
},
}

View File

@@ -0,0 +1,246 @@
let DocUpdaterClient
const Settings = require('@overleaf/settings')
const _ = require('lodash')
const rclient = require('@overleaf/redis-wrapper').createClient(
Settings.redis.documentupdater
)
const keys = Settings.redis.documentupdater.key_schema
const request = require('request').defaults({ jar: false })
const async = require('async')
const rclientSub = require('@overleaf/redis-wrapper').createClient(
Settings.redis.pubsub
)
rclientSub.subscribe('applied-ops')
rclientSub.setMaxListeners(0)
module.exports = DocUpdaterClient = {
randomId() {
let str = ''
for (let i = 0; i < 24; i++) {
str += Math.floor(Math.random() * 16).toString(16)
}
return str
},
subscribeToAppliedOps(callback) {
rclientSub.on('message', callback)
},
_getPendingUpdateListKey() {
const shard = _.random(0, Settings.dispatcherCount - 1)
if (shard === 0) {
return 'pending-updates-list'
} else {
return `pending-updates-list-${shard}`
}
},
sendUpdate(projectId, docId, update, callback) {
rclient.rpush(
keys.pendingUpdates({ doc_id: docId }),
JSON.stringify(update),
error => {
if (error) {
return callback(error)
}
const docKey = `${projectId}:${docId}`
rclient.sadd('DocsWithPendingUpdates', docKey, error => {
if (error) {
return callback(error)
}
rclient.rpush(
DocUpdaterClient._getPendingUpdateListKey(),
docKey,
callback
)
})
}
)
},
sendUpdates(projectId, docId, updates, callback) {
DocUpdaterClient.preloadDoc(projectId, docId, error => {
if (error) {
return callback(error)
}
const jobs = updates.map(update => callback => {
DocUpdaterClient.sendUpdate(projectId, docId, update, callback)
})
async.series(jobs, err => {
if (err) {
return callback(err)
}
DocUpdaterClient.waitForPendingUpdates(projectId, docId, callback)
})
})
},
waitForPendingUpdates(projectId, docId, callback) {
async.retry(
{ times: 30, interval: 100 },
cb =>
rclient.llen(keys.pendingUpdates({ doc_id: docId }), (err, length) => {
if (err) {
return cb(err)
}
if (length > 0) {
cb(new Error('updates still pending'))
} else {
cb()
}
}),
callback
)
},
getDoc(projectId, docId, callback) {
request.get(
`http://127.0.0.1:3003/project/${projectId}/doc/${docId}`,
(error, res, body) => {
if (body != null && res.statusCode >= 200 && res.statusCode < 300) {
body = JSON.parse(body)
}
callback(error, res, body)
}
)
},
getDocAndRecentOps(projectId, docId, fromVersion, callback) {
request.get(
`http://127.0.0.1:3003/project/${projectId}/doc/${docId}?fromVersion=${fromVersion}`,
(error, res, body) => {
if (body != null && res.statusCode >= 200 && res.statusCode < 300) {
body = JSON.parse(body)
}
callback(error, res, body)
}
)
},
getProjectLastUpdatedAt(projectId, callback) {
request.get(
`http://127.0.0.1:3003/project/${projectId}/last_updated_at`,
(error, res, body) => {
if (body != null && res.statusCode >= 200 && res.statusCode < 300) {
body = JSON.parse(body)
}
callback(error, res, body)
}
)
},
preloadDoc(projectId, docId, callback) {
DocUpdaterClient.getDoc(projectId, docId, callback)
},
peekDoc(projectId, docId, callback) {
request.get(
`http://127.0.0.1:3003/project/${projectId}/doc/${docId}/peek`,
(error, res, body) => {
if (body != null && res.statusCode >= 200 && res.statusCode < 300) {
body = JSON.parse(body)
}
callback(error, res, body)
}
)
},
flushDoc(projectId, docId, callback) {
request.post(
`http://127.0.0.1:3003/project/${projectId}/doc/${docId}/flush`,
(error, res, body) => callback(error, res, body)
)
},
setDocLines(projectId, docId, lines, source, userId, undoing, callback) {
request.post(
{
url: `http://127.0.0.1:3003/project/${projectId}/doc/${docId}`,
json: {
lines,
source,
user_id: userId,
undoing,
},
},
(error, res, body) => callback(error, res, body)
)
},
deleteDoc(projectId, docId, callback) {
request.del(
`http://127.0.0.1:3003/project/${projectId}/doc/${docId}`,
(error, res, body) => callback(error, res, body)
)
},
flushProject(projectId, callback) {
request.post(`http://127.0.0.1:3003/project/${projectId}/flush`, callback)
},
deleteProject(projectId, callback) {
request.del(`http://127.0.0.1:3003/project/${projectId}`, callback)
},
deleteProjectOnShutdown(projectId, callback) {
request.del(
`http://127.0.0.1:3003/project/${projectId}?background=true&shutdown=true`,
callback
)
},
flushOldProjects(callback) {
request.get(
'http://127.0.0.1:3003/flush_queued_projects?min_delete_age=1',
callback
)
},
acceptChange(projectId, docId, changeId, callback) {
request.post(
`http://127.0.0.1:3003/project/${projectId}/doc/${docId}/change/${changeId}/accept`,
callback
)
},
acceptChanges(projectId, docId, changeIds, callback) {
request.post(
{
url: `http://127.0.0.1:3003/project/${projectId}/doc/${docId}/change/accept`,
json: { change_ids: changeIds },
},
callback
)
},
removeComment(projectId, docId, comment, callback) {
request.del(
`http://127.0.0.1:3003/project/${projectId}/doc/${docId}/comment/${comment}`,
callback
)
},
getProjectDocs(projectId, projectStateHash, callback) {
request.get(
`http://127.0.0.1:3003/project/${projectId}/doc?state=${projectStateHash}`,
(error, res, body) => {
if (body != null && res.statusCode >= 200 && res.statusCode < 300) {
body = JSON.parse(body)
}
callback(error, res, body)
}
)
},
sendProjectUpdate(projectId, userId, updates, version, callback) {
request.post(
{
url: `http://127.0.0.1:3003/project/${projectId}`,
json: { userId, updates, version },
},
(error, res, body) => callback(error, res, body)
)
},
}

View File

@@ -0,0 +1,111 @@
const express = require('express')
const bodyParser = require('body-parser')
const app = express()
const MAX_REQUEST_SIZE = 2 * (2 * 1024 * 1024 + 64 * 1024)
const MockDocstoreApi = {
docs: {},
clearDocs() {
this.docs = {}
},
getDoc(projectId, docId) {
return this.docs[`${projectId}:${docId}`]
},
insertDoc(projectId, docId, doc) {
if (doc.version == null) {
doc.version = 0
}
if (doc.lines == null) {
doc.lines = []
}
this.docs[`${projectId}:${docId}`] = doc
},
patchDocument(projectId, docId, meta, callback) {
Object.assign(this.docs[`${projectId}:${docId}`], meta)
callback(null)
},
peekDocument(projectId, docId, callback) {
callback(null, this.docs[`${projectId}:${docId}`])
},
getAllDeletedDocs(projectId, callback) {
callback(
null,
Object.entries(this.docs)
.filter(([key, doc]) => key.startsWith(projectId) && doc.deleted)
.map(([key, doc]) => {
return {
_id: key.split(':')[1],
name: doc.name,
deletedAt: doc.deletedAt,
}
})
)
},
run() {
app.get('/project/:project_id/doc-deleted', (req, res, next) => {
this.getAllDeletedDocs(req.params.project_id, (error, docs) => {
if (error) {
res.sendStatus(500)
} else {
res.json(docs)
}
})
})
app.get('/project/:project_id/doc/:doc_id/peek', (req, res, next) => {
this.peekDocument(
req.params.project_id,
req.params.doc_id,
(error, doc) => {
if (error) {
res.sendStatus(500)
} else if (doc) {
res.json(doc)
} else {
res.sendStatus(404)
}
}
)
})
app.patch(
'/project/:project_id/doc/:doc_id',
bodyParser.json({ limit: MAX_REQUEST_SIZE }),
(req, res, next) => {
MockDocstoreApi.patchDocument(
req.params.project_id,
req.params.doc_id,
req.body,
error => {
if (error) {
res.sendStatus(500)
} else {
res.sendStatus(204)
}
}
)
}
)
app
.listen(3016, error => {
if (error) {
throw error
}
})
.on('error', error => {
console.error('error starting MockDocstoreApi:', error.message)
process.exit(1)
})
},
}
MockDocstoreApi.run()
module.exports = MockDocstoreApi

View File

@@ -0,0 +1,40 @@
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
let MockProjectHistoryApi
const express = require('express')
const app = express()
module.exports = MockProjectHistoryApi = {
flushProject(docId, callback) {
if (callback == null) {
callback = function () {}
}
return callback()
},
run() {
app.post('/project/:project_id/flush', (req, res, next) => {
return this.flushProject(req.params.project_id, error => {
if (error != null) {
return res.sendStatus(500)
} else {
return res.sendStatus(204)
}
})
})
return app.listen(3054, error => {
if (error != null) {
throw error
}
})
},
}
MockProjectHistoryApi.run()

View File

@@ -0,0 +1,121 @@
/* eslint-disable
no-return-assign,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
let MockWebApi
const express = require('express')
const bodyParser = require('body-parser')
const app = express()
const MAX_REQUEST_SIZE = 2 * (2 * 1024 * 1024 + 64 * 1024)
module.exports = MockWebApi = {
docs: {},
clearDocs() {
return (this.docs = {})
},
insertDoc(projectId, docId, doc) {
if (doc.version == null) {
doc.version = 0
}
if (doc.lines == null) {
doc.lines = []
}
doc.pathname = '/a/b/c.tex'
return (this.docs[`${projectId}:${docId}`] = doc)
},
setDocument(
projectId,
docId,
lines,
version,
ranges,
lastUpdatedAt,
lastUpdatedBy,
callback
) {
if (callback == null) {
callback = function () {}
}
const doc =
this.docs[`${projectId}:${docId}`] ||
(this.docs[`${projectId}:${docId}`] = {})
doc.lines = lines
doc.version = version
doc.ranges = ranges
doc.pathname = '/a/b/c.tex'
doc.lastUpdatedAt = lastUpdatedAt
doc.lastUpdatedBy = lastUpdatedBy
return callback(null)
},
getDocument(projectId, docId, callback) {
if (callback == null) {
callback = function () {}
}
return callback(null, this.docs[`${projectId}:${docId}`])
},
run() {
app.get('/project/:project_id/doc/:doc_id', (req, res, next) => {
return this.getDocument(
req.params.project_id,
req.params.doc_id,
(error, doc) => {
if (error != null) {
return res.sendStatus(500)
} else if (doc != null) {
return res.send(JSON.stringify(doc))
} else {
return res.sendStatus(404)
}
}
)
})
app.post(
'/project/:project_id/doc/:doc_id',
bodyParser.json({ limit: MAX_REQUEST_SIZE }),
(req, res, next) => {
return MockWebApi.setDocument(
req.params.project_id,
req.params.doc_id,
req.body.lines,
req.body.version,
req.body.ranges,
req.body.lastUpdatedAt,
req.body.lastUpdatedBy,
error => {
if (error != null) {
return res.sendStatus(500)
} else {
return res.json({ rev: '123' })
}
}
)
}
)
return app
.listen(3000, error => {
if (error != null) {
throw error
}
})
.on('error', error => {
console.error('error starting MockWebApi:', error.message)
return process.exit(1)
})
},
}
MockWebApi.run()

View File

@@ -0,0 +1,65 @@
let listenInBackground, sendPings
const redis = require('@overleaf/redis-wrapper')
const rclient1 = redis.createClient({
cluster: [
{
port: '7000',
host: '127.0.0.1',
},
],
})
const rclient2 = redis.createClient({
cluster: [
{
port: '7000',
host: '127.0.0.1',
},
],
})
let counter = 0
const sendPing = function (cb) {
if (cb == null) {
cb = function () {}
}
return rclient1.rpush('test-blpop', counter, error => {
if (error != null) {
console.error('[SENDING ERROR]', error.message)
}
if (error == null) {
counter += 1
}
return cb()
})
}
let previous = null
const listenForPing = cb =>
rclient2.blpop('test-blpop', 200, (error, result) => {
if (error != null) {
return cb(error)
}
let [, value] = Array.from(result)
value = parseInt(value, 10)
if (value % 10 === 0) {
console.log('.')
}
if (previous != null && value !== previous + 1) {
error = new Error(
`Counter not in order. Got ${value}, expected ${previous + 1}`
)
}
previous = value
return cb(error, value)
})
const PING_DELAY = 100
;(sendPings = () => sendPing(() => setTimeout(sendPings, PING_DELAY)))()
;(listenInBackground = () =>
listenForPing(error => {
if (error) {
console.error('[RECEIVING ERROR]', error.message)
}
return setTimeout(listenInBackground)
}))()

View File

@@ -0,0 +1,54 @@
let sendPings
const redis = require('@overleaf/redis-wrapper')
const rclient1 = redis.createClient({
cluster: [
{
port: '7000',
host: '127.0.0.1',
},
],
})
const rclient2 = redis.createClient({
cluster: [
{
port: '7000',
host: '127.0.0.1',
},
],
})
let counter = 0
const sendPing = function (cb) {
if (cb == null) {
cb = function () {}
}
return rclient1.publish('test-pubsub', counter, error => {
if (error) {
console.error('[SENDING ERROR]', error.message)
}
if (error == null) {
counter += 1
}
return cb()
})
}
let previous = null
rclient2.subscribe('test-pubsub')
rclient2.on('message', (channel, value) => {
value = parseInt(value, 10)
if (value % 10 === 0) {
console.log('.')
}
if (previous != null && value !== previous + 1) {
console.error(
'[RECEIVING ERROR]',
`Counter not in order. Got ${value}, expected ${previous + 1}`
)
}
return (previous = value)
})
const PING_DELAY = 100
;(sendPings = () => sendPing(() => setTimeout(sendPings, PING_DELAY)))()

View File

@@ -0,0 +1,52 @@
const chai = require('chai')
const chaiAsPromised = require('chai-as-promised')
const sinonChai = require('sinon-chai')
const SandboxedModule = require('sandboxed-module')
const sinon = require('sinon')
// ensure every ObjectId has the id string as a property for correct comparisons
require('mongodb-legacy').ObjectId.cacheHexString = true
// Chai configuration
chai.should()
chai.use(chaiAsPromised)
// Load sinon-chai assertions so expect(stubFn).to.have.been.calledWith('abc')
// has a nicer failure messages
chai.use(sinonChai)
// Global stubs
const sandbox = sinon.createSandbox()
const stubs = {
logger: {
debug: sandbox.stub(),
log: sandbox.stub(),
warn: sandbox.stub(),
err: sandbox.stub(),
error: sandbox.stub(),
},
}
// SandboxedModule configuration
SandboxedModule.configure({
requires: {
'@overleaf/logger': stubs.logger,
'mongodb-legacy': require('mongodb-legacy'), // for ObjectId comparisons
},
globals: { Buffer, JSON, Math, console, process },
sourceTransformers: {
removeNodePrefix: function (source) {
return source.replace(/require\(['"]node:/g, "require('")
},
},
})
// Mocha hooks
exports.mochaHooks = {
beforeEach() {
this.logger = stubs.logger
},
afterEach() {
sandbox.reset()
},
}

View File

@@ -0,0 +1,387 @@
/* eslint-disable
no-return-assign,
no-unused-vars,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS101: Remove unnecessary use of Array.from
* DS102: Remove unnecessary code created because of implicit returns
* DS202: Simplify dynamic range loops
* DS205: Consider reworking code to avoid use of IIFEs
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
const DocUpdaterClient = require('../../acceptance/js/helpers/DocUpdaterClient')
// MockWebApi = require "../../acceptance/js/helpers/MockWebApi"
const assert = require('node:assert')
const async = require('async')
const insert = function (string, pos, content) {
const result = string.slice(0, pos) + content + string.slice(pos)
return result
}
const transform = function (op1, op2) {
if (op2.p < op1.p) {
return {
p: op1.p + op2.i.length,
i: op1.i,
}
} else {
return op1
}
}
class StressTestClient {
constructor(options) {
if (options == null) {
options = {}
}
this.options = options
if (this.options.updateDelay == null) {
this.options.updateDelay = 200
}
this.project_id = this.options.project_id || DocUpdaterClient.randomId()
this.doc_id = this.options.doc_id || DocUpdaterClient.randomId()
this.pos = this.options.pos || 0
this.content = this.options.content || ''
this.client_id = DocUpdaterClient.randomId()
this.version = this.options.version || 0
this.inflight_op = null
this.charCode = 0
this.counts = {
conflicts: 0,
local_updates: 0,
remote_updates: 0,
max_delay: 0,
}
DocUpdaterClient.subscribeToAppliedOps((channel, update) => {
update = JSON.parse(update)
if (update.error != null) {
console.error(new Error(`Error from server: '${update.error}'`))
return
}
if (update.doc_id === this.doc_id) {
return this.processReply(update)
}
})
}
sendUpdate() {
const data = String.fromCharCode(65 + (this.charCode++ % 26))
this.content = insert(this.content, this.pos, data)
this.inflight_op = {
i: data,
p: this.pos++,
}
this.resendUpdate()
return (this.inflight_op_sent = Date.now())
}
resendUpdate() {
assert(this.inflight_op != null)
DocUpdaterClient.sendUpdate(this.project_id, this.doc_id, {
doc: this.doc_id,
op: [this.inflight_op],
v: this.version,
meta: {
source: this.client_id,
},
dupIfSource: [this.client_id],
})
return (this.update_timer = setTimeout(() => {
console.log(
`[${new Date()}] \t[${this.client_id.slice(
0,
4
)}] WARN: Resending update after 5 seconds`
)
return this.resendUpdate()
}, 5000))
}
processReply(update) {
if (update.op.v !== this.version) {
if (update.op.v < this.version) {
console.log(
`[${new Date()}] \t[${this.client_id.slice(
0,
4
)}] WARN: Duplicate ack (already seen version)`
)
return
} else {
console.error(
`[${new Date()}] \t[${this.client_id.slice(
0,
4
)}] ERROR: Version jumped ahead (client: ${this.version}, op: ${
update.op.v
})`
)
}
}
this.version++
if (update.op.meta.source === this.client_id) {
if (this.inflight_op != null) {
this.counts.local_updates++
this.inflight_op = null
clearTimeout(this.update_timer)
const delay = Date.now() - this.inflight_op_sent
this.counts.max_delay = Math.max(this.counts.max_delay, delay)
return this.continue()
} else {
return console.log(
`[${new Date()}] \t[${this.client_id.slice(
0,
4
)}] WARN: Duplicate ack`
)
}
} else {
assert(update.op.op.length === 1)
this.counts.remote_updates++
let externalOp = update.op.op[0]
if (this.inflight_op != null) {
this.counts.conflicts++
this.inflight_op = transform(this.inflight_op, externalOp)
externalOp = transform(externalOp, this.inflight_op)
}
if (externalOp.p < this.pos) {
this.pos += externalOp.i.length
}
return (this.content = insert(this.content, externalOp.p, externalOp.i))
}
}
continue() {
if (this.updateCount > 0) {
this.updateCount--
return setTimeout(
() => {
return this.sendUpdate()
},
this.options.updateDelay * (0.5 + Math.random())
)
} else {
return this.updateCallback()
}
}
runForNUpdates(n, callback) {
if (callback == null) {
callback = function () {}
}
this.updateCallback = callback
this.updateCount = n
return this.continue()
}
check(callback) {
if (callback == null) {
callback = function () {}
}
return DocUpdaterClient.getDoc(
this.project_id,
this.doc_id,
(error, res, body) => {
if (error != null) {
throw error
}
if (body.lines == null) {
return console.error(
`[${new Date()}] \t[${this.client_id.slice(
0,
4
)}] ERROR: Invalid response from get doc (${this.doc_id})`,
body
)
}
const content = body.lines.join('\n')
const { version } = body
if (content !== this.content) {
if (version === this.version) {
console.error(
`[${new Date()}] \t[${this.client_id.slice(
0,
4
)}] Error: Client content does not match server.`
)
console.error(`Server: ${content.split('a')}`)
console.error(`Client: ${this.content.split('a')}`)
} else {
console.error(
`[${new Date()}] \t[${this.client_id.slice(
0,
4
)}] Error: Version mismatch (Server: '${version}', Client: '${
this.version
}')`
)
}
}
if (!this.isContentValid(this.content)) {
const iterable = this.content.split('')
for (let i = 0; i < iterable.length; i++) {
const chunk = iterable[i]
if (chunk != null && chunk !== 'a') {
console.log(chunk, i)
}
}
throw new Error('bad content')
}
return callback()
}
)
}
isChunkValid(chunk) {
const char = 0
for (let i = 0; i < chunk.length; i++) {
const letter = chunk[i]
if (letter.charCodeAt(0) !== 65 + (i % 26)) {
console.error(
`[${new Date()}] \t[${this.client_id.slice(0, 4)}] Invalid Chunk:`,
chunk
)
return false
}
}
return true
}
isContentValid(content) {
for (const chunk of Array.from(content.split('a'))) {
if (chunk != null && chunk !== '') {
if (!this.isChunkValid(chunk)) {
console.error(
`[${new Date()}] \t[${this.client_id.slice(0, 4)}] Invalid content`,
content
)
return false
}
}
}
return true
}
}
const checkDocument = function (projectId, docId, clients, callback) {
if (callback == null) {
callback = function () {}
}
const jobs = clients.map(client => cb => client.check(cb))
return async.parallel(jobs, callback)
}
const printSummary = function (docId, clients) {
const slot = require('cluster-key-slot')
const now = new Date()
console.log(
`[${now}] [${docId.slice(0, 4)} (slot: ${slot(docId)})] ${
clients.length
} clients...`
)
return (() => {
const result = []
for (const client of Array.from(clients)) {
console.log(
`[${now}] \t[${client.client_id.slice(0, 4)}] { local: ${
client.counts.local_updates
}, remote: ${client.counts.remote_updates}, conflicts: ${
client.counts.conflicts
}, max_delay: ${client.counts.max_delay} }`
)
result.push(
(client.counts = {
local_updates: 0,
remote_updates: 0,
conflicts: 0,
max_delay: 0,
})
)
}
return result
})()
}
const CLIENT_COUNT = parseInt(process.argv[2], 10)
const UPDATE_DELAY = parseInt(process.argv[3], 10)
const SAMPLE_INTERVAL = parseInt(process.argv[4], 10)
for (const docAndProjectId of Array.from(process.argv.slice(5))) {
;(function (docAndProjectId) {
const [projectId, docId] = Array.from(docAndProjectId.split(':'))
console.log({ projectId, docId })
return DocUpdaterClient.setDocLines(
projectId,
docId,
[new Array(CLIENT_COUNT + 2).join('a')],
null,
null,
error => {
if (error != null) {
throw error
}
return DocUpdaterClient.getDoc(projectId, docId, (error, res, body) => {
let runBatch
if (error != null) {
throw error
}
if (body.lines == null) {
return console.error(
`[${new Date()}] ERROR: Invalid response from get doc (${docId})`,
body
)
}
const content = body.lines.join('\n')
const { version } = body
const clients = []
for (
let pos = 1, end = CLIENT_COUNT, asc = end >= 1;
asc ? pos <= end : pos >= end;
asc ? pos++ : pos--
) {
;(function (pos) {
const client = new StressTestClient({
doc_id: docId,
project_id: projectId,
content,
pos,
version,
updateDelay: UPDATE_DELAY,
})
return clients.push(client)
})(pos)
}
return (runBatch = function () {
const jobs = clients.map(
client => cb =>
client.runForNUpdates(SAMPLE_INTERVAL / UPDATE_DELAY, cb)
)
return async.parallel(jobs, error => {
if (error != null) {
throw error
}
printSummary(docId, clients)
return checkDocument(projectId, docId, clients, error => {
if (error != null) {
throw error
}
return runBatch()
})
})
})()
})
}
)
})(docAndProjectId)
}

View File

@@ -0,0 +1,57 @@
const sinon = require('sinon')
const { expect } = require('chai')
const modulePath = '../../../../app/js/DiffCodec.js'
const SandboxedModule = require('sandboxed-module')
describe('DiffCodec', function () {
beforeEach(function () {
this.callback = sinon.stub()
this.DiffCodec = SandboxedModule.require(modulePath)
})
describe('diffAsShareJsOps', function () {
it('should insert new text correctly', function () {
this.before = ['hello world']
this.after = ['hello beautiful world']
const ops = this.DiffCodec.diffAsShareJsOp(this.before, this.after)
expect(ops).to.deep.equal([
{
i: 'beautiful ',
p: 6,
},
])
})
it('should shift later inserts by previous inserts', function () {
this.before = ['the boy played with the ball']
this.after = ['the tall boy played with the red ball']
const ops = this.DiffCodec.diffAsShareJsOp(this.before, this.after)
expect(ops).to.deep.equal([
{ i: 'tall ', p: 4 },
{ i: 'red ', p: 29 },
])
})
it('should delete text correctly', function () {
this.before = ['hello beautiful world']
this.after = ['hello world']
const ops = this.DiffCodec.diffAsShareJsOp(this.before, this.after)
expect(ops).to.deep.equal([
{
d: 'beautiful ',
p: 6,
},
])
})
it('should shift later deletes by the first deletes', function () {
this.before = ['the tall boy played with the red ball']
this.after = ['the boy played with the ball']
const ops = this.DiffCodec.diffAsShareJsOp(this.before, this.after)
expect(ops).to.deep.equal([
{ d: 'tall ', p: 4 },
{ d: 'red ', p: 24 },
])
})
})
})

View File

@@ -0,0 +1,198 @@
/* eslint-disable
no-return-assign,
no-unused-vars,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS102: Remove unnecessary code created because of implicit returns
* DS207: Consider shorter variations of null checks
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
const sinon = require('sinon')
const modulePath = '../../../../app/js/DispatchManager.js'
const SandboxedModule = require('sandboxed-module')
const Errors = require('../../../../app/js/Errors.js')
describe('DispatchManager', function () {
beforeEach(function () {
let Timer
this.timeout(3000)
this.DispatchManager = SandboxedModule.require(modulePath, {
requires: {
'./UpdateManager': (this.UpdateManager = {}),
'@overleaf/settings': (this.settings = {
redis: {
documentupdater: {},
},
}),
'@overleaf/redis-wrapper': (this.redis = {}),
'./RateLimitManager': {},
'./Errors': Errors,
'./Metrics': (this.Metrics = {
Timer: (Timer = (function () {
Timer = class Timer {
static initClass() {
this.prototype.done = sinon.stub()
}
}
Timer.initClass()
return Timer
})()),
}),
},
})
this.callback = sinon.stub()
return (this.RateLimiter = {
run(task, cb) {
return task(cb)
},
})
}) // run task without rate limit
return describe('each worker', function () {
beforeEach(function () {
this.client = { auth: sinon.stub() }
this.redis.createClient = sinon.stub().returns(this.client)
return (this.worker = this.DispatchManager.createDispatcher(
this.RateLimiter,
0
))
})
it('should create a new redis client', function () {
return this.redis.createClient.called.should.equal(true)
})
describe('_waitForUpdateThenDispatchWorker', function () {
beforeEach(function () {
this.project_id = 'project-id-123'
this.doc_id = 'doc-id-123'
this.doc_key = `${this.project_id}:${this.doc_id}`
return (this.client.blpop = sinon
.stub()
.callsArgWith(2, null, ['pending-updates-list', this.doc_key]))
})
describe('in the normal case', function () {
beforeEach(function () {
this.UpdateManager.processOutstandingUpdatesWithLock = sinon
.stub()
.callsArg(2)
return this.worker._waitForUpdateThenDispatchWorker(this.callback)
})
it('should call redis with BLPOP', function () {
return this.client.blpop
.calledWith('pending-updates-list', 0)
.should.equal(true)
})
it('should call processOutstandingUpdatesWithLock', function () {
return this.UpdateManager.processOutstandingUpdatesWithLock
.calledWith(this.project_id, this.doc_id)
.should.equal(true)
})
it('should not log any errors', function () {
this.logger.error.called.should.equal(false)
return this.logger.warn.called.should.equal(false)
})
return it('should call the callback', function () {
return this.callback.called.should.equal(true)
})
})
describe('with an error', function () {
beforeEach(function () {
this.UpdateManager.processOutstandingUpdatesWithLock = sinon
.stub()
.callsArgWith(2, new Error('a generic error'))
return this.worker._waitForUpdateThenDispatchWorker(this.callback)
})
it('should log an error', function () {
return this.logger.error.called.should.equal(true)
})
return it('should call the callback', function () {
return this.callback.called.should.equal(true)
})
})
describe("with a 'Delete component' error", function () {
beforeEach(function () {
this.UpdateManager.processOutstandingUpdatesWithLock = sinon
.stub()
.callsArgWith(2, new Errors.DeleteMismatchError())
return this.worker._waitForUpdateThenDispatchWorker(this.callback)
})
it('should log a debug message', function () {
return this.logger.debug.called.should.equal(true)
})
return it('should call the callback', function () {
return this.callback.called.should.equal(true)
})
})
describe('pending updates list with shard key', function () {
beforeEach(function (done) {
this.client = {
auth: sinon.stub(),
blpop: sinon.stub().callsArgWith(2),
}
this.redis.createClient = sinon.stub().returns(this.client)
this.queueShardNumber = 7
this.worker = this.DispatchManager.createDispatcher(
this.RateLimiter,
this.queueShardNumber
)
this.worker._waitForUpdateThenDispatchWorker(done)
})
it('should call redis with BLPOP with the correct key', function () {
this.client.blpop
.calledWith(`pending-updates-list-${this.queueShardNumber}`, 0)
.should.equal(true)
})
})
})
return describe('run', function () {
return it('should call _waitForUpdateThenDispatchWorker until shutting down', function (done) {
let callCount = 0
this.worker._waitForUpdateThenDispatchWorker = callback => {
if (callback == null) {
callback = function () {}
}
callCount++
if (callCount === 3) {
this.settings.shuttingDown = true
}
return setTimeout(() => callback(), 10)
}
sinon.spy(this.worker, '_waitForUpdateThenDispatchWorker')
this.worker.run()
const checkStatus = () => {
if (!this.settings.shuttingDown) {
// retry until shutdown
setTimeout(checkStatus, 100)
} else {
this.worker._waitForUpdateThenDispatchWorker.callCount.should.equal(
3
)
return done()
}
}
return checkStatus()
})
})
})
})

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,117 @@
const _ = require('lodash')
const { expect } = require('chai')
const HistoryConversions = require('../../../app/js/HistoryConversions')
describe('HistoryConversions', function () {
describe('toHistoryRanges', function () {
it('handles empty ranges', function () {
expect(HistoryConversions.toHistoryRanges({})).to.deep.equal({})
})
it("doesn't modify comments when there are no tracked changes", function () {
const ranges = {
comments: [makeComment('comment1', 5, 12)],
}
const historyRanges = HistoryConversions.toHistoryRanges(ranges)
expect(historyRanges).to.deep.equal(ranges)
})
it('adjusts comments and tracked changes to account for tracked deletes', function () {
const comments = [
makeComment('comment0', 0, 1),
makeComment('comment1', 10, 12),
makeComment('comment2', 20, 10),
makeComment('comment3', 15, 3),
]
const changes = [
makeTrackedDelete('change0', 2, 5),
makeTrackedInsert('change1', 4, 5),
makeTrackedDelete('change2', 10, 10),
makeTrackedDelete('change3', 21, 6),
makeTrackedDelete('change4', 50, 7),
]
const ranges = { comments, changes }
const historyRanges = HistoryConversions.toHistoryRanges(ranges)
expect(historyRanges.comments).to.have.deep.members([
comments[0],
// shifted by change0 and change2, extended by change3
enrichOp(comments[1], {
hpos: 25, // 10 + 5 + 10
hlen: 18, // 12 + 6
}),
// shifted by change0 and change2, extended by change3
enrichOp(comments[2], {
hpos: 35, // 20 + 5 + 10
hlen: 16, // 10 + 6
}),
// shifted by change0 and change2
enrichOp(comments[3], {
hpos: 30, // 15 + 5 + 10
}),
])
expect(historyRanges.changes).to.deep.equal([
changes[0],
enrichOp(changes[1], {
hpos: 9, // 4 + 5
}),
enrichOp(changes[2], {
hpos: 15, // 10 + 5
}),
enrichOp(changes[3], {
hpos: 36, // 21 + 5 + 10
}),
enrichOp(changes[4], {
hpos: 71, // 50 + 5 + 10 + 6
}),
])
})
})
})
function makeComment(id, pos, length) {
return {
id,
op: {
c: 'c'.repeat(length),
p: pos,
t: id,
},
metadata: makeMetadata(),
}
}
function makeTrackedInsert(id, pos, length) {
return {
id,
op: {
i: 'i'.repeat(length),
p: pos,
},
metadata: makeMetadata(),
}
}
function makeTrackedDelete(id, pos, length) {
return {
id,
op: {
d: 'd'.repeat(length),
p: pos,
},
metadata: makeMetadata(),
}
}
function makeMetadata() {
return {
user_id: 'user-id',
ts: new Date().toISOString(),
}
}
function enrichOp(commentOrChange, extraFields) {
const result = _.cloneDeep(commentOrChange)
Object.assign(result.op, extraFields)
return result
}

View File

@@ -0,0 +1,291 @@
/* eslint-disable
mocha/no-nested-tests,
*/
const SandboxedModule = require('sandboxed-module')
const sinon = require('sinon')
const modulePath = require('node:path').join(
__dirname,
'../../../../app/js/HistoryManager'
)
describe('HistoryManager', function () {
beforeEach(function () {
this.HistoryManager = SandboxedModule.require(modulePath, {
requires: {
request: (this.request = {}),
'@overleaf/settings': (this.Settings = {
apis: {
project_history: {
url: 'http://project_history.example.com',
},
},
}),
'./DocumentManager': (this.DocumentManager = {}),
'./RedisManager': (this.RedisManager = {}),
'./ProjectHistoryRedisManager': (this.ProjectHistoryRedisManager = {}),
'./Metrics': (this.metrics = { inc: sinon.stub() }),
},
})
this.project_id = 'mock-project-id'
this.callback = sinon.stub()
})
describe('flushProjectChangesAsync', function () {
beforeEach(function () {
this.request.post = sinon
.stub()
.callsArgWith(1, null, { statusCode: 204 })
this.HistoryManager.flushProjectChangesAsync(this.project_id)
})
it('should send a request to the project history api', function () {
this.request.post
.calledWith({
url: `${this.Settings.apis.project_history.url}/project/${this.project_id}/flush`,
qs: { background: true },
})
.should.equal(true)
})
})
describe('flushProjectChanges', function () {
describe('in the normal case', function () {
beforeEach(function (done) {
this.request.post = sinon
.stub()
.callsArgWith(1, null, { statusCode: 204 })
this.HistoryManager.flushProjectChanges(
this.project_id,
{
background: true,
},
done
)
})
it('should send a request to the project history api', function () {
this.request.post
.calledWith({
url: `${this.Settings.apis.project_history.url}/project/${this.project_id}/flush`,
qs: { background: true },
})
.should.equal(true)
})
})
describe('with the skip_history_flush option', function () {
beforeEach(function (done) {
this.request.post = sinon.stub()
this.HistoryManager.flushProjectChanges(
this.project_id,
{
skip_history_flush: true,
},
done
)
})
it('should not send a request to the project history api', function () {
this.request.post.called.should.equal(false)
})
})
})
describe('recordAndFlushHistoryOps', function () {
beforeEach(function () {
this.ops = ['mock-ops']
this.project_ops_length = 10
this.HistoryManager.flushProjectChangesAsync = sinon.stub()
})
describe('with no ops', function () {
beforeEach(function () {
this.HistoryManager.recordAndFlushHistoryOps(
this.project_id,
[],
this.project_ops_length
)
})
it('should not flush project changes', function () {
this.HistoryManager.flushProjectChangesAsync.called.should.equal(false)
})
})
describe('with enough ops to flush project changes', function () {
beforeEach(function () {
this.HistoryManager.shouldFlushHistoryOps = sinon.stub()
this.HistoryManager.shouldFlushHistoryOps
.withArgs(this.project_ops_length)
.returns(true)
this.HistoryManager.recordAndFlushHistoryOps(
this.project_id,
this.ops,
this.project_ops_length
)
})
it('should flush project changes', function () {
this.HistoryManager.flushProjectChangesAsync
.calledWith(this.project_id)
.should.equal(true)
})
})
describe('with enough ops to flush doc changes', function () {
beforeEach(function () {
this.HistoryManager.shouldFlushHistoryOps = sinon.stub()
this.HistoryManager.shouldFlushHistoryOps
.withArgs(this.project_ops_length)
.returns(false)
this.HistoryManager.recordAndFlushHistoryOps(
this.project_id,
this.ops,
this.project_ops_length
)
})
it('should not flush project changes', function () {
this.HistoryManager.flushProjectChangesAsync.called.should.equal(false)
})
})
describe('shouldFlushHistoryOps', function () {
it('should return false if the number of ops is not known', function () {
this.HistoryManager.shouldFlushHistoryOps(
null,
['a', 'b', 'c'].length,
1
).should.equal(false)
})
it("should return false if the updates didn't take us past the threshold", function () {
// Currently there are 14 ops
// Previously we were on 11 ops
// We didn't pass over a multiple of 5
this.HistoryManager.shouldFlushHistoryOps(
14,
['a', 'b', 'c'].length,
5
).should.equal(false)
it('should return true if the updates took to the threshold', function () {})
// Currently there are 15 ops
// Previously we were on 12 ops
// We've reached a new multiple of 5
this.HistoryManager.shouldFlushHistoryOps(
15,
['a', 'b', 'c'].length,
5
).should.equal(true)
})
it('should return true if the updates took past the threshold', function () {
// Currently there are 19 ops
// Previously we were on 16 ops
// We didn't pass over a multiple of 5
this.HistoryManager.shouldFlushHistoryOps(
17,
['a', 'b', 'c'].length,
5
).should.equal(true)
})
})
})
describe('resyncProjectHistory', function () {
beforeEach(function () {
this.projectHistoryId = 'history-id-1234'
this.docs = [
{
doc: this.doc_id,
path: 'main.tex',
},
]
this.files = [
{
file: 'mock-file-id',
path: 'universe.png',
url: `www.filestore.test/${this.project_id}/mock-file-id`,
},
]
this.ProjectHistoryRedisManager.queueResyncProjectStructure = sinon
.stub()
.yields()
this.DocumentManager.resyncDocContentsWithLock = sinon.stub().yields()
})
describe('full sync', function () {
beforeEach(function () {
this.HistoryManager.resyncProjectHistory(
this.project_id,
this.projectHistoryId,
this.docs,
this.files,
{},
this.callback
)
})
it('should queue a project structure reync', function () {
this.ProjectHistoryRedisManager.queueResyncProjectStructure
.calledWith(
this.project_id,
this.projectHistoryId,
this.docs,
this.files
)
.should.equal(true)
})
it('should queue doc content reyncs', function () {
this.DocumentManager.resyncDocContentsWithLock
.calledWith(this.project_id, this.docs[0].doc, this.docs[0].path)
.should.equal(true)
})
it('should call the callback', function () {
this.callback.called.should.equal(true)
})
})
describe('resyncProjectStructureOnly=true', function () {
beforeEach(function () {
this.HistoryManager.resyncProjectHistory(
this.project_id,
this.projectHistoryId,
this.docs,
this.files,
{ resyncProjectStructureOnly: true },
this.callback
)
})
it('should queue a project structure reync', function () {
this.ProjectHistoryRedisManager.queueResyncProjectStructure
.calledWith(
this.project_id,
this.projectHistoryId,
this.docs,
this.files,
{ resyncProjectStructureOnly: true }
)
.should.equal(true)
})
it('should not queue doc content reyncs', function () {
this.DocumentManager.resyncDocContentsWithLock.called.should.equal(
false
)
})
it('should call the callback', function () {
this.callback.called.should.equal(true)
})
})
})
})

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,84 @@
const { expect } = require('chai')
const modulePath = '../../../../app/js/Limits.js'
const SandboxedModule = require('sandboxed-module')
describe('Limits', function () {
beforeEach(function () {
return (this.Limits = SandboxedModule.require(modulePath))
})
describe('getTotalSizeOfLines', function () {
it('should compute the character count for a document with multiple lines', function () {
const count = this.Limits.getTotalSizeOfLines(['123', '4567'])
expect(count).to.equal(9)
})
it('should compute the character count for a document with a single line', function () {
const count = this.Limits.getTotalSizeOfLines(['123'])
expect(count).to.equal(4)
})
it('should compute the character count for an empty document', function () {
const count = this.Limits.getTotalSizeOfLines([])
expect(count).to.equal(0)
})
})
describe('docIsTooLarge', function () {
describe('when the estimated size is below the limit', function () {
it('should return false when the estimated size is below the limit', function () {
const result = this.Limits.docIsTooLarge(128, ['hello', 'world'], 1024)
expect(result).to.be.false
})
})
describe('when the estimated size is at the limit', function () {
it('should return false when the estimated size is at the limit', function () {
const result = this.Limits.docIsTooLarge(1024, ['hello', 'world'], 1024)
expect(result).to.be.false
})
})
describe('when the estimated size is above the limit', function () {
it('should return false when the actual character count is below the limit', function () {
const result = this.Limits.docIsTooLarge(2048, ['hello', 'world'], 1024)
expect(result).to.be.false
})
it('should return false when the actual character count is at the limit', function () {
const result = this.Limits.docIsTooLarge(2048, ['x'.repeat(1023)], 1024)
expect(result).to.be.false
})
it('should return true when the actual character count is above the limit by 1', function () {
const count = this.Limits.docIsTooLarge(2048, ['x'.repeat(1024)], 1024)
expect(count).to.be.true
})
it('should return true when the actual character count is above the limit', function () {
const count = this.Limits.docIsTooLarge(2048, ['x'.repeat(2000)], 1024)
expect(count).to.be.true
})
})
describe('when the document has many lines', function () {
it('should return false when the actual character count is below the limit ', function () {
const count = this.Limits.docIsTooLarge(
2048,
'1234567890'.repeat(100).split('0'),
1024
)
expect(count).to.be.false
})
it('should return true when the actual character count is above the limit', function () {
const count = this.Limits.docIsTooLarge(
2048,
'1234567890'.repeat(2000).split('0'),
1024
)
expect(count).to.be.true
})
})
})
})

View File

@@ -0,0 +1,65 @@
/* eslint-disable
no-unused-vars,
*/
// TODO: This file was created by bulk-decaffeinate.
// Fix any style issues and re-enable lint.
/*
* decaffeinate suggestions:
* DS102: Remove unnecessary code created because of implicit returns
* DS206: Consider reworking classes to avoid initClass
* Full docs: https://github.com/decaffeinate/decaffeinate/blob/master/docs/suggestions.md
*/
const sinon = require('sinon')
const assert = require('node:assert')
const path = require('node:path')
const modulePath = path.join(__dirname, '../../../../app/js/LockManager.js')
const projectId = 1234
const docId = 5678
const blockingKey = `Blocking:${docId}`
const SandboxedModule = require('sandboxed-module')
describe('LockManager - checking the lock', function () {
let Profiler
const existsStub = sinon.stub()
const mocks = {
'@overleaf/redis-wrapper': {
createClient() {
return {
auth() {},
exists: existsStub,
}
},
},
'@overleaf/metrics': { inc() {} },
'./Profiler': (Profiler = (function () {
Profiler = class Profiler {
static initClass() {
this.prototype.log = sinon.stub().returns({ end: sinon.stub() })
this.prototype.end = sinon.stub()
}
}
Profiler.initClass()
return Profiler
})()),
}
const LockManager = SandboxedModule.require(modulePath, { requires: mocks })
it('should return true if the key does not exists', function (done) {
existsStub.yields(null, '0')
return LockManager.checkLock(docId, (err, free) => {
if (err) return done(err)
free.should.equal(true)
return done()
})
})
return it('should return false if the key does exists', function (done) {
existsStub.yields(null, '1')
return LockManager.checkLock(docId, (err, free) => {
if (err) return done(err)
free.should.equal(false)
return done()
})
})
})

Some files were not shown because too many files have changed in this diff Show More